All of lore.kernel.org
 help / color / mirror / Atom feed
* Cycle of send/receive for backup/restore is incomplete...
@ 2014-04-23 22:30 Robert White
  2014-04-24  4:28 ` Brendan Hide
  0 siblings, 1 reply; 4+ messages in thread
From: Robert White @ 2014-04-23 22:30 UTC (permalink / raw)
  To: linux-btrfs

So the backup/restore system described using snapshots is incomplete 
because the final restore is a copy operation. As such, the act of 
restoring from the backup will require restarting the entire backup 
cycle because the copy operation will scramble the metadata consanguinity.

The real choice is to restore by sending the snapshot back via send and 
receive so that all the UIDs and metadata continue to match up.

But there's no way to "promote" the final snapshot to a non-snapshot 
subvolume identical to the one made by the original btrfs subvolume 
create operation.

Consider a file system with __System as the default mount (e.g. btrfs 
subvolume create /__System). You make a snapshot (btrfs sub snap -r 
/__System /__System_BACKUP). Then you send the backup to another file 
system with send receive. Nothing new here.

The thing is, if you want to restore from that backup, you'd 
send/receive /__System_BACKUP to the new/restore drive. But that 
snapshot is _forced_ to be read only. So then your only choice is to 
make a writable snapshot called /__System. At this point you have a tiny 
problem, the three drives aren't really the same.

The __System and __System_BACKUP on the final drive are subvolumes of /, 
while on the original system / and /__System were full subvolumes.

It's dumb, it's a tiny difference, but it's annoying. There needs to be 
a way to promote /__System to a non-snapshot status.

If you look at the output of "btrfs subvolume list -s /" on the various 
drives it's not possible to end up with the exact same system as the 
original.

There needs to be either an option to btrfs subvolume create that takes 
a snapshot as an argument to base the new device on, or an option to 
receive that will make a read-write non-snapshot subvolume.

Ideally, from "HOST_A":

mkfs.btrfs /dev/sda  # main device
mount /dev/sda /drivea
cd /drivea
btrfs subvolume create __System
btrfs subvolume set-default __System
#//* use system with __System as root *//
mount -o subvol=/ /dev/sda /drivea
cd /drivea
btrfs subvolume snapshot -r __System __System_BACKUP

mkfs.btrfs /dev/sdb # some backup device (presumably shared here)
mount /dev/sdb /driveb
cd /driveb
btrfs subvolume create HOST_A # host specific region
cd HOST_A
btrfs send /drivea/__System_BACKUP | btrfs receive /driveb/HOST_A
# etc.

## Restoring drive.
mkfs.btrfs /dev/sdc
mount /dev/sdc /drivec
mount /dev/sdb /driveb
btrfs send /driveb/HOST_A/__System_BACKUP | btrfs receive /drivec

## What I've been doing is create a non read-only snapshot of
##   the backup snapshot. But this is now _not_ identical to the
##   original /drivea because __System is listed as a snapshot
##   not a subvolume.
cd /drivec
btrfs subvolume snapshot __System_BACKUP __System

## So Ideally I should instead be able to do
btrfs subvolume create -model /drivec/__System_BACKUP /drivec/__System

## Or I should have been able to do
btrfs send /driveb/HOST_A/__System_BACKUP |
   btrfs subvolume create --receive /drivec/__System

## Or a promote/populate option that takes the writable snapshot and
##   and rearranges its flags and the various connections to other
##   snapshots. e.g. properly handling __System_BACKUP et. al.
##   when doing something like:
btrfs subvolume promote __System


The real goal here is that the well designed system is going to use 
incremental backups. If there's a copy operation used then the whole 
HOST_A hierarchy would need to be recreated, which lowers the integrity 
of the whole backup cycle by interrupting the history.

Imagine if there is a (dated or numbered) history of snapshots, any copy 
based restore breaks it all.

ASIDE: A harder problem is when a snapshot is a child of the subvolume 
itself. e.g. "btrfs snapshot -r . BACKUP". Getting the contents of . 
back seems more or less impossible wihtout copying.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Cycle of send/receive for backup/restore is incomplete...
  2014-04-23 22:30 Cycle of send/receive for backup/restore is incomplete Robert White
@ 2014-04-24  4:28 ` Brendan Hide
  2014-04-24  7:08   ` Robert White
  0 siblings, 1 reply; 4+ messages in thread
From: Brendan Hide @ 2014-04-24  4:28 UTC (permalink / raw)
  To: Robert White, linux-btrfs

Replied inline:

On 2014/04/24 12:30 AM, Robert White wrote:
> So the backup/restore system described using snapshots is incomplete 
> because the final restore is a copy operation. As such, the act of 
> restoring from the backup will require restarting the entire backup 
> cycle because the copy operation will scramble the metadata 
> consanguinity.
>
> The real choice is to restore by sending the snapshot back via send 
> and receive so that all the UIDs and metadata continue to match up.
>
> But there's no way to "promote" the final snapshot to a non-snapshot 
> subvolume identical to the one made by the original btrfs subvolume 
> create operation.

btrfs doesn't differentiate snapshots and subvolumes. They're the same 
first-class citizen. A snapshot is a subvolume that just happens to have 
some data (automagically/naturally) deduplicated with another subvolume.

> Consider a file system with __System as the default mount (e.g. btrfs 
> subvolume create /__System). You make a snapshot (btrfs sub snap -r 
> /__System /__System_BACKUP). Then you send the backup to another file 
> system with send receive. Nothing new here.
>
> The thing is, if you want to restore from that backup, you'd 
> send/receive /__System_BACKUP to the new/restore drive. But that 
> snapshot is _forced_ to be read only. So then your only choice is to 
> make a writable snapshot called /__System. At this point you have a 
> tiny problem, the three drives aren't really the same.
>
> The __System and __System_BACKUP on the final drive are subvolumes of 
> /, while on the original system / and /__System were full subvolumes.

There's no such thing as a "full" subvolume. Again, they're all 
first-class citizens. The "real" root of a btrfs is always treated as a 
subvolume, as are the subvolumes inside it too. Just because other 
subvolumes are contained therein it doesn't mean they're diminished 
somehow. You cannot have multiple subvolumes *without* having them be a 
"sub" volume of the real "root" subvolume.

> It's dumb, it's a tiny difference, but it's annoying. There needs to 
> be a way to promote /__System to a non-snapshot status.
>
> If you look at the output of "btrfs subvolume list -s /" on the 
> various drives it's not possible to end up with the exact same system 
> as the original.

 From a user application perspective, the system *is* identical to the 
original. That's the important part.

If you want the disk to be identical bit for bit then you want a 
different backup system entirely, one that backs up the hard disk, not 
the files/content.

On the other hand if you just want to have all your snapshots restored 
as well, that's not too difficult. Its pointless from most perspectives 
- but not difficult.

> There needs to be either an option to btrfs subvolume create that 
> takes a snapshot as an argument to base the new device on, or an 
> option to receive that will make a read-write non-snapshot subvolume.

This feature already exists. This is a very important aspect of how 
snapshots work with send / receive and why it makes things very 
efficient. They work just as well for a restore as they do for a backup. 
The flag you are looking for is "-p" for "parent", which you should 
already be using for the backups in the first place:

 From backup host:
$ btrfs send -p /backup/path/yesterday /backup/path/last_backup | 
<netcat or whatever you choose>

 From restored host:
$ <netcat or whatever you choose> | btrfs receive /tmp/btrfs_root/

Then you make the non-read-only snapshot of the restored subvolume.

> [snip]
>


-- 
__________
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Cycle of send/receive for backup/restore is incomplete...
  2014-04-24  4:28 ` Brendan Hide
@ 2014-04-24  7:08   ` Robert White
  2014-04-24 19:37     ` Chris Murphy
  0 siblings, 1 reply; 4+ messages in thread
From: Robert White @ 2014-04-24  7:08 UTC (permalink / raw)
  To: Brendan Hide, linux-btrfs

I am, then, quite confused...

On 04/23/2014 09:28 PM, Brendan Hide wrote:
> Replied inline:
> btrfs doesn't differentiate snapshots and subvolumes. They're the same
> first-class citizen. A snapshot is a subvolume that just happens to have
> some data (automagically/naturally) deduplicated with another subvolume.

What then is the -s for in

btrfs subvolume list -s <mount point>

Clearly some information of bias and status exists or the option and its 
functional behavior wouldn't exist.

I understand the use of -p for doing diffs, but here's the thing...

mount /dev/sda /drivea
mount /dev/sdb /driveb

btrfs subvolume create /drivea/Base
btrfs subvolume snapshot -r /drivea/Base /drivea/Base_BACKUP

btrfs subvolume list -a /drivea
[blah blah blah] Base
[blah blah blah] Base_BACKUP

* skipping the network and delta-T nonsense as irrelevant
btrfs send /drivea/Base_BACKUP | btrfs receive /driveb/

*No way to make /driveb/Base_BACKUP _not_ end up read only
*So make a writeable snapshot

btrfs subvolume snapshot /driveb/Base_BACKUP /driveb/Base

btrfs subvolume list -a /driveb
[blah blah blah] Base_BACKUP
[blah blah blah] Base

**** Confusing and/or problematic bit:
btrfs subvolume list -s /drivea
[blah blah blah] Base_BACKUP

btrfs subvolume list -s /driveb
[blah blah blah] Base

So if I want to, say, write a backup script that rotates through the 
subvolumes to rotor backups, the restored drive (driveb in this example) 
automatically fails. There is no apparent way to coerce the relationship 
such that both drives end up with "Base" being the writable base and 
"Base_BACKUP" be the read-only snapshot returned when doing list -s.

So the systems are "the same" but they aren't really the same according 
to this clearly visible symptom.

As such various automations that one might right for an original system 
could then choke and die after restoration from this backup. Either that 
or you have to use /bin/cp (or similar) and lose the backup history when 
you restore.

Its a surprise waiting to happen. It surprised me.

It's _impossible_ to strip Base of it's subvolume status on /driveb. If 
you delete the Base_BACKUP element so that Base is the only thing on the 
drive, it's still a shapshot according to -s. What does this status even 
mean if it's as meaningless as it seems.

That seems like a second surprise.

---

Is this a common case? It could easily be if you use NAS or movable USB 
to do your backups and restore-or-media-migration operations.

Are we sure it doesn't matter? I find it problematic but fixable in 
concept. I've got no information whether the internal parentage 
relationship could be reversed so that the before and after of the 
subvolume list -s result are the same.

No I'm not looking for byte-level identical status, I know that's 
ridiculous.

I want semantically identical status.

My experience with list -s says I'm not getting semantically identical 
status after the fact and I have no clear way to coerce it.

====

Why it matters...

If I am doing monthly and weekly archiving I don't want to interrupt the 
rolling archive(s) if I end up having to do a restore. I don't want to 
create a catastrophe point or an interrupting epoch in the archive history.

It sounds like it doesn't matter once you know not to use the -s status 
for anything...

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Cycle of send/receive for backup/restore is incomplete...
  2014-04-24  7:08   ` Robert White
@ 2014-04-24 19:37     ` Chris Murphy
  0 siblings, 0 replies; 4+ messages in thread
From: Chris Murphy @ 2014-04-24 19:37 UTC (permalink / raw)
  To: Btrfs BTRFS


On Apr 24, 2014, at 1:08 AM, Robert White <rwhite@pobox.com> wrote:


> So the systems are "the same" but they aren't really the same according to this clearly visible symptom.

I don't think of btrfs send/receive as sending/receiving whole subvolumes, but rather just their contents.

Try btrfs sub show <subvol> on the original rw subvolume, its ro snapshot, and the sent/received ro snapshot on the target drive. It looks like this:

Drive [a]

[a]# btrfs sub show pictures
/mnt/a/pictures
	Name: 			pictures
	uuid: 			f15ebd77-1151-1740-a7fb-7fd82a0de4aa
	Parent uuid: 		-
	Creation time: 		2014-04-24 08:39:53
	Object ID: 		257
	Generation (Gen): 	11
	Gen at creation: 	7
	
[a]# btrfs sub show pictures_1ro
/mnt/a/pictures_1ro
	Name: 			pictures_1ro
	uuid: 			7bfc06b2-dcc1-0346-a36f-a6305d45934b
	Parent uuid: 		f15ebd77-1151-1740-a7fb-7fd82a0de4aa
	Creation time: 		2014-04-24 08:45:58
	Object ID: 		260
	Generation (Gen): 	11
	Gen at creation: 	11
	
	
	[send from drive a to drive b]
	
[b]# btrfs sub show pictures_1ro/
/mnt/b/pictures_1ro
	Name: 			pictures_1ro
	uuid: 			a814808d-18f7-924b-ba11-68bd3e032bf6
	Parent uuid: 		-
	Creation time: 		2014-04-24 08:46:54
	Object ID: 		257
	Generation (Gen): 	8
	Gen at creation: 	7


Nothing except the name is the same between them. The subvolumes uuids, object ids, generation number, are all different.



> 
> It's _impossible_ to strip Base of it's subvolume status on /driveb. If you delete the Base_BACKUP element so that Base is the only thing on the drive, it's still a shapshot according to -s. What does this status even mean if it's as meaningless as it seems.
> 
> That seems like a second surprise.

I don't know what the first sentence means. A subvolume isn't a state, it's an object, and so it either exists or not. It isn't a flag that can be set and unset.

The snapshot state and parent uuid isn't exactly fool proof, it's easily spoofed. So I take it to mean "how it was created" - subvolume create vs subvolume snapshot. There is no actual difference between a subvolume and a snapshot.

For example. Subvolume A with some files in it. I subvolume create B, and I --reflink files from A to B. I've just functionally made B a snapshot of A, yet its metadata doesn't indicate this. It's a subvolume that doesn't proclaim being a snapshot, nor does it have a parent uuid.

Conversely, if I subvolume snapshot A C, and then rm -rf C/* and fill it with some new files, I have a subvolume claiming to be a snapshot with a parent uuid, but it has nothing at all in common with that parent.

So the snapshot flag and parent uuid are conveniences, they indicate a truth at the time of creation. It's not a dynamic status.


> 
> ---
> 
> Is this a common case? It could easily be if you use NAS or movable USB to do your backups and restore-or-media-migration operations.
> 
> Are we sure it doesn't matter? I find it problematic but fixable in concept. I've got no information whether the internal parentage relationship could be reversed so that the before and after of the subvolume list -s result are the same.
> 
> No I'm not looking for byte-level identical status, I know that's ridiculous.
> 
> I want semantically identical status.

I don't see how that's presently possible since subvolumes themselves aren't being sent. Instead their contents are, with an efficient incremental/clone stream feature. We can't use send on a rw subvolume, so from the outset we have a semantic disconnect between source and destination.

I really think what you're after is seed device. Or possibly some hybrid between seed and send.

> 

> ====
> 
> Why it matters...
> 
> If I am doing monthly and weekly archiving I don't want to interrupt the rolling archive(s) if I end up having to do a restore. I don't want to create a catastrophe point or an interrupting epoch in the archive history.
> 
> It sounds like it doesn't matter once you know not to use the -s status for anything…

I'd say it's not a status because the word status implies present state and thus is a dynamic indicator. Instead, -s indicates how the subvolume was created: empty or prefilled. And it's really mostly useful for ro snapshots because we're assured they haven't changed since they were created.



Chris Murphy


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-04-24 19:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-23 22:30 Cycle of send/receive for backup/restore is incomplete Robert White
2014-04-24  4:28 ` Brendan Hide
2014-04-24  7:08   ` Robert White
2014-04-24 19:37     ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.