* receive failing for incremental streams @ 2021-12-15 20:27 Eric Levy 2021-12-15 23:35 ` Graham Cobb 2021-12-16 5:36 ` Andrei Borzenkov 0 siblings, 2 replies; 9+ messages in thread From: Eric Levy @ 2021-12-15 20:27 UTC (permalink / raw) To: linux-btrfs Hello. I have been experiencing very confusing problems with incremental streams. For a subvolume, I have a simple incremental backup created from two stages: btrfs send old/@ > base.btrfs btrfs send new/@ -p old/@ > update.btrfs The two source subvolumes are snapshots captured at separate times from the same actively mounted subvolume. On the target, I attempt to restore: btrfs receive ./ < base.btrfs btrfs receive ./ < update.btfs The expectation is that the prior command would create a restored snapshot of the initial backup stage, and that the latter would apply the updated stage. The prior command succeeds, but the latter fails: ERROR: creating snapshot ./@ -> @ failed: File exists Since it is obvious I cannot usefully apply the second stage to a target that does not exist, I am puzzled about why the process performs this check, as well as what is expected to have success applying the update. How may I apply the update stage to the target generated from restoring the initial stage? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: receive failing for incremental streams 2021-12-15 20:27 receive failing for incremental streams Eric Levy @ 2021-12-15 23:35 ` Graham Cobb 2021-12-15 23:52 ` Eric Levy 2021-12-16 5:36 ` Andrei Borzenkov 1 sibling, 1 reply; 9+ messages in thread From: Graham Cobb @ 2021-12-15 23:35 UTC (permalink / raw) To: Eric Levy, linux-btrfs On 15/12/2021 20:27, Eric Levy wrote: > Hello. > > I have been experiencing very confusing problems with incremental > streams. There is no such thing as an incremental stream. Send sends all the information necessary to create a subvolume. Some of that includes instructions to share data in other subvolumes but it is not incremental. > For a subvolume, I have a simple incremental backup created from two > stages: > > btrfs send old/@ > base.btrfs > btrfs send new/@ -p old/@ > update.btrfs > > The two source subvolumes are snapshots captured at separate times from > the same actively mounted subvolume. > > On the target, I attempt to restore: > > btrfs receive ./ < base.btrfs > btrfs receive ./ < update.btfs > > The expectation is that the prior command would create a restored > snapshot of the initial backup stage, Yes > and that the latter would apply > the updated stage. No. Receive always creates a brand new subvolume - it doesn't update anything. Of course, the new subvolume may include clones of data stored in other subvolumes but it doesn't modify any existing subvolumes. > > The prior command succeeds, but the latter fails: > > ERROR: creating snapshot ./@ -> @ failed: File exists > > Since it is obvious I cannot usefully apply the second stage to a > target that does not exist, I am puzzled about why the process performs > this check, as well as what is expected to have success applying the > update. > > How may I apply the update stage to the target generated from restoring > the initial stage? You don't. Receive will create a new subvolume - which will include unchanged data from the initial stage and whatever changes have happened. If you want, you can then snapshot that (read-only or read-write as you wish) into any position you want in your destination filesystem. Graham ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: receive failing for incremental streams 2021-12-15 23:35 ` Graham Cobb @ 2021-12-15 23:52 ` Eric Levy 2021-12-16 0:55 ` Graham Cobb 0 siblings, 1 reply; 9+ messages in thread From: Eric Levy @ 2021-12-15 23:52 UTC (permalink / raw) To: linux-btrfs Thank you for the reply. Please see my questions, below. On Wed, 2021-12-15 at 23:35 +0000, Graham Cobb wrote: > There is no such thing as an incremental stream. Send sends all the > information necessary to create a subvolume. Some of that includes > instructions to share data in other subvolumes but it is not > incremental. Perhaps you would clarify the distinction, as to me an incremental backup is a minimal set of data needed to recreate the original volume when combined with the previous capture. > You don't. Receive will create a new subvolume - which will include > unchanged data from the initial stage and whatever changes have > happened. If you want, you can then snapshot that (read-only or > read-write as you wish) into any position you want in your > destination > filesystem. How should I use the latter stream? From the stream length it is obvious it does not contain most of the data from the earlier one. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: receive failing for incremental streams 2021-12-15 23:52 ` Eric Levy @ 2021-12-16 0:55 ` Graham Cobb 2021-12-16 1:13 ` Eric Levy 0 siblings, 1 reply; 9+ messages in thread From: Graham Cobb @ 2021-12-16 0:55 UTC (permalink / raw) To: Eric Levy, linux-btrfs On 15/12/2021 23:52, Eric Levy wrote: > Thank you for the reply. Please see my questions, below. > > On Wed, 2021-12-15 at 23:35 +0000, Graham Cobb wrote: > >> There is no such thing as an incremental stream. Send sends all the >> information necessary to create a subvolume. Some of that includes >> instructions to share data in other subvolumes but it is not >> incremental. > > Perhaps you would clarify the distinction, as to me an incremental > backup is a minimal set of data needed to recreate the original volume > when combined with the previous capture. Maybe it isn't a real difference. I mean that the stream is not intended to make **changes** to an existing subvolume to create the new version. It is intended to **create** a new version, reusing some of the extents from the earlier version (but, not changing the earlier version at all). > >> You don't. Receive will create a new subvolume - which will include >> unchanged data from the initial stage and whatever changes have >> happened. If you want, you can then snapshot that (read-only or >> read-write as you wish) into any position you want in your >> destination >> filesystem. > > How should I use the latter stream? From the stream length it is > obvious it does not contain most of the data from the earlier one. > Imagine you have a subvolume called /data on the source system. One day you snapshot it to create /data-1. You then send /data-1 to the second system to create a read-only subvolume on that system - let's call it /copy-data-1. Later you snapshot /data again to create /data-2 on the source system. You btrfs-send /data-2 to the other system again and it creates a new read-only subvolume - you tell btrfs-receive what to call it and where to put it, let's say you call it /copy-data-2 - using the data in the stream and reusing some extents from the existing /copy-data-1. /copy-data-2 is now a (read-only) copy of /data-2 from the source system. How you use that copy is up to you. If you are just taking backups you probably do nothing with it unless you have a problem (it will form part of the source for data for any future /copy-data-3). If you want to use it to initialize a read-write subvolume on the destination system you can take a read-write snapshot of /copy-data-2 to create a new subvolume (say /my-new-data) on the destination system. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: receive failing for incremental streams 2021-12-16 0:55 ` Graham Cobb @ 2021-12-16 1:13 ` Eric Levy 2021-12-16 10:24 ` Graham Cobb 0 siblings, 1 reply; 9+ messages in thread From: Eric Levy @ 2021-12-16 1:13 UTC (permalink / raw) To: linux-btrfs > Later you snapshot /data again to create /data-2 on the source > system. > You btrfs-send /data-2 to the other system again and it creates a new > read-only subvolume - you tell btrfs-receive what to call it and > where > to put it, let's say you call it /copy-data-2 - using the data in the > stream and reusing some extents from the existing /copy-data-1. > /copy-data-2 is now a (read-only) copy of /data-2 from the source > system. > > How you use that copy is up to you. If you are just taking backups > you > probably do nothing with it unless you have a problem (it will form > part > of the source for data for any future /copy-data-3). If you want to > use > it to initialize a read-write subvolume on the destination system you > can take a read-write snapshot of /copy-data-2 to create a new > subvolume > (say /my-new-data) on the destination system. Such is close to what I have always understood about receive, but the confusion is that the second receive command makes no reference to the subvolume created by the first command. How do I ultimately create a restore target that combines the original full capture with the incremental differences? When I ask how I use it, I mean what commands do I enter into the system. Note in my case I archive the streams into regular (compressed) filesm for later recovery. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: receive failing for incremental streams 2021-12-16 1:13 ` Eric Levy @ 2021-12-16 10:24 ` Graham Cobb 2021-12-16 11:38 ` Hugo Mills 0 siblings, 1 reply; 9+ messages in thread From: Graham Cobb @ 2021-12-16 10:24 UTC (permalink / raw) To: Eric Levy, linux-btrfs On 16/12/2021 01:13, Eric Levy wrote: >> Later you snapshot /data again to create /data-2 on the source >> system. >> You btrfs-send /data-2 to the other system again and it creates a new >> read-only subvolume - you tell btrfs-receive what to call it and >> where >> to put it, let's say you call it /copy-data-2 - using the data in the >> stream and reusing some extents from the existing /copy-data-1. >> /copy-data-2 is now a (read-only) copy of /data-2 from the source >> system. >> >> How you use that copy is up to you. If you are just taking backups >> you >> probably do nothing with it unless you have a problem (it will form >> part >> of the source for data for any future /copy-data-3). If you want to >> use >> it to initialize a read-write subvolume on the destination system you >> can take a read-write snapshot of /copy-data-2 to create a new >> subvolume >> (say /my-new-data) on the destination system. > > Such is close to what I have always understood about receive, but the > confusion is that the second receive command makes no reference to the > subvolume created by the first command. How do I ultimately create a > restore target that combines the original full capture with the > incremental differences? It's just magic. Seriously, as long as you have already restored the parent (and any clone sources, if you have specified those) to the same filesystem, btrfs will find them and clone the necessary files into the new subvolume. > > When I ask how I use it, I mean what commands do I enter into the > system. Assume subvolume called /data. On the sending side... btrfs subv snapshot -r /data /data-1 btrfs send /data-1 -f data-1.send Later, to generate the incremental stream from /data-1... btrfs subv snapshot -r /data /data-2 btrfs send -p /data-1 /data-2 -f data-2.send When you want to restore... btrfs receive -f data-1.send /recv-data-1 btrfs receive -f data-2.send /recv-data-2 If you want read-write access to the data you need to create a new subvolume... btrfs subv snapshot /recv-data-2 /new-data [I haven't tested these so sorry for any mistakes - hopefully you get the idea] > > Note in my case I archive the streams into regular (compressed) filesm > for later recovery. I considered doing that but I don't recommend it. The biggest issue is that you have to keep all the incrementals since the last full backup, as all the steps must complete in order to restore. This means that if something has gone wrong with the archive (even a single bit corruption, or an unexpected truncation) all the incremental streams after that point are useless. btrfs receive doesn't have a "try hard" mode - it will just fail unless all the sources it needs, and the stream it is processing, are perfect. And you don't know, unless you do regular test restorations. In the end I decided I would keep the archive subvolumes themselves, not the streams. Even in the worst case, this takes very little more space (assuming you have turned on compression) - after all the cloned data is still cloned. And even if something has been corrupted you can still get at undamaged files in the various subvolumes. And if you make sure that each send stream is only using the directly previous snapshot as its clone source, you can remove any older snapshots that you like without making later subvolumes useless. Once I decided that, I ended up using btrbk - which makes a good job of managing the backup and archive subvolumes, on both the source system and the destination system. Of course, many other tools are available. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: receive failing for incremental streams 2021-12-16 10:24 ` Graham Cobb @ 2021-12-16 11:38 ` Hugo Mills 2021-12-18 23:53 ` Eric Levy 0 siblings, 1 reply; 9+ messages in thread From: Hugo Mills @ 2021-12-16 11:38 UTC (permalink / raw) To: Graham Cobb; +Cc: Eric Levy, linux-btrfs On Thu, Dec 16, 2021 at 10:24:09AM +0000, Graham Cobb wrote: > On 16/12/2021 01:13, Eric Levy wrote: > >> Later you snapshot /data again to create /data-2 on the source > >> system. > >> You btrfs-send /data-2 to the other system again and it creates a new > >> read-only subvolume - you tell btrfs-receive what to call it and > >> where > >> to put it, let's say you call it /copy-data-2 - using the data in the > >> stream and reusing some extents from the existing /copy-data-1. > >> /copy-data-2 is now a (read-only) copy of /data-2 from the source > >> system. > >> > >> How you use that copy is up to you. If you are just taking backups > >> you > >> probably do nothing with it unless you have a problem (it will form > >> part > >> of the source for data for any future /copy-data-3). If you want to > >> use > >> it to initialize a read-write subvolume on the destination system you > >> can take a read-write snapshot of /copy-data-2 to create a new > >> subvolume > >> (say /my-new-data) on the destination system. > > > > Such is close to what I have always understood about receive, but the > > confusion is that the second receive command makes no reference to the > > subvolume created by the first command. How do I ultimately create a > > restore target that combines the original full capture with the > > incremental differences? > > It's just magic. Seriously, as long as you have already restored the > parent (and any clone sources, if you have specified those) to the same > filesystem, btrfs will find them and clone the necessary files into the > new subvolume. This is what happens: Sending machine Receiving machine $ send A Send all the data of A plus its UUID (uA) $ receive Make a new subvol, A' Write all the data to it Set "received_uuid" on A' to uA Make A' read-only $ send B -p A Send the differences between A and B, plus their UUIDs, uA and uB $ receive Find the subvol with "received_uuid" == uA (this is A') Snapshot it to B' Modify B' using the differences Set "received_uuid" of B' to uB Make B' read-only > > When I ask how I use it, I mean what commands do I enter into the > > system. > > Assume subvolume called /data. > > On the sending side... > > btrfs subv snapshot -r /data /data-1 > btrfs send /data-1 -f data-1.send > > Later, to generate the incremental stream from /data-1... > > btrfs subv snapshot -r /data /data-2 > btrfs send -p /data-1 /data-2 -f data-2.send > > When you want to restore... > > btrfs receive -f data-1.send /recv-data-1 > btrfs receive -f data-2.send /recv-data-2 > > If you want read-write access to the data you need to create a new > subvolume... > > btrfs subv snapshot /recv-data-2 /new-data > > [I haven't tested these so sorry for any mistakes - hopefully you get > the idea] > > > > > Note in my case I archive the streams into regular (compressed) filesm > > for later recovery. > > I considered doing that but I don't recommend it. The biggest issue is > that you have to keep all the incrementals since the last full backup, > as all the steps must complete in order to restore. This means that if > something has gone wrong with the archive (even a single bit corruption, > or an unexpected truncation) all the incremental streams after that > point are useless. btrfs receive doesn't have a "try hard" mode - it > will just fail unless all the sources it needs, and the stream it is > processing, are perfect. And you don't know, unless you do regular test > restorations. > > In the end I decided I would keep the archive subvolumes themselves, not > the streams. Even in the worst case, this takes very little more space > (assuming you have turned on compression) - after all the cloned data is > still cloned. And even if something has been corrupted you can still get > at undamaged files in the various subvolumes. And if you make sure that > each send stream is only using the directly previous snapshot as its > clone source, you can remove any older snapshots that you like without > making later subvolumes useless. > > Once I decided that, I ended up using btrbk - which makes a good job of > managing the backup and archive subvolumes, on both the source system > and the destination system. Of course, many other tools are available. > -- Hugo Mills | Computer Science is not about computers, any more hugo@... carfax.org.uk | than astronomy is about telescopes. http://carfax.org.uk/ | PGP: E2AB1DE4 | Esdger Dijkstra ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: receive failing for incremental streams 2021-12-16 11:38 ` Hugo Mills @ 2021-12-18 23:53 ` Eric Levy 0 siblings, 0 replies; 9+ messages in thread From: Eric Levy @ 2021-12-18 23:53 UTC (permalink / raw) To: linux-btrfs Thank you for the explanation about streams. My first observation is that the details clarified in this conversation are easily understood from the man page, nor from any official online documentation I had found, nor even from any other discussion or documentation I had found through web searches. Thus, even if it were the only change to result from these considerations, I would suggest that the man page should include a more robust explanation of the design. Next, the child stream being restored to a new subvolume, with the result sharing references with the parent, may be practical from a standpoint of underlying implementation, but may not be intuitive for a user in a typical work flow. It might be helpful for users to have some direct support for the use case of updating an existing stream in place. Finally, the constraint that a restore target must have the same file name as the original subvolume is, at least to my thinking, inconvenient, if not also in many cases challenging, as when the original name is not known, perhaps having been chosen arbitrarily. A useful feature would be an option in the administrative tool to choose the name of the restored subvolume, not simply the parent directory. Whether any such enhancements require changes to the file system functionality is beyond my knowledge, but it is certainly worthwhile to consider any that are possible through changing only tools in user space. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: receive failing for incremental streams 2021-12-15 20:27 receive failing for incremental streams Eric Levy 2021-12-15 23:35 ` Graham Cobb @ 2021-12-16 5:36 ` Andrei Borzenkov 1 sibling, 0 replies; 9+ messages in thread From: Andrei Borzenkov @ 2021-12-16 5:36 UTC (permalink / raw) To: Eric Levy, linux-btrfs On 15.12.2021 23:27, Eric Levy wrote: > Hello. > > I have been experiencing very confusing problems with incremental > streams. > > For a subvolume, I have a simple incremental backup created from two > stages: > > btrfs send old/@ > base.btrfs > btrfs send new/@ -p old/@ > update.btrfs > > The two source subvolumes are snapshots captured at separate times from > the same actively mounted subvolume. > > On the target, I attempt to restore: > > btrfs receive ./ < base.btrfs > btrfs receive ./ < update.btfs > > The expectation is that the prior command would create a restored > snapshot of the initial backup stage, and that the latter would apply > the updated stage. > > The prior command succeeds, but the latter fails: > > ERROR: creating snapshot ./@ -> @ failed: File exists > You need to restore it in different directory. Each send stream defines subvolume and you cannot have two subvolumes with the same name in the same directory. > Since it is obvious I cannot usefully apply the second stage to a > target that does not exist, I am puzzled about why the process performs > this check, as well as what is expected to have success applying the > update. > > How may I apply the update stage to the target generated from restoring > the initial stage? > > You misunderstand what happens. btrfs receive does not update existing subvlume. It always creates new subvolume by cloning parent replica and applying changes to this clone. Parent remains in its original state and read-only. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-12-18 23:54 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-12-15 20:27 receive failing for incremental streams Eric Levy 2021-12-15 23:35 ` Graham Cobb 2021-12-15 23:52 ` Eric Levy 2021-12-16 0:55 ` Graham Cobb 2021-12-16 1:13 ` Eric Levy 2021-12-16 10:24 ` Graham Cobb 2021-12-16 11:38 ` Hugo Mills 2021-12-18 23:53 ` Eric Levy 2021-12-16 5:36 ` Andrei Borzenkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).