All of lore.kernel.org
 help / color / mirror / Atom feed
* kvm / virsh snapshot management
@ 2019-06-02  0:12 Gary Dale
  2019-06-10 12:19   ` [Qemu-devel] " Stefan Hajnoczi
  0 siblings, 1 reply; 14+ messages in thread
From: Gary Dale @ 2019-06-02  0:12 UTC (permalink / raw)
  To: kvm

A while back I converted a raw disk image to qcow2 to be able to use 
snapshots. However I realize that I may not really understand exactly 
how snapshots work. In this particular case, I'm only talking about 
internal snapshots currently as there seems to be some differences of 
opinion as to whether internal or external are safer/more reliable. I'm 
also only talking about shutdown state snapshots, so it should just be 
the disk that is snapshotted.

As I understand it, the first snapshot freezes the base image and 
subsequent changes in the virtual machine's disk are stored elsewhere in 
the qcow2 file (remember, only internal snapshots). If I take a second 
snapshot, that freezes the first one, and subsequent changes are now in 
third location. Each new snapshot is incremental to the one that 
preceded it rather than differential to the base image. Each new 
snapshot is a child of the previous one.

One explanation I've seen of the process is if I delete a snapshot, the 
changes it contains are merged with its immediate child. So if I deleted 
the first snapshot, the base image stays the same but any data that has 
changed since the base image is now in the second snapshot's location. 
The merge with children explanation also implies that the base image is 
never touched even if the first snapshot is deleted.

But if I delete a snapshot that has no children, is that essentially the 
same as reverting to the point that snapshot was created and all 
subsequent disk changes are lost? Or does it merge down to the parent 
snapshot? If I delete all snapshots, would that revert to the base image?

I've seen it explained that a snapshot is very much like a timestamp so 
deleting a timestamp removes the dividing line between writes that 
occurred before and after that time, so that data is really only removed 
if I revert to some time stamp - all writes after that point are 
discarded. In this explanation, deleting the oldest timestamp is 
essentially updating the base image. Deleting all snapshots would leave 
me with the base image fully updated.

Frankly, the second explanation sounds more reasonable to me, without 
having to figure out how copy-on-write works,  But I'm dealing with 
important data here and I don't want to mess it up by mishandling the 
snapshots.

Can some provide a little clarity on this? Thanks!



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm / virsh snapshot management
  2019-06-02  0:12 kvm / virsh snapshot management Gary Dale
@ 2019-06-10 12:19   ` Stefan Hajnoczi
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2019-06-10 12:19 UTC (permalink / raw)
  To: Gary Dale; +Cc: kvm, qemu-devel, Kevin Wolf, John Snow

[-- Attachment #1: Type: text/plain, Size: 3962 bytes --]

On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:
> A while back I converted a raw disk image to qcow2 to be able to use
> snapshots. However I realize that I may not really understand exactly how
> snapshots work. In this particular case, I'm only talking about internal
> snapshots currently as there seems to be some differences of opinion as to
> whether internal or external are safer/more reliable. I'm also only talking
> about shutdown state snapshots, so it should just be the disk that is
> snapshotted.
> 
> As I understand it, the first snapshot freezes the base image and subsequent
> changes in the virtual machine's disk are stored elsewhere in the qcow2 file
> (remember, only internal snapshots). If I take a second snapshot, that
> freezes the first one, and subsequent changes are now in third location.
> Each new snapshot is incremental to the one that preceded it rather than
> differential to the base image. Each new snapshot is a child of the previous
> one.

Internal snapshots are not incremental or differential at the qcow2
level, they are simply a separate L1/L2 table pointing to data clusters.
In other words, they are an independent set of metadata showing the full
state of the image at the point of the snapshot.  qcow2 does not track
relationships between snapshots and parents/children.

> 
> One explanation I've seen of the process is if I delete a snapshot, the
> changes it contains are merged with its immediate child.

Nope.  Deleting a snapshot decrements the reference count on all its
data clusters.  If a data cluster's reference count reaches zero it will
be freed.  That's all, there is no additional data movement or
reorganization aside from this.

> So if I deleted the
> first snapshot, the base image stays the same but any data that has changed
> since the base image is now in the second snapshot's location. The merge
> with children explanation also implies that the base image is never touched
> even if the first snapshot is deleted.
> 
> But if I delete a snapshot that has no children, is that essentially the
> same as reverting to the point that snapshot was created and all subsequent
> disk changes are lost? Or does it merge down to the parent snapshot? If I
> delete all snapshots, would that revert to the base image?

No.  qcow2 has the concept of the current disk state of the running VM -
what you get when you boot the guest - and the snapshots - they are
read-only.

When you delete snapshots the current disk state (running VM) is
unaffected.

When you apply a snapshot this throws away the current disk state and
uses the snapshot as the new current disk state.  The read-only snapshot
itself is not modified in any way and you can apply the same snapshot
again as many times as you wish later.

> 
> I've seen it explained that a snapshot is very much like a timestamp so
> deleting a timestamp removes the dividing line between writes that occurred
> before and after that time, so that data is really only removed if I revert
> to some time stamp - all writes after that point are discarded. In this
> explanation, deleting the oldest timestamp is essentially updating the base
> image. Deleting all snapshots would leave me with the base image fully
> updated.
> 
> Frankly, the second explanation sounds more reasonable to me, without having
> to figure out how copy-on-write works,  But I'm dealing with important data
> here and I don't want to mess it up by mishandling the snapshots.
> 
> Can some provide a little clarity on this? Thanks!

If you want an analogy then git(1) is a pretty good one.  qcow2 internal
snapshots are like git tags.  Unlike branches, tags are immutable.  In
qcow2 you only have a master branch (the current disk state) from which
you can create a new tag or you can use git-checkout(1) to apply a
snapshot (discarding whatever your current disk state is).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
@ 2019-06-10 12:19   ` Stefan Hajnoczi
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2019-06-10 12:19 UTC (permalink / raw)
  To: Gary Dale; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm

[-- Attachment #1: Type: text/plain, Size: 3962 bytes --]

On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:
> A while back I converted a raw disk image to qcow2 to be able to use
> snapshots. However I realize that I may not really understand exactly how
> snapshots work. In this particular case, I'm only talking about internal
> snapshots currently as there seems to be some differences of opinion as to
> whether internal or external are safer/more reliable. I'm also only talking
> about shutdown state snapshots, so it should just be the disk that is
> snapshotted.
> 
> As I understand it, the first snapshot freezes the base image and subsequent
> changes in the virtual machine's disk are stored elsewhere in the qcow2 file
> (remember, only internal snapshots). If I take a second snapshot, that
> freezes the first one, and subsequent changes are now in third location.
> Each new snapshot is incremental to the one that preceded it rather than
> differential to the base image. Each new snapshot is a child of the previous
> one.

Internal snapshots are not incremental or differential at the qcow2
level, they are simply a separate L1/L2 table pointing to data clusters.
In other words, they are an independent set of metadata showing the full
state of the image at the point of the snapshot.  qcow2 does not track
relationships between snapshots and parents/children.

> 
> One explanation I've seen of the process is if I delete a snapshot, the
> changes it contains are merged with its immediate child.

Nope.  Deleting a snapshot decrements the reference count on all its
data clusters.  If a data cluster's reference count reaches zero it will
be freed.  That's all, there is no additional data movement or
reorganization aside from this.

> So if I deleted the
> first snapshot, the base image stays the same but any data that has changed
> since the base image is now in the second snapshot's location. The merge
> with children explanation also implies that the base image is never touched
> even if the first snapshot is deleted.
> 
> But if I delete a snapshot that has no children, is that essentially the
> same as reverting to the point that snapshot was created and all subsequent
> disk changes are lost? Or does it merge down to the parent snapshot? If I
> delete all snapshots, would that revert to the base image?

No.  qcow2 has the concept of the current disk state of the running VM -
what you get when you boot the guest - and the snapshots - they are
read-only.

When you delete snapshots the current disk state (running VM) is
unaffected.

When you apply a snapshot this throws away the current disk state and
uses the snapshot as the new current disk state.  The read-only snapshot
itself is not modified in any way and you can apply the same snapshot
again as many times as you wish later.

> 
> I've seen it explained that a snapshot is very much like a timestamp so
> deleting a timestamp removes the dividing line between writes that occurred
> before and after that time, so that data is really only removed if I revert
> to some time stamp - all writes after that point are discarded. In this
> explanation, deleting the oldest timestamp is essentially updating the base
> image. Deleting all snapshots would leave me with the base image fully
> updated.
> 
> Frankly, the second explanation sounds more reasonable to me, without having
> to figure out how copy-on-write works,  But I'm dealing with important data
> here and I don't want to mess it up by mishandling the snapshots.
> 
> Can some provide a little clarity on this? Thanks!

If you want an analogy then git(1) is a pretty good one.  qcow2 internal
snapshots are like git tags.  Unlike branches, tags are immutable.  In
qcow2 you only have a master branch (the current disk state) from which
you can create a new tag or you can use git-checkout(1) to apply a
snapshot (discarding whatever your current disk state is).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm / virsh snapshot management
  2019-06-10 12:19   ` [Qemu-devel] " Stefan Hajnoczi
@ 2019-06-10 15:54     ` Gary Dale
  -1 siblings, 0 replies; 14+ messages in thread
From: Gary Dale @ 2019-06-10 15:54 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm, qemu-devel, Kevin Wolf, John Snow

On 2019-06-10 8:19 a.m., Stefan Hajnoczi wrote:
> On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:
>> A while back I converted a raw disk image to qcow2 to be able to use
>> snapshots. However I realize that I may not really understand exactly how
>> snapshots work. In this particular case, I'm only talking about internal
>> snapshots currently as there seems to be some differences of opinion as to
>> whether internal or external are safer/more reliable. I'm also only talking
>> about shutdown state snapshots, so it should just be the disk that is
>> snapshotted.
>>
>> As I understand it, the first snapshot freezes the base image and subsequent
>> changes in the virtual machine's disk are stored elsewhere in the qcow2 file
>> (remember, only internal snapshots). If I take a second snapshot, that
>> freezes the first one, and subsequent changes are now in third location.
>> Each new snapshot is incremental to the one that preceded it rather than
>> differential to the base image. Each new snapshot is a child of the previous
>> one.
> Internal snapshots are not incremental or differential at the qcow2
> level, they are simply a separate L1/L2 table pointing to data clusters.
> In other words, they are an independent set of metadata showing the full
> state of the image at the point of the snapshot.  qcow2 does not track
> relationships between snapshots and parents/children.
Which sounds to me like they are incremental. Each snapshot starts a new 
L1/L2 table so that the state of the previous one is preserved.
>
>> One explanation I've seen of the process is if I delete a snapshot, the
>> changes it contains are merged with its immediate child.
> Nope.  Deleting a snapshot decrements the reference count on all its
> data clusters.  If a data cluster's reference count reaches zero it will
> be freed.  That's all, there is no additional data movement or
> reorganization aside from this.
Perhaps not physically but logically it would appear that the data 
clusters were merged.
>
>> So if I deleted the
>> first snapshot, the base image stays the same but any data that has changed
>> since the base image is now in the second snapshot's location. The merge
>> with children explanation also implies that the base image is never touched
>> even if the first snapshot is deleted.
>>
>> But if I delete a snapshot that has no children, is that essentially the
>> same as reverting to the point that snapshot was created and all subsequent
>> disk changes are lost? Or does it merge down to the parent snapshot? If I
>> delete all snapshots, would that revert to the base image?
> No.  qcow2 has the concept of the current disk state of the running VM -
> what you get when you boot the guest - and the snapshots - they are
> read-only.
>
> When you delete snapshots the current disk state (running VM) is
> unaffected.
>
> When you apply a snapshot this throws away the current disk state and
> uses the snapshot as the new current disk state.  The read-only snapshot
> itself is not modified in any way and you can apply the same snapshot
> again as many times as you wish later.
So in essence the current state is a pointer to the latest data cluster, 
which is the only data cluster that can be modified.
>
>> I've seen it explained that a snapshot is very much like a timestamp so
>> deleting a timestamp removes the dividing line between writes that occurred
>> before and after that time, so that data is really only removed if I revert
>> to some time stamp - all writes after that point are discarded. In this
>> explanation, deleting the oldest timestamp is essentially updating the base
>> image. Deleting all snapshots would leave me with the base image fully
>> updated.
>>
>> Frankly, the second explanation sounds more reasonable to me, without having
>> to figure out how copy-on-write works,  But I'm dealing with important data
>> here and I don't want to mess it up by mishandling the snapshots.
>>
>> Can some provide a little clarity on this? Thanks!
> If you want an analogy then git(1) is a pretty good one.  qcow2 internal
> snapshots are like git tags.  Unlike branches, tags are immutable.  In
> qcow2 you only have a master branch (the current disk state) from which
> you can create a new tag or you can use git-checkout(1) to apply a
> snapshot (discarding whatever your current disk state is).
>
> Stefan

That's just making things less clear - I've never tried to understand 
git either. Thanks for the attempt though.

If I've gotten things correct, once the base image is established, there 
is a current disk state that points to a table containing all the writes 
since the base image. Creating a snapshot essentially takes that pointer 
and gives it the snapshot name, while creating a new current disk state 
pointer and data table where subsequent writes are recorded.

Deleting snapshots removes your ability to refer to a data table by 
name, but the table itself still exists anonymously as part of a chain 
of data tables between the base image and the current state.

This leaves a problem. The chain will very quickly get quite long which 
will impact performance. To combat this, you can use blockcommit to 
merge a child with its parent or blockpull to merge a parent with its child.

In my situation, I want to keep a week of daily snapshots in case 
something goes horribly wrong with the VM (I recently had a database 
file become corrupt, and reverting to the previous working day's image 
would have been a quick and easy solution, faster than recovering all 
the data tables from the prefious day). I've been shutting down the VM, 
deleting the oldest snapshot and creating a new one before restarting 
the VM.

While your explanation confirms that this is safe, it also implies that 
I need to manage the data table chains. My first instinct is to use 
blockcommit before deleting the oldest snapshot, such as:

     virsh blockcommit <vm name> <qcow2 file path> --top <oldest 
snapshot> --delete --wait
     virsh snapshot-delete  --domain <vm name> --snapshotname <oldest 
snapshot>

so that the base image contains the state as of one week earlier and the 
snapshot chains are limited to 7 links.

1) does this sound reasonable?

2) I note that the syntax in virsh man page is different from the syntax 
at 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-backing-chain 
(RedHat uses --top and --base while the man page just has optional base 
and top names). I believe the RedHat guide is correct because the man 
page doesn't allow distinguishing between the base and the top for a commit.

However the need for specifying the path isn't obvious to me. Isn't the 
path contained in the VM definition?

Since blockcommit would make it impossible for me to revert to an 
earlier state (because I'm committing the oldest snapshot, if it screws 
up, I can't undo within virsh), I need to make sure this command is correct.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
@ 2019-06-10 15:54     ` Gary Dale
  0 siblings, 0 replies; 14+ messages in thread
From: Gary Dale @ 2019-06-10 15:54 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm

On 2019-06-10 8:19 a.m., Stefan Hajnoczi wrote:
> On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:
>> A while back I converted a raw disk image to qcow2 to be able to use
>> snapshots. However I realize that I may not really understand exactly how
>> snapshots work. In this particular case, I'm only talking about internal
>> snapshots currently as there seems to be some differences of opinion as to
>> whether internal or external are safer/more reliable. I'm also only talking
>> about shutdown state snapshots, so it should just be the disk that is
>> snapshotted.
>>
>> As I understand it, the first snapshot freezes the base image and subsequent
>> changes in the virtual machine's disk are stored elsewhere in the qcow2 file
>> (remember, only internal snapshots). If I take a second snapshot, that
>> freezes the first one, and subsequent changes are now in third location.
>> Each new snapshot is incremental to the one that preceded it rather than
>> differential to the base image. Each new snapshot is a child of the previous
>> one.
> Internal snapshots are not incremental or differential at the qcow2
> level, they are simply a separate L1/L2 table pointing to data clusters.
> In other words, they are an independent set of metadata showing the full
> state of the image at the point of the snapshot.  qcow2 does not track
> relationships between snapshots and parents/children.
Which sounds to me like they are incremental. Each snapshot starts a new 
L1/L2 table so that the state of the previous one is preserved.
>
>> One explanation I've seen of the process is if I delete a snapshot, the
>> changes it contains are merged with its immediate child.
> Nope.  Deleting a snapshot decrements the reference count on all its
> data clusters.  If a data cluster's reference count reaches zero it will
> be freed.  That's all, there is no additional data movement or
> reorganization aside from this.
Perhaps not physically but logically it would appear that the data 
clusters were merged.
>
>> So if I deleted the
>> first snapshot, the base image stays the same but any data that has changed
>> since the base image is now in the second snapshot's location. The merge
>> with children explanation also implies that the base image is never touched
>> even if the first snapshot is deleted.
>>
>> But if I delete a snapshot that has no children, is that essentially the
>> same as reverting to the point that snapshot was created and all subsequent
>> disk changes are lost? Or does it merge down to the parent snapshot? If I
>> delete all snapshots, would that revert to the base image?
> No.  qcow2 has the concept of the current disk state of the running VM -
> what you get when you boot the guest - and the snapshots - they are
> read-only.
>
> When you delete snapshots the current disk state (running VM) is
> unaffected.
>
> When you apply a snapshot this throws away the current disk state and
> uses the snapshot as the new current disk state.  The read-only snapshot
> itself is not modified in any way and you can apply the same snapshot
> again as many times as you wish later.
So in essence the current state is a pointer to the latest data cluster, 
which is the only data cluster that can be modified.
>
>> I've seen it explained that a snapshot is very much like a timestamp so
>> deleting a timestamp removes the dividing line between writes that occurred
>> before and after that time, so that data is really only removed if I revert
>> to some time stamp - all writes after that point are discarded. In this
>> explanation, deleting the oldest timestamp is essentially updating the base
>> image. Deleting all snapshots would leave me with the base image fully
>> updated.
>>
>> Frankly, the second explanation sounds more reasonable to me, without having
>> to figure out how copy-on-write works,  But I'm dealing with important data
>> here and I don't want to mess it up by mishandling the snapshots.
>>
>> Can some provide a little clarity on this? Thanks!
> If you want an analogy then git(1) is a pretty good one.  qcow2 internal
> snapshots are like git tags.  Unlike branches, tags are immutable.  In
> qcow2 you only have a master branch (the current disk state) from which
> you can create a new tag or you can use git-checkout(1) to apply a
> snapshot (discarding whatever your current disk state is).
>
> Stefan

That's just making things less clear - I've never tried to understand 
git either. Thanks for the attempt though.

If I've gotten things correct, once the base image is established, there 
is a current disk state that points to a table containing all the writes 
since the base image. Creating a snapshot essentially takes that pointer 
and gives it the snapshot name, while creating a new current disk state 
pointer and data table where subsequent writes are recorded.

Deleting snapshots removes your ability to refer to a data table by 
name, but the table itself still exists anonymously as part of a chain 
of data tables between the base image and the current state.

This leaves a problem. The chain will very quickly get quite long which 
will impact performance. To combat this, you can use blockcommit to 
merge a child with its parent or blockpull to merge a parent with its child.

In my situation, I want to keep a week of daily snapshots in case 
something goes horribly wrong with the VM (I recently had a database 
file become corrupt, and reverting to the previous working day's image 
would have been a quick and easy solution, faster than recovering all 
the data tables from the prefious day). I've been shutting down the VM, 
deleting the oldest snapshot and creating a new one before restarting 
the VM.

While your explanation confirms that this is safe, it also implies that 
I need to manage the data table chains. My first instinct is to use 
blockcommit before deleting the oldest snapshot, such as:

     virsh blockcommit <vm name> <qcow2 file path> --top <oldest 
snapshot> --delete --wait
     virsh snapshot-delete  --domain <vm name> --snapshotname <oldest 
snapshot>

so that the base image contains the state as of one week earlier and the 
snapshot chains are limited to 7 links.

1) does this sound reasonable?

2) I note that the syntax in virsh man page is different from the syntax 
at 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-backing-chain 
(RedHat uses --top and --base while the man page just has optional base 
and top names). I believe the RedHat guide is correct because the man 
page doesn't allow distinguishing between the base and the top for a commit.

However the need for specifying the path isn't obvious to me. Isn't the 
path contained in the VM definition?

Since blockcommit would make it impossible for me to revert to an 
earlier state (because I'm committing the oldest snapshot, if it screws 
up, I can't undo within virsh), I need to make sure this command is correct.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kvm / virsh snapshot management
  2019-06-10 15:54     ` [Qemu-devel] " Gary Dale
@ 2019-06-10 21:27       ` Gary Dale
  -1 siblings, 0 replies; 14+ messages in thread
From: Gary Dale @ 2019-06-10 21:27 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kvm, qemu-devel, Kevin Wolf, John Snow

On 2019-06-10 11:54 a.m., Gary Dale wrote:
> On 2019-06-10 8:19 a.m., Stefan Hajnoczi wrote:
>> On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:
>>> A while back I converted a raw disk image to qcow2 to be able to use
>>> snapshots. However I realize that I may not really understand 
>>> exactly how
>>> snapshots work. In this particular case, I'm only talking about 
>>> internal
>>> snapshots currently as there seems to be some differences of opinion 
>>> as to
>>> whether internal or external are safer/more reliable. I'm also only 
>>> talking
>>> about shutdown state snapshots, so it should just be the disk that is
>>> snapshotted.
>>>
>>> As I understand it, the first snapshot freezes the base image and 
>>> subsequent
>>> changes in the virtual machine's disk are stored elsewhere in the 
>>> qcow2 file
>>> (remember, only internal snapshots). If I take a second snapshot, that
>>> freezes the first one, and subsequent changes are now in third 
>>> location.
>>> Each new snapshot is incremental to the one that preceded it rather 
>>> than
>>> differential to the base image. Each new snapshot is a child of the 
>>> previous
>>> one.
>> Internal snapshots are not incremental or differential at the qcow2
>> level, they are simply a separate L1/L2 table pointing to data clusters.
>> In other words, they are an independent set of metadata showing the full
>> state of the image at the point of the snapshot.  qcow2 does not track
>> relationships between snapshots and parents/children.
> Which sounds to me like they are incremental. Each snapshot starts a 
> new L1/L2 table so that the state of the previous one is preserved.
>>
>>> One explanation I've seen of the process is if I delete a snapshot, the
>>> changes it contains are merged with its immediate child.
>> Nope.  Deleting a snapshot decrements the reference count on all its
>> data clusters.  If a data cluster's reference count reaches zero it will
>> be freed.  That's all, there is no additional data movement or
>> reorganization aside from this.
> Perhaps not physically but logically it would appear that the data 
> clusters were merged.
>>
>>> So if I deleted the
>>> first snapshot, the base image stays the same but any data that has 
>>> changed
>>> since the base image is now in the second snapshot's location. The 
>>> merge
>>> with children explanation also implies that the base image is never 
>>> touched
>>> even if the first snapshot is deleted.
>>>
>>> But if I delete a snapshot that has no children, is that essentially 
>>> the
>>> same as reverting to the point that snapshot was created and all 
>>> subsequent
>>> disk changes are lost? Or does it merge down to the parent snapshot? 
>>> If I
>>> delete all snapshots, would that revert to the base image?
>> No.  qcow2 has the concept of the current disk state of the running VM -
>> what you get when you boot the guest - and the snapshots - they are
>> read-only.
>>
>> When you delete snapshots the current disk state (running VM) is
>> unaffected.
>>
>> When you apply a snapshot this throws away the current disk state and
>> uses the snapshot as the new current disk state.  The read-only snapshot
>> itself is not modified in any way and you can apply the same snapshot
>> again as many times as you wish later.
> So in essence the current state is a pointer to the latest data 
> cluster, which is the only data cluster that can be modified.
>>
>>> I've seen it explained that a snapshot is very much like a timestamp so
>>> deleting a timestamp removes the dividing line between writes that 
>>> occurred
>>> before and after that time, so that data is really only removed if I 
>>> revert
>>> to some time stamp - all writes after that point are discarded. In this
>>> explanation, deleting the oldest timestamp is essentially updating 
>>> the base
>>> image. Deleting all snapshots would leave me with the base image fully
>>> updated.
>>>
>>> Frankly, the second explanation sounds more reasonable to me, 
>>> without having
>>> to figure out how copy-on-write works,  But I'm dealing with 
>>> important data
>>> here and I don't want to mess it up by mishandling the snapshots.
>>>
>>> Can some provide a little clarity on this? Thanks!
>> If you want an analogy then git(1) is a pretty good one.  qcow2 internal
>> snapshots are like git tags.  Unlike branches, tags are immutable.  In
>> qcow2 you only have a master branch (the current disk state) from which
>> you can create a new tag or you can use git-checkout(1) to apply a
>> snapshot (discarding whatever your current disk state is).
>>
>> Stefan
>
> That's just making things less clear - I've never tried to understand 
> git either. Thanks for the attempt though.
>
> If I've gotten things correct, once the base image is established, 
> there is a current disk state that points to a table containing all 
> the writes since the base image. Creating a snapshot essentially takes 
> that pointer and gives it the snapshot name, while creating a new 
> current disk state pointer and data table where subsequent writes are 
> recorded.
>
> Deleting snapshots removes your ability to refer to a data table by 
> name, but the table itself still exists anonymously as part of a chain 
> of data tables between the base image and the current state.
>
> This leaves a problem. The chain will very quickly get quite long 
> which will impact performance. To combat this, you can use blockcommit 
> to merge a child with its parent or blockpull to merge a parent with 
> its child.
>
> In my situation, I want to keep a week of daily snapshots in case 
> something goes horribly wrong with the VM (I recently had a database 
> file become corrupt, and reverting to the previous working day's image 
> would have been a quick and easy solution, faster than recovering all 
> the data tables from the prefious day). I've been shutting down the 
> VM, deleting the oldest snapshot and creating a new one before 
> restarting the VM.
>
> While your explanation confirms that this is safe, it also implies 
> that I need to manage the data table chains. My first instinct is to 
> use blockcommit before deleting the oldest snapshot, such as:
>
>     virsh blockcommit <vm name> <qcow2 file path> --top <oldest 
> snapshot> --delete --wait
>     virsh snapshot-delete  --domain <vm name> --snapshotname <oldest 
> snapshot>
>
> so that the base image contains the state as of one week earlier and 
> the snapshot chains are limited to 7 links.
>
> 1) does this sound reasonable?
>
> 2) I note that the syntax in virsh man page is different from the 
> syntax at 
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-backing-chain 
> (RedHat uses --top and --base while the man page just has optional 
> base and top names). I believe the RedHat guide is correct because the 
> man page doesn't allow distinguishing between the base and the top for 
> a commit.
>
> However the need for specifying the path isn't obvious to me. Isn't 
> the path contained in the VM definition?
>
> Since blockcommit would make it impossible for me to revert to an 
> earlier state (because I'm committing the oldest snapshot, if it 
> screws up, I can't undo within virsh), I need to make sure this 
> command is correct.
>
Trying this against a test VM, I ran into a roadblock. My command line 
and the results are:

# virsh blockcommit stretch "/home/secure/virtual/stretch.qcow2" --top 
stretchS3 --delete --wait
error: unsupported flags (0x2) in function qemuDomainBlockCommit

I get the same thing when the path to the qcow2 file isn't quoted.

I noted in 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain 
that the options use a single "-". However the results for that were:
# virsh blockcommit stretch /home/secure/virtual/stretch.qcow2 -top 
stretchS3 -delete -wait
error: Scaled numeric value '-top' for <--bandwidth> option is malformed 
or out of range

which looks like virsh doesn't like the single dashes and is trying to 
interpret them as positional options.

I also did a

# virsh domblklist stretch
Target     Source
------------------------------------------------
vda        /home/secure/virtual/stretch.qcow2
hda        -

and tried using vda instead of the full path in the blockcommit but got 
the same error.

Any ideas on what I'm doing wrong?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
@ 2019-06-10 21:27       ` Gary Dale
  0 siblings, 0 replies; 14+ messages in thread
From: Gary Dale @ 2019-06-10 21:27 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm

On 2019-06-10 11:54 a.m., Gary Dale wrote:
> On 2019-06-10 8:19 a.m., Stefan Hajnoczi wrote:
>> On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:
>>> A while back I converted a raw disk image to qcow2 to be able to use
>>> snapshots. However I realize that I may not really understand 
>>> exactly how
>>> snapshots work. In this particular case, I'm only talking about 
>>> internal
>>> snapshots currently as there seems to be some differences of opinion 
>>> as to
>>> whether internal or external are safer/more reliable. I'm also only 
>>> talking
>>> about shutdown state snapshots, so it should just be the disk that is
>>> snapshotted.
>>>
>>> As I understand it, the first snapshot freezes the base image and 
>>> subsequent
>>> changes in the virtual machine's disk are stored elsewhere in the 
>>> qcow2 file
>>> (remember, only internal snapshots). If I take a second snapshot, that
>>> freezes the first one, and subsequent changes are now in third 
>>> location.
>>> Each new snapshot is incremental to the one that preceded it rather 
>>> than
>>> differential to the base image. Each new snapshot is a child of the 
>>> previous
>>> one.
>> Internal snapshots are not incremental or differential at the qcow2
>> level, they are simply a separate L1/L2 table pointing to data clusters.
>> In other words, they are an independent set of metadata showing the full
>> state of the image at the point of the snapshot.  qcow2 does not track
>> relationships between snapshots and parents/children.
> Which sounds to me like they are incremental. Each snapshot starts a 
> new L1/L2 table so that the state of the previous one is preserved.
>>
>>> One explanation I've seen of the process is if I delete a snapshot, the
>>> changes it contains are merged with its immediate child.
>> Nope.  Deleting a snapshot decrements the reference count on all its
>> data clusters.  If a data cluster's reference count reaches zero it will
>> be freed.  That's all, there is no additional data movement or
>> reorganization aside from this.
> Perhaps not physically but logically it would appear that the data 
> clusters were merged.
>>
>>> So if I deleted the
>>> first snapshot, the base image stays the same but any data that has 
>>> changed
>>> since the base image is now in the second snapshot's location. The 
>>> merge
>>> with children explanation also implies that the base image is never 
>>> touched
>>> even if the first snapshot is deleted.
>>>
>>> But if I delete a snapshot that has no children, is that essentially 
>>> the
>>> same as reverting to the point that snapshot was created and all 
>>> subsequent
>>> disk changes are lost? Or does it merge down to the parent snapshot? 
>>> If I
>>> delete all snapshots, would that revert to the base image?
>> No.  qcow2 has the concept of the current disk state of the running VM -
>> what you get when you boot the guest - and the snapshots - they are
>> read-only.
>>
>> When you delete snapshots the current disk state (running VM) is
>> unaffected.
>>
>> When you apply a snapshot this throws away the current disk state and
>> uses the snapshot as the new current disk state.  The read-only snapshot
>> itself is not modified in any way and you can apply the same snapshot
>> again as many times as you wish later.
> So in essence the current state is a pointer to the latest data 
> cluster, which is the only data cluster that can be modified.
>>
>>> I've seen it explained that a snapshot is very much like a timestamp so
>>> deleting a timestamp removes the dividing line between writes that 
>>> occurred
>>> before and after that time, so that data is really only removed if I 
>>> revert
>>> to some time stamp - all writes after that point are discarded. In this
>>> explanation, deleting the oldest timestamp is essentially updating 
>>> the base
>>> image. Deleting all snapshots would leave me with the base image fully
>>> updated.
>>>
>>> Frankly, the second explanation sounds more reasonable to me, 
>>> without having
>>> to figure out how copy-on-write works,  But I'm dealing with 
>>> important data
>>> here and I don't want to mess it up by mishandling the snapshots.
>>>
>>> Can some provide a little clarity on this? Thanks!
>> If you want an analogy then git(1) is a pretty good one.  qcow2 internal
>> snapshots are like git tags.  Unlike branches, tags are immutable.  In
>> qcow2 you only have a master branch (the current disk state) from which
>> you can create a new tag or you can use git-checkout(1) to apply a
>> snapshot (discarding whatever your current disk state is).
>>
>> Stefan
>
> That's just making things less clear - I've never tried to understand 
> git either. Thanks for the attempt though.
>
> If I've gotten things correct, once the base image is established, 
> there is a current disk state that points to a table containing all 
> the writes since the base image. Creating a snapshot essentially takes 
> that pointer and gives it the snapshot name, while creating a new 
> current disk state pointer and data table where subsequent writes are 
> recorded.
>
> Deleting snapshots removes your ability to refer to a data table by 
> name, but the table itself still exists anonymously as part of a chain 
> of data tables between the base image and the current state.
>
> This leaves a problem. The chain will very quickly get quite long 
> which will impact performance. To combat this, you can use blockcommit 
> to merge a child with its parent or blockpull to merge a parent with 
> its child.
>
> In my situation, I want to keep a week of daily snapshots in case 
> something goes horribly wrong with the VM (I recently had a database 
> file become corrupt, and reverting to the previous working day's image 
> would have been a quick and easy solution, faster than recovering all 
> the data tables from the prefious day). I've been shutting down the 
> VM, deleting the oldest snapshot and creating a new one before 
> restarting the VM.
>
> While your explanation confirms that this is safe, it also implies 
> that I need to manage the data table chains. My first instinct is to 
> use blockcommit before deleting the oldest snapshot, such as:
>
>     virsh blockcommit <vm name> <qcow2 file path> --top <oldest 
> snapshot> --delete --wait
>     virsh snapshot-delete  --domain <vm name> --snapshotname <oldest 
> snapshot>
>
> so that the base image contains the state as of one week earlier and 
> the snapshot chains are limited to 7 links.
>
> 1) does this sound reasonable?
>
> 2) I note that the syntax in virsh man page is different from the 
> syntax at 
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-backing-chain 
> (RedHat uses --top and --base while the man page just has optional 
> base and top names). I believe the RedHat guide is correct because the 
> man page doesn't allow distinguishing between the base and the top for 
> a commit.
>
> However the need for specifying the path isn't obvious to me. Isn't 
> the path contained in the VM definition?
>
> Since blockcommit would make it impossible for me to revert to an 
> earlier state (because I'm committing the oldest snapshot, if it 
> screws up, I can't undo within virsh), I need to make sure this 
> command is correct.
>
Trying this against a test VM, I ran into a roadblock. My command line 
and the results are:

# virsh blockcommit stretch "/home/secure/virtual/stretch.qcow2" --top 
stretchS3 --delete --wait
error: unsupported flags (0x2) in function qemuDomainBlockCommit

I get the same thing when the path to the qcow2 file isn't quoted.

I noted in 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain 
that the options use a single "-". However the results for that were:
# virsh blockcommit stretch /home/secure/virtual/stretch.qcow2 -top 
stretchS3 -delete -wait
error: Scaled numeric value '-top' for <--bandwidth> option is malformed 
or out of range

which looks like virsh doesn't like the single dashes and is trying to 
interpret them as positional options.

I also did a

# virsh domblklist stretch
Target     Source
------------------------------------------------
vda        /home/secure/virtual/stretch.qcow2
hda        -

and tried using vda instead of the full path in the blockcommit but got 
the same error.

Any ideas on what I'm doing wrong?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
  2019-06-10 15:54     ` [Qemu-devel] " Gary Dale
  (?)
  (?)
@ 2019-06-10 22:04     ` Eric Blake
  2019-06-10 22:47       ` Gary Dale
  -1 siblings, 1 reply; 14+ messages in thread
From: Eric Blake @ 2019-06-10 22:04 UTC (permalink / raw)
  To: Gary Dale, Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm


[-- Attachment #1.1: Type: text/plain, Size: 8553 bytes --]

On 6/10/19 10:54 AM, Gary Dale wrote:

>>> One explanation I've seen of the process is if I delete a snapshot, the
>>> changes it contains are merged with its immediate child.
>> Nope.  Deleting a snapshot decrements the reference count on all its
>> data clusters.  If a data cluster's reference count reaches zero it will
>> be freed.  That's all, there is no additional data movement or
>> reorganization aside from this.
> Perhaps not physically but logically it would appear that the data
> clusters were merged.

No.

If I have an image that starts out as all blanks, then write to part of
it (top line showing cluster number, bottom line showing representative
data):

012345
AA----

then take internal snapshot S1, then write more:

ABB---

then take another internal snapshot S2, then write even more:

ABCC--

the single qcow2 image will have something like:

L1 table for S1 => {
  guest cluster 0 => host cluster 5 refcount 3 content A
  guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for S2 => {
  guest cluster 0 => host cluster 5 refcount 3 content A
  guest cluster 1 => host cluster 7 refcount 2 content B
  guest cluster 2 => host cluster 8 refcount 1 content B
}
L1 table for active image => {
  guest cluster 0 => host cluster 5 refcount 3 content A
  guest cluster 1 => host cluster 7 refcount 2 content B
  guest cluster 2 => host cluster 9 refcount 1 content C
  guest cluster 3 => host cluster 10 refcount 1 content C
}


If I then delete S2, I'm left with:

L1 table for S1 => {
  guest cluster 0 => host cluster 5 refcount 2 content A
  guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for active image => {
  guest cluster 0 => host cluster 5 refcount 2 content A
  guest cluster 1 => host cluster 7 refcount 1 content B
  guest cluster 2 => host cluster 9 refcount 1 content C
  guest cluster 3 => host cluster 10 refcount 1 content C
}

and host cluster 8 is no longer in use.

Or, if I instead use external snapshots, I have a chain of images:

base <- mid <- active

L1 table for image base => {
  guest cluster 0 => host cluster 5 refcount 1 content A
  guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for image mid => {
  guest cluster 1 => host cluster 5 refcount 1 content B
  guest cluster 2 => host cluster 6 refcount 1 content B
}
L1 table for image active => {
  guest cluster 2 => host cluster 5 refcount 1 content C
  guest cluster 3 => host cluster 6 refcount 1 content C
}

If I then delete image mid, I can do so in one of two ways:

blockcommit mid into base:
base <- active
L1 table for image base => {
  guest cluster 0 => host cluster 5 refcount 1 content A
  guest cluster 1 => host cluster 6 refcount 1 content B
  guest cluster 2 => host cluster 7 refcount 1 content B
}
L1 table for image active => {
  guest cluster 2 => host cluster 5 refcount 1 content C
  guest cluster 3 => host cluster 6 refcount 1 content C
}


blockpull mid into active:
base <- active
L1 table for image base => {
  guest cluster 0 => host cluster 5 refcount 1 content A
  guest cluster 1 => host cluster 6 refcount 1 content A
}
L1 table for image active => {
  guest cluster 1 => host cluster 7 refcount 1 content B
  guest cluster 2 => host cluster 5 refcount 1 content C
  guest cluster 3 => host cluster 6 refcount 1 content C
}


>>> Can some provide a little clarity on this? Thanks!
>> If you want an analogy then git(1) is a pretty good one.  qcow2 internal
>> snapshots are like git tags.  Unlike branches, tags are immutable.  In
>> qcow2 you only have a master branch (the current disk state) from which
>> you can create a new tag or you can use git-checkout(1) to apply a
>> snapshot (discarding whatever your current disk state is).
>>
>> Stefan
> 
> That's just making things less clear - I've never tried to understand
> git either. Thanks for the attempt though.
> 
> If I've gotten things correct, once the base image is established, there
> is a current disk state that points to a table containing all the writes
> since the base image. Creating a snapshot essentially takes that pointer
> and gives it the snapshot name, while creating a new current disk state
> pointer and data table where subsequent writes are recorded.

Not quite. Rather, for internal snapshots, there is a table pointing to
ALL the contents that should be visible to the guest at that point in
time (one table for each snapshot, which is effectively read-only, and
one table for the active image, which is updated dynamically as guest
writes happen).  But the table does NOT track provenance of a cluster,
only a refcount.

> 
> Deleting snapshots removes your ability to refer to a data table by
> name, but the table itself still exists anonymously as part of a chain
> of data tables between the base image and the current state.

Wrong for internal snapshots. There is no chain of data tables, and if a
cluster's refcount goes to 0, you no longer have access to the
information that the guest saw at the time that cluster was created.

Also wrong for external snapshots - there, you do have a chain of data
between images, but when you delete an external snapshot, you should
only do so after moving the relevant data elsewhere in the chain, at
which point you reduced the length of the chain.

> 
> This leaves a problem. The chain will very quickly get quite long which
> will impact performance. To combat this, you can use blockcommit to
> merge a child with its parent or blockpull to merge a parent with its
> child.

Wrong for internal snapshots, where blockcommit and blockpull do not
really work.

More accurate for external snapshots.

> 
> In my situation, I want to keep a week of daily snapshots in case
> something goes horribly wrong with the VM (I recently had a database
> file become corrupt, and reverting to the previous working day's image
> would have been a quick and easy solution, faster than recovering all
> the data tables from the prefious day). I've been shutting down the VM,
> deleting the oldest snapshot and creating a new one before restarting
> the VM.
> 
> While your explanation confirms that this is safe, it also implies that
> I need to manage the data table chains. My first instinct is to use
> blockcommit before deleting the oldest snapshot, such as:
> 
>     virsh blockcommit <vm name> <qcow2 file path> --top <oldest
> snapshot> --delete --wait
>     virsh snapshot-delete  --domain <vm name> --snapshotname <oldest
> snapshot>
> 
> so that the base image contains the state as of one week earlier and the
> snapshot chains are limited to 7 links.
> 
> 1) does this sound reasonable?

If you want to track WHICH clusters have changed since the last backup
(which is the goal of incremental/differential backups), you probably
also want to be using persistent bitmaps.  At the moment, internal
snapshots have very little upstream development compared to external
snapshots, and are less likely to have ways to do what you want.

> 
> 2) I note that the syntax in virsh man page is different from the syntax
> at
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-backing-chain
> (RedHat uses --top and --base while the man page just has optional base
> and top names). I believe the RedHat guide is correct because the man
> page doesn't allow distinguishing between the base and the top for a
> commit.

Questions about virsh are outside the realm of what qemu does (that's
what libvirt adds on top of qemu); and the parameters exposed by virsh
may differ according to what versions you are running. Also be aware
that I'm trying to get a new incremental backup API
virDomainBackupBegin() added to libvirt that will make support for
incremental/differential backups by using qcow2 persistent bitmaps much
easier from libvirt's point of use.

> 
> However the need for specifying the path isn't obvious to me. Isn't the
> path contained in the VM definition?
> 
> Since blockcommit would make it impossible for me to revert to an
> earlier state (because I'm committing the oldest snapshot, if it screws
> up, I can't undo within virsh), I need to make sure this command is
> correct.
> 
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
  2019-06-10 21:27       ` [Qemu-devel] " Gary Dale
  (?)
@ 2019-06-10 22:07       ` Eric Blake
  2019-06-10 23:00         ` Gary Dale
  -1 siblings, 1 reply; 14+ messages in thread
From: Eric Blake @ 2019-06-10 22:07 UTC (permalink / raw)
  To: Gary Dale, Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm


[-- Attachment #1.1: Type: text/plain, Size: 1950 bytes --]

On 6/10/19 4:27 PM, Gary Dale wrote:

>>
> Trying this against a test VM, I ran into a roadblock. My command line
> and the results are:
> 
> # virsh blockcommit stretch "/home/secure/virtual/stretch.qcow2" --top
> stretchS3 --delete --wait
> error: unsupported flags (0x2) in function qemuDomainBlockCommit
> 
> I get the same thing when the path to the qcow2 file isn't quoted.

That's a libvirt limitation - the --delete flag is documented from the
generic API standpoint, but not (yet) implemented for the qemu driver
within libvirt. For now, you have to omit --delete from your virsh
command line, and then manually 'rm' the unused external file after the
fact.

> 
> I noted in
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain
> that the options use a single "-".

Sounds like a bug in that documentation.

> However the results for that were:
> # virsh blockcommit stretch /home/secure/virtual/stretch.qcow2 -top
> stretchS3 -delete -wait
> error: Scaled numeric value '-top' for <--bandwidth> option is malformed
> or out of range
> 
> which looks like virsh doesn't like the single dashes and is trying to
> interpret them as positional options.
> 
> I also did a
> 
> # virsh domblklist stretch
> Target     Source
> ------------------------------------------------
> vda        /home/secure/virtual/stretch.qcow2
> hda        -
> 
> and tried using vda instead of the full path in the blockcommit but got
> the same error.
> 
> Any ideas on what I'm doing wrong?

Do you know for sure whether you have internal or external snapshots?
And at this point, your questions are starting to wander more into
libvirt territory.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
  2019-06-10 22:04     ` Eric Blake
@ 2019-06-10 22:47       ` Gary Dale
  2019-06-10 22:54         ` Eric Blake
  0 siblings, 1 reply; 14+ messages in thread
From: Gary Dale @ 2019-06-10 22:47 UTC (permalink / raw)
  To: Eric Blake, Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm

On 2019-06-10 6:04 p.m., Eric Blake wrote:
> On 6/10/19 10:54 AM, Gary Dale wrote:
>
>>>> One explanation I've seen of the process is if I delete a snapshot, the
>>>> changes it contains are merged with its immediate child.
>>> Nope.  Deleting a snapshot decrements the reference count on all its
>>> data clusters.  If a data cluster's reference count reaches zero it will
>>> be freed.  That's all, there is no additional data movement or
>>> reorganization aside from this.
>> Perhaps not physically but logically it would appear that the data
>> clusters were merged.
> No.
>
> If I have an image that starts out as all blanks, then write to part of
> it (top line showing cluster number, bottom line showing representative
> data):
>
> 012345
> AA----
>
> then take internal snapshot S1, then write more:
>
> ABB---
>
> then take another internal snapshot S2, then write even more:
>
> ABCC--
>
> the single qcow2 image will have something like:
>
> L1 table for S1 => {
>    guest cluster 0 => host cluster 5 refcount 3 content A
>    guest cluster 1 => host cluster 6 refcount 1 content A
> }
> L1 table for S2 => {
>    guest cluster 0 => host cluster 5 refcount 3 content A
>    guest cluster 1 => host cluster 7 refcount 2 content B
>    guest cluster 2 => host cluster 8 refcount 1 content B
> }
> L1 table for active image => {
>    guest cluster 0 => host cluster 5 refcount 3 content A
>    guest cluster 1 => host cluster 7 refcount 2 content B
>    guest cluster 2 => host cluster 9 refcount 1 content C
>    guest cluster 3 => host cluster 10 refcount 1 content C
> }
>
>
> If I then delete S2, I'm left with:
>
> L1 table for S1 => {
>    guest cluster 0 => host cluster 5 refcount 2 content A
>    guest cluster 1 => host cluster 6 refcount 1 content A
> }
> L1 table for active image => {
>    guest cluster 0 => host cluster 5 refcount 2 content A
>    guest cluster 1 => host cluster 7 refcount 1 content B
>    guest cluster 2 => host cluster 9 refcount 1 content C
>    guest cluster 3 => host cluster 10 refcount 1 content C
> }
>
> and host cluster 8 is no longer in use.
>
> Or, if I instead use external snapshots, I have a chain of images:
>
> base <- mid <- active
>
> L1 table for image base => {
>    guest cluster 0 => host cluster 5 refcount 1 content A
>    guest cluster 1 => host cluster 6 refcount 1 content A
> }
> L1 table for image mid => {
>    guest cluster 1 => host cluster 5 refcount 1 content B
>    guest cluster 2 => host cluster 6 refcount 1 content B
> }
> L1 table for image active => {
>    guest cluster 2 => host cluster 5 refcount 1 content C
>    guest cluster 3 => host cluster 6 refcount 1 content C
> }
>
> If I then delete image mid, I can do so in one of two ways:
>
> blockcommit mid into base:
> base <- active
> L1 table for image base => {
>    guest cluster 0 => host cluster 5 refcount 1 content A
>    guest cluster 1 => host cluster 6 refcount 1 content B
>    guest cluster 2 => host cluster 7 refcount 1 content B
> }
> L1 table for image active => {
>    guest cluster 2 => host cluster 5 refcount 1 content C
>    guest cluster 3 => host cluster 6 refcount 1 content C
> }
>
>
> blockpull mid into active:
> base <- active
> L1 table for image base => {
>    guest cluster 0 => host cluster 5 refcount 1 content A
>    guest cluster 1 => host cluster 6 refcount 1 content A
> }
> L1 table for image active => {
>    guest cluster 1 => host cluster 7 refcount 1 content B
>    guest cluster 2 => host cluster 5 refcount 1 content C
>    guest cluster 3 => host cluster 6 refcount 1 content C
> }
>
>
>>>> Can some provide a little clarity on this? Thanks!
>>> If you want an analogy then git(1) is a pretty good one.  qcow2 internal
>>> snapshots are like git tags.  Unlike branches, tags are immutable.  In
>>> qcow2 you only have a master branch (the current disk state) from which
>>> you can create a new tag or you can use git-checkout(1) to apply a
>>> snapshot (discarding whatever your current disk state is).
>>>
>>> Stefan
>> That's just making things less clear - I've never tried to understand
>> git either. Thanks for the attempt though.
>>
>> If I've gotten things correct, once the base image is established, there
>> is a current disk state that points to a table containing all the writes
>> since the base image. Creating a snapshot essentially takes that pointer
>> and gives it the snapshot name, while creating a new current disk state
>> pointer and data table where subsequent writes are recorded.
> Not quite. Rather, for internal snapshots, there is a table pointing to
> ALL the contents that should be visible to the guest at that point in
> time (one table for each snapshot, which is effectively read-only, and
> one table for the active image, which is updated dynamically as guest
> writes happen).  But the table does NOT track provenance of a cluster,
> only a refcount.
>
>> Deleting snapshots removes your ability to refer to a data table by
>> name, but the table itself still exists anonymously as part of a chain
>> of data tables between the base image and the current state.
> Wrong for internal snapshots. There is no chain of data tables, and if a
> cluster's refcount goes to 0, you no longer have access to the
> information that the guest saw at the time that cluster was created.
>
> Also wrong for external snapshots - there, you do have a chain of data
> between images, but when you delete an external snapshot, you should
> only do so after moving the relevant data elsewhere in the chain, at
> which point you reduced the length of the chain.
>
>> This leaves a problem. The chain will very quickly get quite long which
>> will impact performance. To combat this, you can use blockcommit to
>> merge a child with its parent or blockpull to merge a parent with its
>> child.
> Wrong for internal snapshots, where blockcommit and blockpull do not
> really work.
>
> More accurate for external snapshots.
>
>> In my situation, I want to keep a week of daily snapshots in case
>> something goes horribly wrong with the VM (I recently had a database
>> file become corrupt, and reverting to the previous working day's image
>> would have been a quick and easy solution, faster than recovering all
>> the data tables from the prefious day). I've been shutting down the VM,
>> deleting the oldest snapshot and creating a new one before restarting
>> the VM.
>>
>> While your explanation confirms that this is safe, it also implies that
>> I need to manage the data table chains. My first instinct is to use
>> blockcommit before deleting the oldest snapshot, such as:
>>
>>      virsh blockcommit <vm name> <qcow2 file path> --top <oldest
>> snapshot> --delete --wait
>>      virsh snapshot-delete  --domain <vm name> --snapshotname <oldest
>> snapshot>
>>
>> so that the base image contains the state as of one week earlier and the
>> snapshot chains are limited to 7 links.
>>
>> 1) does this sound reasonable?
> If you want to track WHICH clusters have changed since the last backup
> (which is the goal of incremental/differential backups), you probably
> also want to be using persistent bitmaps.  At the moment, internal
> snapshots have very little upstream development compared to external
> snapshots, and are less likely to have ways to do what you want.
>
>> 2) I note that the syntax in virsh man page is different from the syntax
>> at
>> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-backing-chain
>> (RedHat uses --top and --base while the man page just has optional base
>> and top names). I believe the RedHat guide is correct because the man
>> page doesn't allow distinguishing between the base and the top for a
>> commit.
> Questions about virsh are outside the realm of what qemu does (that's
> what libvirt adds on top of qemu); and the parameters exposed by virsh
> may differ according to what versions you are running. Also be aware
> that I'm trying to get a new incremental backup API
> virDomainBackupBegin() added to libvirt that will make support for
> incremental/differential backups by using qcow2 persistent bitmaps much
> easier from libvirt's point of use.
>
>> However the need for specifying the path isn't obvious to me. Isn't the
>> path contained in the VM definition?
>>
>> Since blockcommit would make it impossible for me to revert to an
>> earlier state (because I'm committing the oldest snapshot, if it screws
>> up, I can't undo within virsh), I need to make sure this command is
>> correct.
>>
>>
Interesting. Your comments are quite different from what the Redhat 
online documentation suggests. It spends some time talking about 
flattening the chains (e.g. 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain) 
while you are saying the chains don't exist. I gather this is because 
Redhat doesn't like internal snapshots, so they focus purely on 
documenting external ones.

It does strike me as a little bizarre to handle internal and external 
snapshots differently since the essential difference only seems to be 
where the data is stored. Using chains for one and reference counts for 
the other sounds like a recipe for for things not working right.

Anyway, if I understand what you are saying, with internal snapshots, i 
can simply delete old ones and create new ones without worrying about 
there being any performance penalty. All internal snapshots are one hop 
away from the base image.

Thanks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
  2019-06-10 22:47       ` Gary Dale
@ 2019-06-10 22:54         ` Eric Blake
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2019-06-10 22:54 UTC (permalink / raw)
  To: Gary Dale, Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm


[-- Attachment #1.1: Type: text/plain, Size: 2515 bytes --]

On 6/10/19 5:47 PM, Gary Dale wrote:

>>>
>>> Since blockcommit would make it impossible for me to revert to an
>>> earlier state (because I'm committing the oldest snapshot, if it screws
>>> up, I can't undo within virsh), I need to make sure this command is
>>> correct.
>>>
>>>
> Interesting. Your comments are quite different from what the Redhat

It's "Red Hat", two words :)

> online documentation suggests. It spends some time talking about
> flattening the chains (e.g.
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain)

That is all about external snapshot file chains (Red Hat specifically
discourages the use of internal snapshots).

> while you are saying the chains don't exist. I gather this is because
> Redhat doesn't like internal snapshots, so they focus purely on
> documenting external ones.
> 
> It does strike me as a little bizarre to handle internal and external
> snapshots differently since the essential difference only seems to be
> where the data is stored. Using chains for one and reference counts for
> the other sounds like a recipe for for things not working right.

If nothing else, it's a reason WHY Red Hat discourages the use of
internal snapshots.

> 
> Anyway, if I understand what you are saying, with internal snapshots, i
> can simply delete old ones and create new ones without worrying about
> there being any performance penalty. All internal snapshots are one hop
> away from the base image.

Still not quite right. All internal snapshots ARE a complete base image,
they do not track a delta from any other point in time, but rather the
complete disk contents of the point in time in question.

Yes, you can delete internal snapshots at will, because nothing else
depends on them. We don't yet have good code for compacting unused
portions of a qcow2 image, though, so your file size may still appear
larger than necessary (hopefully it's sparse, though, so not actually
consuming extra storage).

Also, don't try to mix-and-match internal and external snapshots on a
single guest image - once you've used one style, trying to switch to the
other can cause data loss if you aren't precise about which files
require which clusters to stick around.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
  2019-06-10 22:07       ` Eric Blake
@ 2019-06-10 23:00         ` Gary Dale
  2019-06-11  0:10           ` Eric Blake
  0 siblings, 1 reply; 14+ messages in thread
From: Gary Dale @ 2019-06-10 23:00 UTC (permalink / raw)
  To: Eric Blake, Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm

On 2019-06-10 6:07 p.m., Eric Blake wrote:
> On 6/10/19 4:27 PM, Gary Dale wrote:
>
>> Trying this against a test VM, I ran into a roadblock. My command line
>> and the results are:
>>
>> # virsh blockcommit stretch "/home/secure/virtual/stretch.qcow2" --top
>> stretchS3 --delete --wait
>> error: unsupported flags (0x2) in function qemuDomainBlockCommit
>>
>> I get the same thing when the path to the qcow2 file isn't quoted.
> That's a libvirt limitation - the --delete flag is documented from the
> generic API standpoint, but not (yet) implemented for the qemu driver
> within libvirt. For now, you have to omit --delete from your virsh
> command line, and then manually 'rm' the unused external file after the
> fact.
Which is not possible since I'm using internal snapshots.
>
>> I noted in
>> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chain
>> that the options use a single "-".
> Sounds like a bug in that documentation.

Yes, and the man page also seems to be wrong. The section on blockcommit 
begins:

blockcommit domain path [bandwidth] [--bytes] [base] [--shallow] [top] 
[--delete]
        [--keep-relative] [--wait [--async] [--verbose]] [--timeout 
seconds] [--active]
        [{--pivot | --keep-overlay}]
            Reduce the length of a backing image chain, by committing 
changes at the top of the
            chain (snapshot or delta files) into backing images. By 
default, this command
            attempts to flatten the entire chain.

In addition to "[base]" actually being "[--base base]" and "[top]" being 
"[--top top]", the description of what it does only applies to external 
snapshots. Similar things are wrong in the blockpull section.

>
>> However the results for that were:
>> # virsh blockcommit stretch /home/secure/virtual/stretch.qcow2 -top
>> stretchS3 -delete -wait
>> error: Scaled numeric value '-top' for <--bandwidth> option is malformed
>> or out of range
>>
>> which looks like virsh doesn't like the single dashes and is trying to
>> interpret them as positional options.
>>
>> I also did a
>>
>> # virsh domblklist stretch
>> Target     Source
>> ------------------------------------------------
>> vda        /home/secure/virtual/stretch.qcow2
>> hda        -
>>
>> and tried using vda instead of the full path in the blockcommit but got
>> the same error.
>>
>> Any ideas on what I'm doing wrong?
> Do you know for sure whether you have internal or external snapshots?
> And at this point, your questions are starting to wander more into
> libvirt territory.
>
Yes. I'm using internal snapshots. From your other e-mail, I gather that 
the (only) benefit to blockcommit with internal snapshots would be to 
reduce the size of the various tables recording changed blocks. Without 
a blockcommit, the L1 tables get progressively larger over time since 
they record all changes to the base file. Eventually the snapshots could 
become larger than the base image if I don't do a blockcommit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
  2019-06-10 23:00         ` Gary Dale
@ 2019-06-11  0:10           ` Eric Blake
  2019-06-11  3:47             ` Gary Dale
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Blake @ 2019-06-11  0:10 UTC (permalink / raw)
  To: Gary Dale, Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm


[-- Attachment #1.1: Type: text/plain, Size: 2148 bytes --]

On 6/10/19 6:00 PM, Gary Dale wrote:

>>> Any ideas on what I'm doing wrong?
>> Do you know for sure whether you have internal or external snapshots?
>> And at this point, your questions are starting to wander more into
>> libvirt territory.
>>
> Yes. I'm using internal snapshots. From your other e-mail, I gather that
> the (only) benefit to blockcommit with internal snapshots would be to
> reduce the size of the various tables recording changed blocks. Without
> a blockcommit, the L1 tables get progressively larger over time since
> they record all changes to the base file. Eventually the snapshots could
> become larger than the base image if I don't do a blockcommit.

Not quite. Blockcommit requires external images. It says to take this
image chain:

base <- active

and change it into this shorter chain:

base

by moving the cluster from active into base.  There is no such thing as
blockcommit on internal snapshots, because you don't have any backing
file to push into.

With internal snapshots, the longer an L1 table is active, the more
clusters you have to change compared to what was the case before the
snapshot was created - every time you change an existing cluster, the
refcount on the old cluster decreases and the change gets written into a
new cluster with refcount 1.  Yes, you can reach the point where there
are more clusters with refcount 1 associated with your current L1 table
than there are clusters with refcount > 1 that are shared with one or
more previous internal snapshots. But they are not recording a change to
the base file, rather, they are recording the current state of the file
where an internal snapshot says to not forget the old state of the file.
 And yes, a qcow2 file with internal snapshots can require more disk
space than the amount of space exposed to the guest.  But that's true
too with external snapshots (the sum of the space required by all images
in the chain may be larger than the space visible to the guest).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] kvm / virsh snapshot management
  2019-06-11  0:10           ` Eric Blake
@ 2019-06-11  3:47             ` Gary Dale
  0 siblings, 0 replies; 14+ messages in thread
From: Gary Dale @ 2019-06-11  3:47 UTC (permalink / raw)
  To: Eric Blake, Stefan Hajnoczi; +Cc: Kevin Wolf, John Snow, qemu-devel, kvm

On 2019-06-10 8:10 p.m., Eric Blake wrote:
> On 6/10/19 6:00 PM, Gary Dale wrote:
>
>>>> Any ideas on what I'm doing wrong?
>>> Do you know for sure whether you have internal or external snapshots?
>>> And at this point, your questions are starting to wander more into
>>> libvirt territory.
>>>
>> Yes. I'm using internal snapshots. From your other e-mail, I gather that
>> the (only) benefit to blockcommit with internal snapshots would be to
>> reduce the size of the various tables recording changed blocks. Without
>> a blockcommit, the L1 tables get progressively larger over time since
>> they record all changes to the base file. Eventually the snapshots could
>> become larger than the base image if I don't do a blockcommit.
> Not quite. Blockcommit requires external images. It says to take this
> image chain:
>
> base <- active
>
> and change it into this shorter chain:
>
> base
>
> by moving the cluster from active into base.  There is no such thing as
> blockcommit on internal snapshots, because you don't have any backing
> file to push into.
>
> With internal snapshots, the longer an L1 table is active, the more
> clusters you have to change compared to what was the case before the
> snapshot was created - every time you change an existing cluster, the
> refcount on the old cluster decreases and the change gets written into a
> new cluster with refcount 1.  Yes, you can reach the point where there
> are more clusters with refcount 1 associated with your current L1 table
> than there are clusters with refcount > 1 that are shared with one or
> more previous internal snapshots. But they are not recording a change to
> the base file, rather, they are recording the current state of the file
> where an internal snapshot says to not forget the old state of the file.
>   And yes, a qcow2 file with internal snapshots can require more disk
> space than the amount of space exposed to the guest.  But that's true
> too with external snapshots (the sum of the space required by all images
> in the chain may be larger than the space visible to the guest).


OK. I think I'm getting it now. Thanks for your help. I just wish there 
was some consistent documentation that explained all this. The Red Hat 
stuff seems to assume that you understand it only applies to external 
snapshots and the virsh man page seems to do the same.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-06-11  3:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-02  0:12 kvm / virsh snapshot management Gary Dale
2019-06-10 12:19 ` Stefan Hajnoczi
2019-06-10 12:19   ` [Qemu-devel] " Stefan Hajnoczi
2019-06-10 15:54   ` Gary Dale
2019-06-10 15:54     ` [Qemu-devel] " Gary Dale
2019-06-10 21:27     ` Gary Dale
2019-06-10 21:27       ` [Qemu-devel] " Gary Dale
2019-06-10 22:07       ` Eric Blake
2019-06-10 23:00         ` Gary Dale
2019-06-11  0:10           ` Eric Blake
2019-06-11  3:47             ` Gary Dale
2019-06-10 22:04     ` Eric Blake
2019-06-10 22:47       ` Gary Dale
2019-06-10 22:54         ` Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.