All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Consistent Snapshots Idea
@ 2011-11-21 12:01 Richard Laager
  2011-11-21 12:31   ` [Qemu-devel] " Avi Kivity
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Laager @ 2011-11-21 12:01 UTC (permalink / raw)
  To: kvm

[-- Attachment #1: Type: text/plain, Size: 1737 bytes --]

I'm not an expert on the architecture of KVM, so perhaps this is a QEMU
question. If so, please let me know and I'll ask on a different list.

Background:

Assuming the block layer can make instantaneous snapshots of a guest's
disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the
guest crashed) snapshots. To get a "fully consistent" snapshot, you need
to shutdown the guest. For production VMs, this is obviously not ideal.

Idea:

What if KVM/QEMU was to fork() the guest and shutdown one copy?

KVM/QEMU would momentarily halt the execution of the guest and take a
writable, instantaneous snapshot of each block device. Then it would
fork(). The parent would resume execution as normal. The child would
redirect disk writes to the snapshot(s). The RAM should have
copy-on-write behavior as with any other fork()ed process. Other
resources like the network, display, sound, serial, etc. would simply be
disconnected/bit-bucketed. Finally, the child would resume guest
execution and send the guest an ACPI power button press event. This
would cause the guest OS to perform an orderly shutdown.

I believe this would provide consistent snapshots in the vast majority
of real-world scenarios in a guest OS and application-independent way.

Implementation Nits:

      * A timeout on the child process would likely be a good idea.
      * It'd probably be best to disconnect the network (i.e. tell the
        guest the cable is unplugged) to avoid long timeouts. Likewise
        for the hardware flow-control lines on the serial port.
      * For correctness, fdatasync()ing or similar might be necessary
        after halting execution and before creating the snapshots.

Richard

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Consistent Snapshots Idea
  2011-11-21 12:01 [RFC] Consistent Snapshots Idea Richard Laager
@ 2011-11-21 12:31   ` Avi Kivity
  0 siblings, 0 replies; 4+ messages in thread
From: Avi Kivity @ 2011-11-21 12:31 UTC (permalink / raw)
  To: Richard Laager; +Cc: kvm, qemu-devel

On 11/21/2011 02:01 PM, Richard Laager wrote:
> I'm not an expert on the architecture of KVM, so perhaps this is a QEMU
> question. If so, please let me know and I'll ask on a different list.

It is a qemu question, yes (though fork()ing a guest also relates to kvm).

> Background:
>
> Assuming the block layer can make instantaneous snapshots of a guest's
> disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the
> guest crashed) snapshots. To get a "fully consistent" snapshot, you need
> to shutdown the guest. For production VMs, this is obviously not ideal.
>
> Idea:
>
> What if KVM/QEMU was to fork() the guest and shutdown one copy?
>
> KVM/QEMU would momentarily halt the execution of the guest and take a
> writable, instantaneous snapshot of each block device. Then it would
> fork(). The parent would resume execution as normal. The child would
> redirect disk writes to the snapshot(s). The RAM should have
> copy-on-write behavior as with any other fork()ed process. Other
> resources like the network, display, sound, serial, etc. would simply be
> disconnected/bit-bucketed. Finally, the child would resume guest
> execution and send the guest an ACPI power button press event. This
> would cause the guest OS to perform an orderly shutdown.
>
> I believe this would provide consistent snapshots in the vast majority
> of real-world scenarios in a guest OS and application-independent way.

Interesting idea.  Will the guest actually shut down nicely without a
network?  Things like NFS mounts will break.

> Implementation Nits:
>
>       * A timeout on the child process would likely be a good idea.
>       * It'd probably be best to disconnect the network (i.e. tell the
>         guest the cable is unplugged) to avoid long timeouts. Likewise
>         for the hardware flow-control lines on the serial port.

This is actually critical, otherwise the guest will shutdown(2) all
sockets and confuse the clients.

>       * For correctness, fdatasync()ing or similar might be necessary
>         after halting execution and before creating the snapshots.

Microsoft guests have an API to quiesce storage prior to a snapshot, and
I think there is work to bring this to Linux guests.  So it should be
possible to get consistent snapshots even without this, but it takes
more integration.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [RFC] Consistent Snapshots Idea
@ 2011-11-21 12:31   ` Avi Kivity
  0 siblings, 0 replies; 4+ messages in thread
From: Avi Kivity @ 2011-11-21 12:31 UTC (permalink / raw)
  To: Richard Laager; +Cc: qemu-devel, kvm

On 11/21/2011 02:01 PM, Richard Laager wrote:
> I'm not an expert on the architecture of KVM, so perhaps this is a QEMU
> question. If so, please let me know and I'll ask on a different list.

It is a qemu question, yes (though fork()ing a guest also relates to kvm).

> Background:
>
> Assuming the block layer can make instantaneous snapshots of a guest's
> disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the
> guest crashed) snapshots. To get a "fully consistent" snapshot, you need
> to shutdown the guest. For production VMs, this is obviously not ideal.
>
> Idea:
>
> What if KVM/QEMU was to fork() the guest and shutdown one copy?
>
> KVM/QEMU would momentarily halt the execution of the guest and take a
> writable, instantaneous snapshot of each block device. Then it would
> fork(). The parent would resume execution as normal. The child would
> redirect disk writes to the snapshot(s). The RAM should have
> copy-on-write behavior as with any other fork()ed process. Other
> resources like the network, display, sound, serial, etc. would simply be
> disconnected/bit-bucketed. Finally, the child would resume guest
> execution and send the guest an ACPI power button press event. This
> would cause the guest OS to perform an orderly shutdown.
>
> I believe this would provide consistent snapshots in the vast majority
> of real-world scenarios in a guest OS and application-independent way.

Interesting idea.  Will the guest actually shut down nicely without a
network?  Things like NFS mounts will break.

> Implementation Nits:
>
>       * A timeout on the child process would likely be a good idea.
>       * It'd probably be best to disconnect the network (i.e. tell the
>         guest the cable is unplugged) to avoid long timeouts. Likewise
>         for the hardware flow-control lines on the serial port.

This is actually critical, otherwise the guest will shutdown(2) all
sockets and confuse the clients.

>       * For correctness, fdatasync()ing or similar might be necessary
>         after halting execution and before creating the snapshots.

Microsoft guests have an API to quiesce storage prior to a snapshot, and
I think there is work to bring this to Linux guests.  So it should be
possible to get consistent snapshots even without this, but it takes
more integration.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [RFC] Consistent Snapshots Idea
  2011-11-21 12:31   ` [Qemu-devel] " Avi Kivity
  (?)
@ 2011-11-21 14:27   ` shu ming
  -1 siblings, 0 replies; 4+ messages in thread
From: shu ming @ 2011-11-21 14:27 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Richard Laager, qemu-devel, kvm

On 2011-11-21 20:31, Avi Kivity wrote:
> On 11/21/2011 02:01 PM, Richard Laager wrote:
>> I'm not an expert on the architecture of KVM, so perhaps this is a QEMU
>> question. If so, please let me know and I'll ask on a different list.
> It is a qemu question, yes (though fork()ing a guest also relates to kvm).
>
>> Background:
>>
>> Assuming the block layer can make instantaneous snapshots of a guest's
>> disk (e.g. lvcreate -s), one can get "crash consistent" (i.e. as if the
>> guest crashed) snapshots. To get a "fully consistent" snapshot, you need
>> to shutdown the guest. For production VMs, this is obviously not ideal.
>>
>> Idea:
>>
>> What if KVM/QEMU was to fork() the guest and shutdown one copy?
>>
>> KVM/QEMU would momentarily halt the execution of the guest and take a
>> writable, instantaneous snapshot of each block device. Then it would
>> fork(). The parent would resume execution as normal. The child would
>> redirect disk writes to the snapshot(s). The RAM should have
>> copy-on-write behavior as with any other fork()ed process. Other
>> resources like the network, display, sound, serial, etc. would simply be
>> disconnected/bit-bucketed. Finally, the child would resume guest
>> execution and send the guest an ACPI power button press event. This
>> would cause the guest OS to perform an orderly shutdown.
>>
>> I believe this would provide consistent snapshots in the vast majority
>> of real-world scenarios in a guest OS and application-independent way.
> Interesting idea.  Will the guest actually shut down nicely without a
> network?  Things like NFS mounts will break.

Does the child and parent process run in parallel?  What will happen if 
the parent process try to access the block device? It looks like that 
the child process will write to a snapshot file, but where will the 
parent process write to?

>
>> Implementation Nits:
>>
>>        * A timeout on the child process would likely be a good idea.
>>        * It'd probably be best to disconnect the network (i.e. tell the
>>          guest the cable is unplugged) to avoid long timeouts. Likewise
>>          for the hardware flow-control lines on the serial port.
> This is actually critical, otherwise the guest will shutdown(2) all
> sockets and confuse the clients.
>
>>        * For correctness, fdatasync()ing or similar might be necessary
>>          after halting execution and before creating the snapshots.
> Microsoft guests have an API to quiesce storage prior to a snapshot, and
> I think there is work to bring this to Linux guests.  So it should be
> possible to get consistent snapshots even without this, but it takes
> more integration.
>



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-11-21 14:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-21 12:01 [RFC] Consistent Snapshots Idea Richard Laager
2011-11-21 12:31 ` Avi Kivity
2011-11-21 12:31   ` [Qemu-devel] " Avi Kivity
2011-11-21 14:27   ` shu ming

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.