All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC] live snapshot, live merge, live block migration
@ 2011-05-09 13:40 Dor Laor
  2011-05-09 15:23 ` Anthony Liguori
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Dor Laor @ 2011-05-09 13:40 UTC (permalink / raw)
  To: qemu-devel, Anthony Liguori, Avi Kivity, Marcelo Tosatti,
	jes sorensen, Kevin Wolf, Stefan Hajnoczi

No patch here (sorry) but collection of thoughts about these features 
and their potential building blocks. Please review (also on 
http://wiki.qemu.org/Features/LiveBlockMigration)

Future qemu is expected to support these features (some already 
implemented):

  * Live block copy

    Ability to copy 1+ virtual disk from the source backing file/block
    device to a new target that is accessible by the host. The copy
    supposed to be executed while the VM runs in a transparent way.

    Status: code exists (by Marcelo) today in qemu but needs refactoring
    due to a race condition at the end of the copy operation. We agreed
    that a re-implementation of the copy operation should take place
    that makes sure the image is completely mirrored until management
    decides what copy to keep.

  * Live snapshots and live snapshot merge

    Live snapshot is already incorporated (by Jes) in qemu (still need
    qemu-agent work to freeze the guest FS).

    Live snapshot merge is required in order of reducing the overhead
    caused by the additional snapshots (sometimes over raw device).
    Currently not implemented for a live running guest

    Possibility: enhance live copy to be used for live snapshot merge.
                 It is almost the same mechanism.

  * Copy on read (image streaming)
    Ability to start guest execution while the parent image reside
    remotely and each block access is replicated to a local copy (image
    format snapshot)

    It should be nice to have a general mechanism that will be used for
    all image formats. What about the protocol to access these blocks
    over the net? We can reuse existing ones (nbd/iscsi).

    Such functionality can be hooked together with live block migration
    instead of the 'post copy' method.

  * Live block migration (pre/post)

    Beyond live block copy we'll sometimes need to move both the storage
    and the guest. There are two main approached here:
    - pre copy
      First live copy the image and only then live migration the VM.
      It is simple but if the purpose of the whole live block migration
      was to balance the cpu load, it won't be practical to use since
      copying an image of 100GB will take too long.
    - post copy
      First live migrate the VM, then live copy it's blocks.
      It's better approach for HA/load balancing but it might make
      management complex (need to keep the source VM alive, what happens
      on failures?)
      Using copy on read might simplify it -
      post copy = live snapshot + copy on read.

    In addition there are two cases for the storage access:
    1. The source block device is shared and can be easily accessed by
       the destination qemu-kvm process.
       That's the easy case, no special protocol needed for the block
       devices copying.
    2. There is no shared storage at all.
       This means we should implement a block access protocol over the
       live migration fd :(

       We need to chose whether to implement a new one, or re-use NBD or
       iScsi (target&initiator)

  * Using external dirty block bitmap

    FVD has an option to use external dirty block bitmap file in
    addition to the regular mapping/data files.

    We can consider using it for live block migration and live merge too.
    It can also allow additional usages of 3rd party tools to calculate
    diffs between the snapshots.
    There is a big down side thought since it will make management
    complicated and there is the risky of the image and its bitmap file
    get out of sync. It's much better choice to have qemu-img tool to be
    the single interface to the dirty block bitmap data.

Summary:
   * We need Marcelo's new (to come) block copy implementation
     * should work in parallel to migration and hotplug
   * General copy on read is desirable
   * Live snapshot merge to be implemented using block copy
   * Need to utilize a remote block access protocol (iscsi/nbd/other)
     Which one is the best?
   * Keep qemu-img the single interface for dirty block mappings.
   * Live block migration pre copy == live copy + block access protocol
     + live migration
   * Live block migration post copy == live migration + block access
     protocol/copy on read.

Comments?

Regards,
Dor

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2011-05-27 17:17 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-09 13:40 [Qemu-devel] [RFC] live snapshot, live merge, live block migration Dor Laor
2011-05-09 15:23 ` Anthony Liguori
2011-05-09 20:58   ` Dor Laor
2011-05-12 14:18   ` Marcelo Tosatti
2011-05-12 15:37   ` Jes Sorensen
2011-05-10 14:13 ` Marcelo Tosatti
2011-05-12 15:33 ` Jes Sorensen
2011-05-13  3:16   ` Jagane Sundar
2011-05-15 21:14     ` Dor Laor
2011-05-15 21:38       ` Jagane Sundar
2011-05-15 21:38         ` Jagane Sundar
2011-05-16  7:53         ` Dor Laor
2011-05-16  7:53           ` [Qemu-devel] " Dor Laor
2011-05-16  8:23           ` Jagane Sundar
2011-05-16  8:23             ` [Qemu-devel] " Jagane Sundar
2011-05-17 22:53             ` Dor Laor
2011-05-17 22:53               ` [Qemu-devel] " Dor Laor
2011-05-18 15:49               ` Jagane Sundar
2011-05-18 15:49                 ` Jagane Sundar
2011-05-20 12:19 ` Stefan Hajnoczi
2011-05-20 12:39   ` Jes Sorensen
2011-05-20 12:49     ` Stefan Hajnoczi
2011-05-20 12:56       ` Jes Sorensen
2011-05-22  9:52   ` Dor Laor
2011-05-23 13:02     ` Stefan Hajnoczi
2011-05-27 16:46       ` Stefan Hajnoczi
2011-05-27 17:16         ` Jagane Sundar
2011-05-23  5:42   ` Jagane Sundar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.