All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC V9 0/4] domain snapshot document
@ 2014-12-16  6:32 Chunyan Liu
  2014-12-16  6:32 ` [RFC V9 1/4] domain snapshot terms Chunyan Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 38+ messages in thread
From: Chunyan Liu @ 2014-12-16  6:32 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, jfehlig, wei.liu2, ian.campbell, Chunyan Liu

Changes to V8:
  - xl removes snapshot-delete/snapshot-list, keeps
    snapshot-create/snapshot-revert only.
  - libxl removes unnecessary domain snapshot functionality
    libxl_domain_snapshot_create/delete/revert. Instead,
    export disk snapshot functionality for xl/libvirt usage
    in libxl/libxlu.
  - Add more introduction to the overall work.

V8 is here:
   http://lists.xen.org/archives/html/xen-devel/2014-11/msg00734.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC V9 1/4] domain snapshot terms
  2014-12-16  6:32 [RFC V9 0/4] domain snapshot document Chunyan Liu
@ 2014-12-16  6:32 ` Chunyan Liu
  2014-12-18 15:05   ` Ian Campbell
  2014-12-16  6:32 ` [RFC V9 2/4] domain snapshot overview Chunyan Liu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 38+ messages in thread
From: Chunyan Liu @ 2014-12-16  6:32 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, jfehlig, wei.liu2, ian.campbell, Chunyan Liu

Changes to V8:
  * add a document for domain snapshot related terms, they will be
    referred in later documents.

=====================================================================
Terms

* Active domain: domain created and started

* Inactive domain: domain created but not started

* Domain snapshot:

  Domain snapshot is a system checkpoint of a domain. It contains
  the memory status at the checkpoint and the disk status.

* Disk-only snapshot:

  Disk-only snapshot only keeps the status of disk, not saving
  memory status.

  Contents of disks (whether a subset or all disks associated with
  the domain) are saved at a given point of time, and can be restored
  back to that state. On a running guest, a disk-only snapshot is
  likely to be only crash-consistent rather than clean (that is, it
  represents the state of the disk on a sudden power outage); on an
  inactive guest, a disk-only snapshot is clean if the disks were
  clean when the guest was last shut down.

* Live Snapshot:

  Like live migration, it will increase size of the memory dump file,
  but reducess downtime of the guest.

* Internal Disk Snapshot

  File formats such as qcow2 track both the snapshot and changes
  since the snapshot in a single file.

* External Disk Snapshot

  The snapshot is one file, and the changes since the snapshot
  are in another file.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC V9 2/4] domain snapshot overview
  2014-12-16  6:32 [RFC V9 0/4] domain snapshot document Chunyan Liu
  2014-12-16  6:32 ` [RFC V9 1/4] domain snapshot terms Chunyan Liu
@ 2014-12-16  6:32 ` Chunyan Liu
  2014-12-17 12:17   ` Wei Liu
  2014-12-18 15:10   ` Ian Campbell
  2014-12-16  6:32 ` [RFC V9 3/4] domain snapshot design: xl Chunyan Liu
  2014-12-16  6:32 ` [RFC V9 4/4] domain snapshot design: libxl/libxlu Chunyan Liu
  3 siblings, 2 replies; 38+ messages in thread
From: Chunyan Liu @ 2014-12-16  6:32 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, jfehlig, wei.liu2, ian.campbell, Chunyan Liu

Changes to V8:
  * add an overview document, so that one can has a overall look
    about the whole domain snapshot work, limits, requirements,
    how to do, etc.

=====================================================================
Domain snapshot overview

1. Purpose

Domain snapshot is a system checkpoint of a domain. Later, one can
roll back the domain to that checkpoint. It's a very useful backup
function. A domain snapshot contains the memory status at the
checkpoint and the disk status (which we called disk snapshot).

Domain snapshot functionality usually includes:
a) create a domain snapshot
b) roll back (or called "revert") to a domain snapshot
c) delete a domain snapshot
d) list all domain snapshots

But following the existing xl idioms of managing storage and saved
VM images via existing CLI command (qemu-img, lvcreate, ls, mv,
cp etc), xl snapshot functionality would be kept as simple as
possible:
* xl will do a) and b), creating a snapshot and reverting a
  domain to a snapshot.
* xl will NOT do c) and d), xl won't manage snapshots, as xl
  doesn't maintain saved images created by 'xl save'. So xl
  will have no idea of the existence of domain snapshots and
  the chain relationship between snapshots. It will depends on
  user to take care of the snapshots, know the snapshot chain
  info, and delete snapshots.

Domain Snapshot Support and Not Support:
* support live snapshot
* support internal disk snapshot and external disk snapshot
* support different disk backend types.
  (Basic goal is to support 'raw' and 'qcow2' only).

* not support snapshot when domain is shutdowning or dying.
* not support disk-only snapshot [1].

 [1] To xl, it only concerns active domains, and even when domain
 is paused, there is no data flush to disk operation. So, take
 a disk-only snapshot and then resume, it is as if the guest
 had crashed. For this reason, disk-only snapshot is meaningless
 to xl. Should not support.


2. Requirements

General Requirements:
* ability to save/restore domain memory
* ability to create/delete/apply disk snapshot [2]
* ability to parse user config file

  [2] Disk snapshot requirements:
  - external tools: qemu-img, lvcreate, vhd-util, etc.
  - for basic goal, we support 'raw' and 'qcow2' backend types
    only. Then it requires:
    libxl qmp command or "qemu-img" (when qemu process does not
    exist)


3. Interaction with other operations:

No.


4. General workflow

Create a snapshot:
  * parse user cfg file if passed in
  * check snapshot operation is allowed or not
  * save domain, saving memory status to file (refer to: save_domain)
  * take disk snapshot (e.g. call qmp command)
  * unpause domain

Revert to snapshot:
  * parse use cfg file (xl doesn't manage snapshots, so it has no
    idea of snapshot existence. User MUST supply configuration file)
  * destroy this domain
  * create a new domain from snapshot info
    - apply disk snapshot (e.g. call qemu-img)
    - a process like restore domain

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC V9 3/4] domain snapshot design: xl
  2014-12-16  6:32 [RFC V9 0/4] domain snapshot document Chunyan Liu
  2014-12-16  6:32 ` [RFC V9 1/4] domain snapshot terms Chunyan Liu
  2014-12-16  6:32 ` [RFC V9 2/4] domain snapshot overview Chunyan Liu
@ 2014-12-16  6:32 ` Chunyan Liu
  2014-12-17 12:28   ` Wei Liu
  2014-12-18 15:15   ` Ian Campbell
  2014-12-16  6:32 ` [RFC V9 4/4] domain snapshot design: libxl/libxlu Chunyan Liu
  3 siblings, 2 replies; 38+ messages in thread
From: Chunyan Liu @ 2014-12-16  6:32 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, jfehlig, wei.liu2, ian.campbell, Chunyan Liu

Changes to V8:
  * xl won't manage snapshots, that means it won't maintain json files,
    won't maintain snapshot chain relationship, and then as a result
    won't take care of deleting snapshot and listing snapshots.
  * remove snapshot-delete and snapshot-list interface
  * update snapshot-revert interface
  * update snapshot-create/revert implementaion

===========================================================================

XL Design

1. User Interface

xl snapshot-create:
  Create a snapshot (disk and RAM) of a domain.

  SYNOPSIS:
    snapshot-create <domain> [<cfgfile>] [--name <string>] [--live]

  OPTIONS:
    --name <string>  snapshot name
    --live           take a live snapshot

    If option includes --live, then the domain is not paused while creating
    the snapshot, like live migration. This increases size of the memory
    dump file, but reducess downtime of the guest.

    If option doens't include --name, a default name will be generated
    according to the creation time.

    If specify @cfgfile, use cfgfile. (e.g. if --name specifies a name,
    meanwhile there is name specified in cfgfile, name in cfgfile will
    be used.)


xl snapshot-revert:
  Revert domain to status of a snapshot.

  SYNOPSIS:
      snapshot-revert <domain> <cfgfile> [--running] [--force]

  OPTIONS:
    --running        after reverting, change state to running
    --force          try harder on risky reverts

    Normally, the domain will revert to the same state the domain was in while
    the snapshot was taken (whether running, or paused).

    If option includes --running, then overrides the snapshot state to
    guarantee a running domain after the revert.



2. cfgfile syntax

#snapshot name. If user doesn't provide a VM snapshot name, xl will generate
#a name automatically by the creation time.
name=""

#snapshot description. Default is NULL.
description=""

#memory location. This field should be filled when memory=1. Default is NULL.
memory_path=""

#disk snapshot information
#For easier parse config work, reuse disk configuration in xl.cfg, but
#with different meanings.
#disk syntax meaning: 'external path, external format, target device'

#e.g. to specify exernal disk snapshot, like this:
#disks=['/tmp/hda_snapshot.qcow2,qcow2,hda',
        '/tmp/hdb_snapshot.qcow2,qcow2,hdb',]

#e.g. to specify internal disk snapshot, like this:
disks=[',,hda',',,hdb',]


3. xl snapshot-xxx implementation

"xl snapshot-create"

    1), parse args or user configuration file.
    2), save domain (store saved memory to memory_path)
    3), create disk snapshots according to disk snapshot configuration
    4), unpause domain

"xl snapshot-revert"

    1), parse user configuration file
    2), destroy current domain
    3), revert disk snapshots according to disk snapshot configuration
    4), restore domain from saved memory.

4. Notes

* user should take care of those snapshots, like: saved memory file, disk
  snapshots info (internal, external, etc.), snapshot chain relationship
* user should delete snapshots by themselves with CLI commands like: rm,
  qemu-img, etc.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-16  6:32 [RFC V9 0/4] domain snapshot document Chunyan Liu
                   ` (2 preceding siblings ...)
  2014-12-16  6:32 ` [RFC V9 3/4] domain snapshot design: xl Chunyan Liu
@ 2014-12-16  6:32 ` Chunyan Liu
  2014-12-17 14:09   ` Wei Liu
  2014-12-18 15:27   ` Ian Campbell
  3 siblings, 2 replies; 38+ messages in thread
From: Chunyan Liu @ 2014-12-16  6:32 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson, jfehlig, wei.liu2, ian.campbell, Chunyan Liu

Changes to V8:
  * remove libxl_domain_snapshot_create/delete/revert API
  * export disk snapshot functionality for both xl and libvirt usage

===========================================================================
Libxl/libxlu Design

1. New Structures

libxl_disk_snapshot = Struct("disk_snapshot",[
    # target disk
    ("disk",            libxl_device_disk),

    # disk snapshot name
    ("name",            string),

    # internal/external disk snapshot?
    ("external",        bool),

    # for external disk snapshot, specify following two field
    ("external_format", string),
    ("external_path",   string),
    ])


2. New Functions

Since there're already APIs for saving memory (libxl_domain_suspend)
and restoring domain from saved memory (libxl_domain_create_restore), to
xl domain snapshot tasks, the missing part is disk snapshot functionality.
And the disk snapshot functionality would be used by libvirt too.

Considering there is qmp handling in creating/deleting disk snapshot,
will add following new functions to libxl (?):

int libxl_disk_snapshot_create(libxl_ctx *ctx, uint32_t domid,
                               libxl_disk_snapshot *snapshot, int nb);

    Taking disk snapshots to a group of domain disks according to
    configuration. For qcow2 disk backend type, it will call qmp
    "transaction" command to do the work. For other disk backend types,
    might call other external commands.

    Parameters:
       ctx (INPUT):
           context
       domid (INPUT):
           domain id
       snapshot (INPUT):
           array of disk snapshot configuration. Has "nb" members.

           libxl_device_disk:
               structure to represent which disk.
           name:
               snapshot name.
           external:
               internal snapshot or external snapshot.
               'false' means internal disk snapshot. external_format and
               external_path will be ignored.
               'true' means external disk snapshot, then external_format
               and external_path should be provided.
           external_format:
               Should be provided when 'external' is true. If not provided,
               will use default format proper to the backend file.
               Ignored when 'external' is false.
           external_path:
               Must be provided when 'external' is true.
               Ignored when 'external' is false.
       nb (INPUT):
           number of disks that need to take disk snapshot.

    Return:
       0 on success, -1 on error.


/*  This API might not be used by xl, since xl won't take care of deleting
 *  snapshots. But for libvirt, since libvirt manages snapshots and will
 *  delete snapshot, this API will be used.
 */
int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid,
                               libxl_disk_snapshot *snapshot, int nb);

    Delete disk snapshot of a group of domain disks according to
    configuration. For qcow2 disk backend type, it will call qmp command
    to delete internal disk snapshot. For other disk backend types, might
    call other external commands.

    Parameters:
       ctx (INPUT):
           context
       domid (INPUT):
           domain id
       snapshot (INPUT):
           array of disk snapshot configuration. Has "nb" members.
       nb (INPUT):
           number of disks that need to take disk snapshot.

    Return:
       0 on success, -1 on error.


int libxl_disk_to_snapshot(libxl_ctx *ctx, uint32_t domid,
                           libxl_disk_snapshot **snapshot, int *num);

    This is for domain snapshot create. If user doesn't specify disks,
    then by default it will take internal disk snapshot to each domain
    disk. This function will fill libxl_disk_snapshot according to domain
    disks info.

    Parameters:
       ctx (INPUT):
           context
       domid (INPUT):
           domain id
       snapshot (OUTPUT):
           array of disk snapshot configuration.
       num (OUTPUT):
           number of disks.

    Return:
       0 on success, -1 on error.



For disk snapshot revert, no qmp command for that, it always calls
external commands to finish the work, so put in libxlu (?):

int xlu_disk_snapshot_revert(libxl_disk_snapshot *snapshot, int nb);

    Apply disk snapshot for a group of disks according to configuration. To
    different disk backend types, call different external commands to do
    the work.

    Parameters:
       snapshot (INPUT):
           array of disk snapshot configuration. Has "nb" members.
       nb (INPUT):
           number of disks that need to take disk snapshot.

    Return:
       0 on success, -1 on error.



3. Simple Architecture View

Creating domain snapshot:
(* means new functions we will need in libxl/libxlu)

  "xl snapshot-create"
         |
  parse configuration ----> libxl_disk_to_snapshot (*)
         |
  saving memory ----> libxl_domain_suspend
         |
 taking disk snapshot ----> libxl_disk_snapshot_create (*)
         |                     |
         |                     --> libxl_qmp_disk_snapshot_transaction (*)
         |
    unpause domain ---->libxl_domain_unpause
         |
        End


Reverting to a snapshot:
(* means new functions we will need in libxl/libxlu)

  "xl snapshot-revert"
         |
   parse configuration
         |
   destroy domain ---->libxl_domain_destroy
         |
 reverting disk snapshot ----> xlu_disk_snapshot_revert (*)
         |                       |
         |                       --> call 'qemu-img' to apply disk snapshot
         |
 restore domain from saved memory ----> libxl_domain_create_restore
         |
        End

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2014-12-16  6:32 ` [RFC V9 2/4] domain snapshot overview Chunyan Liu
@ 2014-12-17 12:17   ` Wei Liu
  2014-12-18  3:34     ` Chun Yan Liu
  2014-12-18 15:10   ` Ian Campbell
  1 sibling, 1 reply; 38+ messages in thread
From: Wei Liu @ 2014-12-17 12:17 UTC (permalink / raw)
  To: Chunyan Liu; +Cc: ian.jackson, jfehlig, wei.liu2, ian.campbell, xen-devel

On Tue, Dec 16, 2014 at 02:32:55PM +0800, Chunyan Liu wrote:
> Changes to V8:
>   * add an overview document, so that one can has a overall look
>     about the whole domain snapshot work, limits, requirements,
>     how to do, etc.
> 
> =====================================================================
> Domain snapshot overview
> 
> 1. Purpose
> 
> Domain snapshot is a system checkpoint of a domain. Later, one can
> roll back the domain to that checkpoint. It's a very useful backup
> function. A domain snapshot contains the memory status at the
> checkpoint and the disk status (which we called disk snapshot).
> 
> Domain snapshot functionality usually includes:
> a) create a domain snapshot
> b) roll back (or called "revert") to a domain snapshot
> c) delete a domain snapshot
> d) list all domain snapshots
> 
> But following the existing xl idioms of managing storage and saved
> VM images via existing CLI command (qemu-img, lvcreate, ls, mv,
> cp etc), xl snapshot functionality would be kept as simple as
> possible:
> * xl will do a) and b), creating a snapshot and reverting a
>   domain to a snapshot.
> * xl will NOT do c) and d), xl won't manage snapshots, as xl
>   doesn't maintain saved images created by 'xl save'. So xl
>   will have no idea of the existence of domain snapshots and
>   the chain relationship between snapshots. It will depends on
>   user to take care of the snapshots, know the snapshot chain
>   info, and delete snapshots.
> 
> Domain Snapshot Support and Not Support:

I think this list applies to xl (last item and [1]). If so please state
clearly to prevent confusion with other toolstack (say, libvirt) and
functionalities of the library (libxl).

> * support live snapshot
> * support internal disk snapshot and external disk snapshot
> * support different disk backend types.
>   (Basic goal is to support 'raw' and 'qcow2' only).
> 
> * not support snapshot when domain is shutdowning or dying.
> * not support disk-only snapshot [1].
> 
>  [1] To xl, it only concerns active domains, and even when domain
>  is paused, there is no data flush to disk operation. So, take
>  a disk-only snapshot and then resume, it is as if the guest
>  had crashed. For this reason, disk-only snapshot is meaningless
>  to xl. Should not support.
> 

I think I understand your reasoning, but it's a bit convoluted to me.

Domain can be in both active and inactive state (libvirt term) when
using xl.  When domain is active, we cannot guarantee in xl that domain
is quiesced so a disk-only snapshot may contain inconsistent data. When
domain is inactive, there's no point in taking a disk-only snapshot
because it would be the same as the base image. So the conclusion is
that xl doesn't need to support disk-only snapshot.

Does the above reasoning equals to yours? Is it clearer or more
confusing?

Wei.

> 
> 2. Requirements
> 
> General Requirements:
> * ability to save/restore domain memory
> * ability to create/delete/apply disk snapshot [2]
> * ability to parse user config file
> 
>   [2] Disk snapshot requirements:
>   - external tools: qemu-img, lvcreate, vhd-util, etc.
>   - for basic goal, we support 'raw' and 'qcow2' backend types
>     only. Then it requires:
>     libxl qmp command or "qemu-img" (when qemu process does not
>     exist)
> 
> 
> 3. Interaction with other operations:
> 
> No.
> 
> 
> 4. General workflow
> 
> Create a snapshot:
>   * parse user cfg file if passed in
>   * check snapshot operation is allowed or not
>   * save domain, saving memory status to file (refer to: save_domain)
>   * take disk snapshot (e.g. call qmp command)
>   * unpause domain
> 
> Revert to snapshot:
>   * parse use cfg file (xl doesn't manage snapshots, so it has no
>     idea of snapshot existence. User MUST supply configuration file)
>   * destroy this domain
>   * create a new domain from snapshot info
>     - apply disk snapshot (e.g. call qemu-img)
>     - a process like restore domain

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 3/4] domain snapshot design: xl
  2014-12-16  6:32 ` [RFC V9 3/4] domain snapshot design: xl Chunyan Liu
@ 2014-12-17 12:28   ` Wei Liu
  2014-12-18  3:23     ` Chun Yan Liu
  2014-12-18 15:15   ` Ian Campbell
  1 sibling, 1 reply; 38+ messages in thread
From: Wei Liu @ 2014-12-17 12:28 UTC (permalink / raw)
  To: Chunyan Liu; +Cc: ian.jackson, jfehlig, wei.liu2, ian.campbell, xen-devel

On Tue, Dec 16, 2014 at 02:32:56PM +0800, Chunyan Liu wrote:
> Changes to V8:
>   * xl won't manage snapshots, that means it won't maintain json files,
>     won't maintain snapshot chain relationship, and then as a result
>     won't take care of deleting snapshot and listing snapshots.
>   * remove snapshot-delete and snapshot-list interface
>   * update snapshot-revert interface
>   * update snapshot-create/revert implementaion
> 
> ===========================================================================
> 
> XL Design
> 
> 1. User Interface
> 
> xl snapshot-create:
>   Create a snapshot (disk and RAM) of a domain.
> 
>   SYNOPSIS:
>     snapshot-create <domain> [<cfgfile>] [--name <string>] [--live]
> 
>   OPTIONS:
>     --name <string>  snapshot name
>     --live           take a live snapshot
> 
>     If option includes --live, then the domain is not paused while creating
>     the snapshot, like live migration. This increases size of the memory
>     dump file, but reducess downtime of the guest.
> 
>     If option doens't include --name, a default name will be generated
>     according to the creation time.
> 
>     If specify @cfgfile, use cfgfile. (e.g. if --name specifies a name,
>     meanwhile there is name specified in cfgfile, name in cfgfile will
>     be used.)
> 
> 
> xl snapshot-revert:
>   Revert domain to status of a snapshot.
> 
>   SYNOPSIS:
>       snapshot-revert <domain> <cfgfile> [--running] [--force]
> 
>   OPTIONS:
>     --running        after reverting, change state to running
>     --force          try harder on risky reverts
> 
>     Normally, the domain will revert to the same state the domain was in while
>     the snapshot was taken (whether running, or paused).
> 
>     If option includes --running, then overrides the snapshot state to
>     guarantee a running domain after the revert.
> 
> 
> 
> 2. cfgfile syntax
> 
> #snapshot name. If user doesn't provide a VM snapshot name, xl will generate
> #a name automatically by the creation time.
> name=""
> 
> #snapshot description. Default is NULL.
> description=""
> 
> #memory location. This field should be filled when memory=1. Default is NULL.
> memory_path=""
> 
> #disk snapshot information
> #For easier parse config work, reuse disk configuration in xl.cfg, but
> #with different meanings.
> #disk syntax meaning: 'external path, external format, target device'
> 
> #e.g. to specify exernal disk snapshot, like this:
> #disks=['/tmp/hda_snapshot.qcow2,qcow2,hda',
>         '/tmp/hdb_snapshot.qcow2,qcow2,hdb',]
> 
> #e.g. to specify internal disk snapshot, like this:
> disks=[',,hda',',,hdb',]
> 

How is snapshot chain represented with this syntax? Does xl not need to
know about the chain? (Note, this is different than managing the chain)

Wei.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-16  6:32 ` [RFC V9 4/4] domain snapshot design: libxl/libxlu Chunyan Liu
@ 2014-12-17 14:09   ` Wei Liu
  2014-12-18  3:01     ` Chun Yan Liu
  2014-12-18 15:27   ` Ian Campbell
  1 sibling, 1 reply; 38+ messages in thread
From: Wei Liu @ 2014-12-17 14:09 UTC (permalink / raw)
  To: Chunyan Liu; +Cc: ian.jackson, jfehlig, wei.liu2, ian.campbell, xen-devel

On Tue, Dec 16, 2014 at 02:32:57PM +0800, Chunyan Liu wrote:
> Changes to V8:
>   * remove libxl_domain_snapshot_create/delete/revert API
>   * export disk snapshot functionality for both xl and libvirt usage
> 
> ===========================================================================
> Libxl/libxlu Design
> 
> 1. New Structures
> 
> libxl_disk_snapshot = Struct("disk_snapshot",[
>     # target disk
>     ("disk",            libxl_device_disk),
> 
>     # disk snapshot name
>     ("name",            string),
> 
>     # internal/external disk snapshot?
>     ("external",        bool),
> 
>     # for external disk snapshot, specify following two field
>     ("external_format", string),
>     ("external_path",   string),
>     ])
> 

So you don't propose making libxl to have knowledge of the snapshot
chains? And in libvirt (or other toolstack that's interested in snapshot
management) you represent a snapshot as chains (or trees) of
libxl_disk_snapshot? (Not suggesting you do things the other way around,
just to confirm)

> 
> 2. New Functions
> 
> Since there're already APIs for saving memory (libxl_domain_suspend)
> and restoring domain from saved memory (libxl_domain_create_restore), to
> xl domain snapshot tasks, the missing part is disk snapshot functionality.
> And the disk snapshot functionality would be used by libvirt too.
> 
> Considering there is qmp handling in creating/deleting disk snapshot,
> will add following new functions to libxl (?):
> 
> int libxl_disk_snapshot_create(libxl_ctx *ctx, uint32_t domid,
>                                libxl_disk_snapshot *snapshot, int nb);
> 
>     Taking disk snapshots to a group of domain disks according to
>     configuration. For qcow2 disk backend type, it will call qmp
>     "transaction" command to do the work. For other disk backend types,
>     might call other external commands.
> 
>     Parameters:
>        ctx (INPUT):
>            context
>        domid (INPUT):
>            domain id
>        snapshot (INPUT):
>            array of disk snapshot configuration. Has "nb" members.
> 
>            libxl_device_disk:
>                structure to represent which disk.
>            name:
>                snapshot name.
>            external:
>                internal snapshot or external snapshot.
>                'false' means internal disk snapshot. external_format and
>                external_path will be ignored.
>                'true' means external disk snapshot, then external_format
>                and external_path should be provided.
>            external_format:
>                Should be provided when 'external' is true. If not provided,
>                will use default format proper to the backend file.
>                Ignored when 'external' is false.
>            external_path:
>                Must be provided when 'external' is true.
>                Ignored when 'external' is false.
>        nb (INPUT):
>            number of disks that need to take disk snapshot.
> 
>     Return:
>        0 on success, -1 on error.
> 

It should return appropriate libxl error code (ERROR_*) on error.

> 
> /*  This API might not be used by xl, since xl won't take care of deleting
>  *  snapshots. But for libvirt, since libvirt manages snapshots and will
>  *  delete snapshot, this API will be used.
>  */
> int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid,
>                                libxl_disk_snapshot *snapshot, int nb);
> 
>     Delete disk snapshot of a group of domain disks according to
>     configuration. For qcow2 disk backend type, it will call qmp command
>     to delete internal disk snapshot. For other disk backend types, might
>     call other external commands.
> 
>     Parameters:
>        ctx (INPUT):
>            context
>        domid (INPUT):
>            domain id
>        snapshot (INPUT):
>            array of disk snapshot configuration. Has "nb" members.
>        nb (INPUT):
>            number of disks that need to take disk snapshot.
> 
>     Return:
>        0 on success, -1 on error.
> 
> 
> int libxl_disk_to_snapshot(libxl_ctx *ctx, uint32_t domid,
>                            libxl_disk_snapshot **snapshot, int *num);
> 
>     This is for domain snapshot create. If user doesn't specify disks,
>     then by default it will take internal disk snapshot to each domain
>     disk. This function will fill libxl_disk_snapshot according to domain
>     disks info.
> 
>     Parameters:
>        ctx (INPUT):
>            context
>        domid (INPUT):
>            domain id
>        snapshot (OUTPUT):
>            array of disk snapshot configuration.
>        num (OUTPUT):
>            number of disks.
> 
>     Return:
>        0 on success, -1 on error.
> 
> 
> 
> For disk snapshot revert, no qmp command for that, it always calls
> external commands to finish the work, so put in libxlu (?):
> 
> int xlu_disk_snapshot_revert(libxl_disk_snapshot *snapshot, int nb);
> 

IMHO it's fine for this to be in libxl. Calling out to other programs
is fine.

Wei.

>     Apply disk snapshot for a group of disks according to configuration. To
>     different disk backend types, call different external commands to do
>     the work.
> 
>     Parameters:
>        snapshot (INPUT):
>            array of disk snapshot configuration. Has "nb" members.
>        nb (INPUT):
>            number of disks that need to take disk snapshot.
> 
>     Return:
>        0 on success, -1 on error.
> 
> 
> 
> 3. Simple Architecture View
> 
> Creating domain snapshot:
> (* means new functions we will need in libxl/libxlu)
> 
>   "xl snapshot-create"
>          |
>   parse configuration ----> libxl_disk_to_snapshot (*)
>          |
>   saving memory ----> libxl_domain_suspend
>          |
>  taking disk snapshot ----> libxl_disk_snapshot_create (*)
>          |                     |
>          |                     --> libxl_qmp_disk_snapshot_transaction (*)
>          |
>     unpause domain ---->libxl_domain_unpause
>          |
>         End
> 
> 
> Reverting to a snapshot:
> (* means new functions we will need in libxl/libxlu)
> 
>   "xl snapshot-revert"
>          |
>    parse configuration
>          |
>    destroy domain ---->libxl_domain_destroy
>          |
>  reverting disk snapshot ----> xlu_disk_snapshot_revert (*)
>          |                       |
>          |                       --> call 'qemu-img' to apply disk snapshot
>          |
>  restore domain from saved memory ----> libxl_domain_create_restore
>          |
>         End

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-17 14:09   ` Wei Liu
@ 2014-12-18  3:01     ` Chun Yan Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-18  3:01 UTC (permalink / raw)
  To: Wei Liu; +Cc: Jim Fehlig, ian.jackson, ian.campbell, xen-devel



>>> On 12/17/2014 at 10:09 PM, in message
<20141217140958.GH1904@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com>
wrote: 
> On Tue, Dec 16, 2014 at 02:32:57PM +0800, Chunyan Liu wrote: 
> > Changes to V8: 
> >   * remove libxl_domain_snapshot_create/delete/revert API 
> >   * export disk snapshot functionality for both xl and libvirt usage 
> >  
> > =========================================================================== 
> > Libxl/libxlu Design 
> >  
> > 1. New Structures 
> >  
> > libxl_disk_snapshot = Struct("disk_snapshot",[ 
> >     # target disk 
> >     ("disk",            libxl_device_disk), 
> >  
> >     # disk snapshot name 
> >     ("name",            string), 
> >  
> >     # internal/external disk snapshot? 
> >     ("external",        bool), 
> >  
> >     # for external disk snapshot, specify following two field 
> >     ("external_format", string), 
> >     ("external_path",   string), 
> >     ]) 
> >  
>  
> So you don't propose making libxl to have knowledge of the snapshot 
> chains? 
Right.

> And in libvirt (or other toolstack that's interested in snapshot 
> management) you represent a snapshot as chains (or trees) of 
> libxl_disk_snapshot? (Not suggesting you do things the other way around, 
> just to confirm) 

Libvirt has its own data structure to manage domain snapshots. libxl_disk_snapshot
is only used by libvirt to call libxl API to do disk snapshot work for it. In libvirt,
it has other data structure to represent disk snapshot information.


>  
> >  
> > 2. New Functions 
> >  
> > Since there're already APIs for saving memory (libxl_domain_suspend) 
> > and restoring domain from saved memory (libxl_domain_create_restore), to 
> > xl domain snapshot tasks, the missing part is disk snapshot functionality. 
> > And the disk snapshot functionality would be used by libvirt too. 
> >  
> > Considering there is qmp handling in creating/deleting disk snapshot, 
> > will add following new functions to libxl (?): 
> >  
> > int libxl_disk_snapshot_create(libxl_ctx *ctx, uint32_t domid, 
> >                                libxl_disk_snapshot *snapshot, int nb); 
> >  
> >     Taking disk snapshots to a group of domain disks according to 
> >     configuration. For qcow2 disk backend type, it will call qmp 
> >     "transaction" command to do the work. For other disk backend types, 
> >     might call other external commands. 
> >  
> >     Parameters: 
> >        ctx (INPUT): 
> >            context 
> >        domid (INPUT): 
> >            domain id 
> >        snapshot (INPUT): 
> >            array of disk snapshot configuration. Has "nb" members. 
> >  
> >            libxl_device_disk: 
> >                structure to represent which disk. 
> >            name: 
> >                snapshot name. 
> >            external: 
> >                internal snapshot or external snapshot. 
> >                'false' means internal disk snapshot. external_format and 
> >                external_path will be ignored. 
> >                'true' means external disk snapshot, then external_format 
> >                and external_path should be provided. 
> >            external_format: 
> >                Should be provided when 'external' is true. If not provided, 
> >                will use default format proper to the backend file. 
> >                Ignored when 'external' is false. 
> >            external_path: 
> >                Must be provided when 'external' is true. 
> >                Ignored when 'external' is false. 
> >        nb (INPUT): 
> >            number of disks that need to take disk snapshot. 
> >  
> >     Return: 
> >        0 on success, -1 on error. 
> >  
>  
> It should return appropriate libxl error code (ERROR_*) on error. 

Thanks. That's better.

>  
> >  
> > /*  This API might not be used by xl, since xl won't take care of deleting 
> >  *  snapshots. But for libvirt, since libvirt manages snapshots and will 
> >  *  delete snapshot, this API will be used. 
> >  */ 
> > int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid, 
> >                                libxl_disk_snapshot *snapshot, int nb); 
> >  
> >     Delete disk snapshot of a group of domain disks according to 
> >     configuration. For qcow2 disk backend type, it will call qmp command 
> >     to delete internal disk snapshot. For other disk backend types, might 
> >     call other external commands. 
> >  
> >     Parameters: 
> >        ctx (INPUT): 
> >            context 
> >        domid (INPUT): 
> >            domain id 
> >        snapshot (INPUT): 
> >            array of disk snapshot configuration. Has "nb" members. 
> >        nb (INPUT): 
> >            number of disks that need to take disk snapshot. 
> >  
> >     Return: 
> >        0 on success, -1 on error. 
> >  
> >  
> > int libxl_disk_to_snapshot(libxl_ctx *ctx, uint32_t domid, 
> >                            libxl_disk_snapshot **snapshot, int *num); 
> >  
> >     This is for domain snapshot create. If user doesn't specify disks, 
> >     then by default it will take internal disk snapshot to each domain 
> >     disk. This function will fill libxl_disk_snapshot according to domain 
> >     disks info. 
> >  
> >     Parameters: 
> >        ctx (INPUT): 
> >            context 
> >        domid (INPUT): 
> >            domain id 
> >        snapshot (OUTPUT): 
> >            array of disk snapshot configuration. 
> >        num (OUTPUT): 
> >            number of disks. 
> >  
> >     Return: 
> >        0 on success, -1 on error. 
> >  
> >  
> >  
> > For disk snapshot revert, no qmp command for that, it always calls 
> > external commands to finish the work, so put in libxlu (?): 
> >  
> > int xlu_disk_snapshot_revert(libxl_disk_snapshot *snapshot, int nb); 
> >  
>  
> IMHO it's fine for this to be in libxl. Calling out to other programs 
> is fine. 

OK. Thanks. Then we can put all disk snapshot APIs in libxl.

>  
> Wei. 
>  
> >     Apply disk snapshot for a group of disks according to configuration. To 
> >     different disk backend types, call different external commands to do 
> >     the work. 
> >  
> >     Parameters: 
> >        snapshot (INPUT): 
> >            array of disk snapshot configuration. Has "nb" members. 
> >        nb (INPUT): 
> >            number of disks that need to take disk snapshot. 
> >  
> >     Return: 
> >        0 on success, -1 on error. 
> >  
> >  
> >  
> > 3. Simple Architecture View 
> >  
> > Creating domain snapshot: 
> > (* means new functions we will need in libxl/libxlu) 
> >  
> >   "xl snapshot-create" 
> >          | 
> >   parse configuration ----> libxl_disk_to_snapshot (*) 
> >          | 
> >   saving memory ----> libxl_domain_suspend 
> >          | 
> >  taking disk snapshot ----> libxl_disk_snapshot_create (*) 
> >          |                     | 
> >          |                     --> libxl_qmp_disk_snapshot_transaction (*) 
> >          | 
> >     unpause domain ---->libxl_domain_unpause 
> >          | 
> >         End 
> >  
> >  
> > Reverting to a snapshot: 
> > (* means new functions we will need in libxl/libxlu) 
> >  
> >   "xl snapshot-revert" 
> >          | 
> >    parse configuration 
> >          | 
> >    destroy domain ---->libxl_domain_destroy 
> >          | 
> >  reverting disk snapshot ----> xlu_disk_snapshot_revert (*) 
> >          |                       | 
> >          |                       --> call 'qemu-img' to apply disk snapshot 
> >          | 
> >  restore domain from saved memory ----> libxl_domain_create_restore 
> >          | 
> >         End 
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 3/4] domain snapshot design: xl
  2014-12-17 12:28   ` Wei Liu
@ 2014-12-18  3:23     ` Chun Yan Liu
  2014-12-18 11:02       ` Wei Liu
  0 siblings, 1 reply; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-18  3:23 UTC (permalink / raw)
  To: Wei Liu; +Cc: Jim Fehlig, ian.jackson, ian.campbell, xen-devel



>>> On 12/17/2014 at 08:28 PM, in message
<20141217122817.GG1904@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com>
wrote: 
> On Tue, Dec 16, 2014 at 02:32:56PM +0800, Chunyan Liu wrote: 
> > Changes to V8: 
> >   * xl won't manage snapshots, that means it won't maintain json files, 
> >     won't maintain snapshot chain relationship, and then as a result 
> >     won't take care of deleting snapshot and listing snapshots. 
> >   * remove snapshot-delete and snapshot-list interface 
> >   * update snapshot-revert interface 
> >   * update snapshot-create/revert implementaion 
> >  
> > =========================================================================== 
> >  
> > XL Design 
> >  
> > 1. User Interface 
> >  
> > xl snapshot-create: 
> >   Create a snapshot (disk and RAM) of a domain. 
> >  
> >   SYNOPSIS: 
> >     snapshot-create <domain> [<cfgfile>] [--name <string>] [--live] 
> >  
> >   OPTIONS: 
> >     --name <string>  snapshot name 
> >     --live           take a live snapshot 
> >  
> >     If option includes --live, then the domain is not paused while creating 
> >     the snapshot, like live migration. This increases size of the memory 
> >     dump file, but reducess downtime of the guest. 
> >  
> >     If option doens't include --name, a default name will be generated 
> >     according to the creation time. 
> >  
> >     If specify @cfgfile, use cfgfile. (e.g. if --name specifies a name, 
> >     meanwhile there is name specified in cfgfile, name in cfgfile will 
> >     be used.) 
> >  
> >  
> > xl snapshot-revert: 
> >   Revert domain to status of a snapshot. 
> >  
> >   SYNOPSIS: 
> >       snapshot-revert <domain> <cfgfile> [--running] [--force] 
> >  
> >   OPTIONS: 
> >     --running        after reverting, change state to running 
> >     --force          try harder on risky reverts 
> >  
> >     Normally, the domain will revert to the same state the domain was in  
> while 
> >     the snapshot was taken (whether running, or paused). 
> >  
> >     If option includes --running, then overrides the snapshot state to 
> >     guarantee a running domain after the revert. 
> >  
> >  
> >  
> > 2. cfgfile syntax 
> >  
> > #snapshot name. If user doesn't provide a VM snapshot name, xl will  
> generate 
> > #a name automatically by the creation time. 
> > name="" 
> >  
> > #snapshot description. Default is NULL. 
> > description="" 
> >  
> > #memory location. This field should be filled when memory=1. Default is  
> NULL. 
> > memory_path="" 
> >  
> > #disk snapshot information 
> > #For easier parse config work, reuse disk configuration in xl.cfg, but 
> > #with different meanings. 
> > #disk syntax meaning: 'external path, external format, target device' 
> >  
> > #e.g. to specify exernal disk snapshot, like this: 
> > #disks=['/tmp/hda_snapshot.qcow2,qcow2,hda', 
> >         '/tmp/hdb_snapshot.qcow2,qcow2,hdb',] 
> >  
> > #e.g. to specify internal disk snapshot, like this: 
> > disks=[',,hda',',,hdb',] 
> >  
>  
> How is snapshot chain represented with this syntax? Does xl not need to 
> know about the chain? (Note, this is different than managing the chain) 

If only supply creating snapshot and restoring domain from a snapshot,
xl doesn't need to know the chain.

For creating snapshot, it's very easy to understand, no matter from base
or from a snapshot, saving memory and taking disk snapshot has no
difference. 

For restoring domain from snapshot, restoring memory has no difference;
applying disk snapshot, to those backend types we can expect:
qcow2 internal snapshot: no need to know chain
vhd, qcow2 external disk snapshot: both external disk snapshot, and 
both using backing file chain to implement, so apply disk snapshot
is very simple, just use the external snapshot file.
lvm: doesn't support snapshot of snapshot, so no such problem.
So, overall, it doesn't need to know the chain either.

> 
> Wei. 
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2014-12-17 12:17   ` Wei Liu
@ 2014-12-18  3:34     ` Chun Yan Liu
  2014-12-18 10:57       ` Wei Liu
  0 siblings, 1 reply; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-18  3:34 UTC (permalink / raw)
  To: Wei Liu; +Cc: Jim Fehlig, ian.jackson, ian.campbell, xen-devel



>>> On 12/17/2014 at 08:17 PM, in message
<20141217121750.GF1904@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com>
wrote: 
> On Tue, Dec 16, 2014 at 02:32:55PM +0800, Chunyan Liu wrote: 
> > Changes to V8: 
> >   * add an overview document, so that one can has a overall look 
> >     about the whole domain snapshot work, limits, requirements, 
> >     how to do, etc. 
> >  
> > ===================================================================== 
> > Domain snapshot overview 
> >  
> > 1. Purpose 
> >  
> > Domain snapshot is a system checkpoint of a domain. Later, one can 
> > roll back the domain to that checkpoint. It's a very useful backup 
> > function. A domain snapshot contains the memory status at the 
> > checkpoint and the disk status (which we called disk snapshot). 
> >  
> > Domain snapshot functionality usually includes: 
> > a) create a domain snapshot 
> > b) roll back (or called "revert") to a domain snapshot 
> > c) delete a domain snapshot 
> > d) list all domain snapshots 
> >  
> > But following the existing xl idioms of managing storage and saved 
> > VM images via existing CLI command (qemu-img, lvcreate, ls, mv, 
> > cp etc), xl snapshot functionality would be kept as simple as 
> > possible: 
> > * xl will do a) and b), creating a snapshot and reverting a 
> >   domain to a snapshot. 
> > * xl will NOT do c) and d), xl won't manage snapshots, as xl 
> >   doesn't maintain saved images created by 'xl save'. So xl 
> >   will have no idea of the existence of domain snapshots and 
> >   the chain relationship between snapshots. It will depends on 
> >   user to take care of the snapshots, know the snapshot chain 
> >   info, and delete snapshots. 
> >  
> > Domain Snapshot Support and Not Support: 
>  
> I think this list applies to xl (last item and [1]). If so please state 
> clearly to prevent confusion with other toolstack (say, libvirt) and 
> functionalities of the library (libxl). 
>  
> > * support live snapshot 
> > * support internal disk snapshot and external disk snapshot 
> > * support different disk backend types. 
> >   (Basic goal is to support 'raw' and 'qcow2' only). 
> >  
> > * not support snapshot when domain is shutdowning or dying. 
> > * not support disk-only snapshot [1]. 
> >  
> >  [1] To xl, it only concerns active domains, and even when domain 
> >  is paused, there is no data flush to disk operation. So, take 
> >  a disk-only snapshot and then resume, it is as if the guest 
> >  had crashed. For this reason, disk-only snapshot is meaningless 
> >  to xl. Should not support. 
> >  
>  
> I think I understand your reasoning, but it's a bit convoluted to me. 
>  
> Domain can be in both active and inactive state (libvirt term) when 
> using xl.  When domain is active, we cannot guarantee in xl that domain 
> is quiesced so a disk-only snapshot may contain inconsistent data.

That's right.

> When 
> domain is inactive, there's no point in taking a disk-only snapshot 
> because it would be the same as the base image.

xl doesn't have inactive domains. Libvirt has. (in libvirt, one can 'define'
a domain but not 'starte', like old xend which can 'new' a domain but not
'start' it.) xl only can 'create' a domain, when domain is shutdown, it's
not visible to user.

For inactive domain, disk-only snapshot is useful. Since later user
may run VM with base image and base image would change. Then the
disk-only snapshot is a usable backup.

That's why, libvirt can support disk-only snapshot, xl won't support
disk-only snapshot. Do I describe it clearly?

> So the conclusion is 
> that xl doesn't need to support disk-only snapshot. 
>  
> Does the above reasoning equals to yours? Is it clearer or more 
> confusing? 
>  
> Wei. 
>  
> >  
> > 2. Requirements 
> >  
> > General Requirements: 
> > * ability to save/restore domain memory 
> > * ability to create/delete/apply disk snapshot [2] 
> > * ability to parse user config file 
> >  
> >   [2] Disk snapshot requirements: 
> >   - external tools: qemu-img, lvcreate, vhd-util, etc. 
> >   - for basic goal, we support 'raw' and 'qcow2' backend types 
> >     only. Then it requires: 
> >     libxl qmp command or "qemu-img" (when qemu process does not 
> >     exist) 
> >  
> >  
> > 3. Interaction with other operations: 
> >  
> > No. 
> >  
> >  
> > 4. General workflow 
> >  
> > Create a snapshot: 
> >   * parse user cfg file if passed in 
> >   * check snapshot operation is allowed or not 
> >   * save domain, saving memory status to file (refer to: save_domain) 
> >   * take disk snapshot (e.g. call qmp command) 
> >   * unpause domain 
> >  
> > Revert to snapshot: 
> >   * parse use cfg file (xl doesn't manage snapshots, so it has no 
> >     idea of snapshot existence. User MUST supply configuration file) 
> >   * destroy this domain 
> >   * create a new domain from snapshot info 
> >     - apply disk snapshot (e.g. call qemu-img) 
> >     - a process like restore domain 
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2014-12-18  3:34     ` Chun Yan Liu
@ 2014-12-18 10:57       ` Wei Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Liu @ 2014-12-18 10:57 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: ian.jackson, Jim Fehlig, Wei Liu, ian.campbell, xen-devel

On Wed, Dec 17, 2014 at 08:34:07PM -0700, Chun Yan Liu wrote:
> 
> 
> >>> On 12/17/2014 at 08:17 PM, in message
> <20141217121750.GF1904@zion.uk.xensource.com>, Wei Liu <wei.liu2@citrix.com>
> wrote: 
> > On Tue, Dec 16, 2014 at 02:32:55PM +0800, Chunyan Liu wrote: 
> > > Changes to V8: 
> > >   * add an overview document, so that one can has a overall look 
> > >     about the whole domain snapshot work, limits, requirements, 
> > >     how to do, etc. 
> > >  
> > > ===================================================================== 
> > > Domain snapshot overview 
> > >  
> > > 1. Purpose 
> > >  
> > > Domain snapshot is a system checkpoint of a domain. Later, one can 
> > > roll back the domain to that checkpoint. It's a very useful backup 
> > > function. A domain snapshot contains the memory status at the 
> > > checkpoint and the disk status (which we called disk snapshot). 
> > >  
> > > Domain snapshot functionality usually includes: 
> > > a) create a domain snapshot 
> > > b) roll back (or called "revert") to a domain snapshot 
> > > c) delete a domain snapshot 
> > > d) list all domain snapshots 
> > >  
> > > But following the existing xl idioms of managing storage and saved 
> > > VM images via existing CLI command (qemu-img, lvcreate, ls, mv, 
> > > cp etc), xl snapshot functionality would be kept as simple as 
> > > possible: 
> > > * xl will do a) and b), creating a snapshot and reverting a 
> > >   domain to a snapshot. 
> > > * xl will NOT do c) and d), xl won't manage snapshots, as xl 
> > >   doesn't maintain saved images created by 'xl save'. So xl 
> > >   will have no idea of the existence of domain snapshots and 
> > >   the chain relationship between snapshots. It will depends on 
> > >   user to take care of the snapshots, know the snapshot chain 
> > >   info, and delete snapshots. 
> > >  
> > > Domain Snapshot Support and Not Support: 
> >  
> > I think this list applies to xl (last item and [1]). If so please state 
> > clearly to prevent confusion with other toolstack (say, libvirt) and 
> > functionalities of the library (libxl). 
> >  
> > > * support live snapshot 
> > > * support internal disk snapshot and external disk snapshot 
> > > * support different disk backend types. 
> > >   (Basic goal is to support 'raw' and 'qcow2' only). 
> > >  
> > > * not support snapshot when domain is shutdowning or dying. 
> > > * not support disk-only snapshot [1]. 
> > >  
> > >  [1] To xl, it only concerns active domains, and even when domain 
> > >  is paused, there is no data flush to disk operation. So, take 
> > >  a disk-only snapshot and then resume, it is as if the guest 
> > >  had crashed. For this reason, disk-only snapshot is meaningless 
> > >  to xl. Should not support. 
> > >  
> >  
> > I think I understand your reasoning, but it's a bit convoluted to me. 
> >  
> > Domain can be in both active and inactive state (libvirt term) when 
> > using xl.  When domain is active, we cannot guarantee in xl that domain 
> > is quiesced so a disk-only snapshot may contain inconsistent data.
> 
> That's right.
> 
> > When 
> > domain is inactive, there's no point in taking a disk-only snapshot 
> > because it would be the same as the base image.
> 
> xl doesn't have inactive domains. Libvirt has. (in libvirt, one can 'define'
> a domain but not 'starte', like old xend which can 'new' a domain but not
> 'start' it.) xl only can 'create' a domain, when domain is shutdown, it's
> not visible to user.
> 

Per the definition in the first patch, inactive domain is a domain
"created but not started", so I thought the domain created by "xl create
-p dom.cfg" falls into this category. I was wrong.

I think the "created but not started" should be "defined but not
started" (using libvirt's terminology).

> For inactive domain, disk-only snapshot is useful. Since later user
> may run VM with base image and base image would change. Then the
> disk-only snapshot is a usable backup.
> 
> That's why, libvirt can support disk-only snapshot, xl won't support
> disk-only snapshot. Do I describe it clearly?
> 

Yes. I think the libvirt terminology is "defined", not "created".

http://wiki.libvirt.org/page/VM_lifecycle

Wei.

> > So the conclusion is 
> > that xl doesn't need to support disk-only snapshot. 
> >  
> > Does the above reasoning equals to yours? Is it clearer or more 
> > confusing? 
> >  
> > Wei. 
> >  
> > >  
> > > 2. Requirements 
> > >  
> > > General Requirements: 
> > > * ability to save/restore domain memory 
> > > * ability to create/delete/apply disk snapshot [2] 
> > > * ability to parse user config file 
> > >  
> > >   [2] Disk snapshot requirements: 
> > >   - external tools: qemu-img, lvcreate, vhd-util, etc. 
> > >   - for basic goal, we support 'raw' and 'qcow2' backend types 
> > >     only. Then it requires: 
> > >     libxl qmp command or "qemu-img" (when qemu process does not 
> > >     exist) 
> > >  
> > >  
> > > 3. Interaction with other operations: 
> > >  
> > > No. 
> > >  
> > >  
> > > 4. General workflow 
> > >  
> > > Create a snapshot: 
> > >   * parse user cfg file if passed in 
> > >   * check snapshot operation is allowed or not 
> > >   * save domain, saving memory status to file (refer to: save_domain) 
> > >   * take disk snapshot (e.g. call qmp command) 
> > >   * unpause domain 
> > >  
> > > Revert to snapshot: 
> > >   * parse use cfg file (xl doesn't manage snapshots, so it has no 
> > >     idea of snapshot existence. User MUST supply configuration file) 
> > >   * destroy this domain 
> > >   * create a new domain from snapshot info 
> > >     - apply disk snapshot (e.g. call qemu-img) 
> > >     - a process like restore domain 
> >  
> >  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 3/4] domain snapshot design: xl
  2014-12-18  3:23     ` Chun Yan Liu
@ 2014-12-18 11:02       ` Wei Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Liu @ 2014-12-18 11:02 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: ian.jackson, Jim Fehlig, Wei Liu, ian.campbell, xen-devel

On Wed, Dec 17, 2014 at 08:23:41PM -0700, Chun Yan Liu wrote:
[...]
> > >  
> > >  
> > > 2. cfgfile syntax 
> > >  
> > > #snapshot name. If user doesn't provide a VM snapshot name, xl will  
> > generate 
> > > #a name automatically by the creation time. 
> > > name="" 
> > >  
> > > #snapshot description. Default is NULL. 
> > > description="" 
> > >  
> > > #memory location. This field should be filled when memory=1. Default is  
> > NULL. 
> > > memory_path="" 
> > >  
> > > #disk snapshot information 
> > > #For easier parse config work, reuse disk configuration in xl.cfg, but 
> > > #with different meanings. 
> > > #disk syntax meaning: 'external path, external format, target device' 
> > >  
> > > #e.g. to specify exernal disk snapshot, like this: 
> > > #disks=['/tmp/hda_snapshot.qcow2,qcow2,hda', 
> > >         '/tmp/hdb_snapshot.qcow2,qcow2,hdb',] 
> > >  
> > > #e.g. to specify internal disk snapshot, like this: 
> > > disks=[',,hda',',,hdb',] 
> > >  
> >  
> > How is snapshot chain represented with this syntax? Does xl not need to 
> > know about the chain? (Note, this is different than managing the chain) 
> 
> If only supply creating snapshot and restoring domain from a snapshot,
> xl doesn't need to know the chain.
> 
> For creating snapshot, it's very easy to understand, no matter from base
> or from a snapshot, saving memory and taking disk snapshot has no
> difference. 
> 
> For restoring domain from snapshot, restoring memory has no difference;
> applying disk snapshot, to those backend types we can expect:
> qcow2 internal snapshot: no need to know chain
> vhd, qcow2 external disk snapshot: both external disk snapshot, and 
> both using backing file chain to implement, so apply disk snapshot
> is very simple, just use the external snapshot file.
> lvm: doesn't support snapshot of snapshot, so no such problem.
> So, overall, it doesn't need to know the chain either.
> 

Thanks for the explanation. Makes sense.

Wei.

> > 
> > Wei. 
> >  
> >  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 1/4] domain snapshot terms
  2014-12-16  6:32 ` [RFC V9 1/4] domain snapshot terms Chunyan Liu
@ 2014-12-18 15:05   ` Ian Campbell
  2014-12-19  2:46     ` Chun Yan Liu
  0 siblings, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2014-12-18 15:05 UTC (permalink / raw)
  To: Chunyan Liu; +Cc: ian.jackson, jfehlig, wei.liu2, xen-devel

On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:
> Changes to V8:
>   * add a document for domain snapshot related terms, they will be
>     referred in later documents.
> 
> =====================================================================
> Terms
> 
> * Active domain: domain created and started
> 
> * Inactive domain: domain created but not started

As Wei says I think you mean "defined" here, since created and started
are (essentially) synonyms for some toolstacks.

You'll probably want to define "defined" too for clarity.

> 
> * Domain snapshot:
> 
>   Domain snapshot is a system checkpoint of a domain. It contains
>   the memory status at the checkpoint and the disk status.
> 
> * Disk-only snapshot:
> 
>   Disk-only snapshot only keeps the status of disk, not saving
>   memory status.
> 
>   Contents of disks (whether a subset or all disks associated with
>   the domain) are saved at a given point of time, and can be restored
>   back to that state. On a running guest, a disk-only snapshot is
>   likely to be only crash-consistent rather than clean (that is, it
>   represents the state of the disk on a sudden power outage); on an
>   inactive guest, a disk-only snapshot is clean if the disks were
>   clean when the guest was last shut down.

There is the possibility of doing clean snapshots if a guest agent is
involved to quiesce the disks at the right moment (e.g. I believe qemu
has such a thing, or at least I've seen talks about it being developed
at conferences). Are you including this possibility or explicitly ruling
it out of scope?

> * Live Snapshot:
> 
>   Like live migration, it will increase size of the memory dump file,
>   but reducess downtime of the guest.
> 
> * Internal Disk Snapshot
> 
>   File formats such as qcow2 track both the snapshot and changes
>   since the snapshot in a single file.
> 
> * External Disk Snapshot
> 
>   The snapshot is one file, and the changes since the snapshot
>   are in another file.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2014-12-16  6:32 ` [RFC V9 2/4] domain snapshot overview Chunyan Liu
  2014-12-17 12:17   ` Wei Liu
@ 2014-12-18 15:10   ` Ian Campbell
  2014-12-19  5:45     ` Chun Yan Liu
  1 sibling, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2014-12-18 15:10 UTC (permalink / raw)
  To: Chunyan Liu; +Cc: ian.jackson, jfehlig, wei.liu2, xen-devel

On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:
> Changes to V8:
>   * add an overview document, so that one can has a overall look
>     about the whole domain snapshot work, limits, requirements,
>     how to do, etc.
> 
> =====================================================================
> Domain snapshot overview

I don't see a similar section for disk snapshots, are you not
considering those here except as a part of a domain snapshot or is this
an oversight?

There are three main use cases (that I know of at least) for
snapshotting like behaviour.

One is as you've mentioned below for "backup", i.e. to preserve the VM
at a certain point in time in order to be able to roll back to it. Is
this the only usecase you are considering?

A second use case is to support "gold image" type deployments, i.e.
where you create one baseline single disk image and then clone it
multiple times to deploy lots of guests. I think this is usually a "disk
snapshot" type thing, but maybe it can be implemented as restoring a
gold domain snapshot multiple times (e.g. for start of day performance
reasons).

The third case, (which is similar to the first), is taking a disk
snapshot in order to be able to run you usual backup software on the
snapshot (which is now unchanging, which is handy) and then deleting the
disk snapshot (this differs from the first case in which disk is active
after the snapshot, and due to the lack of the memory part). 

Are you considering all three use cases here or are you explicitly
ruling out anything but the first? I think there might be some subtle
differences in the requirements, wrt which operations need to consider
the possibility of an active domain etc, depending on which cases are
considered. It would be good to be explicit about the use cases you are
not trying to address here so we are all on the same page.

If you are ruling these other usecases out then I think it would be
useful to briefly describe them and then note that they are out of scope
for this design, so that we have an agreed understanding of what is in
or out of scope and/or can debate to what extent such use cases ought to
be considered in the design if not the implementation.

> 1. Purpose
> 
> Domain snapshot is a system checkpoint of a domain. Later, one can
> roll back the domain to that checkpoint. It's a very useful backup
> function. A domain snapshot contains the memory status at the
> checkpoint and the disk status (which we called disk snapshot).


> Domain snapshot functionality usually includes:
> a) create a domain snapshot
> b) roll back (or called "revert") to a domain snapshot
> c) delete a domain snapshot
> d) list all domain snapshots
> 
> But following the existing xl idioms of managing storage and saved
> VM images via existing CLI command (qemu-img, lvcreate, ls, mv,
> cp etc), xl snapshot functionality would be kept as simple as
> possible:
> * xl will do a) and b), creating a snapshot and reverting a
>   domain to a snapshot.
> * xl will NOT do c) and d), xl won't manage snapshots, as xl
>   doesn't maintain saved images created by 'xl save'. So xl
>   will have no idea of the existence of domain snapshots and
>   the chain relationship between snapshots. It will depends on
>   user to take care of the snapshots, know the snapshot chain
>   info, and delete snapshots.

This is a case where the usecases being considered might apply. If the
third case I outlined above is in scope then xl may need to somehow
support deleting a snapshot from under the feet of an active domain etc
(which need not necessarily imply knowledge of snapshot chains or
snapshot management, but might involve a notification to the backend for
example).

> Domain Snapshot Support and Not Support:
> * support live snapshot
> * support internal disk snapshot and external disk snapshot
> * support different disk backend types.
>   (Basic goal is to support 'raw' and 'qcow2' only).
> 
> * not support snapshot when domain is shutdowning or dying.
> * not support disk-only snapshot [1].
> 
>  [1] To xl, it only concerns active domains, and even when domain
>  is paused, there is no data flush to disk operation. So, take
>  a disk-only snapshot and then resume, it is as if the guest
>  had crashed. For this reason, disk-only snapshot is meaningless
>  to xl. Should not support.
> 
> 
> 2. Requirements
> 
> General Requirements:
> * ability to save/restore domain memory
> * ability to create/delete/apply disk snapshot [2]

Is "apply" the same as "revert to"? Worth adding to the terminology
section and using consistently.

> * ability to parse user config file
> 
>   [2] Disk snapshot requirements:
>   - external tools: qemu-img, lvcreate, vhd-util, etc.
>   - for basic goal, we support 'raw' and 'qcow2' backend types
>     only. Then it requires:
>     libxl qmp command or "qemu-img" (when qemu process does not
>     exist)
> 
> 
> 3. Interaction with other operations:
> 
> No.

What about shutdown/dying as you noted above? What about migration or
regular save/restore?

> 
> 4. General workflow
> 
> Create a snapshot:
>   * parse user cfg file if passed in
>   * check snapshot operation is allowed or not
>   * save domain, saving memory status to file (refer to: save_domain)
>   * take disk snapshot (e.g. call qmp command)
>   * unpause domain
> 
> Revert to snapshot:
>   * parse use cfg file (xl doesn't manage snapshots, so it has no
>     idea of snapshot existence. User MUST supply configuration file)
>   * destroy this domain
>   * create a new domain from snapshot info
>     - apply disk snapshot (e.g. call qemu-img)
>     - a process like restore domain

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 3/4] domain snapshot design: xl
  2014-12-16  6:32 ` [RFC V9 3/4] domain snapshot design: xl Chunyan Liu
  2014-12-17 12:28   ` Wei Liu
@ 2014-12-18 15:15   ` Ian Campbell
  2014-12-19  7:03     ` Chun Yan Liu
  1 sibling, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2014-12-18 15:15 UTC (permalink / raw)
  To: Chunyan Liu; +Cc: ian.jackson, jfehlig, wei.liu2, xen-devel

On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:
> Changes to V8:
>   * xl won't manage snapshots, that means it won't maintain json files,
>     won't maintain snapshot chain relationship, and then as a result
>     won't take care of deleting snapshot and listing snapshots.
>   * remove snapshot-delete and snapshot-list interface
>   * update snapshot-revert interface
>   * update snapshot-create/revert implementaion
> 
> ===========================================================================
> 
> XL Design
> 
> 1. User Interface
> 
> xl snapshot-create:
>   Create a snapshot (disk and RAM) of a domain.
> 
>   SYNOPSIS:
>     snapshot-create <domain> [<cfgfile>] [--name <string>] [--live]
> 
>   OPTIONS:
>     --name <string>  snapshot name
>     --live           take a live snapshot
> 
>     If option includes --live, then the domain is not paused while creating
>     the snapshot, like live migration. This increases size of the memory
>     dump file, but reducess downtime of the guest.
> 
>     If option doens't include --name, a default name will be generated
>     according to the creation time.
> 
>     If specify @cfgfile, use cfgfile. (e.g. if --name specifies a name,
>     meanwhile there is name specified in cfgfile, name in cfgfile will
>     be used.)

If you do not specify cfgfile then where do things go? Is --name perhaps
also serving as a basename for a path to save things to?

I wonder if we can simplify this a bit and therefore avoid the need cfg
file in common case. e.g.

----8<-------
  xl snapshot-create [--live] [--internal|--external] <domain> <path>

<path> is a path to a directory, which must not exist. This path will be
created and the memory snapshot stored in it using some well defined
name.

If the snapshot is --external then the snapshot disks are created in
<path> with some well defined name based on the virtual device and
backend format in use.

If the snapshot is --internal then the snapshot disks are internal.

The default is <TBD, based on backends in use?>

--live is as you've already got it.
----8<-------

I'm not sure what name and description actually are here, so I've
omitted them.

Any of that could be overridden with e.g. more specific disk
configuration stanzas etc via the cfg file, if necessary.

This is just a suggestion, feel free to disagree.

NB xl create can accept a series of cfg file lines on the command line
too, and treats them as appended to the cfgfile (if any was given). I
think xl snapshot-create should include this functionality too.
> xl snapshot-revert:
>   Revert domain to status of a snapshot.
> 
>   SYNOPSIS:
>       snapshot-revert <domain> <cfgfile> [--running] [--force]
> 
>   OPTIONS:
>     --running        after reverting, change state to running
>     --force          try harder on risky reverts
> 
>     Normally, the domain will revert to the same state the domain was in while
>     the snapshot was taken (whether running, or paused).
> 
>     If option includes --running, then overrides the snapshot state to
>     guarantee a running domain after the revert.
> 
> 
> 
> 2. cfgfile syntax
> 
> #snapshot name. If user doesn't provide a VM snapshot name, xl will generate
> #a name automatically by the creation time.
> name=""

What is name used for? (I guessed it might be output path above, but I'm
not sure).

> #snapshot description. Default is NULL.
> description=""

What is this used for? Is it embedded into e.g. qcow images or is it
just for the admin's own use (in which cases the existing support for
#comments ought to suffice).

> #memory location. This field should be filled when memory=1. Default is NULL.

The memory option isn't defined anywhere, and I think you've rules
disk-only snapshots out of scope for xl, so I think this is just a left
over from a previous revision.

> #e.g. to specify exernal disk snapshot, like this:
> #disks=['/tmp/hda_snapshot.qcow2,qcow2,hda',
>         '/tmp/hdb_snapshot.qcow2,qcow2,hdb',]
> 
> #e.g. to specify internal disk snapshot, like this:
> disks=[',,hda',',,hdb',]

Ideally one or the other of these behaviours would be possible without
needing to be quite so explicit.

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-16  6:32 ` [RFC V9 4/4] domain snapshot design: libxl/libxlu Chunyan Liu
  2014-12-17 14:09   ` Wei Liu
@ 2014-12-18 15:27   ` Ian Campbell
  2014-12-19  6:58     ` Chun Yan Liu
  1 sibling, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2014-12-18 15:27 UTC (permalink / raw)
  To: Chunyan Liu; +Cc: ian.jackson, jfehlig, wei.liu2, xen-devel

On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:
> Changes to V8:
>   * remove libxl_domain_snapshot_create/delete/revert API
>   * export disk snapshot functionality for both xl and libvirt usage
> 
> ===========================================================================
> Libxl/libxlu Design
> 
> 1. New Structures
> 
> libxl_disk_snapshot = Struct("disk_snapshot",[
>     # target disk
>     ("disk",            libxl_device_disk),
> 
>     # disk snapshot name
>     ("name",            string),
> 
>     # internal/external disk snapshot?
>     ("external",        bool),
> 
>     # for external disk snapshot, specify following two field
>     ("external_format", string),
>     ("external_path",   string),

Should this be a KeyedUnion over a new LIBXL_DISK_SNAPSHOT_KIND enum
(with values INTERNAL and EXTERNAL)? This would automatically make the
binding between external==true and the fields which depend on that.

external_format should be of type libxl_disk_format, unless it is
referring to something else?

Is it possible for format to differ from the format of the underlying
disk? Perhaps taking a snapshot of a raw disk as a qcow? In any case
passing in UNKNOWN and letting libxl choose (probably by picking the
same as the underlying disk) should be supported.

> /*  This API might not be used by xl, since xl won't take care of deleting
>  *  snapshots. But for libvirt, since libvirt manages snapshots and will
>  *  delete snapshot, this API will be used.
>  */
> int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid,
>                                libxl_disk_snapshot *snapshot, int nb);

The three usecases I mentioned in the previous mail are important here,
because depending on which usecases you are considering there maybe a
many to one relationship between domains and a given snapshot (gold
image case). This interface cannot support that I think.

When we discussed this in previous iterations I suggested a libxl
command to tell a VM that it needed to reexamine its disks to see if any
of the chains had changed. I'm sure that's not the only potential answer
though.

> int libxl_disk_to_snapshot(libxl_ctx *ctx, uint32_t domid,
>                            libxl_disk_snapshot **snapshot, int *num);
> 
>     This is for domain snapshot create. If user doesn't specify disks,
>     then by default it will take internal disk snapshot to each domain
>     disk. This function will fill libxl_disk_snapshot according to domain
>     disks info.

Is this just a helper to produce an array to pass to
libxl_disk_snapshot_create? Or does it actually do stuff?

I think it's the former, but it could be clarified. I *think* this is
just a special case of libxl_device_disk_list which returns plausible
snapshot objects instead of the disks themselves.
> 
> For disk snapshot revert, no qmp command for that, it always calls
> external commands to finish the work, so put in libxlu (?):

I think rather than "no qmp" the issue is that "revert" is (at least as
far as libxl knows) essentially, destroy, rollback disks, restore from
RAM snapshot. So there is no qemu to speak to during the rollback. Is
that right?

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 1/4] domain snapshot terms
  2014-12-18 15:05   ` Ian Campbell
@ 2014-12-19  2:46     ` Chun Yan Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-19  2:46 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 12/18/2014 at 11:05 PM, in message <1418915119.11882.79.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: 
> > Changes to V8: 
> >   * add a document for domain snapshot related terms, they will be 
> >     referred in later documents. 
> >  
> > ===================================================================== 
> > Terms 
> >  
> > * Active domain: domain created and started 
> >  
> > * Inactive domain: domain created but not started 
>  
> As Wei says I think you mean "defined" here, since created and started 
> are (essentially) synonyms for some toolstacks. 
>  
> You'll probably want to define "defined" too for clarity. 

OK. I'll update.

>  
> >  
> > * Domain snapshot: 
> >  
> >   Domain snapshot is a system checkpoint of a domain. It contains 
> >   the memory status at the checkpoint and the disk status. 
> >  
> > * Disk-only snapshot: 
> >  
> >   Disk-only snapshot only keeps the status of disk, not saving 
> >   memory status. 
> >  
> >   Contents of disks (whether a subset or all disks associated with 
> >   the domain) are saved at a given point of time, and can be restored 
> >   back to that state. On a running guest, a disk-only snapshot is 
> >   likely to be only crash-consistent rather than clean (that is, it 
> >   represents the state of the disk on a sudden power outage); on an 
> >   inactive guest, a disk-only snapshot is clean if the disks were 
> >   clean when the guest was last shut down. 
>  
> There is the possibility of doing clean snapshots if a guest agent is 
> involved to quiesce the disks at the right moment (e.g. I believe qemu 
> has such a thing, or at least I've seen talks about it being developed 
> at conferences).

Right. Qemu has it. Libvirt qemu driver supports that. 

> Are you including this possibility or explicitly ruling 
> it out of scope? 

Just libxl has no mechanism to quiesce the disks even when domain
is paused, I didn't mention this scenario. I can add it.

Chunyan

> 
> > * Live Snapshot: 
> >  
> >   Like live migration, it will increase size of the memory dump file, 
> >   but reducess downtime of the guest. 
> >  
> > * Internal Disk Snapshot 
> >  
> >   File formats such as qcow2 track both the snapshot and changes 
> >   since the snapshot in a single file. 
> >  
> > * External Disk Snapshot 
> >  
> >   The snapshot is one file, and the changes since the snapshot 
> >   are in another file. 
>  
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2014-12-18 15:10   ` Ian Campbell
@ 2014-12-19  5:45     ` Chun Yan Liu
  2014-12-19 10:25       ` Ian Campbell
  0 siblings, 1 reply; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-19  5:45 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 12/18/2014 at 11:10 PM, in message <1418915443.11882.86.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: 
> > Changes to V8: 
> >   * add an overview document, so that one can has a overall look 
> >     about the whole domain snapshot work, limits, requirements, 
> >     how to do, etc. 
> >  
> > ===================================================================== 
> > Domain snapshot overview 
>  
> I don't see a similar section for disk snapshots, are you not 
> considering those here except as a part of a domain snapshot or is this 
> an oversight? 
>  
> There are three main use cases (that I know of at least) for 
> snapshotting like behaviour. 
>  
> One is as you've mentioned below for "backup", i.e. to preserve the VM 
> at a certain point in time in order to be able to roll back to it. Is 
> this the only usecase you are considering? 

Yes. I didn't take disk snapshot thing into the scope.

>  
> A second use case is to support "gold image" type deployments, i.e. 
> where you create one baseline single disk image and then clone it 
> multiple times to deploy lots of guests. I think this is usually a "disk 
> snapshot" type thing, but maybe it can be implemented as restoring a 
> gold domain snapshot multiple times (e.g. for start of day performance 
> reasons). 

As we initially discussed about the thing, disk snapshot thing can be done
be existing tools directly like qemu-img, vhd-util.

>  
> The third case, (which is similar to the first), is taking a disk 
> snapshot in order to be able to run you usual backup software on the 
> snapshot (which is now unchanging, which is handy) and then deleting the 
> disk snapshot (this differs from the first case in which disk is active 
> after the snapshot, and due to the lack of the memory part). 

Sorry, I'm still not quite clear about what this user case wants to do.

>  
> Are you considering all three use cases here or are you explicitly 
> ruling out anything but the first? I think there might be some subtle 
> differences in the requirements, wrt which operations need to consider 
> the possibility of an active domain etc, depending on which cases are 
> considered. It would be good to be explicit about the use cases you are 
> not trying to address here so we are all on the same page. 
>  
> If you are ruling these other usecases out then I think it would be 
> useful to briefly describe them and then note that they are out of scope 
> for this design, so that we have an agreed understanding of what is in 
> or out of scope and/or can debate to what extent such use cases ought to 
> be considered in the design if not the implementation.

OK. I'll add this.

>  
> > 1. Purpose 
> >  
> > Domain snapshot is a system checkpoint of a domain. Later, one can 
> > roll back the domain to that checkpoint. It's a very useful backup 
> > function. A domain snapshot contains the memory status at the 
> > checkpoint and the disk status (which we called disk snapshot). 
>  
>  
> > Domain snapshot functionality usually includes: 
> > a) create a domain snapshot 
> > b) roll back (or called "revert") to a domain snapshot 
> > c) delete a domain snapshot 
> > d) list all domain snapshots 
> >  
> > But following the existing xl idioms of managing storage and saved 
> > VM images via existing CLI command (qemu-img, lvcreate, ls, mv, 
> > cp etc), xl snapshot functionality would be kept as simple as 
> > possible: 
> > * xl will do a) and b), creating a snapshot and reverting a 
> >   domain to a snapshot. 
> > * xl will NOT do c) and d), xl won't manage snapshots, as xl 
> >   doesn't maintain saved images created by 'xl save'. So xl 
> >   will have no idea of the existence of domain snapshots and 
> >   the chain relationship between snapshots. It will depends on 
> >   user to take care of the snapshots, know the snapshot chain 
> >   info, and delete snapshots. 
>  
> This is a case where the usecases being considered might apply. If the 
> third case I outlined above is in scope then xl may need to somehow 
> support deleting a snapshot from under the feet of an active domain etc 
> (which need not necessarily imply knowledge of snapshot chains or 
> snapshot management, but might involve a notification to the backend for 
> example). 
>  
> > Domain Snapshot Support and Not Support: 
> > * support live snapshot 
> > * support internal disk snapshot and external disk snapshot 
> > * support different disk backend types. 
> >   (Basic goal is to support 'raw' and 'qcow2' only). 
> >  
> > * not support snapshot when domain is shutdowning or dying. 
> > * not support disk-only snapshot [1]. 
> >  
> >  [1] To xl, it only concerns active domains, and even when domain 
> >  is paused, there is no data flush to disk operation. So, take 
> >  a disk-only snapshot and then resume, it is as if the guest 
> >  had crashed. For this reason, disk-only snapshot is meaningless 
> >  to xl. Should not support. 
> >  
> >  
> > 2. Requirements 
> >  
> > General Requirements: 
> > * ability to save/restore domain memory 
> > * ability to create/delete/apply disk snapshot [2] 
>  
> Is "apply" the same as "revert to"? Worth adding to the terminology 
> section and using consistently.

Yes. In qemu-img terms, it's 'apply'. In libvirt qemu driver domain snapshot
terms, it's 'revert'. I'll add to the terminology section.

>  
> > * ability to parse user config file 
> >  
> >   [2] Disk snapshot requirements: 
> >   - external tools: qemu-img, lvcreate, vhd-util, etc. 
> >   - for basic goal, we support 'raw' and 'qcow2' backend types 
> >     only. Then it requires: 
> >     libxl qmp command or "qemu-img" (when qemu process does not 
> >     exist) 
> >  
> >  
> > 3. Interaction with other operations: 
> >  
> > No. 
>  
> What about shutdown/dying as you noted above? What about migration or 
> regular save/restore? 

Since xl now has no idea of the existence of snapshot, so when writing this
document I turned to depends on users to delete snapshots before or after
deleting a domain (like shutdown, destroy, save, migrate away). User should
know where memory is saved, and disk snapshot related info.

Chunyan

>  
> >  
> > 4. General workflow 
> >  
> > Create a snapshot: 
> >   * parse user cfg file if passed in 
> >   * check snapshot operation is allowed or not 
> >   * save domain, saving memory status to file (refer to: save_domain) 
> >   * take disk snapshot (e.g. call qmp command) 
> >   * unpause domain 
> >  
> > Revert to snapshot: 
> >   * parse use cfg file (xl doesn't manage snapshots, so it has no 
> >     idea of snapshot existence. User MUST supply configuration file) 
> >   * destroy this domain 
> >   * create a new domain from snapshot info 
> >     - apply disk snapshot (e.g. call qemu-img) 
> >     - a process like restore domain 
>  
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-18 15:27   ` Ian Campbell
@ 2014-12-19  6:58     ` Chun Yan Liu
  2014-12-19 10:38       ` Ian Campbell
  0 siblings, 1 reply; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-19  6:58 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 12/18/2014 at 11:27 PM, in message <1418916436.11882.101.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: 
> > Changes to V8: 
> >   * remove libxl_domain_snapshot_create/delete/revert API 
> >   * export disk snapshot functionality for both xl and libvirt usage 
> >  
> > =========================================================================== 
> > Libxl/libxlu Design 
> >  
> > 1. New Structures 
> >  
> > libxl_disk_snapshot = Struct("disk_snapshot",[ 
> >     # target disk 
> >     ("disk",            libxl_device_disk), 
> >  
> >     # disk snapshot name 
> >     ("name",            string), 
> >  
> >     # internal/external disk snapshot? 
> >     ("external",        bool), 
> >  
> >     # for external disk snapshot, specify following two field 
> >     ("external_format", string), 
> >     ("external_path",   string), 
>  
> Should this be a KeyedUnion over a new LIBXL_DISK_SNAPSHOT_KIND enum 
> (with values INTERNAL and EXTERNAL)?
The KeyedUnion seems to be unnecessary. Only EXTERNAL has data items,
INTERNAL doesn't, and no third types.

> This would automatically make the 
> binding between external==true and the fields which depend on that. 
>  
> external_format should be of type libxl_disk_format, unless it is 
> referring to something else? 

Yes. That's right. I'll update.

>  
> Is it possible for format to differ from the format of the underlying 
> disk? Perhaps taking a snapshot of a raw disk as a qcow?

This is related to implementation details. As I understand qemu's
implementation, taking an external disk snapshot is actually a way:
origin domain disk: a raw disk
external= true, external_format: qcow2, external_path: test
a), create a qcow2 file (test.qcow2) with  backing file (the raw disk)
b), replace domain disk, now domain uses test.qcow2 (the raw disk
     is actually to be the snapshot)

So, I think the external_format can only be those supporting backing file.

> In any case 
> passing in UNKNOWN and letting libxl choose (probably by picking the 
> same as the underlying disk) should be supported.

If external_format is not passed (NULL), by default, we will use qcow2.

>  
> > /*  This API might not be used by xl, since xl won't take care of deleting 
> >  *  snapshots. But for libvirt, since libvirt manages snapshots and will 
> >  *  delete snapshot, this API will be used. 
> >  */ 
> > int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid, 
> >                                libxl_disk_snapshot *snapshot, int nb); 
>  
> The three usecases I mentioned in the previous mail are important here, 
> because depending on which usecases you are considering there maybe a 
> many to one relationship between domains and a given snapshot (gold 
> image case). This interface cannot support that I think.

I'm not quite clear about the three usecases, especially the 3rd usercase,
so really not sure what's the requirement towards deleting disk snapshot.
 
>  
> When we discussed this in previous iterations I suggested a libxl 
> command to tell a VM that it needed to reexamine its disks to see if any 
> of the chains had changed. I'm sure that's not the only potential answer 
> though.
 
About delete disk snapshot in a snapshot chain, whether we need to do
extra work to avoid data break, it can be discussed:
a). For external snapshots, usually it's based on backing file chain, qemu
does this, vhd-util does this. In this case, to delete a domain snapshot,
one doesn't need to do anything to disk (no need to delete disk snapshot
at all). Downside is, there might be a long backing chain.
b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't support
snapshot of snapshot, so out of scope. For qcow2, delete any disk snapshot
won't affect others.

>  
> > int libxl_disk_to_snapshot(libxl_ctx *ctx, uint32_t domid, 
> >                            libxl_disk_snapshot **snapshot, int *num); 
> >  
> >     This is for domain snapshot create. If user doesn't specify disks, 
> >     then by default it will take internal disk snapshot to each domain 
> >     disk. This function will fill libxl_disk_snapshot according to domain 
> >     disks info. 
>  
> Is this just a helper to produce an array to pass to 
> libxl_disk_snapshot_create? Or does it actually do stuff? 
>  
> I think it's the former, but it could be clarified.

Yes, the former.

> I *think* this is 
> just a special case of libxl_device_disk_list which returns plausible 
> snapshot objects instead of the disks themselves. 

So we prefer adding codes to libxl_device_disk_list rather than adding
a new API, right?

> >  
> > For disk snapshot revert, no qmp command for that, it always calls 
> > external commands to finish the work, so put in libxlu (?): 
>  
> I think rather than "no qmp" the issue is that "revert" is (at least as 
> far as libxl knows) essentially, destroy, rollback disks, restore from 
> RAM snapshot. So there is no qemu to speak to during the rollback. Is 
> that right? 

Yes, that's right.

>  
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 3/4] domain snapshot design: xl
  2014-12-18 15:15   ` Ian Campbell
@ 2014-12-19  7:03     ` Chun Yan Liu
  2014-12-19 10:27       ` Ian Campbell
  0 siblings, 1 reply; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-19  7:03 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 12/18/2014 at 11:15 PM, in message <1418915759.11882.91.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: 
> > Changes to V8: 
> >   * xl won't manage snapshots, that means it won't maintain json files, 
> >     won't maintain snapshot chain relationship, and then as a result 
> >     won't take care of deleting snapshot and listing snapshots. 
> >   * remove snapshot-delete and snapshot-list interface 
> >   * update snapshot-revert interface 
> >   * update snapshot-create/revert implementaion 
> >  
> > =========================================================================== 
> >  
> > XL Design 
> >  
> > 1. User Interface 
> >  
> > xl snapshot-create: 
> >   Create a snapshot (disk and RAM) of a domain. 
> >  
> >   SYNOPSIS: 
> >     snapshot-create <domain> [<cfgfile>] [--name <string>] [--live] 
> >  
> >   OPTIONS: 
> >     --name <string>  snapshot name 
> >     --live           take a live snapshot 
> >  
> >     If option includes --live, then the domain is not paused while creating 
> >     the snapshot, like live migration. This increases size of the memory 
> >     dump file, but reducess downtime of the guest. 
> >  
> >     If option doens't include --name, a default name will be generated 
> >     according to the creation time. 
> >  
> >     If specify @cfgfile, use cfgfile. (e.g. if --name specifies a name, 
> >     meanwhile there is name specified in cfgfile, name in cfgfile will 
> >     be used.) 
>  
> If you do not specify cfgfile then where do things go? Is --name perhaps 
> also serving as a basename for a path to save things to?

If user doesn't specify cfgfile, then by default, it will save memory to
a default path  and take internal disk snapshot to each disk with a default
disk snapshot name.

'--name' meant to give a meaningful name (like: newinstall. Used as the
memory snapshot name and disk snapshot name). About saving path,
we meant to have a default path, but now I think it's better to let user
specify a path as you suggests here.
 
>  
> I wonder if we can simplify this a bit and therefore avoid the need cfg 
> file in common case. e.g. 
>  
> ----8<------- 
>   xl snapshot-create [--live] [--internal|--external] <domain> <path> 
>  
> <path> is a path to a directory, which must not exist. This path will be 
> created and the memory snapshot stored in it using some well defined 
> name. 
>  
> If the snapshot is --external then the snapshot disks are created in 
> <path> with some well defined name based on the virtual device and 
> backend format in use. 
>  
> If the snapshot is --internal then the snapshot disks are internal. 

That's good. Then we need to add some description to tell users about
the auto-generated domain snapshot name, disk snapshot name,
memory state file and external disk snapshot files, etc.

>  
> The default is <TBD, based on backends in use?> 
>  
> --live is as you've already got it. 
> ----8<------- 
>  
> I'm not sure what name and description actually are here, so I've 
> omitted them. 
>  
> Any of that could be overridden with e.g. more specific disk 
> configuration stanzas etc via the cfg file, if necessary. 
>  
> This is just a suggestion, feel free to disagree.
>  
> NB xl create can accept a series of cfg file lines on the command line 
> too, and treats them as appended to the cfgfile (if any was given). I 
> think xl snapshot-create should include this functionality too. 
> > xl snapshot-revert: 
> >   Revert domain to status of a snapshot. 
> >  
> >   SYNOPSIS: 
> >       snapshot-revert <domain> <cfgfile> [--running] [--force] 
> >  
> >   OPTIONS: 
> >     --running        after reverting, change state to running 
> >     --force          try harder on risky reverts 
> >  
> >     Normally, the domain will revert to the same state the domain was in  
> while 
> >     the snapshot was taken (whether running, or paused). 
> >  
> >     If option includes --running, then overrides the snapshot state to 
> >     guarantee a running domain after the revert. 
> >  
> >  
> >  
> > 2. cfgfile syntax 
> >  
> > #snapshot name. If user doesn't provide a VM snapshot name, xl will  
> generate 
> > #a name automatically by the creation time. 
> > name="" 
>  
> What is name used for? (I guessed it might be output path above, but I'm 
> not sure). 

As above '--name':
It suggests as a meaningful snapshot name. But as you suggest, if we let
user to specify a path, then that <path> can take a meaning path name,
then everything in it can be auto-generated.

>  
> > #snapshot description. Default is NULL. 
> > description="" 
>  
> What is this used for? Is it embedded into e.g. qcow images or is it 
> just for the admin's own use (in which cases the existing support for 
> #comments ought to suffice). 

Oh, I forgot to delete it. It's originally used for manage snapshots, a
text description to the snapshot.

>  
> > #memory location. This field should be filled when memory=1. Default is  
> NULL. 
>  
> The memory option isn't defined anywhere, and I think you've rules 
> disk-only snapshots out of scope for xl, so I think this is just a left 
> over from a previous revision. 

Oh, yes, it's an old comment. I should delete the memory option part.
Sorry for that.

>  
> > #e.g. to specify exernal disk snapshot, like this: 
> > #disks=['/tmp/hda_snapshot.qcow2,qcow2,hda', 
> >         '/tmp/hdb_snapshot.qcow2,qcow2,hdb',] 
> >  
> > #e.g. to specify internal disk snapshot, like this: 
> > disks=[',,hda',',,hdb',] 
>  
> Ideally one or the other of these behaviours would be possible without 
> needing to be quite so explicit.

OK, I'll delete one.

Chunyan

>  
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2014-12-19  5:45     ` Chun Yan Liu
@ 2014-12-19 10:25       ` Ian Campbell
  2014-12-23  3:42         ` Chun Yan Liu
  0 siblings, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2014-12-19 10:25 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: wei.liu2, Jim Fehlig, ian.jackson, xen-devel

On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote:
> 
> >>> On 12/18/2014 at 11:10 PM, in message <1418915443.11882.86.camel@citrix.com>,
> Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: 
> > > Changes to V8: 
> > >   * add an overview document, so that one can has a overall look 
> > >     about the whole domain snapshot work, limits, requirements, 
> > >     how to do, etc. 
> > >  
> > > ===================================================================== 
> > > Domain snapshot overview 
> >  
> > I don't see a similar section for disk snapshots, are you not 
> > considering those here except as a part of a domain snapshot or is this 
> > an oversight? 
> >  
> > There are three main use cases (that I know of at least) for 
> > snapshotting like behaviour. 
> >  
> > One is as you've mentioned below for "backup", i.e. to preserve the VM 
> > at a certain point in time in order to be able to roll back to it. Is 
> > this the only usecase you are considering? 
> 
> Yes. I didn't take disk snapshot thing into the scope.
> 
> >  
> > A second use case is to support "gold image" type deployments, i.e. 
> > where you create one baseline single disk image and then clone it 
> > multiple times to deploy lots of guests. I think this is usually a "disk 
> > snapshot" type thing, but maybe it can be implemented as restoring a 
> > gold domain snapshot multiple times (e.g. for start of day performance 
> > reasons). 
> 
> As we initially discussed about the thing, disk snapshot thing can be done
> be existing tools directly like qemu-img, vhd-util.

I was reading this section as a more generic overview of snapshotting,
without reference to where/how things might ultimately be implemented.

>From a design point of view it would be useful to cover the various use
cases, even if the solution is that the user implements them using CLI
tools by hand (xl) or the toolstack does it for them internally
(libvirt).

This way we can more clearly see the full picture, which allows us to
validate that we are making the right choices about what goes where.

> > The third case, (which is similar to the first), is taking a disk 
> > snapshot in order to be able to run you usual backup software on the 
> > snapshot (which is now unchanging, which is handy) and then deleting the 
> > disk snapshot (this differs from the first case in which disk is active 
> > after the snapshot, and due to the lack of the memory part). 
> 
> Sorry, I'm still not quite clear about what this user case wants to do.

The user has an active domain which they want to backup, but backup
software often does not cope well if the data is changing under its
feet.

So the userswants to take a snapshot of the domains disks while leaving
the domain running, so they can backup that static version of the disk
out of band from the VM itself (e.g. by attaching it to a separate
backup VM).

This may require a guest agent to quiesce the disks.

> >  
> > > * ability to parse user config file 
> > >  
> > >   [2] Disk snapshot requirements: 
> > >   - external tools: qemu-img, lvcreate, vhd-util, etc. 
> > >   - for basic goal, we support 'raw' and 'qcow2' backend types 
> > >     only. Then it requires: 
> > >     libxl qmp command or "qemu-img" (when qemu process does not 
> > >     exist) 
> > >  
> > >  
> > > 3. Interaction with other operations: 
> > >  
> > > No. 
> >  
> > What about shutdown/dying as you noted above? What about migration or 
> > regular save/restore? 
> 
> Since xl now has no idea of the existence of snapshot,

what about libvirt? This section is an overview, so making toolstack
specific assumptions is confusing.

>  so when writing this
> document I turned to depends on users to delete snapshots before or after
> deleting a domain (like shutdown, destroy, save, migrate away). User should
> know where memory is saved, and disk snapshot related info.

What I meant was what happens if you try to snapshot a domain while it
is being shutdown or being migrated? There clearly has to be some sort
of interaction, even if it is "there is a global toolstack lock" or "the
user is advised not to do this".

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 3/4] domain snapshot design: xl
  2014-12-19  7:03     ` Chun Yan Liu
@ 2014-12-19 10:27       ` Ian Campbell
  2014-12-22  8:52         ` Chun Yan Liu
  0 siblings, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2014-12-19 10:27 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel

On Fri, 2014-12-19 at 00:03 -0700, Chun Yan Liu wrote:

> '--name' meant to give a meaningful name (like: newinstall. Used as the
> memory snapshot name and disk snapshot name).

Where is this name stored and when and where would it be presented to
the user?

> That's good. Then we need to add some description to tell users about
> the auto-generated domain snapshot name, disk snapshot name,
> memory state file and external disk snapshot files, etc.

We will need user docs and manpage updates, yes.

> > > #e.g. to specify exernal disk snapshot, like this: 
> > > #disks=['/tmp/hda_snapshot.qcow2,qcow2,hda', 
> > >         '/tmp/hdb_snapshot.qcow2,qcow2,hdb',] 
> > >  
> > > #e.g. to specify internal disk snapshot, like this: 
> > > disks=[',,hda',',,hdb',] 
> >  
> > Ideally one or the other of these behaviours would be possible without 
> > needing to be quite so explicit.
> 
> OK, I'll delete one.

I don't object to having this more capable syntax as an option, so the
user can override things if they wish, all I was suggesting is that the
default ought to be something useful so the user doesn't need to say
anything if they just want the toolstack to "do something sensible".

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-19  6:58     ` Chun Yan Liu
@ 2014-12-19 10:38       ` Ian Campbell
  2014-12-22  9:36         ` Chun Yan Liu
                           ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Ian Campbell @ 2014-12-19 10:38 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel

On Thu, 2014-12-18 at 23:58 -0700, Chun Yan Liu wrote:
> 
> >>> On 12/18/2014 at 11:27 PM, in message <1418916436.11882.101.camel@citrix.com>,
> Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote: 
> > > Changes to V8: 
> > >   * remove libxl_domain_snapshot_create/delete/revert API 
> > >   * export disk snapshot functionality for both xl and libvirt usage 
> > >  
> > > =========================================================================== 
> > > Libxl/libxlu Design 
> > >  
> > > 1. New Structures 
> > >  
> > > libxl_disk_snapshot = Struct("disk_snapshot",[ 
> > >     # target disk 
> > >     ("disk",            libxl_device_disk), 
> > >  
> > >     # disk snapshot name 
> > >     ("name",            string), 
> > >  
> > >     # internal/external disk snapshot? 
> > >     ("external",        bool), 
> > >  
> > >     # for external disk snapshot, specify following two field 
> > >     ("external_format", string), 
> > >     ("external_path",   string), 
> >  
> > Should this be a KeyedUnion over a new LIBXL_DISK_SNAPSHOT_KIND enum 
> > (with values INTERNAL and EXTERNAL)?
> The KeyedUnion seems to be unnecessary. Only EXTERNAL has data items,
> INTERNAL doesn't, and no third types.
> 
> > This would automatically make the 
> > binding between external==true and the fields which depend on that. 
> >  
> > external_format should be of type libxl_disk_format, unless it is 
> > referring to something else? 
> 
> Yes. That's right. I'll update.
> 
> >  
> > Is it possible for format to differ from the format of the underlying 
> > disk? Perhaps taking a snapshot of a raw disk as a qcow?
> 
> This is related to implementation details. As I understand qemu's
> implementation, taking an external disk snapshot is actually a way:
> origin domain disk: a raw disk
> external= true, external_format: qcow2, external_path: test
> a), create a qcow2 file (test.qcow2) with  backing file (the raw disk)
> b), replace domain disk, now domain uses test.qcow2 (the raw disk
>      is actually to be the snapshot)
> 
> So, I think the external_format can only be those supporting backing file.

Not sure what you mean here.

What about a phy snapshot via lvm snapshotting?

> > In any case 
> > passing in UNKNOWN and letting libxl choose (probably by picking the 
> > same as the underlying disk) should be supported.
> 
> If external_format is not passed (NULL), by default, we will use qcow2.

I think you need to base this on the type of the original disk, if it is
e.g. vhd then making a qcow snapshot seems a bit odd.

> 
> >  
> > > /*  This API might not be used by xl, since xl won't take care of deleting 
> > >  *  snapshots. But for libvirt, since libvirt manages snapshots and will 
> > >  *  delete snapshot, this API will be used. 
> > >  */ 
> > > int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid, 
> > >                                libxl_disk_snapshot *snapshot, int nb); 
> >  
> > The three usecases I mentioned in the previous mail are important here, 
> > because depending on which usecases you are considering there maybe a 
> > many to one relationship between domains and a given snapshot (gold 
> > image case). This interface cannot support that I think.
> 
> I'm not quite clear about the three usecases, especially the 3rd usercase,
> so really not sure what's the requirement towards deleting disk snapshot.

I hope my reply to the previous mail helped clear this up a bit. The
reason deleting a disk is interesting is because that is what you would
do after the backup was finished.

> > When we discussed this in previous iterations I suggested a libxl 
> > command to tell a VM that it needed to reexamine its disks to see if any 
> > of the chains had changed. I'm sure that's not the only potential answer 
> > though.
>  
> About delete disk snapshot in a snapshot chain, whether we need to do
> extra work to avoid data break, it can be discussed:
> a). For external snapshots, usually it's based on backing file chain, qemu
> does this, vhd-util does this. In this case, to delete a domain snapshot,
> one doesn't need to do anything to disk (no need to delete disk snapshot
> at all). Downside is, there might be a long backing chain.

I'm not sure what you mean here I'm afraid. If you are deleting a domain
snapshot why do you not want to delete the disk snapshots associated
with it?

> b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't support
> snapshot of snapshot, so out of scope. For qcow2, delete any disk snapshot
> won't affect others.

For either internal or external if you are removing a snapshot from the
middle of a chain which ends in one or more active disks, then surely
the disk backend associated with those domains need to get some sort of
notification, otherwise they would need to be written *very* carefully
in order to be able to cope with disk metadata changing under their
feet.

Are you saying that the qemu/qcow implementation has indeed been written
with this in mind and can cope with arbitrary other processes modifying
the qcow metadata under their feet?

e.g. 
BASE---SNAPSHOT A---SNAPSHOT B --- domain 1
                 `--SNAPSHOT C --- domain 2

If SNAPSHOT B and C are in active use then I would expect the deletion
of SNAPSHOT A would need to notify the backends associated with domain 1
and domain 2 somehow, so they don't get very confused.

It's possible that this relates to a use case which you aren't intending
to address (e.g. the gold image use case), in which case it might be out
of scope here.
  
Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 3/4] domain snapshot design: xl
  2014-12-19 10:27       ` Ian Campbell
@ 2014-12-22  8:52         ` Chun Yan Liu
  2015-01-08 11:59           ` Ian Campbell
  0 siblings, 1 reply; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-22  8:52 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 12/19/2014 at 06:27 PM, in message <1418984856.20028.17.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Fri, 2014-12-19 at 00:03 -0700, Chun Yan Liu wrote: 
>  
> > '--name' meant to give a meaningful name (like: newinstall. Used as the 
> > memory snapshot name and disk snapshot name). 
>  
> Where is this name stored and when and where would it be presented to 
> the user?
e.g. For qcow2 internal disk snapshot, this name is stored within the disk.
When user wants to delete internal disk snapshot, it will be:
#qemu-img snapshot -d name disk

>  
> > That's good. Then we need to add some description to tell users about 
> > the auto-generated domain snapshot name, disk snapshot name, 
> > memory state file and external disk snapshot files, etc. 
>  
> We will need user docs and manpage updates, yes. 
>  
> > > > #e.g. to specify exernal disk snapshot, like this:  
> > > > #disks=['/tmp/hda_snapshot.qcow2,qcow2,hda',  
> > > >         '/tmp/hdb_snapshot.qcow2,qcow2,hdb',]  
> > > >   
> > > > #e.g. to specify internal disk snapshot, like this:  
> > > > disks=[',,hda',',,hdb',]  
> > >   
> > > Ideally one or the other of these behaviours would be possible without  
> > > needing to be quite so explicit. 
> >  
> > OK, I'll delete one. 
>  
> I don't object to having this more capable syntax as an option, so the 
> user can override things if they wish, all I was suggesting is that the 
> default ought to be something useful so the user doesn't need to say 
> anything if they just want the toolstack to "do something sensible". 

I see. By default user doesn't need to specify 'disks' at all, then xl
will do internal disk snapshot to each domain disk.

>  
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-19 10:38       ` Ian Campbell
  2014-12-22  9:36         ` Chun Yan Liu
@ 2014-12-22  9:36         ` Chun Yan Liu
  2015-01-08 12:11           ` Ian Campbell
  2015-01-08 12:11           ` Ian Campbell
  2014-12-22  9:53         ` Chun Yan Liu
  2 siblings, 2 replies; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-22  9:36 UTC (permalink / raw)
  To: Ian Campbell
  Cc: kwolf, wei.liu2, ian.jackson, qemu-devel, xen-devel, Jim Fehlig,
	stefanha



>>> On 12/19/2014 at 06:38 PM, in message <1418985490.20028.27.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Thu, 2014-12-18 at 23:58 -0700, Chun Yan Liu wrote: 
> >  
> > >>> On 12/18/2014 at 11:27 PM, in message  
> <1418916436.11882.101.camel@citrix.com>, 
> > Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:  
> > > > Changes to V8:  
> > > >   * remove libxl_domain_snapshot_create/delete/revert API  
> > > >   * export disk snapshot functionality for both xl and libvirt usage  
> > > >   
> > > >  
> ===========================================================================  
> > > > Libxl/libxlu Design  
> > > >   
> > > > 1. New Structures  
> > > >   
> > > > libxl_disk_snapshot = Struct("disk_snapshot",[  
> > > >     # target disk  
> > > >     ("disk",            libxl_device_disk),  
> > > >   
> > > >     # disk snapshot name  
> > > >     ("name",            string),  
> > > >   
> > > >     # internal/external disk snapshot?  
> > > >     ("external",        bool),  
> > > >   
> > > >     # for external disk snapshot, specify following two field  
> > > >     ("external_format", string),  
> > > >     ("external_path",   string),  
> > >   
> > > Should this be a KeyedUnion over a new LIBXL_DISK_SNAPSHOT_KIND enum  
> > > (with values INTERNAL and EXTERNAL)? 
> > The KeyedUnion seems to be unnecessary. Only EXTERNAL has data items, 
> > INTERNAL doesn't, and no third types. 
> >  
> > > This would automatically make the  
> > > binding between external==true and the fields which depend on that.  
> > >   
> > > external_format should be of type libxl_disk_format, unless it is  
> > > referring to something else?  
> >  
> > Yes. That's right. I'll update. 
> >  
> > >   
> > > Is it possible for format to differ from the format of the underlying  
> > > disk? Perhaps taking a snapshot of a raw disk as a qcow? 
> >  
> > This is related to implementation details. As I understand qemu's 
> > implementation, taking an external disk snapshot is actually a way: 
> > origin domain disk: a raw disk 
> > external= true, external_format: qcow2, external_path: test 
> > a), create a qcow2 file (test.qcow2) with  backing file (the raw disk) 
> > b), replace domain disk, now domain uses test.qcow2 (the raw disk 
> >      is actually to be the snapshot) 
> >  
> > So, I think the external_format can only be those supporting backing file. 
>  
> Not sure what you mean here. 
>  
> What about a phy snapshot via lvm snapshotting? 
>  
> > > In any case  
> > > passing in UNKNOWN and letting libxl choose (probably by picking the  
> > > same as the underlying disk) should be supported. 
> >  
> > If external_format is not passed (NULL), by default, we will use qcow2. 
>  
> I think you need to base this on the type of the original disk, if it is 
> e.g. vhd then making a qcow snapshot seems a bit odd. 
>  
> >  
> > >   
> > > > /*  This API might not be used by xl, since xl won't take care of  
> deleting  
> > > >  *  snapshots. But for libvirt, since libvirt manages snapshots and will  
> > > >  *  delete snapshot, this API will be used.  
> > > >  */  
> > > > int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid,  
> > > >                                libxl_disk_snapshot *snapshot, int nb);  
> > >   
> > > The three usecases I mentioned in the previous mail are important here,  
> > > because depending on which usecases you are considering there maybe a  
> > > many to one relationship between domains and a given snapshot (gold  
> > > image case). This interface cannot support that I think. 
> >  
> > I'm not quite clear about the three usecases, especially the 3rd usercase, 
> > so really not sure what's the requirement towards deleting disk snapshot. 
>  
> I hope my reply to the previous mail helped clear this up a bit. The 
> reason deleting a disk is interesting is because that is what you would 
> do after the backup was finished. 
>  
> > > When we discussed this in previous iterations I suggested a libxl  
> > > command to tell a VM that it needed to reexamine its disks to see if any  
> > > of the chains had changed. I'm sure that's not the only potential answer  
> > > though. 
> >   
> > About delete disk snapshot in a snapshot chain, whether we need to do 
> > extra work to avoid data break, it can be discussed: 
> > a). For external snapshots, usually it's based on backing file chain, qemu 
> > does this, vhd-util does this. In this case, to delete a domain snapshot, 
> > one doesn't need to do anything to disk (no need to delete disk snapshot 
> > at all). Downside is, there might be a long backing chain. 
>  
> I'm not sure what you mean here I'm afraid. If you are deleting a domain 
> snapshot why do you not want to delete the disk snapshots associated 
> with it? 
>  
> > b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't support 
> > snapshot of snapshot, so out of scope. For qcow2, delete any disk snapshot 
> > won't affect others. 
>  
> For either internal or external if you are removing a snapshot from the 
> middle of a chain which ends in one or more active disks, then surely 
> the disk backend associated with those domains need to get some sort of 
> notification, otherwise they would need to be written *very* carefully 
> in order to be able to cope with disk metadata changing under their 
> feet. 
>  
> Are you saying that the qemu/qcow implementation has indeed been written 
> with this in mind and can cope with arbitrary other processes modifying 
> the qcow metadata under their feet? 

Yes.

I add qemu-devel Kevin and Stefan in this thread in case my understanding
has somewhere wrong.

Kevin & Stefan,

About the qcow2 snapshot implementation,  in following snapshot chain case,
if we delete SNAPSHOT A, will it affect domain 1 and domain 2 which uses
SNAPSHOT B and SNAPSHOT C?

>From my understanding, creating a snapshot will increases refcount of original data,
deleting a snapshot only descreases the refcount (won't delete data until the refcount
becomes 0), so I think it won't affect domain 1 and domain 2.
Is that right?

Thanks,
Chunyan

>  
> e.g.  
> BASE---SNAPSHOT A---SNAPSHOT B --- domain 1 
>                  `--SNAPSHOT C --- domain 2 
>  
> If SNAPSHOT B and C are in active use then I would expect the deletion 
> of SNAPSHOT A would need to notify the backends associated with domain 1 
> and domain 2 somehow, so they don't get very confused. 
>  
> It's possible that this relates to a use case which you aren't intending 
> to address (e.g. the gold image use case), in which case it might be out 
> of scope here. 
>    
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-19 10:38       ` Ian Campbell
@ 2014-12-22  9:36         ` Chun Yan Liu
  2014-12-22  9:36         ` [Qemu-devel] " Chun Yan Liu
  2014-12-22  9:53         ` Chun Yan Liu
  2 siblings, 0 replies; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-22  9:36 UTC (permalink / raw)
  To: Ian Campbell
  Cc: kwolf, wei.liu2, ian.jackson, qemu-devel, xen-devel, Jim Fehlig,
	stefanha



>>> On 12/19/2014 at 06:38 PM, in message <1418985490.20028.27.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Thu, 2014-12-18 at 23:58 -0700, Chun Yan Liu wrote: 
> >  
> > >>> On 12/18/2014 at 11:27 PM, in message  
> <1418916436.11882.101.camel@citrix.com>, 
> > Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:  
> > > > Changes to V8:  
> > > >   * remove libxl_domain_snapshot_create/delete/revert API  
> > > >   * export disk snapshot functionality for both xl and libvirt usage  
> > > >   
> > > >  
> ===========================================================================  
> > > > Libxl/libxlu Design  
> > > >   
> > > > 1. New Structures  
> > > >   
> > > > libxl_disk_snapshot = Struct("disk_snapshot",[  
> > > >     # target disk  
> > > >     ("disk",            libxl_device_disk),  
> > > >   
> > > >     # disk snapshot name  
> > > >     ("name",            string),  
> > > >   
> > > >     # internal/external disk snapshot?  
> > > >     ("external",        bool),  
> > > >   
> > > >     # for external disk snapshot, specify following two field  
> > > >     ("external_format", string),  
> > > >     ("external_path",   string),  
> > >   
> > > Should this be a KeyedUnion over a new LIBXL_DISK_SNAPSHOT_KIND enum  
> > > (with values INTERNAL and EXTERNAL)? 
> > The KeyedUnion seems to be unnecessary. Only EXTERNAL has data items, 
> > INTERNAL doesn't, and no third types. 
> >  
> > > This would automatically make the  
> > > binding between external==true and the fields which depend on that.  
> > >   
> > > external_format should be of type libxl_disk_format, unless it is  
> > > referring to something else?  
> >  
> > Yes. That's right. I'll update. 
> >  
> > >   
> > > Is it possible for format to differ from the format of the underlying  
> > > disk? Perhaps taking a snapshot of a raw disk as a qcow? 
> >  
> > This is related to implementation details. As I understand qemu's 
> > implementation, taking an external disk snapshot is actually a way: 
> > origin domain disk: a raw disk 
> > external= true, external_format: qcow2, external_path: test 
> > a), create a qcow2 file (test.qcow2) with  backing file (the raw disk) 
> > b), replace domain disk, now domain uses test.qcow2 (the raw disk 
> >      is actually to be the snapshot) 
> >  
> > So, I think the external_format can only be those supporting backing file. 
>  
> Not sure what you mean here. 
>  
> What about a phy snapshot via lvm snapshotting? 
>  
> > > In any case  
> > > passing in UNKNOWN and letting libxl choose (probably by picking the  
> > > same as the underlying disk) should be supported. 
> >  
> > If external_format is not passed (NULL), by default, we will use qcow2. 
>  
> I think you need to base this on the type of the original disk, if it is 
> e.g. vhd then making a qcow snapshot seems a bit odd. 
>  
> >  
> > >   
> > > > /*  This API might not be used by xl, since xl won't take care of  
> deleting  
> > > >  *  snapshots. But for libvirt, since libvirt manages snapshots and will  
> > > >  *  delete snapshot, this API will be used.  
> > > >  */  
> > > > int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid,  
> > > >                                libxl_disk_snapshot *snapshot, int nb);  
> > >   
> > > The three usecases I mentioned in the previous mail are important here,  
> > > because depending on which usecases you are considering there maybe a  
> > > many to one relationship between domains and a given snapshot (gold  
> > > image case). This interface cannot support that I think. 
> >  
> > I'm not quite clear about the three usecases, especially the 3rd usercase, 
> > so really not sure what's the requirement towards deleting disk snapshot. 
>  
> I hope my reply to the previous mail helped clear this up a bit. The 
> reason deleting a disk is interesting is because that is what you would 
> do after the backup was finished. 
>  
> > > When we discussed this in previous iterations I suggested a libxl  
> > > command to tell a VM that it needed to reexamine its disks to see if any  
> > > of the chains had changed. I'm sure that's not the only potential answer  
> > > though. 
> >   
> > About delete disk snapshot in a snapshot chain, whether we need to do 
> > extra work to avoid data break, it can be discussed: 
> > a). For external snapshots, usually it's based on backing file chain, qemu 
> > does this, vhd-util does this. In this case, to delete a domain snapshot, 
> > one doesn't need to do anything to disk (no need to delete disk snapshot 
> > at all). Downside is, there might be a long backing chain. 
>  
> I'm not sure what you mean here I'm afraid. If you are deleting a domain 
> snapshot why do you not want to delete the disk snapshots associated 
> with it? 
>  
> > b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't support 
> > snapshot of snapshot, so out of scope. For qcow2, delete any disk snapshot 
> > won't affect others. 
>  
> For either internal or external if you are removing a snapshot from the 
> middle of a chain which ends in one or more active disks, then surely 
> the disk backend associated with those domains need to get some sort of 
> notification, otherwise they would need to be written *very* carefully 
> in order to be able to cope with disk metadata changing under their 
> feet. 
>  
> Are you saying that the qemu/qcow implementation has indeed been written 
> with this in mind and can cope with arbitrary other processes modifying 
> the qcow metadata under their feet? 

Yes.

I add qemu-devel Kevin and Stefan in this thread in case my understanding
has somewhere wrong.

Kevin & Stefan,

About the qcow2 snapshot implementation,  in following snapshot chain case,
if we delete SNAPSHOT A, will it affect domain 1 and domain 2 which uses
SNAPSHOT B and SNAPSHOT C?

>From my understanding, creating a snapshot will increases refcount of original data,
deleting a snapshot only descreases the refcount (won't delete data until the refcount
becomes 0), so I think it won't affect domain 1 and domain 2.
Is that right?

Thanks,
Chunyan

>  
> e.g.  
> BASE---SNAPSHOT A---SNAPSHOT B --- domain 1 
>                  `--SNAPSHOT C --- domain 2 
>  
> If SNAPSHOT B and C are in active use then I would expect the deletion 
> of SNAPSHOT A would need to notify the backends associated with domain 1 
> and domain 2 somehow, so they don't get very confused. 
>  
> It's possible that this relates to a use case which you aren't intending 
> to address (e.g. the gold image use case), in which case it might be out 
> of scope here. 
>    
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-19 10:38       ` Ian Campbell
  2014-12-22  9:36         ` Chun Yan Liu
  2014-12-22  9:36         ` [Qemu-devel] " Chun Yan Liu
@ 2014-12-22  9:53         ` Chun Yan Liu
  2 siblings, 0 replies; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-22  9:53 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 12/19/2014 at 06:38 PM, in message <1418985490.20028.27.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Thu, 2014-12-18 at 23:58 -0700, Chun Yan Liu wrote: 
> >  
> > >>> On 12/18/2014 at 11:27 PM, in message  
> <1418916436.11882.101.camel@citrix.com>, 
> > Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:  
> > > > Changes to V8:  
> > > >   * remove libxl_domain_snapshot_create/delete/revert API  
> > > >   * export disk snapshot functionality for both xl and libvirt usage  
> > > >   
> > > >  
> ===========================================================================  
> > > > Libxl/libxlu Design  
> > > >   
> > > > 1. New Structures  
> > > >   
> > > > libxl_disk_snapshot = Struct("disk_snapshot",[  
> > > >     # target disk  
> > > >     ("disk",            libxl_device_disk),  
> > > >   
> > > >     # disk snapshot name  
> > > >     ("name",            string),  
> > > >   
> > > >     # internal/external disk snapshot?  
> > > >     ("external",        bool),  
> > > >   
> > > >     # for external disk snapshot, specify following two field  
> > > >     ("external_format", string),  
> > > >     ("external_path",   string),  
> > >   
> > > Should this be a KeyedUnion over a new LIBXL_DISK_SNAPSHOT_KIND enum  
> > > (with values INTERNAL and EXTERNAL)? 
> > The KeyedUnion seems to be unnecessary. Only EXTERNAL has data items, 
> > INTERNAL doesn't, and no third types. 
> >  
> > > This would automatically make the  
> > > binding between external==true and the fields which depend on that.  
> > >   
> > > external_format should be of type libxl_disk_format, unless it is  
> > > referring to something else?  
> >  
> > Yes. That's right. I'll update. 
> >  
> > >   
> > > Is it possible for format to differ from the format of the underlying  
> > > disk? Perhaps taking a snapshot of a raw disk as a qcow? 
> >  
> > This is related to implementation details. As I understand qemu's 
> > implementation, taking an external disk snapshot is actually a way: 
> > origin domain disk: a raw disk 
> > external= true, external_format: qcow2, external_path: test 
> > a), create a qcow2 file (test.qcow2) with  backing file (the raw disk) 
> > b), replace domain disk, now domain uses test.qcow2 (the raw disk 
> >      is actually to be the snapshot) 
> >  
> > So, I think the external_format can only be those supporting backing file.

Well, yeah, I should correct. This is only valid for creating external snapshot by
'qemu-img' tool, not fit for lvm and vhd, which have  their own snapshot
functionality tools.

For lvm, the snapshot can be done by 'lvcreate snapshot', snapshot file is also
'lvm' format;

For vhd, the snapshot can be done by 'vht-util snapshot' , then snapshot file
is still vhd format; snapshot also can be done by 'qemu-img snapshot', then the
external format should be a format supported by qemu-img and supporting
backing file.

>  
> Not sure what you mean here. 
>  
> What about a phy snapshot via lvm snapshotting? 
>  
> > > In any case  
> > > passing in UNKNOWN and letting libxl choose (probably by picking the  
> > > same as the underlying disk) should be supported. 
> >  
> > If external_format is not passed (NULL), by default, we will use qcow2. 
>  
> I think you need to base this on the type of the original disk, if it is 
> e.g. vhd then making a qcow snapshot seems a bit odd. 

Agree. For vhd and lvm which have tools other than 'qemu-img', should be
treated differently. For those creating snapshot by 'qemu-img', I think using
'qcow2' by default is reasonable according to qemu's implementation.

- Chunyan

>  
> > >   
> > > > /*  This API might not be used by xl, since xl won't take care of  
> deleting  
> > > >  *  snapshots. But for libvirt, since libvirt manages snapshots and will  
> > > >  *  delete snapshot, this API will be used.  
> > > >  */  
> > > > int libxl_disk_snapshot_delete(libxl_ctx *ctx, uint32_t domid,  
> > > >                                libxl_disk_snapshot *snapshot, int nb);  
> > >   
> > > The three usecases I mentioned in the previous mail are important here,  
> > > because depending on which usecases you are considering there maybe a  
> > > many to one relationship between domains and a given snapshot (gold  
> > > image case). This interface cannot support that I think. 
> >  
> > I'm not quite clear about the three usecases, especially the 3rd usercase, 
> > so really not sure what's the requirement towards deleting disk snapshot. 
>  
> I hope my reply to the previous mail helped clear this up a bit. The 
> reason deleting a disk is interesting is because that is what you would 
> do after the backup was finished. 
>  
> > > When we discussed this in previous iterations I suggested a libxl  
> > > command to tell a VM that it needed to reexamine its disks to see if any  
> > > of the chains had changed. I'm sure that's not the only potential answer  
> > > though. 
> >   
> > About delete disk snapshot in a snapshot chain, whether we need to do 
> > extra work to avoid data break, it can be discussed: 
> > a). For external snapshots, usually it's based on backing file chain, qemu 
> > does this, vhd-util does this. In this case, to delete a domain snapshot, 
> > one doesn't need to do anything to disk (no need to delete disk snapshot 
> > at all). Downside is, there might be a long backing chain. 
>  
> I'm not sure what you mean here I'm afraid. If you are deleting a domain 
> snapshot why do you not want to delete the disk snapshots associated 
> with it? 
>  
> > b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't support 
> > snapshot of snapshot, so out of scope. For qcow2, delete any disk snapshot 
> > won't affect others. 
>  
> For either internal or external if you are removing a snapshot from the 
> middle of a chain which ends in one or more active disks, then surely 
> the disk backend associated with those domains need to get some sort of 
> notification, otherwise they would need to be written *very* carefully 
> in order to be able to cope with disk metadata changing under their 
> feet. 
>  
> Are you saying that the qemu/qcow implementation has indeed been written 
> with this in mind and can cope with arbitrary other processes modifying 
> the qcow metadata under their feet? 
>  
> e.g.  
> BASE---SNAPSHOT A---SNAPSHOT B --- domain 1 
>                  `--SNAPSHOT C --- domain 2 
>  
> If SNAPSHOT B and C are in active use then I would expect the deletion 
> of SNAPSHOT A would need to notify the backends associated with domain 1 
> and domain 2 somehow, so they don't get very confused. 
>  
> It's possible that this relates to a use case which you aren't intending 
> to address (e.g. the gold image use case), in which case it might be out 
> of scope here. 
>    
> Ian. 
> >  
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2014-12-19 10:25       ` Ian Campbell
@ 2014-12-23  3:42         ` Chun Yan Liu
  2015-01-08 12:26           ` Ian Campbell
  0 siblings, 1 reply; 38+ messages in thread
From: Chun Yan Liu @ 2014-12-23  3:42 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 12/19/2014 at 06:25 PM, in message <1418984720.20028.15.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote: 
> >  
> > >>> On 12/18/2014 at 11:10 PM, in message  
> <1418915443.11882.86.camel@citrix.com>, 
> > Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:  
> > > > Changes to V8:  
> > > >   * add an overview document, so that one can has a overall look  
> > > >     about the whole domain snapshot work, limits, requirements,  
> > > >     how to do, etc.  
> > > >   
> > > > =====================================================================  
> > > > Domain snapshot overview  
> > >   
> > > I don't see a similar section for disk snapshots, are you not  
> > > considering those here except as a part of a domain snapshot or is this  
> > > an oversight?  
> > >   
> > > There are three main use cases (that I know of at least) for  
> > > snapshotting like behaviour.  
> > >   
> > > One is as you've mentioned below for "backup", i.e. to preserve the VM  
> > > at a certain point in time in order to be able to roll back to it. Is  
> > > this the only usecase you are considering?  
> >  
> > Yes. I didn't take disk snapshot thing into the scope. 
> >  
> > >   
> > > A second use case is to support "gold image" type deployments, i.e.  
> > > where you create one baseline single disk image and then clone it  
> > > multiple times to deploy lots of guests. I think this is usually a "disk  
> > > snapshot" type thing, but maybe it can be implemented as restoring a  
> > > gold domain snapshot multiple times (e.g. for start of day performance  
> > > reasons).  
> >  
> > As we initially discussed about the thing, disk snapshot thing can be done 
> > be existing tools directly like qemu-img, vhd-util. 
>  
> I was reading this section as a more generic overview of snapshotting, 
> without reference to where/how things might ultimately be implemented. 
>  
> From a design point of view it would be useful to cover the various use 
> cases, even if the solution is that the user implements them using CLI 
> tools by hand (xl) or the toolstack does it for them internally 
> (libvirt). 
>  
> This way we can more clearly see the full picture, which allows us to 
> validate that we are making the right choices about what goes where. 

OK. I see. I think this user case is more like how to use the snapshot, rather
than how to implement snapshot. Right?
'Gold image' or 'Gold domain', the needed work is more like cloning disks.

>  
> > > The third case, (which is similar to the first), is taking a disk  
> > > snapshot in order to be able to run you usual backup software on the  
> > > snapshot (which is now unchanging, which is handy) and then deleting the  
> > > disk snapshot (this differs from the first case in which disk is active  
> > > after the snapshot, and due to the lack of the memory part).  
> >  
> > Sorry, I'm still not quite clear about what this user case wants to do. 
>  
> The user has an active domain which they want to backup, but backup 
> software often does not cope well if the data is changing under its 
> feet. 
>  
> So the users wants to take a snapshot of the domains disks while leaving 
> the domain running, so they can backup that static version of the disk 
> out of band from the VM itself (e.g. by attaching it to a separate 
> backup VM). 

Got it. So that's simply disk-only snapshot when domian is active. As you
mentioned below, that needs guest agent to quiesce the disks. But currently
xen hypervisor can't support that, right?

>  
> This may require a guest agent to quiesce the disks. 
>  
> > >   
> > > > * ability to parse user config file  
> > > >   
> > > >   [2] Disk snapshot requirements:  
> > > >   - external tools: qemu-img, lvcreate, vhd-util, etc.  
> > > >   - for basic goal, we support 'raw' and 'qcow2' backend types  
> > > >     only. Then it requires:  
> > > >     libxl qmp command or "qemu-img" (when qemu process does not  
> > > >     exist)  
> > > >   
> > > >   
> > > > 3. Interaction with other operations:  
> > > >   
> > > > No.  
> > >   
> > > What about shutdown/dying as you noted above? What about migration or  
> > > regular save/restore?  
> >  
> > Since xl now has no idea of the existence of snapshot, 
>  
> what about libvirt? This section is an overview, so making toolstack 
> specific assumptions is confusing.

Understand. I think most questions here are about a general overview vs a xl
specific view. Which I provided is xl specific, which you suggested is a
general overview. I'll update.
 
>  
> >  so when writing this 
> > document I turned to depends on users to delete snapshots before or after 
> > deleting a domain (like shutdown, destroy, save, migrate away). User should 
> > know where memory is saved, and disk snapshot related info. 
>  
> What I meant was what happens if you try to snapshot a domain while it 
> is being shutdown or being migrated?

Ah, see. I should add words here. As described above, snapshot is not supported
when domain is being shutdown or dying.

Thanks very much for your precious time before holiday.
Merry Christmas! 

-Chunyan

> There clearly has to be some sort 
> of interaction, even if it is "there is a global toolstack lock" or "the 
> user is advised not to do this". 
>  
> Ian. 
>  
>  
> _______________________________________________ 
> Xen-devel mailing list 
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 3/4] domain snapshot design: xl
  2014-12-22  8:52         ` Chun Yan Liu
@ 2015-01-08 11:59           ` Ian Campbell
  0 siblings, 0 replies; 38+ messages in thread
From: Ian Campbell @ 2015-01-08 11:59 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel

On Mon, 2014-12-22 at 01:52 -0700, Chun Yan Liu wrote:
> 
> >>> On 12/19/2014 at 06:27 PM, in message <1418984856.20028.17.camel@citrix.com>,
> Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> > On Fri, 2014-12-19 at 00:03 -0700, Chun Yan Liu wrote: 
> >  
> > > '--name' meant to give a meaningful name (like: newinstall. Used as the 
> > > memory snapshot name and disk snapshot name). 
> >  
> > Where is this name stored and when and where would it be presented to 
> > the user?
> e.g. For qcow2 internal disk snapshot, this name is stored within the disk.
> When user wants to delete internal disk snapshot, it will be:
> #qemu-img snapshot -d name disk

Makes sense, thanks. Can you clarify in the doc with something like
"name is used as an identifier in the underlying storage backend".

Does it have to be unique? Sounds like it does.

> > > That's good. Then we need to add some description to tell users about 
> > > the auto-generated domain snapshot name, disk snapshot name, 
> > > memory state file and external disk snapshot files, etc. 
> >  
> > We will need user docs and manpage updates, yes. 
> >  
> > > > > #e.g. to specify exernal disk snapshot, like this:  
> > > > > #disks=['/tmp/hda_snapshot.qcow2,qcow2,hda',  
> > > > >         '/tmp/hdb_snapshot.qcow2,qcow2,hdb',]  
> > > > >   
> > > > > #e.g. to specify internal disk snapshot, like this:  
> > > > > disks=[',,hda',',,hdb',]  
> > > >   
> > > > Ideally one or the other of these behaviours would be possible without  
> > > > needing to be quite so explicit. 
> > >  
> > > OK, I'll delete one. 
> >  
> > I don't object to having this more capable syntax as an option, so the 
> > user can override things if they wish, all I was suggesting is that the 
> > default ought to be something useful so the user doesn't need to say 
> > anything if they just want the toolstack to "do something sensible". 
> 
> I see. By default user doesn't need to specify 'disks' at all, then xl
> will do internal disk snapshot to each domain disk.

Right, or it could do an external snapshot to the directory it has been
told the snapshot should live in.

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-22  9:36         ` [Qemu-devel] " Chun Yan Liu
@ 2015-01-08 12:11           ` Ian Campbell
  2015-01-09  9:59               ` Chun Yan Liu
  2015-01-08 12:11           ` Ian Campbell
  1 sibling, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2015-01-08 12:11 UTC (permalink / raw)
  To: Chun Yan Liu
  Cc: kwolf, wei.liu2, ian.jackson, qemu-devel, xen-devel, Jim Fehlig,
	stefanha

On Mon, 2014-12-22 at 02:36 -0700, Chun Yan Liu wrote:
> > > b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't support 
> > > snapshot of snapshot, so out of scope. For qcow2, delete any disk snapshot 
> > > won't affect others. 
> >  
> > For either internal or external if you are removing a snapshot from the 
> > middle of a chain which ends in one or more active disks, then surely 
> > the disk backend associated with those domains need to get some sort of 
> > notification, otherwise they would need to be written *very* carefully 
> > in order to be able to cope with disk metadata changing under their 
> > feet. 
> >  
> > Are you saying that the qemu/qcow implementation has indeed been written 
> > with this in mind and can cope with arbitrary other processes modifying 
> > the qcow metadata under their feet? 
> 
> Yes.
> 
> I add qemu-devel Kevin and Stefan in this thread in case my understanding
> has somewhere wrong.
> 
> Kevin & Stefan,
> 
> About the qcow2 snapshot implementation,  in following snapshot chain case,
> if we delete SNAPSHOT A, will it affect domain 1 and domain 2 which uses
> SNAPSHOT B and SNAPSHOT C?
> 
> From my understanding, creating a snapshot will increases refcount of original data,
> deleting a snapshot only descreases the refcount (won't delete data until the refcount
> becomes 0), so I think it won't affect domain 1 and domain 2.
> Is that right?

I'm not worried about the data being deleted (I'm sure qcow2 will get
that right), but rather about the snapshot chain being collapsed and
therefore the metadata (e.g. the tables of which block is in which
backing file, and perhaps the location of the data itself) changing
while a domain is running, e.g.

BASE---SNAPSHOT A---SNAPSHOT B --- domain 1 
                 `--SNAPSHOT C --- domain 2 

becoming 

BASE----------------SNAPSHOT B --- domain 1 
                 `--SNAPSHOT C --- domain 2 

(essentially changing B and C's tables to accommodate the lack of A)

For an internal snapshot I can see that it would be sensible (and easy)
to keep A around as a "ghost", to avoid this case, and the need to
perhaps move data around or duplicate it.

If A were external though an admin might think they could delete the
file...

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2014-12-22  9:36         ` [Qemu-devel] " Chun Yan Liu
  2015-01-08 12:11           ` Ian Campbell
@ 2015-01-08 12:11           ` Ian Campbell
  1 sibling, 0 replies; 38+ messages in thread
From: Ian Campbell @ 2015-01-08 12:11 UTC (permalink / raw)
  To: Chun Yan Liu
  Cc: kwolf, wei.liu2, ian.jackson, qemu-devel, xen-devel, Jim Fehlig,
	stefanha

On Mon, 2014-12-22 at 02:36 -0700, Chun Yan Liu wrote:
> > > b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't support 
> > > snapshot of snapshot, so out of scope. For qcow2, delete any disk snapshot 
> > > won't affect others. 
> >  
> > For either internal or external if you are removing a snapshot from the 
> > middle of a chain which ends in one or more active disks, then surely 
> > the disk backend associated with those domains need to get some sort of 
> > notification, otherwise they would need to be written *very* carefully 
> > in order to be able to cope with disk metadata changing under their 
> > feet. 
> >  
> > Are you saying that the qemu/qcow implementation has indeed been written 
> > with this in mind and can cope with arbitrary other processes modifying 
> > the qcow metadata under their feet? 
> 
> Yes.
> 
> I add qemu-devel Kevin and Stefan in this thread in case my understanding
> has somewhere wrong.
> 
> Kevin & Stefan,
> 
> About the qcow2 snapshot implementation,  in following snapshot chain case,
> if we delete SNAPSHOT A, will it affect domain 1 and domain 2 which uses
> SNAPSHOT B and SNAPSHOT C?
> 
> From my understanding, creating a snapshot will increases refcount of original data,
> deleting a snapshot only descreases the refcount (won't delete data until the refcount
> becomes 0), so I think it won't affect domain 1 and domain 2.
> Is that right?

I'm not worried about the data being deleted (I'm sure qcow2 will get
that right), but rather about the snapshot chain being collapsed and
therefore the metadata (e.g. the tables of which block is in which
backing file, and perhaps the location of the data itself) changing
while a domain is running, e.g.

BASE---SNAPSHOT A---SNAPSHOT B --- domain 1 
                 `--SNAPSHOT C --- domain 2 

becoming 

BASE----------------SNAPSHOT B --- domain 1 
                 `--SNAPSHOT C --- domain 2 

(essentially changing B and C's tables to accommodate the lack of A)

For an internal snapshot I can see that it would be sensible (and easy)
to keep A around as a "ghost", to avoid this case, and the need to
perhaps move data around or duplicate it.

If A were external though an admin might think they could delete the
file...

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2014-12-23  3:42         ` Chun Yan Liu
@ 2015-01-08 12:26           ` Ian Campbell
  2015-01-12  7:01             ` Chun Yan Liu
  0 siblings, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2015-01-08 12:26 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel

On Mon, 2014-12-22 at 20:42 -0700, Chun Yan Liu wrote:
> 
> >>> On 12/19/2014 at 06:25 PM, in message <1418984720.20028.15.camel@citrix.com>,
> Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> > On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote: 
> > >  
> > > >>> On 12/18/2014 at 11:10 PM, in message  
> > <1418915443.11882.86.camel@citrix.com>, 
> > > Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > > > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:  
> > > > > Changes to V8:  
> > > > >   * add an overview document, so that one can has a overall look  
> > > > >     about the whole domain snapshot work, limits, requirements,  
> > > > >     how to do, etc.  
> > > > >   
> > > > > =====================================================================  
> > > > > Domain snapshot overview  
> > > >   
> > > > I don't see a similar section for disk snapshots, are you not  
> > > > considering those here except as a part of a domain snapshot or is this  
> > > > an oversight?  
> > > >   
> > > > There are three main use cases (that I know of at least) for  
> > > > snapshotting like behaviour.  
> > > >   
> > > > One is as you've mentioned below for "backup", i.e. to preserve the VM  
> > > > at a certain point in time in order to be able to roll back to it. Is  
> > > > this the only usecase you are considering?  
> > >  
> > > Yes. I didn't take disk snapshot thing into the scope. 
> > >  
> > > >   
> > > > A second use case is to support "gold image" type deployments, i.e.  
> > > > where you create one baseline single disk image and then clone it  
> > > > multiple times to deploy lots of guests. I think this is usually a "disk  
> > > > snapshot" type thing, but maybe it can be implemented as restoring a  
> > > > gold domain snapshot multiple times (e.g. for start of day performance  
> > > > reasons).  
> > >  
> > > As we initially discussed about the thing, disk snapshot thing can be done 
> > > be existing tools directly like qemu-img, vhd-util. 
> >  
> > I was reading this section as a more generic overview of snapshotting, 
> > without reference to where/how things might ultimately be implemented. 
> >  
> > From a design point of view it would be useful to cover the various use 
> > cases, even if the solution is that the user implements them using CLI 
> > tools by hand (xl) or the toolstack does it for them internally 
> > (libvirt). 
> >  
> > This way we can more clearly see the full picture, which allows us to 
> > validate that we are making the right choices about what goes where. 
> 
> OK. I see. I think this user case is more like how to use the snapshot, rather
> than how to implement snapshot. Right?

Correct, what the user is actually trying to achieve with the
functionality.

> 'Gold image' or 'Gold domain', the needed work is more like cloning disks.

Yes, or resuming multiple times.

> > > > The third case, (which is similar to the first), is taking a disk  
> > > > snapshot in order to be able to run you usual backup software on the  
> > > > snapshot (which is now unchanging, which is handy) and then deleting the  
> > > > disk snapshot (this differs from the first case in which disk is active  
> > > > after the snapshot, and due to the lack of the memory part).  
> > >  
> > > Sorry, I'm still not quite clear about what this user case wants to do. 
> >  
> > The user has an active domain which they want to backup, but backup 
> > software often does not cope well if the data is changing under its 
> > feet. 
> >  
> > So the users wants to take a snapshot of the domains disks while leaving 
> > the domain running, so they can backup that static version of the disk 
> > out of band from the VM itself (e.g. by attaching it to a separate 
> > backup VM). 
> 
> Got it. So that's simply disk-only snapshot when domian is active. As you
> mentioned below, that needs guest agent to quiesce the disks. But currently
> xen hypervisor can't support that, right?

I don't think that's relevant right now, let me explain:

I think it's important to consider all the use cases for snapshotting,
not because I think they need to be implemented now but to make sure
that we don't make any design decisions now which would make it
*impossible* to implement it in the future (at least without API
changes).

As a random example, we would want to avoid designing a libxl API where
it is impossible to send the quiesce request at the right point for some
reason.

So we need to consider these use cases now and have the design, but not
necessarily the implementation, be able to deal with them, or at least
to convince ourselves we most likely aren't tying our hands for future
work.

> Merry Christmas! 

And (retrospectively) to you!

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC V9 4/4] domain snapshot design: libxl/libxlu
  2015-01-08 12:11           ` Ian Campbell
@ 2015-01-09  9:59               ` Chun Yan Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Chun Yan Liu @ 2015-01-09  9:59 UTC (permalink / raw)
  To: Ian Campbell
  Cc: kwolf, wei.liu2, ian.jackson, qemu-devel, xen-devel, Jim Fehlig,
	stefanha



>>> On 1/8/2015 at 08:11 PM, in message <1420719107.19787.53.camel@citrix.com>, Ian
Campbell <Ian.Campbell@citrix.com> wrote: 
> On Mon, 2014-12-22 at 02:36 -0700, Chun Yan Liu wrote: 
> > > > b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't  
> support  
> > > > snapshot of snapshot, so out of scope. For qcow2, delete any disk  
> snapshot  
> > > > won't affect others.  
> > >   
> > > For either internal or external if you are removing a snapshot from the  
> > > middle of a chain which ends in one or more active disks, then surely  
> > > the disk backend associated with those domains need to get some sort of  
> > > notification, otherwise they would need to be written *very* carefully  
> > > in order to be able to cope with disk metadata changing under their  
> > > feet.  
> > >   
> > > Are you saying that the qemu/qcow implementation has indeed been written  
> > > with this in mind and can cope with arbitrary other processes modifying  
> > > the qcow metadata under their feet?  
> >  
> > Yes. 
> >  
> > I add qemu-devel Kevin and Stefan in this thread in case my understanding 
> > has somewhere wrong. 
> >  
> > Kevin & Stefan, 
> >  
> > About the qcow2 snapshot implementation,  in following snapshot chain case, 
> > if we delete SNAPSHOT A, will it affect domain 1 and domain 2 which uses 
> > SNAPSHOT B and SNAPSHOT C? 
> >  
> > From my understanding, creating a snapshot will increases refcount of  
> original data, 
> > deleting a snapshot only descreases the refcount (won't delete data until  
> the refcount 
> > becomes 0), so I think it won't affect domain 1 and domain 2. 
> > Is that right? 
>  
> I'm not worried about the data being deleted (I'm sure qcow2 will get 
> that right), but rather about the snapshot chain being collapsed and 
> therefore the metadata (e.g. the tables of which block is in which 
> backing file, and perhaps the location of the data itself) changing 
> while a domain is running, e.g. 
>  
> BASE---SNAPSHOT A---SNAPSHOT B --- domain 1  
>                  `--SNAPSHOT C --- domain 2  
>  
> becoming  
>  
> BASE----------------SNAPSHOT B --- domain 1  
>                  `--SNAPSHOT C --- domain 2  
>  
> (essentially changing B and C's tables to accommodate the lack of A) 
>  
> For an internal snapshot I can see that it would be sensible (and easy) 
> to keep A around as a "ghost", to avoid this case, and the need to 
> perhaps move data around or duplicate it. 

This is what we talked about a lot :-)

If I understand correctly, qcow2 internal snapshot implementation
should be like this:
Taking a snapshot is increasing a new table which records the data
clusters index in this snapshot, at the same time increase the refcount
of those data clusters;
Teleting a snapshot only decrease the refcount (not deleting the data,
only when refcount=0 the data will be deleted.) and remove the map table.

So, I think deleting a snapshot even within a snapshot chain, won't
affect other snapshots. ( Otherwise, even if that's not the case, how can
we move internal data around or duplicate it? )

>  
> If A were external though an admin might think they could delete the 
> file...

About external, generally, external disk snapshot are made by backing file,
qemu, vhd-util and lvm are all done that way. Like:
    snapshot A -> snapshot B -> Current image
A is B's backing file, B is current image's backing file, as backing file, it
should not be changed, otherwise if A is changed or deleted, B is corrupted.
So, a delete operation is actually a merge process, like 'virsh blockcommit':
shorten disk image chain by live merging the current active disk content.

To external disk snapshot: revert (may need to clone the snapshot out)
and delete (merge in fact as mentioned above). The complexity is
internal/external mixing case (take internal snapshot, revert, take
external snapshot, ...)

Chunyan

>
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 4/4] domain snapshot design: libxl/libxlu
@ 2015-01-09  9:59               ` Chun Yan Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Chun Yan Liu @ 2015-01-09  9:59 UTC (permalink / raw)
  To: Ian Campbell
  Cc: kwolf, wei.liu2, ian.jackson, qemu-devel, xen-devel, Jim Fehlig,
	stefanha



>>> On 1/8/2015 at 08:11 PM, in message <1420719107.19787.53.camel@citrix.com>, Ian
Campbell <Ian.Campbell@citrix.com> wrote: 
> On Mon, 2014-12-22 at 02:36 -0700, Chun Yan Liu wrote: 
> > > > b). For internal snapshot, like qcow2, lvm too. For lvm, it doesn't  
> support  
> > > > snapshot of snapshot, so out of scope. For qcow2, delete any disk  
> snapshot  
> > > > won't affect others.  
> > >   
> > > For either internal or external if you are removing a snapshot from the  
> > > middle of a chain which ends in one or more active disks, then surely  
> > > the disk backend associated with those domains need to get some sort of  
> > > notification, otherwise they would need to be written *very* carefully  
> > > in order to be able to cope with disk metadata changing under their  
> > > feet.  
> > >   
> > > Are you saying that the qemu/qcow implementation has indeed been written  
> > > with this in mind and can cope with arbitrary other processes modifying  
> > > the qcow metadata under their feet?  
> >  
> > Yes. 
> >  
> > I add qemu-devel Kevin and Stefan in this thread in case my understanding 
> > has somewhere wrong. 
> >  
> > Kevin & Stefan, 
> >  
> > About the qcow2 snapshot implementation,  in following snapshot chain case, 
> > if we delete SNAPSHOT A, will it affect domain 1 and domain 2 which uses 
> > SNAPSHOT B and SNAPSHOT C? 
> >  
> > From my understanding, creating a snapshot will increases refcount of  
> original data, 
> > deleting a snapshot only descreases the refcount (won't delete data until  
> the refcount 
> > becomes 0), so I think it won't affect domain 1 and domain 2. 
> > Is that right? 
>  
> I'm not worried about the data being deleted (I'm sure qcow2 will get 
> that right), but rather about the snapshot chain being collapsed and 
> therefore the metadata (e.g. the tables of which block is in which 
> backing file, and perhaps the location of the data itself) changing 
> while a domain is running, e.g. 
>  
> BASE---SNAPSHOT A---SNAPSHOT B --- domain 1  
>                  `--SNAPSHOT C --- domain 2  
>  
> becoming  
>  
> BASE----------------SNAPSHOT B --- domain 1  
>                  `--SNAPSHOT C --- domain 2  
>  
> (essentially changing B and C's tables to accommodate the lack of A) 
>  
> For an internal snapshot I can see that it would be sensible (and easy) 
> to keep A around as a "ghost", to avoid this case, and the need to 
> perhaps move data around or duplicate it. 

This is what we talked about a lot :-)

If I understand correctly, qcow2 internal snapshot implementation
should be like this:
Taking a snapshot is increasing a new table which records the data
clusters index in this snapshot, at the same time increase the refcount
of those data clusters;
Teleting a snapshot only decrease the refcount (not deleting the data,
only when refcount=0 the data will be deleted.) and remove the map table.

So, I think deleting a snapshot even within a snapshot chain, won't
affect other snapshots. ( Otherwise, even if that's not the case, how can
we move internal data around or duplicate it? )

>  
> If A were external though an admin might think they could delete the 
> file...

About external, generally, external disk snapshot are made by backing file,
qemu, vhd-util and lvm are all done that way. Like:
    snapshot A -> snapshot B -> Current image
A is B's backing file, B is current image's backing file, as backing file, it
should not be changed, otherwise if A is changed or deleted, B is corrupted.
So, a delete operation is actually a merge process, like 'virsh blockcommit':
shorten disk image chain by live merging the current active disk content.

To external disk snapshot: revert (may need to clone the snapshot out)
and delete (merge in fact as mentioned above). The complexity is
internal/external mixing case (take internal snapshot, revert, take
external snapshot, ...)

Chunyan

>
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2015-01-08 12:26           ` Ian Campbell
@ 2015-01-12  7:01             ` Chun Yan Liu
  2015-01-12 13:54               ` Ian Campbell
  0 siblings, 1 reply; 38+ messages in thread
From: Chun Yan Liu @ 2015-01-12  7:01 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 1/8/2015 at 08:26 PM, in message <1420719995.19787.62.camel@citrix.com>, Ian
Campbell <Ian.Campbell@citrix.com> wrote: 
> On Mon, 2014-12-22 at 20:42 -0700, Chun Yan Liu wrote: 
> >  
> > >>> On 12/19/2014 at 06:25 PM, in message  
> <1418984720.20028.15.camel@citrix.com>, 
> > Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > > On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote:  
> > > >   
> > > > >>> On 12/18/2014 at 11:10 PM, in message   
> > > <1418915443.11882.86.camel@citrix.com>,  
> > > > Ian Campbell <Ian.Campbell@citrix.com> wrote:   
> > > > > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:   
> > > > > > Changes to V8:   
> > > > > >   * add an overview document, so that one can has a overall look   
> > > > > >     about the whole domain snapshot work, limits, requirements,   
> > > > > >     how to do, etc.   
> > > > > >    
> > > > > > =====================================================================   
> > > > > > Domain snapshot overview   
> > > > >    
> > > > > I don't see a similar section for disk snapshots, are you not   
> > > > > considering those here except as a part of a domain snapshot or is this   
>  
> > > > > an oversight?   
> > > > >    
> > > > > There are three main use cases (that I know of at least) for   
> > > > > snapshotting like behaviour.   
> > > > >    
> > > > > One is as you've mentioned below for "backup", i.e. to preserve the VM   
> > > > > at a certain point in time in order to be able to roll back to it. Is   
> > > > > this the only usecase you are considering?   
> > > >   
> > > > Yes. I didn't take disk snapshot thing into the scope.  
> > > >   
> > > > >    
> > > > > A second use case is to support "gold image" type deployments, i.e.   
> > > > > where you create one baseline single disk image and then clone it   
> > > > > multiple times to deploy lots of guests. I think this is usually a "disk  
>   
> > > > > snapshot" type thing, but maybe it can be implemented as restoring a   
> > > > > gold domain snapshot multiple times (e.g. for start of day performance   
> > > > > reasons).   
> > > >   
> > > > As we initially discussed about the thing, disk snapshot thing can be  
> done  
> > > > be existing tools directly like qemu-img, vhd-util.  
> > >   
> > > I was reading this section as a more generic overview of snapshotting,  
> > > without reference to where/how things might ultimately be implemented.  
> > >   
> > > From a design point of view it would be useful to cover the various use  
> > > cases, even if the solution is that the user implements them using CLI  
> > > tools by hand (xl) or the toolstack does it for them internally  
> > > (libvirt).  
> > >   
> > > This way we can more clearly see the full picture, which allows us to  
> > > validate that we are making the right choices about what goes where.  
> >  
> > OK. I see. I think this user case is more like how to use the snapshot,  
> rather 
> > than how to implement snapshot. Right? 
>  
> Correct, what the user is actually trying to achieve with the 
> functionality. 
>  
> > 'Gold image' or 'Gold domain', the needed work is more like cloning disks. 
>  
> Yes, or resuming multiple times. 

I see. But IMO it doesn't need change in snapshot design and implementation.
Even resuming multiple times, they couldn't use the same image but duplicate
the image multiple times.

>  
> > > > > The third case, (which is similar to the first), is taking a disk   
> > > > > snapshot in order to be able to run you usual backup software on the   
> > > > > snapshot (which is now unchanging, which is handy) and then deleting the  
>   
> > > > > disk snapshot (this differs from the first case in which disk is active   
>  
> > > > > after the snapshot, and due to the lack of the memory part).   
> > > >   
> > > > Sorry, I'm still not quite clear about what this user case wants to do.  
> > >   
> > > The user has an active domain which they want to backup, but backup  
> > > software often does not cope well if the data is changing under its  
> > > feet.  
> > >   
> > > So the users wants to take a snapshot of the domains disks while leaving  
> > > the domain running, so they can backup that static version of the disk  
> > > out of band from the VM itself (e.g. by attaching it to a separate  
> > > backup VM).  
> >  
> > Got it. So that's simply disk-only snapshot when domian is active. As you 
> > mentioned below, that needs guest agent to quiesce the disks. But currently 
> > xen hypervisor can't support that, right? 
>  
> I don't think that's relevant right now, let me explain: 
>  
> I think it's important to consider all the use cases for snapshotting, 
> not because I think they need to be implemented now but to make sure 
> that we don't make any design decisions now which would make it 
> *impossible* to implement it in the future (at least without API 
> changes). 
>  
> As a random example, we would want to avoid designing a libxl API where 
> it is impossible to send the quiesce request at the right point for some 
> reason. 
>  
> So we need to consider these use cases now and have the design, but not 
> necessarily the implementation, be able to deal with them, or at least 
> to convince ourselves we most likely aren't tying our hands for future 
> work.

Understand. If this user case is included in design, then I think
libxl_disk_snapshot_create is not enough, better to have
libxl_domain_snapshot_create, in future if guest agent is implemented,
the process would be:
if 'disk-only':
   pause domain;
   drain cache data to disk;
   take disk snapshot;
   resume domain;
else:
   save memory;
   take disk snapshot;
   resume domain.

And in 'xl xnapshot-create' snap.cfg, reserve 'memory' and 'memory_path'
for future extension.

Will update. Thanks.

>  
> > Merry Christmas!  
>  
> And (retrospectively) to you! 
>  
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2015-01-12  7:01             ` Chun Yan Liu
@ 2015-01-12 13:54               ` Ian Campbell
  2015-01-14  3:12                 ` Chun Yan Liu
  0 siblings, 1 reply; 38+ messages in thread
From: Ian Campbell @ 2015-01-12 13:54 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel

On Mon, 2015-01-12 at 00:01 -0700, Chun Yan Liu wrote:
> 
> >>> On 1/8/2015 at 08:26 PM, in message <1420719995.19787.62.camel@citrix.com>, Ian
> Campbell <Ian.Campbell@citrix.com> wrote: 
> > On Mon, 2014-12-22 at 20:42 -0700, Chun Yan Liu wrote: 
> > >  
> > > >>> On 12/19/2014 at 06:25 PM, in message  
> > <1418984720.20028.15.camel@citrix.com>, 
> > > Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > > > On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote:  
> > > > >   
> > > > > >>> On 12/18/2014 at 11:10 PM, in message   
> > > > <1418915443.11882.86.camel@citrix.com>,  
> > > > > Ian Campbell <Ian.Campbell@citrix.com> wrote:   
> > > > > > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:   
> > > > > > > Changes to V8:   
> > > > > > >   * add an overview document, so that one can has a overall look   
> > > > > > >     about the whole domain snapshot work, limits, requirements,   
> > > > > > >     how to do, etc.   
> > > > > > >    
> > > > > > > =====================================================================   
> > > > > > > Domain snapshot overview   
> > > > > >    
> > > > > > I don't see a similar section for disk snapshots, are you not   
> > > > > > considering those here except as a part of a domain snapshot or is this   
> >  
> > > > > > an oversight?   
> > > > > >    
> > > > > > There are three main use cases (that I know of at least) for   
> > > > > > snapshotting like behaviour.   
> > > > > >    
> > > > > > One is as you've mentioned below for "backup", i.e. to preserve the VM   
> > > > > > at a certain point in time in order to be able to roll back to it. Is   
> > > > > > this the only usecase you are considering?   
> > > > >   
> > > > > Yes. I didn't take disk snapshot thing into the scope.  
> > > > >   
> > > > > >    
> > > > > > A second use case is to support "gold image" type deployments, i.e.   
> > > > > > where you create one baseline single disk image and then clone it   
> > > > > > multiple times to deploy lots of guests. I think this is usually a "disk  
> >   
> > > > > > snapshot" type thing, but maybe it can be implemented as restoring a   
> > > > > > gold domain snapshot multiple times (e.g. for start of day performance   
> > > > > > reasons).   
> > > > >   
> > > > > As we initially discussed about the thing, disk snapshot thing can be  
> > done  
> > > > > be existing tools directly like qemu-img, vhd-util.  
> > > >   
> > > > I was reading this section as a more generic overview of snapshotting,  
> > > > without reference to where/how things might ultimately be implemented.  
> > > >   
> > > > From a design point of view it would be useful to cover the various use  
> > > > cases, even if the solution is that the user implements them using CLI  
> > > > tools by hand (xl) or the toolstack does it for them internally  
> > > > (libvirt).  
> > > >   
> > > > This way we can more clearly see the full picture, which allows us to  
> > > > validate that we are making the right choices about what goes where.  
> > >  
> > > OK. I see. I think this user case is more like how to use the snapshot,  
> > rather 
> > > than how to implement snapshot. Right? 
> >  
> > Correct, what the user is actually trying to achieve with the 
> > functionality. 
> >  
> > > 'Gold image' or 'Gold domain', the needed work is more like cloning disks. 
> >  
> > Yes, or resuming multiple times. 
> 
> I see. But IMO it doesn't need change in snapshot design and implementation.
> Even resuming multiple times, they couldn't use the same image but duplicate
> the image multiple times.

Perhaps, but the use case should be included so that this rationale for
not worrying about it can be written down (so that people like me don't
keep asking...) 

> 
> >  
> > > > > > The third case, (which is similar to the first), is taking a disk   
> > > > > > snapshot in order to be able to run you usual backup software on the   
> > > > > > snapshot (which is now unchanging, which is handy) and then deleting the  
> >   
> > > > > > disk snapshot (this differs from the first case in which disk is active   
> >  
> > > > > > after the snapshot, and due to the lack of the memory part).   
> > > > >   
> > > > > Sorry, I'm still not quite clear about what this user case wants to do.  
> > > >   
> > > > The user has an active domain which they want to backup, but backup  
> > > > software often does not cope well if the data is changing under its  
> > > > feet.  
> > > >   
> > > > So the users wants to take a snapshot of the domains disks while leaving  
> > > > the domain running, so they can backup that static version of the disk  
> > > > out of band from the VM itself (e.g. by attaching it to a separate  
> > > > backup VM).  
> > >  
> > > Got it. So that's simply disk-only snapshot when domian is active. As you 
> > > mentioned below, that needs guest agent to quiesce the disks. But currently 
> > > xen hypervisor can't support that, right? 
> >  
> > I don't think that's relevant right now, let me explain: 
> >  
> > I think it's important to consider all the use cases for snapshotting, 
> > not because I think they need to be implemented now but to make sure 
> > that we don't make any design decisions now which would make it 
> > *impossible* to implement it in the future (at least without API 
> > changes). 
> >  
> > As a random example, we would want to avoid designing a libxl API where 
> > it is impossible to send the quiesce request at the right point for some 
> > reason. 
> >  
> > So we need to consider these use cases now and have the design, but not 
> > necessarily the implementation, be able to deal with them, or at least 
> > to convince ourselves we most likely aren't tying our hands for future 
> > work.
> 
> Understand. If this user case is included in design, then I think
> libxl_disk_snapshot_create is not enough, better to have
> libxl_domain_snapshot_create, in future if guest agent is implemented,
> the process would be:
> if 'disk-only':
>    pause domain;
>    drain cache data to disk;
>    take disk snapshot;
>    resume domain;
> else:
>    save memory;
>    take disk snapshot;
>    resume domain.

You don't mention the poking of the agent here, I think it comes before
the if, or maybe just after in the disk-only case (it can't come after
the pause)?

It might be that a mechanism for quiescing all disks + pausing would be
sufficient, meaning you could leave libxl_disk_snapshot_create as it is
and implement the snapshot for backup similarly as quiesce+pause +
libxl_disk_snapshot create.

Either way by considering this usecase now we can decide whether
libxl_disk_snapshot_create is sufficient or whether we should go with
lixbxl_domain_snapshot_create from day one.

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC V9 2/4] domain snapshot overview
  2015-01-12 13:54               ` Ian Campbell
@ 2015-01-14  3:12                 ` Chun Yan Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Chun Yan Liu @ 2015-01-14  3:12 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Jim Fehlig, wei.liu2, xen-devel



>>> On 1/12/2015 at 09:54 PM, in message <1421070890.26317.69.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Mon, 2015-01-12 at 00:01 -0700, Chun Yan Liu wrote: 
> >  
> > >>> On 1/8/2015 at 08:26 PM, in message <1420719995.19787.62.camel@citrix.com>,  
> Ian 
> > Campbell <Ian.Campbell@citrix.com> wrote:  
> > > On Mon, 2014-12-22 at 20:42 -0700, Chun Yan Liu wrote:  
> > > >   
> > > > >>> On 12/19/2014 at 06:25 PM, in message   
> > > <1418984720.20028.15.camel@citrix.com>,  
> > > > Ian Campbell <Ian.Campbell@citrix.com> wrote:   
> > > > > On Thu, 2014-12-18 at 22:45 -0700, Chun Yan Liu wrote:   
> > > > > >    
> > > > > > >>> On 12/18/2014 at 11:10 PM, in message    
> > > > > <1418915443.11882.86.camel@citrix.com>,   
> > > > > > Ian Campbell <Ian.Campbell@citrix.com> wrote:    
> > > > > > > On Tue, 2014-12-16 at 14:32 +0800, Chunyan Liu wrote:    
> > > > > > > > Changes to V8:    
> > > > > > > >   * add an overview document, so that one can has a overall look    
> > > > > > > >     about the whole domain snapshot work, limits, requirements,    
> > > > > > > >     how to do, etc.    
> > > > > > > >     
> > > > > > > > =====================================================================  
>    
> > > > > > > > Domain snapshot overview    
> > > > > > >     
> > > > > > > I don't see a similar section for disk snapshots, are you not    
> > > > > > > considering those here except as a part of a domain snapshot or is  
> this    
> > >   
> > > > > > > an oversight?    
> > > > > > >     
> > > > > > > There are three main use cases (that I know of at least) for    
> > > > > > > snapshotting like behaviour.    
> > > > > > >     
> > > > > > > One is as you've mentioned below for "backup", i.e. to preserve the VM  
>    
> > > > > > > at a certain point in time in order to be able to roll back to it. Is   
>   
> > > > > > > this the only usecase you are considering?    
> > > > > >    
> > > > > > Yes. I didn't take disk snapshot thing into the scope.   
> > > > > >    
> > > > > > >     
> > > > > > > A second use case is to support "gold image" type deployments, i.e.    
> > > > > > > where you create one baseline single disk image and then clone it    
> > > > > > > multiple times to deploy lots of guests. I think this is usually a  
> "disk   
> > >    
> > > > > > > snapshot" type thing, but maybe it can be implemented as restoring a    
>  
> > > > > > > gold domain snapshot multiple times (e.g. for start of day performance  
>    
> > > > > > > reasons).    
> > > > > >    
> > > > > > As we initially discussed about the thing, disk snapshot thing can be   
> > > done   
> > > > > > be existing tools directly like qemu-img, vhd-util.   
> > > > >    
> > > > > I was reading this section as a more generic overview of snapshotting,   
> > > > > without reference to where/how things might ultimately be implemented.   
> > > > >    
> > > > > From a design point of view it would be useful to cover the various use   
>  
> > > > > cases, even if the solution is that the user implements them using CLI   
> > > > > tools by hand (xl) or the toolstack does it for them internally   
> > > > > (libvirt).   
> > > > >    
> > > > > This way we can more clearly see the full picture, which allows us to   
> > > > > validate that we are making the right choices about what goes where.   
> > > >   
> > > > OK. I see. I think this user case is more like how to use the snapshot,   
> > > rather  
> > > > than how to implement snapshot. Right?  
> > >   
> > > Correct, what the user is actually trying to achieve with the  
> > > functionality.  
> > >   
> > > > 'Gold image' or 'Gold domain', the needed work is more like cloning  
> disks.  
> > >   
> > > Yes, or resuming multiple times.  
> >  
> > I see. But IMO it doesn't need change in snapshot design and  
> implementation. 
> > Even resuming multiple times, they couldn't use the same image but  
> duplicate 
> > the image multiple times. 
>  
> Perhaps, but the use case should be included so that this rationale for 
> not worrying about it can be written down (so that people like me don't 
> keep asking...)  

Got it. Thanks!

>  
> >  
> > >   
> > > > > > > The third case, (which is similar to the first), is taking a disk    
> > > > > > > snapshot in order to be able to run you usual backup software on the    
>  
> > > > > > > snapshot (which is now unchanging, which is handy) and then deleting  
> the   
> > >    
> > > > > > > disk snapshot (this differs from the first case in which disk is  
> active    
> > >   
> > > > > > > after the snapshot, and due to the lack of the memory part).    
> > > > > >    
> > > > > > Sorry, I'm still not quite clear about what this user case wants to do.  
>   
> > > > >    
> > > > > The user has an active domain which they want to backup, but backup   
> > > > > software often does not cope well if the data is changing under its   
> > > > > feet.   
> > > > >    
> > > > > So the users wants to take a snapshot of the domains disks while leaving  
>   
> > > > > the domain running, so they can backup that static version of the disk   
> > > > > out of band from the VM itself (e.g. by attaching it to a separate   
> > > > > backup VM).   
> > > >   
> > > > Got it. So that's simply disk-only snapshot when domian is active. As you  
>  
> > > > mentioned below, that needs guest agent to quiesce the disks. But  
> currently  
> > > > xen hypervisor can't support that, right?  
> > >   
> > > I don't think that's relevant right now, let me explain:  
> > >   
> > > I think it's important to consider all the use cases for snapshotting,  
> > > not because I think they need to be implemented now but to make sure  
> > > that we don't make any design decisions now which would make it  
> > > *impossible* to implement it in the future (at least without API  
> > > changes).  
> > >   
> > > As a random example, we would want to avoid designing a libxl API where  
> > > it is impossible to send the quiesce request at the right point for some  
> > > reason.  
> > >   
> > > So we need to consider these use cases now and have the design, but not  
> > > necessarily the implementation, be able to deal with them, or at least  
> > > to convince ourselves we most likely aren't tying our hands for future  
> > > work. 
> >  
> > Understand. If this user case is included in design, then I think 
> > libxl_disk_snapshot_create is not enough, better to have 
> > libxl_domain_snapshot_create, in future if guest agent is implemented, 
> > the process would be: 
> > if 'disk-only': 
> >    pause domain; 
> >    drain cache data to disk; 
> >    take disk snapshot; 
> >    resume domain; 
> > else: 
> >    save memory; 
> >    take disk snapshot; 
> >    resume domain. 
>  
> You don't mention the poking of the agent here, I think it comes before 
> the if, or maybe just after in the disk-only case (it can't come after 
> the pause)? 
>  
> It might be that a mechanism for quiescing all disks + pausing would be 
> sufficient, meaning you could leave libxl_disk_snapshot_create as it is 
> and implement the snapshot for backup similarly as quiesce+pause + 
> libxl_disk_snapshot create. 

Yeah, that's sufficient, if a mechanism for quiescing disks could be
provided. Then we can keep libxl_disk_snapshot_create.

Will update all docs.

Thanks a lot!

- Chunyan

>  
> Either way by considering this usecase now we can decide whether 
> libxl_disk_snapshot_create is sufficient or whether we should go with 
> lixbxl_domain_snapshot_create from day one. 
>  
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2015-01-14  3:12 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-16  6:32 [RFC V9 0/4] domain snapshot document Chunyan Liu
2014-12-16  6:32 ` [RFC V9 1/4] domain snapshot terms Chunyan Liu
2014-12-18 15:05   ` Ian Campbell
2014-12-19  2:46     ` Chun Yan Liu
2014-12-16  6:32 ` [RFC V9 2/4] domain snapshot overview Chunyan Liu
2014-12-17 12:17   ` Wei Liu
2014-12-18  3:34     ` Chun Yan Liu
2014-12-18 10:57       ` Wei Liu
2014-12-18 15:10   ` Ian Campbell
2014-12-19  5:45     ` Chun Yan Liu
2014-12-19 10:25       ` Ian Campbell
2014-12-23  3:42         ` Chun Yan Liu
2015-01-08 12:26           ` Ian Campbell
2015-01-12  7:01             ` Chun Yan Liu
2015-01-12 13:54               ` Ian Campbell
2015-01-14  3:12                 ` Chun Yan Liu
2014-12-16  6:32 ` [RFC V9 3/4] domain snapshot design: xl Chunyan Liu
2014-12-17 12:28   ` Wei Liu
2014-12-18  3:23     ` Chun Yan Liu
2014-12-18 11:02       ` Wei Liu
2014-12-18 15:15   ` Ian Campbell
2014-12-19  7:03     ` Chun Yan Liu
2014-12-19 10:27       ` Ian Campbell
2014-12-22  8:52         ` Chun Yan Liu
2015-01-08 11:59           ` Ian Campbell
2014-12-16  6:32 ` [RFC V9 4/4] domain snapshot design: libxl/libxlu Chunyan Liu
2014-12-17 14:09   ` Wei Liu
2014-12-18  3:01     ` Chun Yan Liu
2014-12-18 15:27   ` Ian Campbell
2014-12-19  6:58     ` Chun Yan Liu
2014-12-19 10:38       ` Ian Campbell
2014-12-22  9:36         ` Chun Yan Liu
2014-12-22  9:36         ` [Qemu-devel] " Chun Yan Liu
2015-01-08 12:11           ` Ian Campbell
2015-01-09  9:59             ` Chun Yan Liu
2015-01-09  9:59               ` Chun Yan Liu
2015-01-08 12:11           ` Ian Campbell
2014-12-22  9:53         ` Chun Yan Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.