All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC V7 0/3] domain snapshot document
@ 2014-10-10  8:48 Chunyan Liu
  2014-10-10  8:48 ` [RFC V7 1/3] libxl domain snapshot introduction Chunyan Liu
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Chunyan Liu @ 2014-10-10  8:48 UTC (permalink / raw)
  To: xen-devel; +Cc: jfehlig, ian.campbell, Chunyan Liu

This version follows the discussion with Ian, redefines the work
of libxl and xl, make it much clearer.

Changes to V6:
  - add simple introduction
  - modify libxl API design according to Ian's comment
  - add xl interface desgin so that one can get a full picture of
    the work.

  V6:
  http://lists.xen.org/archives/html/xen-devel/2014-09/msg00862.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V7 1/3] libxl domain snapshot introduction
  2014-10-10  8:48 [RFC V7 0/3] domain snapshot document Chunyan Liu
@ 2014-10-10  8:48 ` Chunyan Liu
  2014-10-20 15:59   ` Ian Campbell
  2014-10-10  8:48 ` [RFC V7 2/3] libxl domain snapshot API design Chunyan Liu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Chunyan Liu @ 2014-10-10  8:48 UTC (permalink / raw)
  To: xen-devel; +Cc: jfehlig, ian.campbell, Chunyan Liu

Domain snapshot includes disk snapshots and domain state saving. domain
could be resumed to the very state when the snapshot was created. This
kind of snapshot is also referred to as a domain checkpoint or system
checkpoint.

Disk snapshot is crash-consistent if the domain is running. To libxl,
even domain is paused, there is no data flush to disk operation. So, for
active domain (domain is started), take a disk-only snapshot and then
resume, it is as if the guest had crashed. For this reason, we only
support disk-only snapshot when domain is inactive (like: libvirt defines
but not start the domain.)

Disk snapshot itself could be "internal" (like in qcow2 format, snapshot
and delta are both in one image file), or "external" (snapshot in one file,
delta in another).

Expected 4 types of operations:

"domain snapshot create":
    means saving domain state (if not disk-only) and doing disk snapshots.

"domain snapshot revert":
    means rolling back to the state of indicated snapshot.

"domain snapshot delete":
    delete indicated domain snapshot.

"domain snapshot list":
    list domain snapshot information.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V7 2/3] libxl domain snapshot API design
  2014-10-10  8:48 [RFC V7 0/3] domain snapshot document Chunyan Liu
  2014-10-10  8:48 ` [RFC V7 1/3] libxl domain snapshot introduction Chunyan Liu
@ 2014-10-10  8:48 ` Chunyan Liu
  2014-10-20 16:11   ` Ian Campbell
  2014-10-10  8:48 ` [RFC V7 3/3] xl snapshot-xxx Design Chunyan Liu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Chunyan Liu @ 2014-10-10  8:48 UTC (permalink / raw)
  To: xen-devel; +Cc: jfehlig, ian.campbell, Chunyan Liu

changes to V6:
  * redefine libxl and xl work separately
  * modified according to Ian's sugguestion

===========================================================================
Libxl Domain Snapshot API

1. New Structures

libxl_disk_snapshot = Struct("disk_snapshot",[
    # target disk
    ("disk",            libxl_device_disk),

    # disk snapshot name
    ("name",            string),

    # internal/external disk snapshot?
    ("external",        bool),

    # for external disk snapshot, specify following two field
    ("external_format", string),
    ("external_path",   string),
    ])

libxl_domain_snapshot_args = Struct("domain_snapshot_args",[
    # save memory or not. "false" means disk-only snapshot
    ("memory",        bool),

    # memory state path when snapshot is external
    ("memory_path",   string),

    # array to store disk snapshot info
    ("disks",         Array(libxl_disk_snapshot, "num_disks")),
    ]

2. New Functions

int libxl_domain_snapshot_create(libxl_ctx *ctx, int domid,
                                 libxl_domain_snapshot_args *snapshot,
                                 bool live)

    Creates a new snapshot of a domain based on the snapshot config contained
    in @snapshot. Save domain and do disk snapshot.

    ctx (INPUT): context
    domid (INPUT):  domain id
    snapshot (INPUT): configuration of domain snapshot
    live (INPUT):   live snapshot or not
    Returns: 0 on success, -1 on failure

    ctx:
       context.

    domid:
       If domain is active, this is the domid of the domain.
       If domain is inactive, set domid=-1. Only disk-only snapshot can be
       done. libxl_domain_snapshot_args:memory should be 'false'.

    live:
       true or false.
       when live is 'true', domain is not paused while creating the snapshot,
       like live migration. This increases size of the memory dump file, but
       reducess downtime of the guest. Only support this flag during external
       checkpoints.

    snapshot:
       memory:
           true or false.
           'false' means disk-only, won't save memory state.
           'true' means saving memory state. Memory would be saved in
           'memory_path'.
       memory_path:
           path to save memory file. NULL when 'memory' is false.
       num_disks:
           number of disks that need to take disk snapshot.
       disks:
           array of disk snapshot configuration. Has num_disks members.
           libxl_device_disk:
               structure to represent which disk.
           name:
               snapshot name.
           external:
               true or flase.
               'false' means internal disk snapshot. external_format and
               external_path will be ignored.
               'true' means external disk snapshot, then external_format and
               external_path should be provided.
          external_format:
              should be provided when 'external' is true. If not provided, will
              use default 'qcow2'.
              ignored when 'external' is false.
          external_path:
              must be provided when 'external' is true.
              ignored when 'external' is false.


int libxl_domain_snapshot_delete(libxl_ctx *ctx, int domid,
                                 libxl_domain_snapshot_args *snapshot);

    Delete a snapshot.
    This will delete the related domain and related disk snapshots.

    ctx (INPUT): context
    domid (INPUT): domain id
    snapshot (INPUT): domain snapshot related info
    Returns: 0 on success, -1 on error.

    About each input, explanation is the same as libxl_domain_snapshot_create.

int libxl_domain_snapshot_revert(libxl_ctx *ctx, int domid,
                               libxl_domain_snapshot_args *snapshot);

    Revert the domain to a given snapshot.

    Normally, the domain will revert to the same state the domain was in while
    the snapshot was taken (whether inactive, running, or paused).

    ctx (INPUT): context
    domid (INPUT): domain id
    snapshot (INPUT): snapshot
    Returns: 0 on success, -1 on error.

    About each input, explanation is the same as libxl_domain_snapshot_create.

3. Function Implementation

   libxl_domain_snapshot_create:
       1). check args validation
       2). if it is not disk-only, save domain memory through save-domain
       3). take disk snapshot by qmp command (if domian is active) or qemu-img
           command (if domain is inactive).

   libxl_domain_snapshot_delete:
       1). check args validation
       2). remove memory state file if it's not disk-only.
       3). delete disk snapshot. (for internal disk snapshot, through qmp
           command or qemu-img command)

   libxl_domain_snapshot_revert:
       This may need to hack current libxl code. Could be (?):
       1). pause domain
       2). reload memory
       3). apply disk snapshot.
       4). restore domain config file
       5). resume.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC V7 3/3] xl snapshot-xxx Design
  2014-10-10  8:48 [RFC V7 0/3] domain snapshot document Chunyan Liu
  2014-10-10  8:48 ` [RFC V7 1/3] libxl domain snapshot introduction Chunyan Liu
  2014-10-10  8:48 ` [RFC V7 2/3] libxl domain snapshot API design Chunyan Liu
@ 2014-10-10  8:48 ` Chunyan Liu
  2014-10-20 16:39   ` Ian Campbell
  2014-10-17  6:04 ` [RFC V7 0/3] domain snapshot document Chun Yan Liu
  2014-10-20 16:12 ` Ian Campbell
  4 siblings, 1 reply; 21+ messages in thread
From: Chunyan Liu @ 2014-10-10  8:48 UTC (permalink / raw)
  To: xen-devel; +Cc: jfehlig, ian.campbell, Chunyan Liu

1. xl commandline interface design

xl snapshot-create:
  Create a snapshot (disk and RAM) of a domain.

  SYNOPSIS:
    snapshot-create <domain> [<cfgfile>] [--disk-only] [--reuse-external]
    [--live]

  OPTIONS:
    [--domain] <string>  domain name, id or uuid
    [--cfgfile] <string>  domain snapshot configuration
    --disk-only      capture disk state but not vm state
    --reuse-external  reuse any existing external files
    --live           take a live snapshot

    If option includes --live, then the domain is not paused while creating
    the snapshot, like live migration. This increases size of the memory
    dump file, but reducess downtime of the guest. Only support this flag
    during external checkpoints.

    If option includes --disk-only, then the snapshot will be limited to
    the disks, and no VM state will be saved. For an active guest, this is
    not supported.

    If specify @cfgfile, cfgfile is prioritized.

xl snapshot-delete:
  Delete a snapshot of a domain.

  SYNOPSIS:
    snapshot-delete <domain> <snapshotname> [--children] [--children-only]

    By default, just this snapshot is deleted, and changes from this
    snapshot are automatically merged into children snapshots.

  OPTIONS:
    [--domain] <string>  domain name, id or uuid
    [--snapshotname] <string>  snapshot name
    --children       delete snapshot and all children
    --children-only  delete children but not snapshot

    If option includes --children, then this snapshot and any descendant
    snapshots are deleted.

    If option include --children-only, only descendant snapshots are deleted,
    this snapshot is not deleted.

xl snapshot-revert:
  Revert domain to status of a snapshot.

  SYNOPSIS:
      snapshot-revert <domain> <snapshotname> [--running] [--paused] [--force]

  OPTIONS:
    [--domain] <string>  domain name, id or uuid
    [--snapshotname] <string>  snapshot name
    --running        after reverting, change state to running
    --paused         after reverting, change state to paused
    --force          try harder on risky reverts

    Normally, the domain will revert to the same state the domain was in while
    the snapshot was taken (whether inactive, running, or paused).

    If option includes --running, then overrides the snapshot state to
    guarantee a running domain after the revert.

    If option includes --paused, then guarantees a paused domain after
    the revert.

xl snapshot-list:
  List snapshots for a domain.

  SYNOPSIS:
    snapshot-list <domain> [--parent] [--disk-only] [--internal] [--external]
    [--tree] [--name]

  OPTIONS:
    --disk-only      filter by disk-only snapshots
    --internal       filter by internal snapshots
    --external       filter by external snapshots
    --tree           list snapshots in a tree
    --parent         add a column showing parent snapshot
    --name           list snapshot names only

2. cfgfile syntax

"xl snapshot-create" supports creating a VM snapshot with user provided
configuration file. The configuration file syntax is as below:

#snapshot name. If user doesn't provide a VM snapshot name, xl will generate
#a name automatically by the creation time.
name=""

#snapshot description. Default is NULL.
description=""

#save memory or not. 1 (true) or 0 (false). Default is 0.
memory=0

#memory location. This field should be filled when memory=1. Default is NULL.
memory_path=""

#disk snapshot information
disks=['sda,1,qcow2,/tmp/sda_snapshot.qcow2','sdb,1,qcow2,/tmp/sda_snapshot.qcow2']
or
disks=['sda,0','sdb,0']

disk syntax:
'target device, external disk snapshot?, external format, external path'

3. xl structure to maintain VM snapshot info

libxl_disk_snapshot = Struct("disk_snapshot",[
    # target disk
    ("disk",            libxl_device_disk),

    # disk snapshot name
    ("name",            string),

    # internal/external disk snapshot?
    ("external",        bool),

    # for external disk snapshot, specify following two field
    ("external_format", string),
    ("external_path",   string),
    ])

libxl_domain_snapshot_info = Struct("domain_snapshot_info",[
    # snapshot name
    ("name",          string)

    ("create_time",   string)
    ("description",   string)

    # save memory or not. "false" means disk-only snapshot
    ("memory",        bool),

    # memory state path when snapshot is external
    ("memory_path",   string),

    # array to store disk snapshot info
    ("disks",         Array(libxl_disk_snapshot, "num_disks")),

    # parent snapshot name
    ("parent",        string),

    # array to store all children snapshot name
    ("children",      Array(string, "num_children"),
    ]

According to libxl_domain_snapshot_info, a json file will be saved on disk.

4. xl snapshot-xxx implementation details

"xl snapshot-create"

    1), parse args or domain snapshot configuration file.
    2), fill info in libxl_domain_snapshot_args struct according to
        options or config file.
    3), call libxl_domain_snapshot_create()
    4), fill info in libxl_domain_snapshot_info.
    5), save snapshot info in json file under                                                             "/var/lib/xen/snapshots/domain_uuid"

"xl snapshot-list"

    1), read all domain snapshot related json file under
        "/var/lib/xen/snapshots/domain_uuid". Parse each file and fill in
        libxl_domain_snapshot_info struct.
    2), display information from those libxl_domain_snapshot_info(s)

"xl snapshot-delete"

    1), read snapshot json file from
        "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\
        .libxl-json", parse the file and fill in libxl_domain_snapshot_info
    2), according to parent/children info in libxl_domain_snapshot_info
        and commandline options, decide which domain snapshot to be deleted.
        To delete each domain snapshot, fill in
        libxl_domain_snapshot_args and call libxl_domain_snapshot_delete().
    3), refresh parent/children relationship, delete json file for those
        already deleted snapshot.

"xl snapshot-revert"

    1), read snapshot json file from
        "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\
        .libxl-json", parse the file and fill in libxl_domain_snapshot_info.
    2), fill in libxl_domain_snapshot_args
    3). call libxl_domain_snapshot_revert().

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 0/3] domain snapshot document
  2014-10-10  8:48 [RFC V7 0/3] domain snapshot document Chunyan Liu
                   ` (2 preceding siblings ...)
  2014-10-10  8:48 ` [RFC V7 3/3] xl snapshot-xxx Design Chunyan Liu
@ 2014-10-17  6:04 ` Chun Yan Liu
  2014-10-17  9:50   ` Ian Campbell
  2014-10-20 16:12 ` Ian Campbell
  4 siblings, 1 reply; 21+ messages in thread
From: Chun Yan Liu @ 2014-10-17  6:04 UTC (permalink / raw)
  To: xen-devel, Chun Yan Liu; +Cc: Jim Fehlig, ian.campbell

Hi, Ian,

Could you have a look at the document?
I'm trying to reorganize code based on this document and
Bamvor's original code. Some detail things might be different
in final implementation (document will be updated), but now
I hope the general direction (API design and framework) is
correct.

Thanks,
Chunyan

>>> On 10/10/2014 at 04:48 PM, in message
<1412930928-14309-1-git-send-email-cyliu@suse.com>, Chunyan Liu
<cyliu@suse.com> wrote: 
> This version follows the discussion with Ian, redefines the work 
> of libxl and xl, make it much clearer. 
>  
> Changes to V6: 
>   - add simple introduction 
>   - modify libxl API design according to Ian's comment  
>   - add xl interface desgin so that one can get a full picture of 
>     the work. 
>  
>   V6: 
>   http://lists.xen.org/archives/html/xen-devel/2014-09/msg00862.html 
>  
> _______________________________________________ 
> Xen-devel mailing list 
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 
>  
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 0/3] domain snapshot document
  2014-10-17  6:04 ` [RFC V7 0/3] domain snapshot document Chun Yan Liu
@ 2014-10-17  9:50   ` Ian Campbell
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Campbell @ 2014-10-17  9:50 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: Jim Fehlig, xen-devel

Sorry, I've been at LinuxCon+Linux Plumbers all week so I've not managed
to give this the attention it needs. I'll try and get to it ASAP next
week.

On Fri, 2014-10-17 at 00:04 -0600, Chun Yan Liu wrote:
> Hi, Ian,
> 
> Could you have a look at the document?
> I'm trying to reorganize code based on this document and
> Bamvor's original code. Some detail things might be different
> in final implementation (document will be updated), but now
> I hope the general direction (API design and framework) is
> correct.
> 
> Thanks,
> Chunyan
> 
> >>> On 10/10/2014 at 04:48 PM, in message
> <1412930928-14309-1-git-send-email-cyliu@suse.com>, Chunyan Liu
> <cyliu@suse.com> wrote: 
> > This version follows the discussion with Ian, redefines the work 
> > of libxl and xl, make it much clearer. 
> >  
> > Changes to V6: 
> >   - add simple introduction 
> >   - modify libxl API design according to Ian's comment  
> >   - add xl interface desgin so that one can get a full picture of 
> >     the work. 
> >  
> >   V6: 
> >   http://lists.xen.org/archives/html/xen-devel/2014-09/msg00862.html 
> >  
> > _______________________________________________ 
> > Xen-devel mailing list 
> > Xen-devel@lists.xen.org 
> > http://lists.xen.org/xen-devel 
> >  
> >  
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 1/3] libxl domain snapshot introduction
  2014-10-10  8:48 ` [RFC V7 1/3] libxl domain snapshot introduction Chunyan Liu
@ 2014-10-20 15:59   ` Ian Campbell
  2014-10-21  3:25     ` Chun Yan Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Ian Campbell @ 2014-10-20 15:59 UTC (permalink / raw)
  To: Chunyan Liu, Ian.Jackson; +Cc: jfehlig, xen-devel

On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote:
> Domain snapshot includes disk snapshots and domain state saving. domain
> could be resumed to the very state when the snapshot was created. This
> kind of snapshot is also referred to as a domain checkpoint or system
> checkpoint.

Are these checkpoints considered consistent?

> 
> Disk snapshot is crash-consistent if the domain is running.

Did you mean inconsistent here?

>  To libxl,
> even domain is paused, there is no data flush to disk operation. So, for
> active domain (domain is started), take a disk-only snapshot and then
> resume, it is as if the guest had crashed. For this reason, we only
> support disk-only snapshot when domain is inactive (like: libvirt defines
> but not start the domain.)
> 
> Disk snapshot itself could be "internal" (like in qcow2 format, snapshot
> and delta are both in one image file), or "external" (snapshot in one file,
> delta in another).
> 
> Expected 4 types of operations:
> 
> "domain snapshot create":
>     means saving domain state (if not disk-only) and doing disk snapshots.
> 
> "domain snapshot revert":
>     means rolling back to the state of indicated snapshot.

This operation is only possible on what you call a "checkpoint" in the
first paragraph. is the right?

> 
> "domain snapshot delete":
>     delete indicated domain snapshot.
> 
> "domain snapshot list":
>     list domain snapshot information.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 2/3] libxl domain snapshot API design
  2014-10-10  8:48 ` [RFC V7 2/3] libxl domain snapshot API design Chunyan Liu
@ 2014-10-20 16:11   ` Ian Campbell
  2014-10-21  4:18     ` Chun Yan Liu
  2014-10-22  3:59     ` Chun Yan Liu
  0 siblings, 2 replies; 21+ messages in thread
From: Ian Campbell @ 2014-10-20 16:11 UTC (permalink / raw)
  To: Chunyan Liu, Ian.Jackson; +Cc: jfehlig, xen-devel

On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote:

> int libxl_domain_snapshot_create(libxl_ctx *ctx, int domid,
>                                  libxl_domain_snapshot_args *snapshot,
>                                  bool live)
> 
>     Creates a new snapshot of a domain based on the snapshot config contained
>     in @snapshot. Save domain and do disk snapshot.
> 
>     ctx (INPUT): context
>     domid (INPUT):  domain id
>     snapshot (INPUT): configuration of domain snapshot
>     live (INPUT):   live snapshot or not
>     Returns: 0 on success, -1 on failure
> 
>     ctx:
>        context.
> 
>     domid:
>        If domain is active, this is the domid of the domain.
>        If domain is inactive, set domid=-1. Only disk-only snapshot can be
>        done. libxl_domain_snapshot_args:memory should be 'false'.

I think we discussed last time that if the domain is inactive then libxl
doesn't know anything about it and cannot be expected to snapshot it. In
this case I think the toolstack's (e.g. libvirt's) storage management is
responsible for taking a disk snapshot, libxl is not involved.

>     live:
>        true or false.
>        when live is 'true', domain is not paused while creating the snapshot,
>        like live migration. This increases size of the memory dump file, but
>        reducess downtime of the guest.

>  Only support this flag during external checkpoints.

Why?

Even if valid for the planned implementation I don't think it belongs in
this sort of high level design. There should be an error value
indicating that a live checkpoint is not possible, which is the right
place to encode this behaviour.

>     snapshot:
>        memory:
>            true or false.
>            'false' means disk-only, won't save memory state.
>            'true' means saving memory state. Memory would be saved in
>            'memory_path'.
>        memory_path:
>            path to save memory file. NULL when 'memory' is false.
>        num_disks:
>            number of disks that need to take disk snapshot.
>        disks:
>            array of disk snapshot configuration. Has num_disks members.
>            libxl_device_disk:
>                structure to represent which disk.
>            name:
>                snapshot name.

How is this used? Does it get stored somewhere by libxl?

>            external:
>                true or flase.
>                'false' means internal disk snapshot. external_format and
>                external_path will be ignored.
>                'true' means external disk snapshot, then external_format and
>                external_path should be provided.
>           external_format:
>               should be provided when 'external' is true. If not provided, will
>               use default 'qcow2'.

I think this should say: will use a default appropriate to the disk
backend and format of the underlying disk image in use.

>               ignored when 'external' is false.
>           external_path:
>               must be provided when 'external' is true.
>               ignored when 'external' is false.
> 
> 
> int libxl_domain_snapshot_delete(libxl_ctx *ctx, int domid,
>                                  libxl_domain_snapshot_args *snapshot);
> 
>     Delete a snapshot.
>     This will delete the related domain and related disk snapshots.

I think last time we agreed that this operation could not "delete the
related domain" because it mustn't be active, and therefore libxl
doesn't know about it and that the management of the snapshot storage
was a matter for the toolstack's storage management layer, not libxl.

I think we ended up proposing a scheme where there was an API which the
toolstack could use to tell libxl that a snapshot in an active domain's
snapshot chain was to be changed/has changed, so that it could rescan
and make any necessary adjustments.

I think this is what we were discussing here:
http://lists.xen.org/archives/html/xen-devel/2014-09/msg01541.html

> 
>     ctx (INPUT): context
>     domid (INPUT): domain id
>     snapshot (INPUT): domain snapshot related info
>     Returns: 0 on success, -1 on error.
> 
>     About each input, explanation is the same as libxl_domain_snapshot_create.
> 
> int libxl_domain_snapshot_revert(libxl_ctx *ctx, int domid,
>                                libxl_domain_snapshot_args *snapshot);
> 
>     Revert the domain to a given snapshot.
> 
>     Normally, the domain will revert to the same state the domain was in while
>     the snapshot was taken (whether inactive, running, or paused).

I don't think inactive makes sense in this interface, there should be no
way to create a libxl snapshot of an inactive domain, therefore any
reversion to that state will not involve libxl.

Is this operation any different to destroying the domain and using
libxl_domain_restore to start a new domain based on the snapshot? Is
this operation just a convenience layer over that operation?

> 
>     ctx (INPUT): context
>     domid (INPUT): domain id
>     snapshot (INPUT): snapshot
>     Returns: 0 on success, -1 on error.
> 
>     About each input, explanation is the same as libxl_domain_snapshot_create.
> 
> 3. Function Implementation
> 
>    libxl_domain_snapshot_create:
>        1). check args validation
>        2). if it is not disk-only, save domain memory through save-domain
>        3). take disk snapshot by qmp command (if domian is active) or qemu-img
>            command (if domain is inactive).
> 
>    libxl_domain_snapshot_delete:
>        1). check args validation
>        2). remove memory state file if it's not disk-only.
>        3). delete disk snapshot. (for internal disk snapshot, through qmp
>            command or qemu-img command)
> 
>    libxl_domain_snapshot_revert:
>        This may need to hack current libxl code. Could be (?):
>        1). pause domain
>        2). reload memory
>        3). apply disk snapshot.
>        4). restore domain config file
>        5). resume.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 0/3] domain snapshot document
  2014-10-10  8:48 [RFC V7 0/3] domain snapshot document Chunyan Liu
                   ` (3 preceding siblings ...)
  2014-10-17  6:04 ` [RFC V7 0/3] domain snapshot document Chun Yan Liu
@ 2014-10-20 16:12 ` Ian Campbell
  4 siblings, 0 replies; 21+ messages in thread
From: Ian Campbell @ 2014-10-20 16:12 UTC (permalink / raw)
  To: Chunyan Liu; +Cc: jfehlig, xen-devel

Please could you also CC the other libxl maintainers on future
iterations, that is Ian Jackson and Wei Lui.

On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote:
> This version follows the discussion with Ian, redefines the work
> of libxl and xl, make it much clearer.
> 
> Changes to V6:
>   - add simple introduction
>   - modify libxl API design according to Ian's comment
>   - add xl interface desgin so that one can get a full picture of
>     the work.
> 
>   V6:
>   http://lists.xen.org/archives/html/xen-devel/2014-09/msg00862.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 3/3] xl snapshot-xxx Design
  2014-10-10  8:48 ` [RFC V7 3/3] xl snapshot-xxx Design Chunyan Liu
@ 2014-10-20 16:39   ` Ian Campbell
  2014-10-21  5:37     ` Chun Yan Liu
                       ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Ian Campbell @ 2014-10-20 16:39 UTC (permalink / raw)
  To: Chunyan Liu, Ian.Jackson; +Cc: jfehlig, xen-devel

On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote:
> 1. xl commandline interface design
> 
> xl snapshot-create:
>   Create a snapshot (disk and RAM) of a domain.
> 
>   SYNOPSIS:
>     snapshot-create <domain> [<cfgfile>] [--disk-only] [--reuse-external]
>     [--live]
> 
>   OPTIONS:
>     [--domain] <string>  domain name, id or uuid
>     [--cfgfile] <string>  domain snapshot configuration

Is this separate to or the same as the <cfgfile> argument? Is this a
xl.cfg(5) config file or something specific to snapshot cfg?

>     --disk-only      capture disk state but not vm state
>     --reuse-external  reuse any existing external files
>     --live           take a live snapshot
> 
>     If option includes --live, then the domain is not paused while creating
>     the snapshot, like live migration. This increases size of the memory
>     dump file, but reducess downtime of the guest. Only support this flag
>     during external checkpoints.
> 
>     If option includes --disk-only, then the snapshot will be limited to
>     the disks, and no VM state will be saved. For an active guest, this is
>     not supported.
> 
>     If specify @cfgfile, cfgfile is prioritized.

What does "prioritized" mean in this context?

> xl snapshot-delete:
>   Delete a snapshot of a domain.

So what's not clear yet (but I see it is discussed below) is the manner
in which xl is going to manage snapshots.

Typically in the past users have been expected to manage disk and save
images with rm(1) and/or various format specific tools (qemu-img,
vhd-image, lvcreate etc).

I think you are proposing that there should be some path full of
snapshots, is that right? That is adding a lot of complexity to xl which
could potentially be avoided by sticking to the "user takes care of it"
path.

> 3. xl structure to maintain VM snapshot info

These are repeating the libxl ones? The look subtly different. If they
are xl specific then they should be in the xl_foo namespace, and of
course the should incorporate public libxl API structs where necessary.

Having the structs be named libxl_* makes it hard for me to see if you
have gotten the layering right in much of the below. I'll try and point
out the ones I think should be xl_* below, if you really meant libxl_*
then that probably means I disagree with the layering.

> According to libxl_domain_snapshot_info, a json file will be saved on disk.

You mean that libxl_domain_snapshot_info (really
xl_domain_snapshot_info) can be serialised to disk as json, right?

> 
> 4. xl snapshot-xxx implementation details

How do these interact with xl create/destroy/shutdown/save/restore?

e.g. does destroying a domain remove any snapshots?

A bunch of these are gong to require some care wrt the possibility of
multiple xl invocations and the possibility of an xl crash.

> "xl snapshot-create"
> 
>     1), parse args or domain snapshot configuration file.
>     2), fill info in libxl_domain_snapshot_args struct according to
>         options or config file.
>     3), call libxl_domain_snapshot_create()
>     4), fill info in libxl_domain_snapshot_info.

xl_domain_snapshot_info?

>     5), save snapshot info in json file under
> "/var/lib/xen/snapshots/domain_uuid"

Do the disk images go here too?

> "xl snapshot-list"
> 
>     1), read all domain snapshot related json file under
>         "/var/lib/xen/snapshots/domain_uuid". Parse each file and fill in
>         libxl_domain_snapshot_info struct.

xl_domain_snapshot_info?

>     2), display information from those libxl_domain_snapshot_info(s)
> 
> "xl snapshot-delete"
> 
>     1), read snapshot json file from
>         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\
>         .libxl-json", parse the file and fill in libxl_domain_snapshot_info

.xl-json and xl_domain_snapshot_info, I think?

>     2), according to parent/children info in libxl_domain_snapshot_info

xl_domain_snapshot_info.

>         and commandline options, decide which domain snapshot to be deleted.
>         To delete each domain snapshot, fill in
>         libxl_domain_snapshot_args and call libxl_domain_snapshot_delete().

Depending on the state of the domain, much of this can be done with
unlink and/or calling out to external tools.

>     3), refresh parent/children relationship, delete json file for those
>         already deleted snapshot.
> 
> "xl snapshot-revert"
> 
>     1), read snapshot json file from
>         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\
>         .libxl-json", parse the file and fill in libxl_domain_snapshot_info.

.xl-json and xl_domain_snapshot_info

>     2), fill in libxl_domain_snapshot_args
>     3). call libxl_domain_snapshot_revert().

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 1/3] libxl domain snapshot introduction
  2014-10-20 15:59   ` Ian Campbell
@ 2014-10-21  3:25     ` Chun Yan Liu
  0 siblings, 0 replies; 21+ messages in thread
From: Chun Yan Liu @ 2014-10-21  3:25 UTC (permalink / raw)
  To: Ian Campbell, Wei Liu, Ian.Jackson; +Cc: Jim Fehlig, xen-devel



>>> On 10/20/2014 at 11:59 PM, in message <1413820757.29506.11.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote: 
> > Domain snapshot includes disk snapshots and domain state saving. domain 
> > could be resumed to the very state when the snapshot was created. This 
> > kind of snapshot is also referred to as a domain checkpoint or system 
> > checkpoint. 
>  
> Are these checkpoints considered consistent? 

Right.

>  
> >  
> > Disk snapshot is crash-consistent if the domain is running. 
>  
> Did you mean inconsistent here?

Right.
 
>  
> >  To libxl, 
> > even domain is paused, there is no data flush to disk operation. So, for 
> > active domain (domain is started), take a disk-only snapshot and then 
> > resume, it is as if the guest had crashed. For this reason, we only 
> > support disk-only snapshot when domain is inactive (like: libvirt defines 
> > but not start the domain.) 
> >  
> > Disk snapshot itself could be "internal" (like in qcow2 format, snapshot 
> > and delta are both in one image file), or "external" (snapshot in one file, 
> > delta in another). 
> >  
> > Expected 4 types of operations: 
> >  
> > "domain snapshot create": 
> >     means saving domain state (if not disk-only) and doing disk snapshots. 
> >  
> > "domain snapshot revert": 
> >     means rolling back to the state of indicated snapshot. 
>  
> This operation is only possible on what you call a "checkpoint" in the 
> first paragraph. is the right? 

Right. Basically we won't support inconsistent checkpoint, it's meaningless.

Toward the next patch, disk-only snapshot would not be supported in xl since
a xl domain is either active or doesn't exist at all. Could be supported in libvirt
since libvirt domain could be created but not started. So, to be simple, I agree
we won't support disk-only snapshot in libxl and xl. Let libvirt call qemu-img to
do disk-only snapshot.

>  
> >  
> > "domain snapshot delete": 
> >     delete indicated domain snapshot. 
> >  
> > "domain snapshot list": 
> >     list domain snapshot information. 
>  
>  
>  
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 2/3] libxl domain snapshot API design
  2014-10-20 16:11   ` Ian Campbell
@ 2014-10-21  4:18     ` Chun Yan Liu
  2014-10-22  3:59     ` Chun Yan Liu
  1 sibling, 0 replies; 21+ messages in thread
From: Chun Yan Liu @ 2014-10-21  4:18 UTC (permalink / raw)
  To: Ian Campbell, Ian.Jackson; +Cc: Jim Fehlig, xen-devel



>>> On 10/21/2014 at 12:11 AM, in message <1413821501.29506.13.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote: 
>  
> > int libxl_domain_snapshot_create(libxl_ctx *ctx, int domid, 
> >                                  libxl_domain_snapshot_args *snapshot, 
> >                                  bool live) 
> >  
> >     Creates a new snapshot of a domain based on the snapshot config  
> contained 
> >     in @snapshot. Save domain and do disk snapshot. 
> >  
> >     ctx (INPUT): context 
> >     domid (INPUT):  domain id 
> >     snapshot (INPUT): configuration of domain snapshot 
> >     live (INPUT):   live snapshot or not 
> >     Returns: 0 on success, -1 on failure 
> >  
> >     ctx: 
> >        context. 
> >  
> >     domid: 
> >        If domain is active, this is the domid of the domain. 
> >        If domain is inactive, set domid=-1. Only disk-only snapshot can be 
> >        done. libxl_domain_snapshot_args:memory should be 'false'. 
>  
> I think we discussed last time that if the domain is inactive then libxl 
> doesn't know anything about it and cannot be expected to snapshot it. In 
> this case I think the toolstack's (e.g. libvirt's) storage management is 
> responsible for taking a disk snapshot, libxl is not involved. 

OK. To be simple,  we won't support disk-only snapshot in libxl and xl.

xl domain is always active domain (started one), disk-only snapshot
couldn't keep data  consistent, won't allow that.
Let libvirt call qemu-img to do disk-only snapshot.

>  
> >     live: 
> >        true or false. 
> >        when live is 'true', domain is not paused while creating the  
> snapshot, 
> >        like live migration. This increases size of the memory dump file,  
> but 
> >        reducess downtime of the guest. 
>  
> >  Only support this flag during external checkpoints. 
>  
> Why? 
>  
> Even if valid for the planned implementation I don't think it belongs in 
> this sort of high level design. There should be an error value 
> indicating that a live checkpoint is not possible, which is the right 
> place to encode this behaviour. 
>  
> >     snapshot: 
> >        memory: 
> >            true or false. 
> >            'false' means disk-only, won't save memory state. 
> >            'true' means saving memory state. Memory would be saved in 
> >            'memory_path'. 

Since we decided to not support disk-only snapshot in libxl, this 'memory'
parameter is not needed. It's always 'true'.

> >        memory_path: 
> >            path to save memory file. NULL when 'memory' is false.
> >        num_disks: 
> >            number of disks that need to take disk snapshot. 
> >        disks: 
> >            array of disk snapshot configuration. Has num_disks members. 
> >            libxl_device_disk: 
> >                structure to represent which disk. 
> >            name: 
> >                snapshot name. 
>  
> How is this used? Does it get stored somewhere by libxl?

To do internal disk snapshot, that snapshot name will be stored on disk.
Libxl won't store anything after API.

>  
> >            external: 
> >                true or flase. 
> >                'false' means internal disk snapshot. external_format and 
> >                external_path will be ignored.
> >                'true' means external disk snapshot, then external_format  
> and 
> >                external_path should be provided. 
> >           external_format: 
> >               should be provided when 'external' is true. If not provided,  
> will 
> >               use default 'qcow2'. 
>  
> I think this should say: will use a default appropriate to the disk 
> backend and format of the underlying disk image in use.

Yes, it's a better description in high level design. But in implementation,
referring to libvirt qemu driver code, it's actually uses 'qcow2'. An
external snapshot is trying to treat the original disk image file as
backing file and create a new qcow2 file. Of course we can do in
different ways.
 
>  
> >               ignored when 'external' is false. 
> >           external_path: 
> >               must be provided when 'external' is true. 
> >               ignored when 'external' is false. 
> >  
> >  
> > int libxl_domain_snapshot_delete(libxl_ctx *ctx, int domid, 
> >                                  libxl_domain_snapshot_args *snapshot); 
> >  
> >     Delete a snapshot. 
> >     This will delete the related domain and related disk snapshots. 
>  
> I think last time we agreed that this operation could not "delete the 
> related domain" because it mustn't be active, and therefore libxl 

Sorry, here I missed some words.  I mean delete the related domain
memory state and related disk snapshots.

> doesn't know about it and that the management of the snapshot storage 
> was a matter for the toolstack's storage management layer, not libxl. 
>  
> I think we ended up proposing a scheme where there was an API which the 
> toolstack could use to tell libxl that a snapshot in an active domain's 
> snapshot chain was to be changed/has changed, so that it could rescan 
> and make any necessary adjustments. 
>  
> I think this is what we were discussing here: 
> http://lists.xen.org/archives/html/xen-devel/2014-09/msg01541.html 
>  
> >  
> >     ctx (INPUT): context 
> >     domid (INPUT): domain id 
> >     snapshot (INPUT): domain snapshot related info 
> >     Returns: 0 on success, -1 on error. 
> >  
> >     About each input, explanation is the same as  
> libxl_domain_snapshot_create. 
> >  
> > int libxl_domain_snapshot_revert(libxl_ctx *ctx, int domid, 
> >                                libxl_domain_snapshot_args *snapshot); 
> >  
> >     Revert the domain to a given snapshot. 
> >  
> >     Normally, the domain will revert to the same state the domain was in  
> while 
> >     the snapshot was taken (whether inactive, running, or paused). 
>  
> I don't think inactive makes sense in this interface, there should be no 
> way to create a libxl snapshot of an inactive domain, therefore any 
> reversion to that state will not involve libxl. 
>  
> Is this operation any different to destroying the domain and using 
> libxl_domain_restore to start a new domain based on the snapshot? Is 
> this operation just a convenience layer over that operation? 
>  
> >  
> >     ctx (INPUT): context 
> >     domid (INPUT): domain id 
> >     snapshot (INPUT): snapshot 
> >     Returns: 0 on success, -1 on error. 
> >  
> >     About each input, explanation is the same as  
> libxl_domain_snapshot_create. 
> >  
> > 3. Function Implementation 
> >  
> >    libxl_domain_snapshot_create: 
> >        1). check args validation 
> >        2). if it is not disk-only, save domain memory through save-domain 
> >        3). take disk snapshot by qmp command (if domian is active) or  
> qemu-img 
> >            command (if domain is inactive). 
> >  
> >    libxl_domain_snapshot_delete: 
> >        1). check args validation 
> >        2). remove memory state file if it's not disk-only. 
> >        3). delete disk snapshot. (for internal disk snapshot, through qmp 
> >            command or qemu-img command) 
> >  
> >    libxl_domain_snapshot_revert: 
> >        This may need to hack current libxl code. Could be (?): 
> >        1). pause domain 
> >        2). reload memory 
> >        3). apply disk snapshot. 
> >        4). restore domain config file 
> >        5). resume. 
>  
>  
>  
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 3/3] xl snapshot-xxx Design
  2014-10-20 16:39   ` Ian Campbell
@ 2014-10-21  5:37     ` Chun Yan Liu
  2014-10-22  4:10     ` Chun Yan Liu
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: Chun Yan Liu @ 2014-10-21  5:37 UTC (permalink / raw)
  To: Ian Campbell, Ian.Jackson, Wei Liu; +Cc: Jim Fehlig, xen-devel



>>> On 10/21/2014 at 12:39 AM, in message <1413823182.29506.16.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote: 
> > 1. xl commandline interface design 
> >  
> > xl snapshot-create: 
> >   Create a snapshot (disk and RAM) of a domain. 
> >  
> >   SYNOPSIS: 
> >     snapshot-create <domain> [<cfgfile>] [--disk-only] [--reuse-external] 
> >     [--live] 
> >  
> >   OPTIONS: 
> >     [--domain] <string>  domain name, id or uuid 
> >     [--cfgfile] <string>  domain snapshot configuration 
>  
> Is this separate to or the same as the <cfgfile> argument? 

Same. Will remove this option.
Add an option --name (snapshot name).

> Is this a 
> xl.cfg(5) config file or something specific to snapshot cfg?

specific to snapshot cfg.

>  
> >     --disk-only      capture disk state but not vm state

Will remove this option.
As replied in previous patch, xl won't support disk-only snapshot.

> >     --reuse-external  reuse any existing external files 
> >     --live           take a live snapshot 
> >  
> >     If option includes --live, then the domain is not paused while creating 
> >     the snapshot, like live migration. This increases size of the memory 
> >     dump file, but reducess downtime of the guest. Only support this flag 
> >     during external checkpoints. 
> >  
> >     If option includes --disk-only, then the snapshot will be limited to 
> >     the disks, and no VM state will be saved. For an active guest, this is 
> >     not supported. 
> >  
> >     If specify @cfgfile, cfgfile is prioritized. 
>  
> What does "prioritized" mean in this context? 

If specify cfgfile, and at the same time has option '--name snapshotname', will
use cfgfile info (like use 'name' got  from config file).

>  
> > xl snapshot-delete: 
> >   Delete a snapshot of a domain. 
>  
> So what's not clear yet (but I see it is discussed below) is the manner 
> in which xl is going to manage snapshots. 
>  
> Typically in the past users have been expected to manage disk and save 
> images with rm(1) and/or various format specific tools (qemu-img, 
> vhd-image, lvcreate etc). 
>  
> I think you are proposing that there should be some path full of 
> snapshots, is that right?


> That is adding a lot of complexity to xl which 
> could potentially be avoided by sticking to the "user takes care of it" 
> path. 
>  
> > 3. xl structure to maintain VM snapshot info 
>  
> These are repeating the libxl ones? The look subtly different. If they 
> are xl specific then they should be in the xl_foo namespace, and of 
> course the should incorporate public libxl API structs where necessary. 

You are right. It's xl specific one, not the same as libxl one. My mistake.
Will change into xl_foo namespace.
There is a path to store each snapshot_info in json file. From that file,
can get memory state file location and disk snapshot info. Libxl
can help delete memory state file and delete disk snapshot
(internal or external).

>  
> Having the structs be named libxl_* makes it hard for me to see if you 
> have gotten the layering right in much of the below. I'll try and point 
> out the ones I think should be xl_* below, if you really meant libxl_* 
> then that probably means I disagree with the layering. 
>  
> > According to libxl_domain_snapshot_info, a json file will be saved on disk. 
>  
> You mean that libxl_domain_snapshot_info (really 
> xl_domain_snapshot_info) can be serialised to disk as json, right? 

Right.

>  
> >  
> > 4. xl snapshot-xxx implementation details 
>  
> How do these interact with xl create/destroy/shutdown/save/restore? 
>  
> e.g. does destroying a domain remove any snapshots? 
>  
> A bunch of these are gong to require some care wrt the possibility of 
> multiple xl invocations and the possibility of an xl crash. 
> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\ 
> >         .libxl-json", parse the file and fill in libxl_domain_snapshot_info 
>  
> .xl-json and xl_domain_snapshot_info, I think? 
>  
> > "xl snapshot-create" 
> >  
> >     1), parse args or domain snapshot configuration file. 
> >     2), fill info in libxl_domain_snapshot_args struct according to 
> >         options or config file. 
> >     3), call libxl_domain_snapshot_create() 
> >     4), fill info in libxl_domain_snapshot_info. 
>  
> xl_domain_snapshot_info? 

Right.

>  
> >     5), save snapshot info in json file under 
> > "/var/lib/xen/snapshots/domain_uuid" 
>  
> Do the disk images go here too? 
>  
> > "xl snapshot-list" 
> >  
> >     1), read all domain snapshot related json file under 
> >         "/var/lib/xen/snapshots/domain_uuid". Parse each file and fill in 
> >         libxl_domain_snapshot_info struct. 
>  
> xl_domain_snapshot_info?

Right.
 
>  
> >     2), display information from those libxl_domain_snapshot_info(s) 
> .xl-json and xl_domain_snapshot_info, I think? 

Right.

>  
> >     2), according to parent/children info in libxl_domain_snapshot_info 
>  
> xl_domain_snapshot_info.

Right.

>
> >  
> > "xl snapshot-delete" 
> >  
> >     1), read snapshot json file from 
> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\ 
> >         .libxl-json", parse the file and fill in libxl_domain_snapshot_info 
>  
> >         and commandline options, decide which domain snapshot to be  
> deleted. 
> >         To delete each domain snapshot, fill in 
> >         libxl_domain_snapshot_args and call libxl_domain_snapshot_delete(). 
>  
> Depending on the state of the domain, much of this can be done with 
> unlink and/or calling out to external tools. 

Yes, xl or libvirt application can delete memory state file and delete
disk snapshot (eg. call qemu-img to delete internal disk snapshot instead
of qmp command, or delete external snapshot directly).
Just both xl and libvirt do the same work repeatly. So, I propose
libxl_domain_snapshot_delete API. Keep it or not?

>  
> >     3), refresh parent/children relationship, delete json file for those 
> >         already deleted snapshot. 
> >  
> > "xl snapshot-revert" 
> >  
> >     1), read snapshot json file from 
> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\ 
> >         .libxl-json", parse the file and fill in  
> libxl_domain_snapshot_info. 
>  
> .xl-json and xl_domain_snapshot_info 

Right.

>  
> >     2), fill in libxl_domain_snapshot_args 
> >     3). call libxl_domain_snapshot_revert(). 
>  
>  
>  
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 2/3] libxl domain snapshot API design
  2014-10-20 16:11   ` Ian Campbell
  2014-10-21  4:18     ` Chun Yan Liu
@ 2014-10-22  3:59     ` Chun Yan Liu
  2014-10-29 10:26       ` Ian Campbell
  1 sibling, 1 reply; 21+ messages in thread
From: Chun Yan Liu @ 2014-10-22  3:59 UTC (permalink / raw)
  To: Ian Campbell, Ian.Jackson; +Cc: Jim Fehlig, xen-devel



>>> On 10/21/2014 at 12:11 AM, in message <1413821501.29506.13.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote: 
>  
> > int libxl_domain_snapshot_create(libxl_ctx *ctx, int domid, 
> >                                  libxl_domain_snapshot_args *snapshot, 
> >                                  bool live) 
> >  
> >     Creates a new snapshot of a domain based on the snapshot config  
> contained 
> >     in @snapshot. Save domain and do disk snapshot. 
> >  
> >     ctx (INPUT): context 
> >     domid (INPUT):  domain id 
> >     snapshot (INPUT): configuration of domain snapshot 
> >     live (INPUT):   live snapshot or not 
> >     Returns: 0 on success, -1 on failure 
> >  
> >     ctx: 
> >        context. 
> >  
> >     domid: 
> >        If domain is active, this is the domid of the domain. 
> >        If domain is inactive, set domid=-1. Only disk-only snapshot can b
>  
> >     live: 
> >        true or false. 
> >        when live is 'true', domain is not paused while creating the  
> snapshot, 
> >        like live migration. This increases size of the memory dump file,  
> but 
> >        reducess downtime of the guest. 
>  
> >  Only support this flag during external checkpoints. 
>  
> Why?

This refers to libvirt qemu implementation. I think the reason is time.
Using external snapshot, it only needs to create a qcow2 image,
reference to original image as backing file, then switch to use the new
qcow2 image and start VM again. It's much quicker than doing internal
disk snapshot. For live snapshot, it certainly hopes the VM down time
is short. Well, this is my guess, please point out if there is any different
ideas.
 
>  
> Even if valid for the planned implementation I don't think it belongs in 
> this sort of high level design. There should be an error value 
> indicating that a live checkpoint is not possible, which is the right 
> place to encode this behaviour. 
>  
> >     snapshot: 
> >        memory: 
> >            true or false. 
> >            'false' means disk-only, won't save memory state. 
> >            'true' means saving memory state. Memory would be saved in 
> >            'memory_path'. 
> >        memory_path: 
> >            path to save memory file. NULL when 'memory' is false. 
> >        num_disks: 
> >            number of disks that need to take disk snapshot. 
> >        disks: 
> >            array of disk snapshot configuration. Has num_disks members. 
> >            libxl_device_disk: 
> >                structure to represent which disk. 
> >            name: 
> >                snapshot name. 
>  
> How is this used? Does it get stored somewhere by libxl? 
>  
> >            external: 
> >                true or flase. 
> >                'false' means internal disk snapshot. external_format and 
> >                external_path will be ignored. 
> >                'true' means external disk snapshot, then external_format  
> and 
> >                external_path should be provided. 
> >           external_format: 
> >               should be provided when 'external' is true. If not provided,  
> will 
> >               use default 'qcow2'. 
>  
> I think this should say: will use a default appropriate to the disk 
> backend and format of the underlying disk image in use. 
>  
> >               ignored when 'external' is false. 
> >           external_path: 
> >               must be provided when 'external' is true. 
> >               ignored when 'external' is false. 
> >  
> >  
> > int libxl_domain_snapshot_delete(libxl_ctx *ctx, int domid, 
> >                                  libxl_domain_snapshot_args *snapshot); 
> >  
> >     Delete a snapshot. 
> >     This will delete the related domain and related disk snapshots. 
>  
> I think last time we agreed that this operation could not "delete the 
> related domain" because it mustn't be active, and therefore libxl 
> doesn't know about it and that the management of the snapshot storage 
> was a matter for the toolstack's storage management layer, not libxl. 
>  
> I think we ended up proposing a scheme where there was an API which the 
> toolstack could use to tell libxl that a snapshot in an active domain's 
> snapshot chain was to be changed/has changed, so that it could rescan 
> and make any necessary adjustments. 
>  
> I think this is what we were discussing here: 
> http://lists.xen.org/archives/html/xen-devel/2014-09/msg01541.html 
>  
> >  
> >     ctx (INPUT): context 
> >     domid (INPUT): domain id 
> >     snapshot (INPUT): domain snapshot related info 
> >     Returns: 0 on success, -1 on error. 
> >  
> >     About each input, explanation is the same as  
> libxl_domain_snapshot_create. 
> >  
> > int libxl_domain_snapshot_revert(libxl_ctx *ctx,
> >                                libxl_domain_snapshot_args *snapshot); 
> >  
> >     Revert the domain to a given snapshot. 
> >  
> >     Normally, the domain will revert to the same state the domain was in  
> while 
> >     the snapshot was taken (whether inactive, running, or paused). 
>  
> I don't think inactive makes sense in this interface, there should be no 
> way to create a libxl snapshot of an inactive domain, therefore any 
> reversion to that state will not involve libxl. 

One case is in libvirt. It creates a snapshot, then destroy the domain, but
the domain still exists (inactive). In this case, one can still do snapshot-revert.
But maybe we shouldn't include it in libxl, let libvirt handle this case itself.

>  
> Is this operation any different to destroying the domain and using 
> libxl_domain_restore to start a new domain based on the snapshot? Is 
> this operation just a convenience layer over that operation? 

It depends on implementation. It's a simple way to destroy the domain
first, then start new domain based on snapshot. But destroying the
domain may be not good to user (after xl snapshot-revert, domid is
changed.)  and may cause some problem in libvirt (may affect its
event handling ?).

Or another way is: not destroying the domain, but through a process
like pause domain, reload memory, reload disk snapshot, reload config
file, resume domain. Complex but maybe better.
At a previous talk with Jim, he personally suggests it should not destroy
the domain.

>  
> >  
> >     ctx (INPUT): context 
> >     domid (INPUT): domain id 
> >     snapshot (INPUT): snapshot 
> >     Returns: 0 on success, -1 on error. 
> >  
> >     About each input, explanation is the same as  
> libxl_domain_snapshot_create. 
> >  
> > 3. Function Implementation 
> >  
> >    libxl_domain_snapshot_create: 
> >        1). check args validation e 
> >        done. libxl_domain_snapshot_args:memory should be 'false'. 
>  
> I think we discussed last time that if the domain is inactive then libxl 
> doesn't know anything about it and cannot be expected to snapshot it. In 
> this case I think the toolstack's (e.g. libvirt's) storage management is 
> responsible for taking a disk snapshot, libxl is not involved. 
> >        2). if it is not disk-only, save domain memory through save-domain 
> >        3). take disk snapshot by qmp command (if domian is active) or  
> qemu-img 
> >            command (if domain is inactive). 
> >  
> >    libxl_domain_snapshot_delete: 
> >        1). check args validation 
> >        2). remove memory state file if it's not disk-only. 
> >        3). delete disk snapshot. (for internal disk snapshot, through qmp 
> >            command or qemu-img command) 
> >  
> >    libxl_domain_snapshot_revert: 
> >        This may need to hack current libxl code. Could be (?): 
> >        1). pause domain 
> >        2). reload memory 
> >        3). apply disk snapshot. 
> >        4). restore domain config file 
> >        5). resume. 
>  
>  
>  
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 3/3] xl snapshot-xxx Design
  2014-10-20 16:39   ` Ian Campbell
  2014-10-21  5:37     ` Chun Yan Liu
@ 2014-10-22  4:10     ` Chun Yan Liu
  2014-10-29  8:32     ` Chun Yan Liu
  2014-10-29  8:34     ` Chun Yan Liu
  3 siblings, 0 replies; 21+ messages in thread
From: Chun Yan Liu @ 2014-10-22  4:10 UTC (permalink / raw)
  To: Ian Campbell, Ian.Jackson; +Cc: Jim Fehlig, xen-devel



>>> On 10/21/2014 at 12:39 AM, in message <1413823182.29506.16.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote: 
> > 1. xl commandline interface design 
> >  
> > xl snapshot-create: 
> >   Create a snapshot (disk and RAM) of a domain. 
> >  
> >   SYNOPSIS: 
> >     snapshot-create <domain> [<cfgfile>] [--disk-only] [--reuse-external] 
> >     [--live] 
> >  
> >   OPTIONS: 
> >     [--domain] <string>  domain name, id or uuid 
> >     [--cfgfile] <string>  domain snapshot configuration 
>  
> Is this separate to or the same as the <cfgfile> argument? Is this a 
> xl.cfg(5) config file or something specific to snapshot cfg? 
>  
> >     --disk-only      capture disk state but not vm state 
> >     --reuse-external  reuse any existing external files 
> >     --live           take a live snapshot 
> >  
> >     If option includes --live, then the domain is not paused while creating 
> >     the snapshot, like live migration. This increases size of the memory 
> >     dump file, but reducess downtime of the guest. Only support this flag 
> >     during external checkpoints. 
> >  
> >     If option includes --disk-only, then the snapshot will be limited to 
> >     the disks, and no VM state will be saved. For an active guest, this is 
> >     not supported. 
> >  
> >     If specify @cfgfile, cfgfile is prioritized. 
>  
> What does "prioritized" mean in this context? 
>  
> > xl snapshot-delete: 
> >   Delete a snapshot of a domain. 
>  
> So what's not clear yet (but I see it is discussed below) is the manner 
> in which xl is going to manage snapshots. 
>  
> Typically in the past users have been expected to manage disk and save 
> images with rm(1) and/or various format specific tools (qemu-img, 
> vhd-image, lvcreate etc). 
>  
> I think you are proposing that there should be some path full of 
> snapshots, is that right? That is adding a lot of complexity to xl which 
> could potentially be avoided by sticking to the "user takes care of it" 
> path. 
>  
> > 3. xl structure to maintain VM snapshot info 
>  
> These are repeating the libxl ones? The look subtly different. If they 
> are xl specific then they should be in the xl_foo namespace, and of 
> course the should incorporate public libxl API structs where necessary. 
>  
> Having the structs be named libxl_* makes it hard for me to see if you 
> have gotten the layering right in much of the below. I'll try and point 
> out the ones I think should be xl_* below, if you really meant libxl_* 
> then that probably means I disagree with the layering. 
>  
> > According to libxl_domain_snapshot_info, a json file will be saved on disk. 
>  
> You mean that libxl_domain_snapshot_info (really 
> xl_domain_snapshot_info) can be serialised to disk as json, right? 
>  
> >  
> > 4. xl snapshot-xxx implementation details 
>  
> How do these interact with xl create/destroy/shutdown/save/restore? 
>  
> e.g. does destroying a domain remove any snapshots? 
> Typically in the past users have been expected to manage disk and save 
> images with rm(1) and/or various format specific tools (qemu-img, 
> vhd-image, lvcreate etc). 
>  
> I think you are proposing that there should be some path full of 
> snapshots, is that right? That is adding a lot of complexity to xl which 
> could potentially be avoided by sticking to the "user takes care of it" 
> path. 
>  
> > 3. xl structure to maintain VM snapshot info 
>  
> These are repeating the libxl ones? The look subtly different. If they 
> are xl specific then they should be in the xl_foo namespace, and of 
> course the should incorporate public libxl API structs where necessary. 
>  
> Having the structs be named libxl_* makes it hard for me to see if you 
> have gotten the layering right in much of the below. I'll try and point 
> out the ones I think should be xl_* below, if you really meant libxl_* 
> then that probably means I disagree with the layering. 
>  
> > According to libxl_domain_snapshot_info, a json file will be saved on disk. 
>  
> You mean that libxl_domain_snapshot_info (really 
> xl_domain_snapshot_info) can be serialised to disk as json, right? 
>  
> >  
> > 4. xl snapshot-xxx implementation details 
>  
> How do these interact with xl create/destroy/shutdown/save/restore? 
>  
> e.g. does destroying a domain remove any snapshots? 

Since in xl domain is deleted when destroyed/shutdown/save, no state
like previously in xend, created but not started, deleted only when issue
'delete' command.

I don't think clearly now about:
how to handle snapshots when destroy/shutdown/save/migrate a domain.
In theory, if a domain is deleted, the snapshots should all be deleted.
But this way, I don't know how much value this operation can bring to users.
Those snapshot are still usable. One can start a new domain from a snapshot.
There is memory state (can restore memory, restore config) and disk
snapshot (can restore disk status).

Do you have any suggestions?

>  
> A bunch of these are gong to require some care wrt the possibility of 
> multiple xl invocations and the possibility of an xl crash. 
>  
>  
> A bunch of these are gong to require some care wrt the possibility of 
> multiple xl invocations and the possibility of an xl crash. 
>  
> > "xl snapshot-create" 
> >  
> >     1), parse args or domain snapshot configuration file. 
> >     2), fill info in libxl_domain_snapshot_args struct according to 
> >         options or config file. 
> >     3), call libxl_domain_snapshot_create() 
> >     4), fill info in libxl_domain_snapshot_info. 
>  
> xl_domain_snapshot_info? 
>  
> >     5), save snapshot info in json file under 
> > "/var/lib/xen/snapshots/domain_uuid" 
>  
> Do the disk images go here too? 
>  
> > "xl snapshot-list" 
> >  
> >     2), display information from those libxl_domain_snapshot_info(s) 
> >  
> > "xl snapshot-delete" 
> >  
> >     1), read snapshot json file from 
> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\ 
> >         .libxl-json", parse the file and fill in libxl_domain_snapshot_info 
>  
> .xl-json and xl_domain_snapshot_info, I think? 
>  
> >     2), according to parent/children info in libxl_domain_snapshot_info 
>  
> xl_domain_snapshot_info. 
>  
> >         and commandline options, decide which domain snapshot to be  
> deleted. 
> >         To delete each domain snapshot, fill in 
> >         libxl_domain_snapshot_args and call libxl_domain_snapshot_delete(). 
>  
> Depending on the state of the domain, much of this can be done with 
> unlink and/or calling out to external tools. 
>  
> >     3), refresh parent/children relationship, delete json file for those 
> >         already deleted snapshot. 
> >  
> > "xl snapshot-revert" 
> >  
> >     1), read snapshot json file from 
> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\ 
> >         .libxl-json", parse the file and fill in  
> libxl_domain_snapshot_info. 
>  
> .xl-json and xl_domain_snapshot_info 
>  
> >     2), fill in libxl_domain_snapshot_args 
> >     3). call libxl_domain_snapshot_revert(). 
>  
>  
>  
>  
> >     1), read all domain snapshot related json file under 
> >         "/var/lib/xen/snapshots/domain_uuid". Parse each file and fill in 
> >         libxl_domain_snapshot_info struct. 
>  
> xl_domain_snapshot_info? 
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 3/3] xl snapshot-xxx Design
  2014-10-20 16:39   ` Ian Campbell
  2014-10-21  5:37     ` Chun Yan Liu
  2014-10-22  4:10     ` Chun Yan Liu
@ 2014-10-29  8:32     ` Chun Yan Liu
  2014-10-29 10:19       ` Ian Campbell
  2014-10-29  8:34     ` Chun Yan Liu
  3 siblings, 1 reply; 21+ messages in thread
From: Chun Yan Liu @ 2014-10-29  8:32 UTC (permalink / raw)
  To: Ian Campbell, Ian.Jackson; +Cc: Jim Fehlig, xen-devel



>>> On 10/22/2014 at 12:10 PM, in message <54472E36.F08 : 102 : 21807>, Chun Yan
Liu wrote: 

>  
>>>> On 10/21/2014 at 12:39 AM, in message <1413823182.29506.16.camel@citrix.com>, 
> Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote:  
>> > 1. xl commandline interface design  
>> >   
>> > xl snapshot-create:  
>> >   Create a snapshot (disk and RAM) of a domain.  
>> >   
>> >   SYNOPSIS:  
>> >     snapshot-create <domain> [<cfgfile>] [--disk-only] [--reuse-external]  
>> >     [--live]  
>> >   
>> >   OPTIONS:  
>> >     [--domain] <string>  domain name, id or uuid  
>> >     [--cfgfile] <string>  domain snapshot configuration  
>>   
>> Is this separate to or the same as the <cfgfile> argument? Is this a  
>> xl.cfg(5) config file or something specific to snapshot cfg?  
>>   
>> >     --disk-only      capture disk state but not vm state  
>> >     --reuse-external  reuse any existing external files  
>> >     --live           take a live snapshot  
>> >   
>> >     If option includes --live, then the domain is not paused while creating  
>> >     the snapshot, like live migration. This increases size of the memory  
>> >     dump file, but reducess downtime of the guest. Only support this flag  
>> >     during external checkpoints.  
>> >   
>> >     If option includes --disk-only, then the snapshot will be limited to  
>> >     the disks, and no VM state will be saved. For an active guest, this is  
>> >     not supported.  
>> >   
>> >     If specify @cfgfile, cfgfile is prioritized.  
>>   
>> What does "prioritized" mean in this context?  
>>   
>> > xl snapshot-delete:  
>> >   Delete a snapshot of a domain.  
>>   
>> So what's not clear yet (but I see it is discussed below) is the manner  
>> in which xl is going to manage snapshots.  
>>   
>> Typically in the past users have been expected to manage disk and save  
>> images with rm(1) and/or various format specific tools (qemu-img,  
>> vhd-image, lvcreate etc).  
>>   
>> I think you are proposing that there should be some path full of  
>> snapshots, is that right? That is adding a lot of complexity to xl which  
>> could potentially be avoided by sticking to the "user takes care of it"  
>> path.  
>>   
>> > 3. xl structure to maintain VM snapshot info  
>>   
>> These are repeating the libxl ones? The look subtly different. If they  
>> are xl specific then they should be in the xl_foo namespace, and of  
>> course the should incorporate public libxl API structs where necessary.  
>>   
>> Having the structs be named libxl_* makes it hard for me to see if you  
>> have gotten the layering right in much of the below. I'll try and point  
>> out the ones I think should be xl_* below, if you really meant libxl_*  
>> then that probably means I disagree with the layering.  
>>   
>> > According to libxl_domain_snapshot_info, a json file will be saved on disk.  
>>   
>> You mean that libxl_domain_snapshot_info (really  
>> xl_domain_snapshot_info) can be serialised to disk as json, right?  
>>   
>> >   
>> > 4. xl snapshot-xxx implementation details  
>>   
>> How do these interact with xl create/destroy/shutdown/save/restore?  
>>   
>> e.g. does destroying a domain remove any snapshots?  
>> Typically in the past users have been expected to manage disk and save  
>> images with rm(1) and/or various format specific tools (qemu-img,  
>> vhd-image, lvcreate etc).  
>>   
>> I think you are proposing that there should be some path full of  
>> snapshots, is that right? That is adding a lot of complexity to xl which  
>> could potentially be avoided by sticking to the "user takes care of it"  
>> path.  
>>   
>> > 3. xl structure to maintain VM snapshot info  
>>   
>> These are repeating the libxl ones? The look subtly different. If they  
>> are xl specific then they should be in the xl_foo namespace, and of  
>> course the should incorporate public libxl API structs where necessary.  
>>   
>> Having the structs be named libxl_* makes it hard for me to see if you  
>> have gotten the layering right in much of the below. I'll try and point  
>> out the ones I think should be xl_* below, if you really meant libxl_*  
>> then that probably means I disagree with the layering.  
>>   
>> > According to libxl_domain_snapshot_info, a json file will be saved on disk.  
>>   
>> You mean that libxl_domain_snapshot_info (really  
>> xl_domain_snapshot_info) can be serialised to disk as json, right?  
>>   
>> >   
>> > 4. xl snapshot-xxx implementation details  
>>   
>> How do these interact with xl create/destroy/shutdown/save/restore?  
>>   
>> e.g. does destroying a domain remove any snapshots?  
>  
> Since in xl domain is deleted when destroyed/shutdown/save, no state 
> like previously in xend, created but not started, deleted only when issue 
> 'delete' command. 
>  
> I don't think clearly now about: 
> how to handle snapshots when destroy/shutdown/save/migrate a domain. 
> In theory, if a domain is deleted, the snapshots should all be deleted. 
> But this way, I don't know how much value this operation can bring to users. 
> Those snapshot are still usable. One can start a new domain from a snapshot. 
> There is memory state (can restore memory, restore config) and disk 
> snapshot (can restore disk status). 
>  
> Do you have any suggestions? 

As described above, is there any suggestions?

>  
>>   
>> A bunch of these are gong to require some care wrt the possibility of  
>> multiple xl invocations and the possibility of an xl crash.  
>>   
>>   
>> A bunch of these are gong to require some care wrt the possibility of  
>> multiple xl invocations and the possibility of an xl crash.  
>>   
>> > "xl snapshot-create"  
>> >   
>> >     1), parse args or domain snapshot configuration file.  
>> >     2), fill info in libxl_domain_snapshot_args struct according to  
>> >         options or config file.  
>> >     3), call libxl_domain_snapshot_create()  
>> >     4), fill info in libxl_domain_snapshot_info.  
>>   
>> xl_domain_snapshot_info?  
>>   
>> >     5), save snapshot info in json file under  
>> > "/var/lib/xen/snapshots/domain_uuid"  
>>   
>> Do the disk images go here too?  
>>   
>> > "xl snapshot-list"  
>> >   
>> >     2), display information from those libxl_domain_snapshot_info(s)  
>> >   
>> > "xl snapshot-delete"  
>> >   
>> >     1), read snapshot json file from  
>> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\  
>> >         .libxl-json", parse the file and fill in libxl_domain_snapshot_info  
>>   
>> .xl-json and xl_domain_snapshot_info, I think?  
>>   
>> >     2), according to parent/children info in libxl_domain_snapshot_info  
>>   
>> xl_domain_snapshot_info.  
>>   
>> >         and commandline options, decide which domain snapshot to be   
>> deleted.  
>> >         To delete each domain snapshot, fill in  
>> >         libxl_domain_snapshot_args and call libxl_domain_snapshot_delete().  
>>   
>> Depending on the state of the domain, much of this can be done with  
>> unlink and/or calling out to external tools.  
>>   
>> >     3), refresh parent/children relationship, delete json file for those  
>> >         already deleted snapshot.  
>> >   
>> > "xl snapshot-revert"  
>> >   
>> >     1), read snapshot json file from  
>> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\  
>> >         .libxl-json", parse the file and fill in   
>> libxl_domain_snapshot_info.  
>>   
>> .xl-json and xl_domain_snapshot_info  
>>   
>> >     2), fill in libxl_domain_snapshot_args  
>> >     3). call libxl_domain_snapshot_revert().  
>>   
>>   
>>   
>>   
>> >     1), read all domain snapshot related json file under  
>> >         "/var/lib/xen/snapshots/domain_uuid". Parse each file and fill in  
>> >         libxl_domain_snapshot_info struct.  
>>   
>> xl_domain_snapshot_info?  
>>   
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 3/3] xl snapshot-xxx Design
  2014-10-20 16:39   ` Ian Campbell
                       ` (2 preceding siblings ...)
  2014-10-29  8:32     ` Chun Yan Liu
@ 2014-10-29  8:34     ` Chun Yan Liu
  2014-10-29 10:22       ` Ian Campbell
  3 siblings, 1 reply; 21+ messages in thread
From: Chun Yan Liu @ 2014-10-29  8:34 UTC (permalink / raw)
  To: Ian Campbell, Ian.Jackson, Wei Liu; +Cc: Jim Fehlig, xen-devel



>>> On 10/21/2014 at 01:37 PM, in message <5445F137.5EA : 102 : 21807>, Chun Yan
Liu wrote: 

>  
>>>> On 10/21/2014 at 12:39 AM, in message <1413823182.29506.16.camel@citrix.com>, 
> Ian Campbell <Ian.Campbell@citrix.com> wrote:  
> > On Fri, 2014-10-10 at 16:48 +0800, Chunyan Liu wrote:  
>> > 1. xl commandline interface design  
>> >   
>> > xl snapshot-create:  
>> >   Create a snapshot (disk and RAM) of a domain.  
>> >   
>> >   SYNOPSIS:  
>> >     snapshot-create <domain> [<cfgfile>] [--disk-only] [--reuse-external]  
>> >     [--live]  
>> >   
>> >   OPTIONS:  
>> >     [--domain] <string>  domain name, id or uuid  
>> >     [--cfgfile] <string>  domain snapshot configuration  
>>   
>> Is this separate to or the same as the <cfgfile> argument?  
>  
> Same. Will remove this option. 
> Add an option --name (snapshot name). 
>  
>> Is this a  
>> xl.cfg(5) config file or something specific to snapshot cfg? 
>  
> specific to snapshot cfg. 
>  
>>   
>> >     --disk-only      capture disk state but not vm state 
>  
> Will remove this option. 
> As replied in previous patch, xl won't support disk-only snapshot. 
>  
>> >     --reuse-external  reuse any existing external files  
>> >     --live           take a live snapshot  
>> >   
>> >     If option includes --live, then the domain is not paused while creating  
>> >     the snapshot, like live migration. This increases size of the memory  
>> >     dump file, but reducess downtime of the guest. Only support this flag  
>> >     during external checkpoints.  
>> >   
>> >     If option includes --disk-only, then the snapshot will be limited to  
>> >     the disks, and no VM state will be saved. For an active guest, this is  
>> >     not supported.  
>> >   
>> >     If specify @cfgfile, cfgfile is prioritized.  
>>   
>> What does "prioritized" mean in this context?  
>  
> If specify cfgfile, and at the same time has option '--name snapshotname',  
> will 
> use cfgfile info (like use 'name' got  from config file). 
>  
>>   
>> > xl snapshot-delete:  
>> >   Delete a snapshot of a domain.  
>>   
>> So what's not clear yet (but I see it is discussed below) is the manner  
>> in which xl is going to manage snapshots.  
>>   
>> Typically in the past users have been expected to manage disk and save  
>> images with rm(1) and/or various format specific tools (qemu-img,  
>> vhd-image, lvcreate etc).  
>>   
>> I think you are proposing that there should be some path full of  
>> snapshots, is that right? 
>  
>  
>> That is adding a lot of complexity to xl which  
>> could potentially be avoided by sticking to the "user takes care of it"  
>> path.  
>>   
>> > 3. xl structure to maintain VM snapshot info  
>>   
>> These are repeating the libxl ones? The look subtly different. If they  
>> are xl specific then they should be in the xl_foo namespace, and of  
>> course the should incorporate public libxl API structs where necessary.  
>  
> You are right. It's xl specific one, not the same as libxl one. My mistake. 
> Will change into xl_foo namespace. 
> There is a path to store each snapshot_info in json file. From that file, 
> can get memory state file location and disk snapshot info. Libxl 
> can help delete memory state file and delete disk snapshot 
> (internal or external). 
>  
>>   
>> Having the structs be named libxl_* makes it hard for me to see if you  
>> have gotten the layering right in much of the below. I'll try and point  
>> out the ones I think should be xl_* below, if you really meant libxl_*  
>> then that probably means I disagree with the layering.  
>>   
>> > According to libxl_domain_snapshot_info, a json file will be saved on disk.  
>>   
>> You mean that libxl_domain_snapshot_info (really  
>> xl_domain_snapshot_info) can be serialised to disk as json, right?  
>  
> Right. 
>  
>>   
>> >   
>> > 4. xl snapshot-xxx implementation details  
>>   
>> How do these interact with xl create/destroy/shutdown/save/restore?  
>>   
>> e.g. does destroying a domain remove any snapshots?  
>>   
>> A bunch of these are gong to require some care wrt the possibility of  
>> multiple xl invocations and the possibility of an xl crash.  
>> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\  
>> >         .libxl-json", parse the file and fill in libxl_domain_snapshot_info  
>>   
>> .xl-json and xl_domain_snapshot_info, I think?  
>>   
>> > "xl snapshot-create"  
>> >   
>> >     1), parse args or domain snapshot configuration file.  
>> >     2), fill info in libxl_domain_snapshot_args struct according to  
>> >         options or config file.  
>> >     3), call libxl_domain_snapshot_create()  
>> >     4), fill info in libxl_domain_snapshot_info.  
>>   
>> xl_domain_snapshot_info?  
>  
> Right. 
>  
>>   
>> >     5), save snapshot info in json file under  
>> > "/var/lib/xen/snapshots/domain_uuid"  
>>   
>> Do the disk images go here too?  
>>   
>> > "xl snapshot-list"  
>> >   
>> >     1), read all domain snapshot related json file under  
>> >         "/var/lib/xen/snapshots/domain_uuid". Parse each file and fill in  
>> >         libxl_domain_snapshot_info struct.  
>>   
>> xl_domain_snapshot_info? 
>  
> Right. 
>   
>>   
>> >     2), display information from those libxl_domain_snapshot_info(s)  
>> .xl-json and xl_domain_snapshot_info, I think?  
>  
> Right. 
>  
>>   
>> >     2), according to parent/children info in libxl_domain_snapshot_info  
>>   
>> xl_domain_snapshot_info. 
>  
> Right. 
>  
>> 
>> >   
>> > "xl snapshot-delete"  
>> >   
>> >     1), read snapshot json file from  
>> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\  
>> >         .libxl-json", parse the file and fill in libxl_domain_snapshot_info  
>>   
>> >         and commandline options, decide which domain snapshot to be   
>> deleted.  
>> >         To delete each domain snapshot, fill in  
>> >         libxl_domain_snapshot_args and call libxl_domain_snapshot_delete().  
>>   
>> Depending on the state of the domain, much of this can be done with  
>> unlink and/or calling out to external tools.  
>  
> Yes, xl or libvirt application can delete memory state file and delete 
> disk snapshot (eg. call qemu-img to delete internal disk snapshot instead 
> of qmp command, or delete external snapshot directly). 
> Just both xl and libvirt do the same work repeatly. So, I propose 
> libxl_domain_snapshot_delete API. Keep it or not? 

And here, any suggestion?

>  
>>   
>> >     3), refresh parent/children relationship, delete json file for those  
>> >         already deleted snapshot.  
>> >   
>> > "xl snapshot-revert"  
>> >   
>> >     1), read snapshot json file from  
>> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\  
>> >         .libxl-json", parse the file and fill in   
>> libxl_domain_snapshot_info.  
>>   
>> .xl-json and xl_domain_snapshot_info  
>  
> Right. 
>  
>>   
>> >     2), fill in libxl_domain_snapshot_args  
>> >     3). call libxl_domain_snapshot_revert().  
>>   
>>   
>>   
>>   
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 3/3] xl snapshot-xxx Design
  2014-10-29  8:32     ` Chun Yan Liu
@ 2014-10-29 10:19       ` Ian Campbell
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Campbell @ 2014-10-29 10:19 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: Ian.Jackson, Jim Fehlig, xen-devel

On Wed, 2014-10-29 at 02:32 -0600, Chun Yan Liu wrote:
> >> >   
> >> > 4. xl snapshot-xxx implementation details  
> >>   
> >> How do these interact with xl create/destroy/shutdown/save/restore?  
> >>   
> >> e.g. does destroying a domain remove any snapshots?  
> >  
> > Since in xl domain is deleted when destroyed/shutdown/save, no state 
> > like previously in xend, created but not started, deleted only when issue 
> > 'delete' command. 
> >  
> > I don't think clearly now about: 
> > how to handle snapshots when destroy/shutdown/save/migrate a domain. 
> > In theory, if a domain is deleted, the snapshots should all be deleted. 
> > But this way, I don't know how much value this operation can bring to users. 
> > Those snapshot are still usable. One can start a new domain from a snapshot. 
> > There is memory state (can restore memory, restore config) and disk 
> > snapshot (can restore disk status). 
> >  
> > Do you have any suggestions? 
> 
> As described above, is there any suggestions?

I'm afraid not, it all depends on the usecases I suppose.

Ian.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 3/3] xl snapshot-xxx Design
  2014-10-29  8:34     ` Chun Yan Liu
@ 2014-10-29 10:22       ` Ian Campbell
  0 siblings, 0 replies; 21+ messages in thread
From: Ian Campbell @ 2014-10-29 10:22 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: Ian.Jackson, Jim Fehlig, Wei Liu, xen-devel

On Wed, 2014-10-29 at 02:34 -0600, Chun Yan Liu wrote:
> >> > "xl snapshot-delete"  
> >> >   
> >> >     1), read snapshot json file from  
> >> >         "/var/lib/xen/snapshots/domain_uuid/snapshotdata-<snapshot_name>\  
> >> >         .libxl-json", parse the file and fill in libxl_domain_snapshot_info  
> >>   
> >> >         and commandline options, decide which domain snapshot to be   
> >> deleted.  
> >> >         To delete each domain snapshot, fill in  
> >> >         libxl_domain_snapshot_args and call libxl_domain_snapshot_delete().  
> >>   
> >> Depending on the state of the domain, much of this can be done with  
> >> unlink and/or calling out to external tools.  
> >  
> > Yes, xl or libvirt application can delete memory state file and delete 
> > disk snapshot (eg. call qemu-img to delete internal disk snapshot instead 
> > of qmp command, or delete external snapshot directly). 
> > Just both xl and libvirt do the same work repeatly. So, I propose 
> > libxl_domain_snapshot_delete API. Keep it or not? 
> 
> And here, any suggestion?

If it weren't for the need to do "storage management" in libxl (i.e.
call qemu-img) this would seem like a harmless enough helper. However
the need for it to do storage mgmt is concerning, since it means libxl
needs to learn more about the details of each container format it might
support using as a backend.

Perhaps it is a candidate for libxlu rather than libxl proper?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 2/3] libxl domain snapshot API design
  2014-10-22  3:59     ` Chun Yan Liu
@ 2014-10-29 10:26       ` Ian Campbell
  2014-11-03  7:37         ` Chun Yan Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Ian Campbell @ 2014-10-29 10:26 UTC (permalink / raw)
  To: Chun Yan Liu; +Cc: Jim Fehlig, Ian.Jackson, xen-devel

On Tue, 2014-10-21 at 21:59 -0600, Chun Yan Liu wrote:
> > Is this operation any different to destroying the domain and using 
> > libxl_domain_restore to start a new domain based on the snapshot? Is 
> > this operation just a convenience layer over that operation? 
> 
> It depends on implementation. It's a simple way to destroy the domain
> first, then start new domain based on snapshot. But destroying the
> domain may be not good to user (after xl snapshot-revert, domid is
> changed.)  and may cause some problem in libvirt (may affect its
> event handling ?).

I would hope that as part of the implementation libvirt would learn to
cope with this if it can't already, but it can surely already cope with
migration and reverting to a snapshot is not so very different.

> Or another way is: not destroying the domain, but through a process
> like pause domain, reload memory, reload disk snapshot, reload config
> file, resume domain. Complex but maybe better.

I don't think the complexity of resetting an already existing domain's
memory and i/o state to an earlier incarnation rather than starting from
a clean slate should be underestimated either (TBH it never occurred to
me that you might try this). AFAICT you'd need to effectively tear
everything down to a blank slate and then do all the same things that
you would do in the destroy case.

Ian.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC V7 2/3] libxl domain snapshot API design
  2014-10-29 10:26       ` Ian Campbell
@ 2014-11-03  7:37         ` Chun Yan Liu
  0 siblings, 0 replies; 21+ messages in thread
From: Chun Yan Liu @ 2014-11-03  7:37 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jim Fehlig, Ian.Jackson, xen-devel



>>> On 10/29/2014 at 06:26 PM, in message <1414578409.29975.9.camel@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com> wrote: 
> On Tue, 2014-10-21 at 21:59 -0600, Chun Yan Liu wrote: 
> > > Is this operation any different to destroying the domain and using  
> > > libxl_domain_restore to start a new domain based on the snapshot? Is  
> > > this operation just a convenience layer over that operation?  
> >  
> > It depends on implementation. It's a simple way to destroy the domain 
> > first, then start new domain based on snapshot. But destroying the 
> > domain may be not good to user (after xl snapshot-revert, domid is 
> > changed.)  and may cause some problem in libvirt (may affect its 
> > event handling ?). 
>  
> I would hope that as part of the implementation libvirt would learn to 
> cope with this if it can't already, but it can surely already cope with 
> migration and reverting to a snapshot is not so very different. 

See. If implemented in this way (destroy domain and restore based on snapshot),
it's better not supply a libxl API. Let xl and libvirt handle that by themselves.
(Libxl destroys the domain and starts a new domain will cause problem in libvirt,
libvirt has no idea of the new domain created by libxl internally.)

Thanks Ian. I'll update the doc.

- Chunyan

>  
> > Or another way is: not destroying the domain, but through a process 
> > like pause domain, reload memory, reload disk snapshot, reload config 
> > file, resume domain. Complex but maybe better. 
>  
> I don't think the complexity of resetting an already existing domain's 
> memory and i/o state to an earlier incarnation rather than starting from 
> a clean slate should be underestimated either (TBH it never occurred to 
> me that you might try this). AFAICT you'd need to effectively tear 
> everything down to a blank slate and then do all the same things that 
> you would do in the destroy case. 
>  
> Ian. 
>  
>  
>  

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-11-03  7:37 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-10  8:48 [RFC V7 0/3] domain snapshot document Chunyan Liu
2014-10-10  8:48 ` [RFC V7 1/3] libxl domain snapshot introduction Chunyan Liu
2014-10-20 15:59   ` Ian Campbell
2014-10-21  3:25     ` Chun Yan Liu
2014-10-10  8:48 ` [RFC V7 2/3] libxl domain snapshot API design Chunyan Liu
2014-10-20 16:11   ` Ian Campbell
2014-10-21  4:18     ` Chun Yan Liu
2014-10-22  3:59     ` Chun Yan Liu
2014-10-29 10:26       ` Ian Campbell
2014-11-03  7:37         ` Chun Yan Liu
2014-10-10  8:48 ` [RFC V7 3/3] xl snapshot-xxx Design Chunyan Liu
2014-10-20 16:39   ` Ian Campbell
2014-10-21  5:37     ` Chun Yan Liu
2014-10-22  4:10     ` Chun Yan Liu
2014-10-29  8:32     ` Chun Yan Liu
2014-10-29 10:19       ` Ian Campbell
2014-10-29  8:34     ` Chun Yan Liu
2014-10-29 10:22       ` Ian Campbell
2014-10-17  6:04 ` [RFC V7 0/3] domain snapshot document Chun Yan Liu
2014-10-17  9:50   ` Ian Campbell
2014-10-20 16:12 ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.