Re: [Qemu-devel] [PATCH 2/8] qmp: create qmp_savevm command

From: "Denis V. Lunev" <den@openvz.org>
To: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Peter Krempa <pkrempa@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	qemu-devel@nongnu.org, Amit Shah <amit.shah@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 2/8] qmp: create qmp_savevm command
Date: Fri, 22 Jan 2016 11:01:03 +0300	[thread overview]
Message-ID: <56A1E1BF.7070209@openvz.org> (raw)
In-Reply-To: <87lh7l783q.fsf@blackfin.pond.sub.org>

On 01/19/2016 09:11 PM, Markus Armbruster wrote:
> "Denis V. Lunev" <den@openvz.org> writes:
>
>> On 01/18/2016 06:58 PM, Markus Armbruster wrote:
>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>
>>>> 'name' attribute is made mandatory in distinction with HMP command.
>>>>
>>>> The patch also moves hmp_savevm implementation into hmp.c. This function
>>>> is just a simple wrapper now and does not have knowledge about
>>>> migration internals.
> [...]
>>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>>> index 2e31733..09d1a1a 100644
>>>> --- a/qapi-schema.json
>>>> +++ b/qapi-schema.json
>>>> @@ -4054,3 +4054,16 @@
>>>>    ##
>>>>    { 'enum': 'ReplayMode',
>>>>      'data': [ 'none', 'record', 'play' ] }
>>>> +
>>>> +##
>>>> +# @savevm
>>>> +#
>>>> +# Save a VM snapshot. Old snapshot with the same name will be deleted if exists.
>>>> +#
>>>> +# @name: identifier of a snapshot to be created
>>>> +#
>>>> +# Returns: Nothing on success
>>>> +#
>>>> +# Since 2.6
>>>> +##
>>>> +{ 'command': 'savevm', 'data': {'name': 'str'} }
>>>> diff --git a/qmp-commands.hx b/qmp-commands.hx
>>>> index db072a6..b7851e1 100644
>>>> --- a/qmp-commands.hx
>>>> +++ b/qmp-commands.hx
>>>> @@ -4795,3 +4795,28 @@ Example:
>>>>                     {"type": 0, "out-pport": 0, "pport": 0, "vlan-id": 3840,
>>>>                      "pop-vlan": 1, "id": 251658240}
>>>>       ]}
>>>> +
>>>> +EQMP
>>>> +
>>>> +SQMP
>>>> +savevm
>>>> +------
>>>> +
>>>> +Save a VM snapshot. Old snapshot with the same name will be deleted if exists.
>>>> +
>>>> +Arguments:
>>>> +
>>>> +- "name": snapshot name
>>>> +
>>>> +Example:
>>>> +
>>>> +-> { "execute": "savevm", "arguments": { "name": "snapshot1" } }
>>>> +<- { "return": {} }
>>>> +
>>>> +EQMP
>>>> +
>>>> +    {
>>>> +        .name       = "savevm",
>>>> +        .args_type  = "name:s",
>>>> +        .mhandler.cmd_new = qmp_marshal_savevm,
>>>> +    },
>>> A snapshot has a tag (QEMUSnapshotInfo member name) and an ID
>>> (QEMUSnapshotInfo member id_str).
>>>
>>> HMP's name arguments are overloaded: they're matched both against tag
>>> and ID.  Unwisely chosen tags can create ambiguity.  Example:
>>>
>>>       (qemu) savevm 2
>>>       (qemu) savevm
>>>       (qemu) info snapshots
>>>       ID        TAG                 VM SIZE                DATE       VM CLOCK
>>>       1         2                      1.7M 2016-01-18 16:56:31   00:00:00.000
>>>       2         vm-20160118165641      1.7M 2016-01-18 16:56:41   00:00:00.000
>>>
>>> Care to guess which one we get when we ask for "2"?
>>>
>>> I think we want separate, unoverloaded arguments for QMP.
>> I think there is no need to. Name is now absolutely mandatory.
>> Thus for new snapshots we will have 'name' specified and we
>> will be bound to name only.
>>
>> 'id' will be used for old VMs and this is convenience
>> layer to make old 'id' only snaphosts accessible
>> through new interface in the same way as old.
> The overloaded interface you propose is more complex than it seems.  You
> hide the complexity by not documenting its workings.  Not even to the
> (insufficient!) degree the HMP interface documents how its overloaded
> name parameters work.
>
> Merely copying over the HMP documentation won't cut it.  The bar for new
> QMP interfaces is a fair bit higher than "no worse than HMP".  The new
> interface must reasonably sane for *QMP*, and sufficiently documented.
>
> If we can't make a sane QMP interface, I'd rather have no QMP interface.
> However, I believe we *can* make a sane QMP interface if we put in the
> design work.
>
> The design work must start with a review of what we're trying to
> accomplish, and how to fit it into the rest of the system.  Here's my
> attempt.  Since my knowledge on snapshots is rather superficial, I'm
> cc'ing Kevin for additional snapshot expertise.  Kevin, please correct
> me when I talk nonsense.  I'm further cc'ing Eric and Peter for the
> management layer perspective.
>
> A point-in-time snapshot of a system consists of a snapshot of its
> machine state and snapshots of its storage.  All the snapshots need to
> be made at the same point in time for the result to be consistent.
>
> Snapshots of read-only storage carry no information and are commonly
> omitted.
>
> Isolated storage snapshots can make sense, but snapshotting the machine
> state without also snapshotting the machine's storage doesn't sound
> useful to me.
>
> Both storage and machine state snapshots come in two flavours: internal
> and external.
>
> External ones can be made with any block backend, but internal storage
> snapshots work only with certain formats, notably qcow2.  QMP supports
> both kinds of storage snapshots.
>
> Both kinds of storage snapshots need exclusive access while they work.
> They're relatively quick, but the delay could be noticable for large
> internal snapshots, and perhaps for external snapshots on really slow
> storage.
>
> Internal machine state snapshots are currently only available via HMP's
> savevm, which integrates internal machine state and storage snapshots.
> This is non-live, i.e. the guest is stopped while the snapshot gets
> saved.  I figure we could make it live if we really wanted to.  Another
> instance of the emerging background job concept.
>
> On the implementation level, QCOW2 can't currently store a machine state
> snapshot without also storing a storage snapshot.  I guess we could
> change this if we really wanted to.
>
> External machine state snapshots are basically migrate to file.
> Supported by QMP.
>
> Live migration to file is possible, but currently wastes space, because
> memory dirtied during migration gets saved multiple times.  Could be
> fixed either by making migration update previously saved memory instead
> of appending (beware, random I/O), or by compacting the file afterwards.
>
> Non-live migration to file doesn't waste space that way.
>
> To take multiple *consistent* snapshots, you have to bundle them up in a
> transaction.  Transactions currently support only *storage* snapshots,
> though.
>
> Work-around for external machine state snapshot: migrate to file
> (possibly live), leaving the guest sopped on completion, take storage
> snapshots, resume guest.
>
> You can combine internal and external storage snapshots with an external
> machine state snapshot to get a mixed system snapshot.
>
> You currently can't do that with an internal machine state snapshot: the
> only way to take one is HMP savevm, which insists on internally
> snapshotting all writable storage, and doesn't transact together with
> external storage snapshots.
>
> Except for the case "purely internal snapshot with just one writable
> storage device", a system snapshot consists of multiple parts stored in
> separate files.  Tying the parts together is a management problem.  QEMU
> provides rudimentary management for purely internal snapshots, but it's
> flawed: missing storage isn't detected, and additional storage can creep
> in if snapshot tags or IDs happen to match.  I guess managing the parts
> is better left to the management layer.
>
> I figure a fully general QMP interface would let you perform a system
> snapshot by combining storage snapshots of either kind with either kind
> of machine state snapshot.
>
> We already have most of the building blocks: we can take external and
> internal storage snapshots, and combine them in transactions.
>
> What's missing is transactionable machine state snapshots.
>
> We know how to work around it for external machine state snapshots (see
> above), but a transaction-based solution might be nicer.
>
> Any solution for internal machine state snapshots in QMP should at least
> try to fit into this.  Some restrictions are okay.  For instance, we
> probably need to restrict internal machine state snapshots to piggyback
> on an internal storage snapshot for now, so we don't have to dig up
> QCOW2 just to get QMP support.
>
> We can talk about more convenient interfaces for common special cases,
> but I feel we need to design for the general case.  We don't have to
> implement the completely general case right away, though.  As long as we
> know where we want to go, incremental steps towards that goal are fine.
>
> Can we create a transactionable QMP command to take an internal machine
> state snapshot?
>
> This would be like HMP savevm with the following differences:
>
> * Separate parameters for tag and ID.  I'll have none of this
>    overloading nonsense in QMP.
why do we need both 'name' and 'id' values in the future at all?
Why single descriptive parameter is not enough? From my point
of view all this overloading is non-sense and in future one
name parameter should be kept.

> * Specify the destination block device.  I'll have none of this "pick a
>    device in some magic, undocumented way" in QMP.
agreed

> * Leave alone the other block devices.  Adding transaction actions to
>    snapshot all the writable block devices to get a full system snapshot
>    is the user's responsibility.
this requires clarification:
- do you mean that the list of the devices should come from the 
management layer?
If so this is error prone at the restoration stage. From my point of 
view restoration
must check that all RW devices are restored properly and there are no missed
ones. Do you propose to put this into libvirt? From my point of view 
this flexibility
is not that necessary.

There is one important thing missed in your description.
Machine snapshot MUST handle configuration change. This means
that at the moment of snapshot VM can have different amount of
disks and other peripherals and this should be properly handled.

> Limitations:
>
> * No live internal machine snapshot, yet.
>
> * The storage device taking the internal snapshot must also be
>    internally snapshot for now.  In fact, the command does both
>    (tolerable wart).
>
> Open questions:
>
> * Do we want the QMP command to delete existing snapshots with
>    conflicting tag / ID, like HMP savevm does?  Or do we want it to fail
>    the transaction?
>
> * Do we need transactions for switching to a system snapshot, too?
>
> Opinions?
>
> Denis, I understand doing this right will be more work than simply
> rebadging HMP's savevm etc for QMP, and probably a good deal more than
> you anticipated.  Sorry about that.  I'm trying to accomodate you as far
> as possible without screwing up QMP.  Not screwing up QMP too much is my
> job as maintainer.
this is not a problem if we will be able to do things with less
latency. From my side I am a bit frustrated that this discussion
comes with v5 of patches after two months of waiting to be
merged assuming that most of previous discussions were about
some mistakes in the code.