All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP
@ 2020-08-27 11:15 Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 1/7] migration: improve error reporting of block driver state name Daniel P. Berrangé
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-08-27 11:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, John Snow,
	Markus Armbruster, Dr. David Alan Gilbert, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz

 v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html
 v2: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg07523.html

When QMP was first introduced some 10+ years ago now, the snapshot
related commands (savevm/loadvm/delvm) were not converted. This was
primarily because their implementation causes blocking of the thread
running the monitor commands. This was (and still is) considered
undesirable behaviour both in HMP and QMP.

In theory someone was supposed to fix this flaw at some point in the
past 10 years and bring them into the QMP world. Sadly, thus far it
hasn't happened as people always had more important things to work
on. Enterprise apps were much more interested in external snapshots
than internal snapshots as they have many more features.

Meanwhile users still want to use internal snapshots as there is
a certainly simplicity in having everything self-contained in one
image, even though it has limitations. Thus the apps that end up
executing the savevm/loadvm/delvm via the "human-monitor-command"
QMP command.

IOW, the problematic blocking behaviour that was one of the reasons
for not having savevm/loadvm/delvm in QMP is experienced by applications
regardless. By not portting the commands to QMP due to one design flaw,
we've forced apps and users to suffer from other design flaws of HMP (
bad error reporting, strong type checking of args, no introspection) for
an additional 10 years. This feels rather sub-optimal :-(

In practice users don't appear to care strongly about the fact that these
commands block the VM while they run. I might have seen one bug report
about it, but it certainly isn't something that comes up as a frequent
topic except among us QEMU maintainers. Users do care about having
access to the snapshot feature.

Where I am seeing frequent complaints is wrt the use of OVMF combined
with snapshots which has some serious pain points. This is getting worse
as the push to ditch legacy BIOS in favour of UEFI gain momentum both
across OS vendors and mgmt apps. Solving it requires new parameters to
the commands, but doing this in HMP is super unappealing.

After 10 years, I think it is time for us to be a little pragmatic about
our handling of snapshots commands. My desire is that libvirt should never
use "human-monitor-command" under any circumstances, because of the
inherant flaws in HMP as a protocol for machine consumption.

Thus in this series I'm proposing a fairly direct mapping of the existing
HMP commands for savevm/loadvm/delvm into QMP as a first step. This does
not solve the blocking thread problem, but it does put in a place a
design using the jobs framework which can facilitate solving it later.
It does also solve the error reporting, type checking and introspection
problems inherant to HMP. So we're winning on 3 out of the 4 problems,
and pushed apps to a QMP design that will let us solve the last
remaining problem.

With a QMP variant, we reasonably deal with the problems related to OVMF:

 - The logic to pick which disk to store the vmstate in is not
   satsifactory.

   The first block driver state cannot be assumed to be the root disk
   image, it might be OVMF varstore and we don't want to store vmstate
   in there.

 - The logic to decide which disks must be snapshotted is hardwired
   to all disks which are writable

   Again with OVMF there might be a writable varstore, but this can be
   raw rather than qcow2 format, and thus unable to be snapshotted.
   While users might wish to snapshot their varstore, in some/many/most
   cases it is entirely uneccessary. Users are blocked from snapshotting
   their VM though due to this varstore.

These are solved by adding two parameters to the commands. The first is
a block device node name that identifies the image to store vmstate in,
and the second is a list of node names to include for the snapshots.
If the list of nodes isn't given, it falls back to the historical
behaviour of using all disks matching some undocumented criteria.

In the block code I've only dealt with node names for block devices, as
IIUC, this is all that libvirt should need in the -blockdev world it now
lives in. IOW, I've made not attempt to cope with people wanting to use
these QMP commands in combination with -drive args, as libvirt will
never use -drive with a QEMU new enough to have these new commands.

The main limitations of this current impl

 - The snapshot process runs serialized in the main thread. ie QEMU
   guest execution is blocked for the duration. The job framework
   lets us fix this in future without changing the QMP semantics
   exposed to the apps.

 - Most vmstate loading errors just go to stderr, as they are not
   using Error **errp reporting. Thus the job framework just
   reports a fairly generic message

     "Error -22 while loading VM state"

   Again this can be fixed later without changing the QMP semantics
   exposed to apps.

I've done some minimal work in libvirt to start to make use of the new
commands to validate their functionality, but this isn't finished yet.

My ultimate goal is to make the GNOME Boxes maintainer happy again by
having internal snapshots work with OVMF:

  https://gitlab.gnome.org/GNOME/gnome-boxes/-/commit/c486da262f6566326fbcb5e=
f45c5f64048f16a6e

Changed in v3:

 - Schedule a bottom half to escape from coroutine context in
   the jobs. This is needed because the locking in the snapshot
   code goes horribly wrong when run from a background coroutine
   instead of the main event thread.

 - Re-factor way we iterate over devices, so that we correctly
   report non-existant devices passed by the user over QMP.

 - Add QAPI docs notes about limitations wrt vmstate error
   reporting (it all goes to stderr not an Error **errp)
   so QMP only gets a fairly generic error message currently.

 - Add I/O test to validate many usage scenarios / errors

 - Add I/O test helpers to handle QMP events with a deterministic
   ordering

 - Ensure 'delete-snapshot' reports an error if requesting
   delete from devices that don't support snapshot, instead of
   silently succeeding with no erro.

Changed in v2:

 - Use new command names "snapshot-{load,save,delete}" to make it
   clear that these are different from the "savevm|loadvm|delvm"
   as they use the Job framework

 - Use an include list for block devs, not an exclude list

Daniel P. Berrang=C3=A9 (7):
  migration: improve error reporting of block driver state name
  block: push error reporting into bdrv_all_*_snapshot functions
  migration: stop returning errno from load_snapshot()
  block: add ability to specify list of blockdevs during snapshot
  block: allow specifying name of block device for vmstate storage
  iotests: add support for capturing and matching QMP events
  migration: introduce snapshot-{save,load,delete} QMP commands

 block/monitor/block-hmp-cmds.c |   7 +-
 block/snapshot.c               | 233 ++++++++++++++-------
 include/block/snapshot.h       |  19 +-
 include/migration/snapshot.h   |  10 +-
 migration/savevm.c             | 258 +++++++++++++++++++----
 monitor/hmp-cmds.c             |  11 +-
 qapi/job.json                  |   9 +-
 qapi/migration.json            | 135 ++++++++++++
 replay/replay-snapshot.c       |   4 +-
 softmmu/vl.c                   |   2 +-
 tests/qemu-iotests/267.out     |  14 +-
 tests/qemu-iotests/310         | 255 +++++++++++++++++++++++
 tests/qemu-iotests/310.out     | 369 +++++++++++++++++++++++++++++++++
 tests/qemu-iotests/common.qemu | 107 +++++++++-
 tests/qemu-iotests/group       |   1 +
 15 files changed, 1289 insertions(+), 145 deletions(-)
 create mode 100755 tests/qemu-iotests/310
 create mode 100644 tests/qemu-iotests/310.out

--=20
2.26.2




^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/7] migration: improve error reporting of block driver state name
  2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
@ 2020-08-27 11:16 ` Daniel P. Berrangé
  2020-08-27 17:17   ` Dr. David Alan Gilbert
  2020-08-27 11:16 ` [PATCH v3 2/7] block: push error reporting into bdrv_all_*_snapshot functions Daniel P. Berrangé
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-08-27 11:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, John Snow,
	Markus Armbruster, Dr. David Alan Gilbert, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz

With blockdev, a BlockDriverState may not have a device name,
so using a node name is required as an alternative.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c         | 12 ++++++------
 tests/qemu-iotests/267.out |  4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index a843d202b5..304d98ff78 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2682,7 +2682,7 @@ int save_snapshot(const char *name, Error **errp)
 
     if (!bdrv_all_can_snapshot(&bs)) {
         error_setg(errp, "Device '%s' is writable but does not support "
-                   "snapshots", bdrv_get_device_name(bs));
+                   "snapshots", bdrv_get_device_or_node_name(bs));
         return ret;
     }
 
@@ -2691,7 +2691,7 @@ int save_snapshot(const char *name, Error **errp)
         ret = bdrv_all_delete_snapshot(name, &bs1, errp);
         if (ret < 0) {
             error_prepend(errp, "Error while deleting snapshot on device "
-                          "'%s': ", bdrv_get_device_name(bs1));
+                          "'%s': ", bdrv_get_device_or_node_name(bs1));
             return ret;
         }
     }
@@ -2766,7 +2766,7 @@ int save_snapshot(const char *name, Error **errp)
     ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
     if (ret < 0) {
         error_setg(errp, "Error while creating snapshot on '%s'",
-                   bdrv_get_device_name(bs));
+                   bdrv_get_device_or_node_name(bs));
         goto the_end;
     }
 
@@ -2884,14 +2884,14 @@ int load_snapshot(const char *name, Error **errp)
     if (!bdrv_all_can_snapshot(&bs)) {
         error_setg(errp,
                    "Device '%s' is writable but does not support snapshots",
-                   bdrv_get_device_name(bs));
+                   bdrv_get_device_or_node_name(bs));
         return -ENOTSUP;
     }
     ret = bdrv_all_find_snapshot(name, &bs);
     if (ret < 0) {
         error_setg(errp,
                    "Device '%s' does not have the requested snapshot '%s'",
-                   bdrv_get_device_name(bs), name);
+                   bdrv_get_device_or_node_name(bs), name);
         return ret;
     }
 
@@ -2920,7 +2920,7 @@ int load_snapshot(const char *name, Error **errp)
     ret = bdrv_all_goto_snapshot(name, &bs, errp);
     if (ret < 0) {
         error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
-                      name, bdrv_get_device_name(bs));
+                      name, bdrv_get_device_or_node_name(bs));
         goto err_drain;
     }
 
diff --git a/tests/qemu-iotests/267.out b/tests/qemu-iotests/267.out
index d6d80c099f..215902b3ad 100644
--- a/tests/qemu-iotests/267.out
+++ b/tests/qemu-iotests/267.out
@@ -81,11 +81,11 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 Testing: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=file
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) savevm snap0
-Error: Device '' is writable but does not support snapshots
+Error: Device 'file' is writable but does not support snapshots
 (qemu) info snapshots
 No available block device supports snapshots
 (qemu) loadvm snap0
-Error: Device '' is writable but does not support snapshots
+Error: Device 'file' is writable but does not support snapshots
 (qemu) quit
 
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/7] block: push error reporting into bdrv_all_*_snapshot functions
  2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 1/7] migration: improve error reporting of block driver state name Daniel P. Berrangé
@ 2020-08-27 11:16 ` Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 3/7] migration: stop returning errno from load_snapshot() Daniel P. Berrangé
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-08-27 11:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, John Snow,
	Markus Armbruster, Dr. David Alan Gilbert, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz

The bdrv_all_*_snapshot functions return a BlockDriverState pointer
for the invalid backend, which the callers then use to report an
error message. In some cases multiple callers are reporting the
same error message, but with slightly different text. In the future
there will be more error scenarios for some of these methods, which
will benefit from fine grained error message reporting. So it is
helpful to push error reporting down a level.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 block/monitor/block-hmp-cmds.c |  7 ++--
 block/snapshot.c               | 77 +++++++++++++++++-----------------
 include/block/snapshot.h       | 14 +++----
 migration/savevm.c             | 37 +++++-----------
 monitor/hmp-cmds.c             |  7 +---
 tests/qemu-iotests/267.out     | 10 ++---
 6 files changed, 65 insertions(+), 87 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 4c8c375172..9df11494d6 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -898,10 +898,11 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 
     ImageEntry *image_entry, *next_ie;
     SnapshotEntry *snapshot_entry;
+    Error *err = NULL;
 
-    bs = bdrv_all_find_vmstate_bs();
+    bs = bdrv_all_find_vmstate_bs(&err);
     if (!bs) {
-        monitor_printf(mon, "No available block device supports snapshots\n");
+        error_report_err(err);
         return;
     }
     aio_context = bdrv_get_aio_context(bs);
@@ -951,7 +952,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
     total = 0;
     for (i = 0; i < nb_sns; i++) {
         SnapshotEntry *next_sn;
-        if (bdrv_all_find_snapshot(sn_tab[i].name, &bs1) == 0) {
+        if (bdrv_all_find_snapshot(sn_tab[i].name, NULL) == 0) {
             global_snapshots[total] = i;
             total++;
             QTAILQ_FOREACH(image_entry, &image_list, next) {
diff --git a/block/snapshot.c b/block/snapshot.c
index bd9fb01817..6839060622 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -400,14 +400,14 @@ static bool bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
  * These functions will properly handle dataplane (take aio_context_acquire
  * when appropriate for appropriate block drivers) */
 
-bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
+bool bdrv_all_can_snapshot(Error **errp)
 {
-    bool ok = true;
     BlockDriverState *bs;
     BdrvNextIterator it;
 
     for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
         AioContext *ctx = bdrv_get_aio_context(bs);
+        bool ok;
 
         aio_context_acquire(ctx);
         if (bdrv_all_snapshots_includes_bs(bs)) {
@@ -415,26 +415,25 @@ bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
         }
         aio_context_release(ctx);
         if (!ok) {
+            error_setg(errp, "Device '%s' is writable but does not support "
+                       "snapshots", bdrv_get_device_or_node_name(bs));
             bdrv_next_cleanup(&it);
-            goto fail;
+            return false;
         }
     }
 
-fail:
-    *first_bad_bs = bs;
-    return ok;
+    return true;
 }
 
-int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
-                             Error **errp)
+int bdrv_all_delete_snapshot(const char *name, Error **errp)
 {
-    int ret = 0;
     BlockDriverState *bs;
     BdrvNextIterator it;
     QEMUSnapshotInfo sn1, *snapshot = &sn1;
 
     for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
         AioContext *ctx = bdrv_get_aio_context(bs);
+        int ret;
 
         aio_context_acquire(ctx);
         if (bdrv_all_snapshots_includes_bs(bs) &&
@@ -445,26 +444,25 @@ int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
         }
         aio_context_release(ctx);
         if (ret < 0) {
+            error_prepend(errp, "Could not delete snapshot '%s' on '%s': ",
+                          name, bdrv_get_device_or_node_name(bs));
             bdrv_next_cleanup(&it);
-            goto fail;
+            return -1;
         }
     }
 
-fail:
-    *first_bad_bs = bs;
-    return ret;
+    return 0;
 }
 
 
-int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs,
-                           Error **errp)
+int bdrv_all_goto_snapshot(const char *name, Error **errp)
 {
-    int ret = 0;
     BlockDriverState *bs;
     BdrvNextIterator it;
 
     for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
         AioContext *ctx = bdrv_get_aio_context(bs);
+        int ret;
 
         aio_context_acquire(ctx);
         if (bdrv_all_snapshots_includes_bs(bs)) {
@@ -472,75 +470,75 @@ int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs,
         }
         aio_context_release(ctx);
         if (ret < 0) {
+            error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
+                          name, bdrv_get_device_or_node_name(bs));
             bdrv_next_cleanup(&it);
-            goto fail;
+            return -1;
         }
     }
 
-fail:
-    *first_bad_bs = bs;
-    return ret;
+    return 0;
 }
 
-int bdrv_all_find_snapshot(const char *name, BlockDriverState **first_bad_bs)
+int bdrv_all_find_snapshot(const char *name, Error **errp)
 {
     QEMUSnapshotInfo sn;
-    int err = 0;
     BlockDriverState *bs;
     BdrvNextIterator it;
 
     for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
         AioContext *ctx = bdrv_get_aio_context(bs);
+        int ret;
 
         aio_context_acquire(ctx);
         if (bdrv_all_snapshots_includes_bs(bs)) {
-            err = bdrv_snapshot_find(bs, &sn, name);
+            ret = bdrv_snapshot_find(bs, &sn, name);
         }
         aio_context_release(ctx);
-        if (err < 0) {
+        if (ret < 0) {
+            error_setg(errp, "Could not find snapshot '%s' on '%s'",
+                       name, bdrv_get_device_or_node_name(bs));
             bdrv_next_cleanup(&it);
-            goto fail;
+            return -1;
         }
     }
 
-fail:
-    *first_bad_bs = bs;
-    return err;
+    return 0;
 }
 
 int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
                              BlockDriverState *vm_state_bs,
                              uint64_t vm_state_size,
-                             BlockDriverState **first_bad_bs)
+                             Error **errp)
 {
-    int err = 0;
     BlockDriverState *bs;
     BdrvNextIterator it;
 
     for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
         AioContext *ctx = bdrv_get_aio_context(bs);
+        int ret;
 
         aio_context_acquire(ctx);
         if (bs == vm_state_bs) {
             sn->vm_state_size = vm_state_size;
-            err = bdrv_snapshot_create(bs, sn);
+            ret = bdrv_snapshot_create(bs, sn);
         } else if (bdrv_all_snapshots_includes_bs(bs)) {
             sn->vm_state_size = 0;
-            err = bdrv_snapshot_create(bs, sn);
+            ret = bdrv_snapshot_create(bs, sn);
         }
         aio_context_release(ctx);
-        if (err < 0) {
+        if (ret < 0) {
+            error_setg(errp, "Could not create snapshot '%s' on '%s'",
+                       sn->name, bdrv_get_device_or_node_name(bs));
             bdrv_next_cleanup(&it);
-            goto fail;
+            return -1;
         }
     }
 
-fail:
-    *first_bad_bs = bs;
-    return err;
+    return 0;
 }
 
-BlockDriverState *bdrv_all_find_vmstate_bs(void)
+BlockDriverState *bdrv_all_find_vmstate_bs(Error **errp)
 {
     BlockDriverState *bs;
     BdrvNextIterator it;
@@ -558,5 +556,8 @@ BlockDriverState *bdrv_all_find_vmstate_bs(void)
             break;
         }
     }
+    if (!bs) {
+        error_setg(errp, "No block device supports snapshots");
+    }
     return bs;
 }
diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index 2bfcd57578..ba1528eee0 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -76,17 +76,15 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
  * These functions will properly handle dataplane (take aio_context_acquire
  * when appropriate for appropriate block drivers */
 
-bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs);
-int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bsd_bs,
-                             Error **errp);
-int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs,
-                           Error **errp);
-int bdrv_all_find_snapshot(const char *name, BlockDriverState **first_bad_bs);
+bool bdrv_all_can_snapshot(Error **errp);
+int bdrv_all_delete_snapshot(const char *name, Error **errp);
+int bdrv_all_goto_snapshot(const char *name, Error **errp);
+int bdrv_all_find_snapshot(const char *name, Error **errp);
 int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
                              BlockDriverState *vm_state_bs,
                              uint64_t vm_state_size,
-                             BlockDriverState **first_bad_bs);
+                             Error **errp);
 
-BlockDriverState *bdrv_all_find_vmstate_bs(void);
+BlockDriverState *bdrv_all_find_vmstate_bs(Error **errp);
 
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index 304d98ff78..3826c437cc 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2660,7 +2660,7 @@ int qemu_load_device_state(QEMUFile *f)
 
 int save_snapshot(const char *name, Error **errp)
 {
-    BlockDriverState *bs, *bs1;
+    BlockDriverState *bs;
     QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1;
     int ret = -1, ret2;
     QEMUFile *f;
@@ -2680,25 +2680,19 @@ int save_snapshot(const char *name, Error **errp)
         return ret;
     }
 
-    if (!bdrv_all_can_snapshot(&bs)) {
-        error_setg(errp, "Device '%s' is writable but does not support "
-                   "snapshots", bdrv_get_device_or_node_name(bs));
+    if (!bdrv_all_can_snapshot(errp)) {
         return ret;
     }
 
     /* Delete old snapshots of the same name */
     if (name) {
-        ret = bdrv_all_delete_snapshot(name, &bs1, errp);
-        if (ret < 0) {
-            error_prepend(errp, "Error while deleting snapshot on device "
-                          "'%s': ", bdrv_get_device_or_node_name(bs1));
+        if (bdrv_all_delete_snapshot(name, errp) < 0) {
             return ret;
         }
     }
 
-    bs = bdrv_all_find_vmstate_bs();
+    bs = bdrv_all_find_vmstate_bs(errp);
     if (bs == NULL) {
-        error_setg(errp, "No block device can accept snapshots");
         return ret;
     }
     aio_context = bdrv_get_aio_context(bs);
@@ -2763,10 +2757,8 @@ int save_snapshot(const char *name, Error **errp)
     aio_context_release(aio_context);
     aio_context = NULL;
 
-    ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
+    ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, errp);
     if (ret < 0) {
-        error_setg(errp, "Error while creating snapshot on '%s'",
-                   bdrv_get_device_or_node_name(bs));
         goto the_end;
     }
 
@@ -2868,7 +2860,7 @@ void qmp_xen_load_devices_state(const char *filename, Error **errp)
 
 int load_snapshot(const char *name, Error **errp)
 {
-    BlockDriverState *bs, *bs_vm_state;
+    BlockDriverState *bs_vm_state;
     QEMUSnapshotInfo sn;
     QEMUFile *f;
     int ret;
@@ -2881,23 +2873,16 @@ int load_snapshot(const char *name, Error **errp)
         return -EINVAL;
     }
 
-    if (!bdrv_all_can_snapshot(&bs)) {
-        error_setg(errp,
-                   "Device '%s' is writable but does not support snapshots",
-                   bdrv_get_device_or_node_name(bs));
+    if (!bdrv_all_can_snapshot(errp)) {
         return -ENOTSUP;
     }
-    ret = bdrv_all_find_snapshot(name, &bs);
+    ret = bdrv_all_find_snapshot(name, errp);
     if (ret < 0) {
-        error_setg(errp,
-                   "Device '%s' does not have the requested snapshot '%s'",
-                   bdrv_get_device_or_node_name(bs), name);
         return ret;
     }
 
-    bs_vm_state = bdrv_all_find_vmstate_bs();
+    bs_vm_state = bdrv_all_find_vmstate_bs(errp);
     if (!bs_vm_state) {
-        error_setg(errp, "No block device supports snapshots");
         return -ENOTSUP;
     }
     aio_context = bdrv_get_aio_context(bs_vm_state);
@@ -2917,10 +2902,8 @@ int load_snapshot(const char *name, Error **errp)
     /* Flush all IO requests so they don't interfere with the new state.  */
     bdrv_drain_all_begin();
 
-    ret = bdrv_all_goto_snapshot(name, &bs, errp);
+    ret = bdrv_all_goto_snapshot(name, errp);
     if (ret < 0) {
-        error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
-                      name, bdrv_get_device_or_node_name(bs));
         goto err_drain;
     }
 
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 7711726fd2..9bb50b9abf 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1137,15 +1137,10 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
 
 void hmp_delvm(Monitor *mon, const QDict *qdict)
 {
-    BlockDriverState *bs;
     Error *err = NULL;
     const char *name = qdict_get_str(qdict, "name");
 
-    if (bdrv_all_delete_snapshot(name, &bs, &err) < 0) {
-        error_prepend(&err,
-                      "deleting snapshot on device '%s': ",
-                      bdrv_get_device_name(bs));
-    }
+    bdrv_all_delete_snapshot(name, &err);
     hmp_handle_error(mon, err);
 }
 
diff --git a/tests/qemu-iotests/267.out b/tests/qemu-iotests/267.out
index 215902b3ad..c65cce893a 100644
--- a/tests/qemu-iotests/267.out
+++ b/tests/qemu-iotests/267.out
@@ -6,9 +6,9 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 Testing:
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) savevm snap0
-Error: No block device can accept snapshots
+Error: No block device supports snapshots
 (qemu) info snapshots
-No available block device supports snapshots
+No block device supports snapshots
 (qemu) loadvm snap0
 Error: No block device supports snapshots
 (qemu) quit
@@ -22,7 +22,7 @@ QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) savevm snap0
 Error: Device 'none0' is writable but does not support snapshots
 (qemu) info snapshots
-No available block device supports snapshots
+No block device supports snapshots
 (qemu) loadvm snap0
 Error: Device 'none0' is writable but does not support snapshots
 (qemu) quit
@@ -58,7 +58,7 @@ QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) savevm snap0
 Error: Device 'virtio0' is writable but does not support snapshots
 (qemu) info snapshots
-No available block device supports snapshots
+No block device supports snapshots
 (qemu) loadvm snap0
 Error: Device 'virtio0' is writable but does not support snapshots
 (qemu) quit
@@ -83,7 +83,7 @@ QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) savevm snap0
 Error: Device 'file' is writable but does not support snapshots
 (qemu) info snapshots
-No available block device supports snapshots
+No block device supports snapshots
 (qemu) loadvm snap0
 Error: Device 'file' is writable but does not support snapshots
 (qemu) quit
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/7] migration: stop returning errno from load_snapshot()
  2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 1/7] migration: improve error reporting of block driver state name Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 2/7] block: push error reporting into bdrv_all_*_snapshot functions Daniel P. Berrangé
@ 2020-08-27 11:16 ` Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 4/7] block: add ability to specify list of blockdevs during snapshot Daniel P. Berrangé
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-08-27 11:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, John Snow,
	Markus Armbruster, Dr. David Alan Gilbert, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz

None of the callers care about the errno value since there is a full
Error object populated. This gives consistency with save_snapshot()
which already just returns -1.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 3826c437cc..711137bcbe 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2870,20 +2870,20 @@ int load_snapshot(const char *name, Error **errp)
     if (!replay_can_snapshot()) {
         error_setg(errp, "Record/replay does not allow loading snapshot "
                    "right now. Try once more later.");
-        return -EINVAL;
+        return -1;
     }
 
     if (!bdrv_all_can_snapshot(errp)) {
-        return -ENOTSUP;
+        return -1;
     }
     ret = bdrv_all_find_snapshot(name, errp);
     if (ret < 0) {
-        return ret;
+        return -1;
     }
 
     bs_vm_state = bdrv_all_find_vmstate_bs(errp);
     if (!bs_vm_state) {
-        return -ENOTSUP;
+        return -1;
     }
     aio_context = bdrv_get_aio_context(bs_vm_state);
 
@@ -2892,11 +2892,11 @@ int load_snapshot(const char *name, Error **errp)
     ret = bdrv_snapshot_find(bs_vm_state, &sn, name);
     aio_context_release(aio_context);
     if (ret < 0) {
-        return ret;
+        return -1;
     } else if (sn.vm_state_size == 0) {
         error_setg(errp, "This is a disk-only snapshot. Revert to it "
                    " offline using qemu-img");
-        return -EINVAL;
+        return -1;
     }
 
     /* Flush all IO requests so they don't interfere with the new state.  */
@@ -2911,7 +2911,6 @@ int load_snapshot(const char *name, Error **errp)
     f = qemu_fopen_bdrv(bs_vm_state, 0);
     if (!f) {
         error_setg(errp, "Could not open VM state file");
-        ret = -EINVAL;
         goto err_drain;
     }
 
@@ -2927,14 +2926,14 @@ int load_snapshot(const char *name, Error **errp)
 
     if (ret < 0) {
         error_setg(errp, "Error %d while loading VM state", ret);
-        return ret;
+        return -1;
     }
 
     return 0;
 
 err_drain:
     bdrv_drain_all_end();
-    return ret;
+    return -1;
 }
 
 void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 4/7] block: add ability to specify list of blockdevs during snapshot
  2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
                   ` (2 preceding siblings ...)
  2020-08-27 11:16 ` [PATCH v3 3/7] migration: stop returning errno from load_snapshot() Daniel P. Berrangé
@ 2020-08-27 11:16 ` Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 5/7] block: allow specifying name of block device for vmstate storage Daniel P. Berrangé
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-08-27 11:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, John Snow,
	Markus Armbruster, Dr. David Alan Gilbert, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz

When running snapshot operations, there are various rules for which
blockdevs are included/excluded. While this provides reasonable default
behaviour, there are scenarios that are not well handled by the default
logic. Some of the conditions do not have a single correct answer.

Thus there needs to be a way for the mgmt app to provide an explicit
list of blockdevs to perform snapshots across. This can be achieved
by passing a list of node names that should be used.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 block/monitor/block-hmp-cmds.c |   4 +-
 block/snapshot.c               | 167 +++++++++++++++++++++++----------
 include/block/snapshot.h       |  13 +--
 migration/savevm.c             |  16 ++--
 monitor/hmp-cmds.c             |   2 +-
 5 files changed, 138 insertions(+), 64 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 9df11494d6..db76c43cc2 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -900,7 +900,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
     SnapshotEntry *snapshot_entry;
     Error *err = NULL;
 
-    bs = bdrv_all_find_vmstate_bs(&err);
+    bs = bdrv_all_find_vmstate_bs(NULL, &err);
     if (!bs) {
         error_report_err(err);
         return;
@@ -952,7 +952,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
     total = 0;
     for (i = 0; i < nb_sns; i++) {
         SnapshotEntry *next_sn;
-        if (bdrv_all_find_snapshot(sn_tab[i].name, NULL) == 0) {
+        if (bdrv_all_find_snapshot(sn_tab[i].name, NULL, NULL) == 0) {
             global_snapshots[total] = i;
             total++;
             QTAILQ_FOREACH(image_entry, &image_list, next) {
diff --git a/block/snapshot.c b/block/snapshot.c
index 6839060622..5691cdc6cb 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -385,6 +385,36 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
     return ret;
 }
 
+
+static int bdrv_all_get_snapshot_devices(strList *devices,
+                                         GList **all_bdrvs,
+                                         Error **errp)
+{
+    g_autoptr(GList) bdrvs = NULL;
+
+    if (devices) {
+        while (devices) {
+            BlockDriverState *bs = bdrv_find_node(devices->value);
+            if (!bs) {
+                error_setg(errp, "No block device node '%s'", devices->value);
+                return -1;
+            }
+            bdrvs = g_list_append(bdrvs, bs);
+            devices = devices->next;
+        }
+    } else {
+        BlockDriverState *bs;
+        BdrvNextIterator it;
+        for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+            bdrvs = g_list_append(bdrvs, bs);
+        }
+    }
+
+    *all_bdrvs = g_steal_pointer(&bdrvs);
+    return 0;
+}
+
+
 static bool bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
 {
     if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
@@ -400,43 +430,56 @@ static bool bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
  * These functions will properly handle dataplane (take aio_context_acquire
  * when appropriate for appropriate block drivers) */
 
-bool bdrv_all_can_snapshot(Error **errp)
+bool bdrv_all_can_snapshot(strList *devices, Error **errp)
 {
-    BlockDriverState *bs;
-    BdrvNextIterator it;
+    g_autoptr(GList) bdrvs = NULL;
+    GList *iterbdrvs;
 
-    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+    if (bdrv_all_get_snapshot_devices(devices, &bdrvs, errp) < 0) {
+        return false;
+    }
+
+    iterbdrvs = bdrvs;
+    while (iterbdrvs) {
+        BlockDriverState *bs = iterbdrvs->data;
         AioContext *ctx = bdrv_get_aio_context(bs);
         bool ok;
 
         aio_context_acquire(ctx);
-        if (bdrv_all_snapshots_includes_bs(bs)) {
+        if (devices || bdrv_all_snapshots_includes_bs(bs)) {
             ok = bdrv_can_snapshot(bs);
         }
         aio_context_release(ctx);
         if (!ok) {
             error_setg(errp, "Device '%s' is writable but does not support "
                        "snapshots", bdrv_get_device_or_node_name(bs));
-            bdrv_next_cleanup(&it);
             return false;
         }
+
+        iterbdrvs = iterbdrvs->next;
     }
 
     return true;
 }
 
-int bdrv_all_delete_snapshot(const char *name, Error **errp)
+int bdrv_all_delete_snapshot(const char *name, strList *devices, Error **errp)
 {
-    BlockDriverState *bs;
-    BdrvNextIterator it;
-    QEMUSnapshotInfo sn1, *snapshot = &sn1;
+    g_autoptr(GList) bdrvs = NULL;
+    GList *iterbdrvs;
 
-    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+    if (bdrv_all_get_snapshot_devices(devices, &bdrvs, errp) < 0) {
+        return -1;
+    }
+
+    iterbdrvs = bdrvs;
+    while (iterbdrvs) {
+        BlockDriverState *bs = iterbdrvs->data;
         AioContext *ctx = bdrv_get_aio_context(bs);
-        int ret;
+        QEMUSnapshotInfo sn1, *snapshot = &sn1;
+        int ret = 0;
 
         aio_context_acquire(ctx);
-        if (bdrv_all_snapshots_includes_bs(bs) &&
+        if ((devices || bdrv_all_snapshots_includes_bs(bs)) &&
             bdrv_snapshot_find(bs, snapshot, name) >= 0)
         {
             ret = bdrv_snapshot_delete(bs, snapshot->id_str,
@@ -446,61 +489,76 @@ int bdrv_all_delete_snapshot(const char *name, Error **errp)
         if (ret < 0) {
             error_prepend(errp, "Could not delete snapshot '%s' on '%s': ",
                           name, bdrv_get_device_or_node_name(bs));
-            bdrv_next_cleanup(&it);
             return -1;
         }
+
+        iterbdrvs = iterbdrvs->next;
     }
 
     return 0;
 }
 
 
-int bdrv_all_goto_snapshot(const char *name, Error **errp)
+int bdrv_all_goto_snapshot(const char *name, strList *devices, Error **errp)
 {
-    BlockDriverState *bs;
-    BdrvNextIterator it;
+    g_autoptr(GList) bdrvs = NULL;
+    GList *iterbdrvs;
 
-    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+    if (bdrv_all_get_snapshot_devices(devices, &bdrvs, errp) < 0) {
+        return -1;
+    }
+
+    iterbdrvs = bdrvs;
+    while (iterbdrvs) {
+        BlockDriverState *bs = iterbdrvs->data;
         AioContext *ctx = bdrv_get_aio_context(bs);
-        int ret;
+        int ret = 0;
 
         aio_context_acquire(ctx);
-        if (bdrv_all_snapshots_includes_bs(bs)) {
+        if (devices || bdrv_all_snapshots_includes_bs(bs)) {
             ret = bdrv_snapshot_goto(bs, name, errp);
         }
         aio_context_release(ctx);
         if (ret < 0) {
             error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
                           name, bdrv_get_device_or_node_name(bs));
-            bdrv_next_cleanup(&it);
             return -1;
         }
+
+        iterbdrvs = iterbdrvs->next;
     }
 
     return 0;
 }
 
-int bdrv_all_find_snapshot(const char *name, Error **errp)
+int bdrv_all_find_snapshot(const char *name, strList *devices, Error **errp)
 {
-    QEMUSnapshotInfo sn;
-    BlockDriverState *bs;
-    BdrvNextIterator it;
+    g_autoptr(GList) bdrvs = NULL;
+    GList *iterbdrvs;
 
-    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+    if (bdrv_all_get_snapshot_devices(devices, &bdrvs, errp) < 0) {
+        return -1;
+    }
+
+    iterbdrvs = bdrvs;
+    while (iterbdrvs) {
+        BlockDriverState *bs = iterbdrvs->data;
         AioContext *ctx = bdrv_get_aio_context(bs);
-        int ret;
+        QEMUSnapshotInfo sn;
+        int ret = 0;
 
         aio_context_acquire(ctx);
-        if (bdrv_all_snapshots_includes_bs(bs)) {
+        if (devices || bdrv_all_snapshots_includes_bs(bs)) {
             ret = bdrv_snapshot_find(bs, &sn, name);
         }
         aio_context_release(ctx);
         if (ret < 0) {
             error_setg(errp, "Could not find snapshot '%s' on '%s'",
                        name, bdrv_get_device_or_node_name(bs));
-            bdrv_next_cleanup(&it);
             return -1;
         }
+
+        iterbdrvs = iterbdrvs->next;
     }
 
     return 0;
@@ -509,20 +567,27 @@ int bdrv_all_find_snapshot(const char *name, Error **errp)
 int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
                              BlockDriverState *vm_state_bs,
                              uint64_t vm_state_size,
+                             strList *devices,
                              Error **errp)
 {
-    BlockDriverState *bs;
-    BdrvNextIterator it;
+    g_autoptr(GList) bdrvs = NULL;
+    GList *iterbdrvs;
 
-    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+    if (bdrv_all_get_snapshot_devices(devices, &bdrvs, errp) < 0) {
+        return -1;
+    }
+
+    iterbdrvs = bdrvs;
+    while (iterbdrvs) {
+        BlockDriverState *bs = iterbdrvs->data;
         AioContext *ctx = bdrv_get_aio_context(bs);
-        int ret;
+        int ret = 0;
 
         aio_context_acquire(ctx);
         if (bs == vm_state_bs) {
             sn->vm_state_size = vm_state_size;
             ret = bdrv_snapshot_create(bs, sn);
-        } else if (bdrv_all_snapshots_includes_bs(bs)) {
+        } else if (devices || bdrv_all_snapshots_includes_bs(bs)) {
             sn->vm_state_size = 0;
             ret = bdrv_snapshot_create(bs, sn);
         }
@@ -530,34 +595,42 @@ int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
         if (ret < 0) {
             error_setg(errp, "Could not create snapshot '%s' on '%s'",
                        sn->name, bdrv_get_device_or_node_name(bs));
-            bdrv_next_cleanup(&it);
             return -1;
         }
+
+        iterbdrvs = iterbdrvs->next;
     }
 
     return 0;
 }
 
-BlockDriverState *bdrv_all_find_vmstate_bs(Error **errp)
+BlockDriverState *bdrv_all_find_vmstate_bs(strList *devices, Error **errp)
 {
-    BlockDriverState *bs;
-    BdrvNextIterator it;
+    g_autoptr(GList) bdrvs = NULL;
+    GList *iterbdrvs;
+
+    if (bdrv_all_get_snapshot_devices(devices, &bdrvs, errp) < 0) {
+        return NULL;
+    }
 
-    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+    iterbdrvs = bdrvs;
+    while (iterbdrvs) {
+        BlockDriverState *bs = iterbdrvs->data;
         AioContext *ctx = bdrv_get_aio_context(bs);
-        bool found;
+        bool found = false;
 
         aio_context_acquire(ctx);
-        found = bdrv_all_snapshots_includes_bs(bs) && bdrv_can_snapshot(bs);
+        found = (devices || bdrv_all_snapshots_includes_bs(bs)) &&
+            bdrv_can_snapshot(bs);
         aio_context_release(ctx);
 
         if (found) {
-            bdrv_next_cleanup(&it);
-            break;
+            return bs;
         }
+
+        iterbdrvs = iterbdrvs->next;
     }
-    if (!bs) {
-        error_setg(errp, "No block device supports snapshots");
-    }
-    return bs;
+
+    error_setg(errp, "No block device supports snapshots");
+    return NULL;
 }
diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index ba1528eee0..1c5b0705a9 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -25,7 +25,7 @@
 #ifndef SNAPSHOT_H
 #define SNAPSHOT_H
 
-
+#include "qapi/qapi-builtin-types.h"
 
 #define SNAPSHOT_OPT_BASE       "snapshot."
 #define SNAPSHOT_OPT_ID         "snapshot.id"
@@ -76,15 +76,16 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
  * These functions will properly handle dataplane (take aio_context_acquire
  * when appropriate for appropriate block drivers */
 
-bool bdrv_all_can_snapshot(Error **errp);
-int bdrv_all_delete_snapshot(const char *name, Error **errp);
-int bdrv_all_goto_snapshot(const char *name, Error **errp);
-int bdrv_all_find_snapshot(const char *name, Error **errp);
+bool bdrv_all_can_snapshot(strList *devices, Error **errp);
+int bdrv_all_delete_snapshot(const char *name, strList *devices, Error **errp);
+int bdrv_all_goto_snapshot(const char *name, strList *devices, Error **errp);
+int bdrv_all_find_snapshot(const char *name, strList *devices, Error **errp);
 int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
                              BlockDriverState *vm_state_bs,
                              uint64_t vm_state_size,
+                             strList *devices,
                              Error **errp);
 
-BlockDriverState *bdrv_all_find_vmstate_bs(Error **errp);
+BlockDriverState *bdrv_all_find_vmstate_bs(strList *devices, Error **errp);
 
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index 711137bcbe..ae56de1a85 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2680,18 +2680,18 @@ int save_snapshot(const char *name, Error **errp)
         return ret;
     }
 
-    if (!bdrv_all_can_snapshot(errp)) {
+    if (!bdrv_all_can_snapshot(NULL, errp)) {
         return ret;
     }
 
     /* Delete old snapshots of the same name */
     if (name) {
-        if (bdrv_all_delete_snapshot(name, errp) < 0) {
+        if (bdrv_all_delete_snapshot(name, NULL, errp) < 0) {
             return ret;
         }
     }
 
-    bs = bdrv_all_find_vmstate_bs(errp);
+    bs = bdrv_all_find_vmstate_bs(NULL, errp);
     if (bs == NULL) {
         return ret;
     }
@@ -2757,7 +2757,7 @@ int save_snapshot(const char *name, Error **errp)
     aio_context_release(aio_context);
     aio_context = NULL;
 
-    ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, errp);
+    ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, NULL, errp);
     if (ret < 0) {
         goto the_end;
     }
@@ -2873,15 +2873,15 @@ int load_snapshot(const char *name, Error **errp)
         return -1;
     }
 
-    if (!bdrv_all_can_snapshot(errp)) {
+    if (!bdrv_all_can_snapshot(NULL, errp)) {
         return -1;
     }
-    ret = bdrv_all_find_snapshot(name, errp);
+    ret = bdrv_all_find_snapshot(name, NULL, errp);
     if (ret < 0) {
         return -1;
     }
 
-    bs_vm_state = bdrv_all_find_vmstate_bs(errp);
+    bs_vm_state = bdrv_all_find_vmstate_bs(NULL, errp);
     if (!bs_vm_state) {
         return -1;
     }
@@ -2902,7 +2902,7 @@ int load_snapshot(const char *name, Error **errp)
     /* Flush all IO requests so they don't interfere with the new state.  */
     bdrv_drain_all_begin();
 
-    ret = bdrv_all_goto_snapshot(name, errp);
+    ret = bdrv_all_goto_snapshot(name, NULL, errp);
     if (ret < 0) {
         goto err_drain;
     }
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 9bb50b9abf..51f624b5ea 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1140,7 +1140,7 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
     Error *err = NULL;
     const char *name = qdict_get_str(qdict, "name");
 
-    bdrv_all_delete_snapshot(name, &err);
+    bdrv_all_delete_snapshot(name, NULL, &err);
     hmp_handle_error(mon, err);
 }
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 5/7] block: allow specifying name of block device for vmstate storage
  2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
                   ` (3 preceding siblings ...)
  2020-08-27 11:16 ` [PATCH v3 4/7] block: add ability to specify list of blockdevs during snapshot Daniel P. Berrangé
@ 2020-08-27 11:16 ` Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 6/7] iotests: add support for capturing and matching QMP events Daniel P. Berrangé
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-08-27 11:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, John Snow,
	Markus Armbruster, Dr. David Alan Gilbert, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz

Currently the vmstate will be stored in the first block device that
supports snapshots. Historically this would have usually been the
root device, but with UEFI it might be the variable store. There
needs to be a way to override the choice of block device to store
the state in.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 block/monitor/block-hmp-cmds.c |  2 +-
 block/snapshot.c               | 17 +++++++++++++++--
 include/block/snapshot.h       |  4 +++-
 migration/savevm.c             |  4 ++--
 4 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index db76c43cc2..81d1b52262 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -900,7 +900,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
     SnapshotEntry *snapshot_entry;
     Error *err = NULL;
 
-    bs = bdrv_all_find_vmstate_bs(NULL, &err);
+    bs = bdrv_all_find_vmstate_bs(NULL, NULL, &err);
     if (!bs) {
         error_report_err(err);
         return;
diff --git a/block/snapshot.c b/block/snapshot.c
index 5691cdc6cb..1f7b9a5146 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -604,7 +604,9 @@ int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
     return 0;
 }
 
-BlockDriverState *bdrv_all_find_vmstate_bs(strList *devices, Error **errp)
+BlockDriverState *bdrv_all_find_vmstate_bs(const char *vmstate_bs,
+                                           strList *devices,
+                                           Error **errp)
 {
     g_autoptr(GList) bdrvs = NULL;
     GList *iterbdrvs;
@@ -624,6 +626,13 @@ BlockDriverState *bdrv_all_find_vmstate_bs(strList *devices, Error **errp)
             bdrv_can_snapshot(bs);
         aio_context_release(ctx);
 
+        if (vmstate_bs && g_str_equal(vmstate_bs,
+                                      bdrv_get_node_name(bs))) {
+            error_setg(errp, "block device '%s' does not support snapshots",
+                       vmstate_bs);
+            return NULL;
+        }
+
         if (found) {
             return bs;
         }
@@ -631,6 +640,10 @@ BlockDriverState *bdrv_all_find_vmstate_bs(strList *devices, Error **errp)
         iterbdrvs = iterbdrvs->next;
     }
 
-    error_setg(errp, "No block device supports snapshots");
+    if (vmstate_bs) {
+        error_setg(errp, "Block device '%s' does not exist", vmstate_bs);
+    } else {
+        error_setg(errp, "No block device supports snapshots");
+    }
     return NULL;
 }
diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index 1c5b0705a9..05550e5da1 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -86,6 +86,8 @@ int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
                              strList *devices,
                              Error **errp);
 
-BlockDriverState *bdrv_all_find_vmstate_bs(strList *devices, Error **errp);
+BlockDriverState *bdrv_all_find_vmstate_bs(const char *vmstate_bs,
+                                           strList *devices,
+                                           Error **errp);
 
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index ae56de1a85..4a52704132 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2691,7 +2691,7 @@ int save_snapshot(const char *name, Error **errp)
         }
     }
 
-    bs = bdrv_all_find_vmstate_bs(NULL, errp);
+    bs = bdrv_all_find_vmstate_bs(NULL, NULL, errp);
     if (bs == NULL) {
         return ret;
     }
@@ -2881,7 +2881,7 @@ int load_snapshot(const char *name, Error **errp)
         return -1;
     }
 
-    bs_vm_state = bdrv_all_find_vmstate_bs(NULL, errp);
+    bs_vm_state = bdrv_all_find_vmstate_bs(NULL, NULL, errp);
     if (!bs_vm_state) {
         return -1;
     }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 6/7] iotests: add support for capturing and matching QMP events
  2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
                   ` (4 preceding siblings ...)
  2020-08-27 11:16 ` [PATCH v3 5/7] block: allow specifying name of block device for vmstate storage Daniel P. Berrangé
@ 2020-08-27 11:16 ` Daniel P. Berrangé
  2020-08-27 11:16 ` [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands Daniel P. Berrangé
  2020-09-11 11:52 ` [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Dr. David Alan Gilbert
  7 siblings, 0 replies; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-08-27 11:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, John Snow,
	Markus Armbruster, Dr. David Alan Gilbert, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz

When using the _launch_qemu and _send_qemu_cmd functions from
common.qemu, any QMP events get mixed in with the output from
the commands and responses.

This makes it difficult to write a test case as the ordering
of events in the output is not stable.

This introduces a variable 'capture_events' which can be set
to a list of event names. Any events listed in this variable
will not be printed, instead collected in the $QEMU_EVENTS
environment variable.

A new '_wait_event' function can be invoked to collect events
at a fixed point in time. The function will first pull events
cached in $QEMU_EVENTS variable, and if none are found, will
then read more from QMP.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qemu-iotests/common.qemu | 107 ++++++++++++++++++++++++++++++++-
 1 file changed, 106 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index de680cf1c7..87d7a54001 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -53,6 +53,15 @@ _in_fd=4
 # If $mismatch_only is set, only non-matching responses will
 # be echoed.
 #
+# If $capture_events is non-empty, then any QMP event names it lists
+# will not be echoed out, but instead collected in the $QEMU_EVENTS
+# variable. The _wait_event function can later be used to received
+# the cached events.
+#
+# If $only_capture_events is set to anything but an empty string,
+# when an error will be raised if a QMP message is seen which is
+# not an event listed in $capture_events.
+#
 # If $success_or_failure is set, the meaning of the arguments is
 # changed as follows:
 # $2: A string to search for in the response; if found, this indicates
@@ -78,6 +87,32 @@ _timed_wait_for()
     QEMU_STATUS[$h]=0
     while IFS= read -t ${QEMU_COMM_TIMEOUT} resp <&${QEMU_OUT[$h]}
     do
+        if [ -n "$capture_events" ]; then
+            capture=0
+            local evname
+            for evname in $capture_events
+            do
+                grep -q "\"event\": \"${evname}\"" < <(echo "${resp}")
+                if [ $? -eq 0 ]; then
+                    capture=1
+                fi
+            done
+            if [ $capture = 1 ];
+            then
+                ev=$(echo "${resp}" | tr -d '\r' | tr % .)
+                QEMU_EVENTS="${QEMU_EVENTS:+${QEMU_EVENTS}%}${ev}"
+                if [ -n "$only_capture_events" ]; then
+                    return
+                else
+                    continue
+                fi
+            fi
+        fi
+        if [ -n "$only_capture_events" ]; then
+            echo "Only expected $capture_events but got ${resp}"
+            exit 1
+        fi
+
         if [ -z "${silent}" ] && [ -z "${mismatch_only}" ]; then
             echo "${resp}" | _filter_testdir | _filter_qemu \
                            | _filter_qemu_io | _filter_qmp | _filter_hmp
@@ -177,12 +212,82 @@ _send_qemu_cmd()
         let count--;
     done
     if [ ${QEMU_STATUS[$h]} -ne 0 ] && [ -z "${qemu_error_no_exit}" ]; then
-        echo "Timeout waiting for ${1} on handle ${h}"
+        echo "Timeout waiting for command ${1} response on handle ${h}"
         exit 1 #Timeout means the test failed
     fi
 }
 
 
+# Check event cache for a named QMP event
+#
+# Input parameters:
+# $1:       Name of the QMP event to check for
+#
+# Checks if the named QMP event that was previously captured
+# into $QEMU_EVENTS. When matched, the QMP event will be echoed
+# and the $matched variable set to 1.
+#
+# _wait_event is more suitable for test usage in most cases
+_check_cached_events()
+{
+    local evname=${1}
+
+    local match="\"event\": \"$evname\""
+
+    matched=0
+    if [ -n "$QEMU_EVENTS" ]; then
+        CURRENT_QEMU_EVENTS=$QEMU_EVENTS
+        QEMU_EVENTS=
+        old_IFS=$IFS
+        IFS="%"
+        for ev in $CURRENT_QEMU_EVENTS
+        do
+                grep -q "$match" < <(echo "${ev}")
+            if [ $? -eq 0 -a $matched = 0 ]; then
+                echo "${ev}" | _filter_testdir | _filter_qemu \
+                           | _filter_qemu_io | _filter_qmp | _filter_hmp
+                matched=1
+            else
+                QEMU_EVENTS="${QEMU_EVENTS:+${QEMU_EVENTS}%}${ev}"
+            fi
+        done
+        IFS=$old_IFS
+    fi
+}
+
+# Wait for a named QMP event
+#
+# Input parameters:
+# $1:       QEMU handle to use
+# $2:       Name of the QMP event to wait for
+#
+# Checks if the named QMP event that was previously captured
+# into $QEMU_EVENTS. If none are present, then waits for the
+# event to arrive on the QMP channel. When matched, the QMP
+# event will be echoed
+_wait_event()
+{
+    local h=${1}
+    local evname=${2}
+
+    while true
+    do
+        _check_cached_events $evname
+
+        if [ $matched = 1 ];
+        then
+            return
+        fi
+
+        only_capture_events=1 qemu_error_no_exit=1 _timed_wait_for ${h}
+
+        if [ ${QEMU_STATUS[$h]} -ne 0 ] ; then
+            echo "Timeout waiting for event ${evname} on handle ${h}"
+            exit 1 #Timeout means the test failed
+        fi
+    done
+}
+
 # Launch a QEMU process.
 #
 # Input parameters:
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
                   ` (5 preceding siblings ...)
  2020-08-27 11:16 ` [PATCH v3 6/7] iotests: add support for capturing and matching QMP events Daniel P. Berrangé
@ 2020-08-27 11:16 ` Daniel P. Berrangé
  2020-09-01 14:20   ` Markus Armbruster
  2020-09-11 11:52 ` [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Dr. David Alan Gilbert
  7 siblings, 1 reply; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-08-27 11:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, John Snow,
	Markus Armbruster, Dr. David Alan Gilbert, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz

savevm, loadvm and delvm are some of the few HMP commands that have never
been converted to use QMP. The primary reason for this lack of conversion
is that they block execution of the thread for as long as they run.

Despite this downside, however, libvirt and applications using libvirt
have used these commands for as long as QMP has existed, via the
"human-monitor-command" passthrough command. IOW, while it is clearly
desirable to be able to fix the blocking problem, this is not an
immediate obstacle to real world usage.

Meanwhile there is a need for other features which involve adding new
parameters to the commands. This is possible with HMP passthrough, but
it provides no reliable way for apps to introspect features, so using
QAPI modelling is highly desirable.

This patch thus introduces new snapshot-{load,save,delete} commands to
QMP that are intended to replace the old HMP counterparts. The new
commands are given different names, because they will be using the new
QEMU job framework and thus will have diverging behaviour from the HMP
originals. It would thus be misleading to keep the same name.

While this design uses the generic job framework, the current impl is
still blocking. The intention that the blocking problem is fixed later.
None the less applications using these new commands should assume that
they are asynchronous and thus wait for the job status change event to
indicate completion.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 include/migration/snapshot.h |  10 +-
 migration/savevm.c           | 220 ++++++++++++++++++++-
 monitor/hmp-cmds.c           |   4 +-
 qapi/job.json                |   9 +-
 qapi/migration.json          | 135 +++++++++++++
 replay/replay-snapshot.c     |   4 +-
 softmmu/vl.c                 |   2 +-
 tests/qemu-iotests/310       | 255 ++++++++++++++++++++++++
 tests/qemu-iotests/310.out   | 369 +++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group     |   1 +
 10 files changed, 991 insertions(+), 18 deletions(-)
 create mode 100755 tests/qemu-iotests/310
 create mode 100644 tests/qemu-iotests/310.out

diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
index c85b6ec75b..f2ed9d1e43 100644
--- a/include/migration/snapshot.h
+++ b/include/migration/snapshot.h
@@ -15,7 +15,13 @@
 #ifndef QEMU_MIGRATION_SNAPSHOT_H
 #define QEMU_MIGRATION_SNAPSHOT_H
 
-int save_snapshot(const char *name, Error **errp);
-int load_snapshot(const char *name, Error **errp);
+#include "qapi/qapi-builtin-types.h"
+
+int save_snapshot(const char *name,
+                  const char *vmstate, strList *devices,
+                  Error **errp);
+int load_snapshot(const char *name,
+                  const char *vmstate, strList *devices,
+                  Error **errp);
 
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index 4a52704132..c5d8131d82 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -43,6 +43,8 @@
 #include "qapi/error.h"
 #include "qapi/qapi-commands-migration.h"
 #include "qapi/qapi-commands-misc.h"
+#include "qapi/clone-visitor.h"
+#include "qapi/qapi-builtin-visit.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/error-report.h"
 #include "sysemu/cpus.h"
@@ -2658,7 +2660,8 @@ int qemu_load_device_state(QEMUFile *f)
     return 0;
 }
 
-int save_snapshot(const char *name, Error **errp)
+int save_snapshot(const char *name, const char *vmstate,
+                  strList *devices, Error **errp)
 {
     BlockDriverState *bs;
     QEMUSnapshotInfo sn1, *sn = &sn1, old_sn1, *old_sn = &old_sn1;
@@ -2680,18 +2683,18 @@ int save_snapshot(const char *name, Error **errp)
         return ret;
     }
 
-    if (!bdrv_all_can_snapshot(NULL, errp)) {
+    if (!bdrv_all_can_snapshot(devices, errp)) {
         return ret;
     }
 
     /* Delete old snapshots of the same name */
     if (name) {
-        if (bdrv_all_delete_snapshot(name, NULL, errp) < 0) {
+        if (bdrv_all_delete_snapshot(name, devices, errp) < 0) {
             return ret;
         }
     }
 
-    bs = bdrv_all_find_vmstate_bs(NULL, NULL, errp);
+    bs = bdrv_all_find_vmstate_bs(vmstate, devices, errp);
     if (bs == NULL) {
         return ret;
     }
@@ -2757,7 +2760,7 @@ int save_snapshot(const char *name, Error **errp)
     aio_context_release(aio_context);
     aio_context = NULL;
 
-    ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, NULL, errp);
+    ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, devices, errp);
     if (ret < 0) {
         goto the_end;
     }
@@ -2858,7 +2861,8 @@ void qmp_xen_load_devices_state(const char *filename, Error **errp)
     migration_incoming_state_destroy();
 }
 
-int load_snapshot(const char *name, Error **errp)
+int load_snapshot(const char *name, const char *vmstate,
+                  strList *devices, Error **errp)
 {
     BlockDriverState *bs_vm_state;
     QEMUSnapshotInfo sn;
@@ -2873,15 +2877,15 @@ int load_snapshot(const char *name, Error **errp)
         return -1;
     }
 
-    if (!bdrv_all_can_snapshot(NULL, errp)) {
+    if (!bdrv_all_can_snapshot(devices, errp)) {
         return -1;
     }
-    ret = bdrv_all_find_snapshot(name, NULL, errp);
+    ret = bdrv_all_find_snapshot(name, devices, errp);
     if (ret < 0) {
         return -1;
     }
 
-    bs_vm_state = bdrv_all_find_vmstate_bs(NULL, NULL, errp);
+    bs_vm_state = bdrv_all_find_vmstate_bs(vmstate, devices, errp);
     if (!bs_vm_state) {
         return -1;
     }
@@ -2902,7 +2906,7 @@ int load_snapshot(const char *name, Error **errp)
     /* Flush all IO requests so they don't interfere with the new state.  */
     bdrv_drain_all_begin();
 
-    ret = bdrv_all_goto_snapshot(name, NULL, errp);
+    ret = bdrv_all_goto_snapshot(name, devices, errp);
     if (ret < 0) {
         goto err_drain;
     }
@@ -2936,6 +2940,19 @@ err_drain:
     return -1;
 }
 
+static int delete_snapshot(const char *name, strList *devices, Error **errp)
+{
+    if (!bdrv_all_can_snapshot(devices, errp)) {
+        return -1;
+    }
+
+    if (bdrv_all_delete_snapshot(name, devices, errp) < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
 void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
 {
     qemu_ram_set_idstr(mr->ram_block,
@@ -2963,3 +2980,186 @@ bool vmstate_check_only_migratable(const VMStateDescription *vmsd)
 
     return !(vmsd && vmsd->unmigratable);
 }
+
+typedef struct SnapshotJob {
+    Job common;
+    char *tag;
+    char *vmstate;
+    strList *devices;
+    Coroutine *co;
+    Error **errp;
+    int ret;
+} SnapshotJob;
+
+static void qmp_snapshot_job_free(SnapshotJob *s)
+{
+    g_free(s->tag);
+    g_free(s->vmstate);
+    qapi_free_strList(s->devices);
+}
+
+
+static void snapshot_load_job_bh(void *opaque)
+{
+    Job *job = opaque;
+    SnapshotJob *s = container_of(job, SnapshotJob, common);
+    int saved_vm_running;
+
+    job_progress_set_remaining(&s->common, 1);
+
+    saved_vm_running = runstate_is_running();
+    vm_stop(RUN_STATE_RESTORE_VM);
+
+    s->ret = load_snapshot(s->tag, s->vmstate, s->devices, s->errp);
+    if (s->ret == 0 && saved_vm_running) {
+        vm_start();
+    }
+
+    job_progress_update(&s->common, 1);
+
+    qmp_snapshot_job_free(s);
+    aio_co_wake(s->co);
+}
+
+static void snapshot_save_job_bh(void *opaque)
+{
+    Job *job = opaque;
+    SnapshotJob *s = container_of(job, SnapshotJob, common);
+
+    job_progress_set_remaining(&s->common, 1);
+    s->ret = save_snapshot(s->tag, s->vmstate, s->devices, s->errp);
+    job_progress_update(&s->common, 1);
+
+    qmp_snapshot_job_free(s);
+    aio_co_wake(s->co);
+}
+
+static void snapshot_delete_job_bh(void *opaque)
+{
+    Job *job = opaque;
+    SnapshotJob *s = container_of(job, SnapshotJob, common);
+
+    job_progress_set_remaining(&s->common, 1);
+    s->ret = delete_snapshot(s->tag, s->devices, s->errp);
+    job_progress_update(&s->common, 1);
+
+    qmp_snapshot_job_free(s);
+    aio_co_wake(s->co);
+}
+
+static int coroutine_fn snapshot_save_job_run(Job *job, Error **errp)
+{
+    SnapshotJob *s = container_of(job, SnapshotJob, common);
+    s->errp = errp;
+    s->co = qemu_coroutine_self();
+    aio_bh_schedule_oneshot(qemu_get_aio_context(),
+                            snapshot_save_job_bh, job);
+    qemu_coroutine_yield();
+    return s->ret;
+}
+
+static int coroutine_fn snapshot_load_job_run(Job *job, Error **errp)
+{
+    SnapshotJob *s = container_of(job, SnapshotJob, common);
+    s->errp = errp;
+    s->co = qemu_coroutine_self();
+    aio_bh_schedule_oneshot(qemu_get_aio_context(),
+                            snapshot_load_job_bh, job);
+    qemu_coroutine_yield();
+    return s->ret;
+}
+
+static int coroutine_fn snapshot_delete_job_run(Job *job, Error **errp)
+{
+    SnapshotJob *s = container_of(job, SnapshotJob, common);
+    s->errp = errp;
+    s->co = qemu_coroutine_self();
+    aio_bh_schedule_oneshot(qemu_get_aio_context(),
+                            snapshot_delete_job_bh, job);
+    qemu_coroutine_yield();
+    return s->ret;
+}
+
+
+static const JobDriver snapshot_load_job_driver = {
+    .instance_size = sizeof(SnapshotJob),
+    .job_type      = JOB_TYPE_SNAPSHOT_LOAD,
+    .run           = snapshot_load_job_run,
+};
+
+static const JobDriver snapshot_save_job_driver = {
+    .instance_size = sizeof(SnapshotJob),
+    .job_type      = JOB_TYPE_SNAPSHOT_SAVE,
+    .run           = snapshot_save_job_run,
+};
+
+static const JobDriver snapshot_delete_job_driver = {
+    .instance_size = sizeof(SnapshotJob),
+    .job_type      = JOB_TYPE_SNAPSHOT_DELETE,
+    .run           = snapshot_delete_job_run,
+};
+
+
+void qmp_snapshot_save(const char *job_id,
+                       const char *tag,
+                       bool has_vmstate, const char *vmstate,
+                       bool has_devices, strList *devices,
+                       Error **errp)
+{
+    SnapshotJob *s;
+
+    s = job_create(job_id, &snapshot_save_job_driver, NULL,
+                   qemu_get_aio_context(), JOB_MANUAL_DISMISS,
+                   NULL, NULL, errp);
+    if (!s) {
+        return;
+    }
+
+    s->tag = g_strdup(tag);
+    s->vmstate = has_vmstate ? g_strdup(vmstate) : NULL;
+    s->devices = has_devices ? QAPI_CLONE(strList, devices) : NULL;
+
+    job_start(&s->common);
+}
+
+void qmp_snapshot_load(const char *job_id,
+                       const char *tag,
+                       bool has_vmstate, const char *vmstate,
+                       bool has_devices, strList *devices,
+                       Error **errp)
+{
+    SnapshotJob *s;
+
+    s = job_create(job_id, &snapshot_load_job_driver, NULL,
+                   qemu_get_aio_context(), JOB_MANUAL_DISMISS,
+                   NULL, NULL, errp);
+    if (!s) {
+        return;
+    }
+
+    s->tag = g_strdup(tag);
+    s->vmstate = has_vmstate ? g_strdup(vmstate) : NULL;
+    s->devices = has_devices ? QAPI_CLONE(strList, devices) : NULL;
+
+    job_start(&s->common);
+}
+
+void qmp_snapshot_delete(const char *job_id,
+                         const char *tag,
+                         bool has_devices, strList *devices,
+                         Error **errp)
+{
+    SnapshotJob *s;
+
+    s = job_create(job_id, &snapshot_delete_job_driver, NULL,
+                   qemu_get_aio_context(), JOB_MANUAL_DISMISS,
+                   NULL, NULL, errp);
+    if (!s) {
+        return;
+    }
+
+    s->tag = g_strdup(tag);
+    s->devices = has_devices ? QAPI_CLONE(strList, devices) : NULL;
+
+    job_start(&s->common);
+}
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 51f624b5ea..2ac2c109db 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1121,7 +1121,7 @@ void hmp_loadvm(Monitor *mon, const QDict *qdict)
 
     vm_stop(RUN_STATE_RESTORE_VM);
 
-    if (load_snapshot(name, &err) == 0 && saved_vm_running) {
+    if (load_snapshot(name, NULL, NULL, &err) == 0 && saved_vm_running) {
         vm_start();
     }
     hmp_handle_error(mon, err);
@@ -1131,7 +1131,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
 {
     Error *err = NULL;
 
-    save_snapshot(qdict_get_try_str(qdict, "name"), &err);
+    save_snapshot(qdict_get_try_str(qdict, "name"), NULL, NULL, &err);
     hmp_handle_error(mon, err);
 }
 
diff --git a/qapi/job.json b/qapi/job.json
index 280c2f76f1..51bee470f0 100644
--- a/qapi/job.json
+++ b/qapi/job.json
@@ -22,10 +22,17 @@
 #
 # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
 #
+# @snapshot-load: snapshot load job type, see "loadvm" (since 5.2)
+#
+# @snapshot-save: snapshot save job type, see "savevm" (since 5.2)
+#
+# @snapshot-delete: snapshot delete job type, see "delvm" (since 5.2)
+#
 # Since: 1.7
 ##
 { 'enum': 'JobType',
-  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
+  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
+           'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
 
 ##
 # @JobStatus:
diff --git a/qapi/migration.json b/qapi/migration.json
index 5f6b06172c..d70f627b77 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1720,3 +1720,138 @@
 ##
 { 'event': 'UNPLUG_PRIMARY',
   'data': { 'device-id': 'str' } }
+
+##
+# @snapshot-save:
+#
+# Save a VM snapshot
+#
+# @job-id: identifier for the newly created job
+# @tag: name of the snapshot to create. If it already
+# exists it will be replaced.
+# @devices: list of block device node names to save a snapshot to
+# @vmstate: block device node name to save vmstate to
+#
+# Applications should not assume that the snapshot save is complete
+# when this command returns. Completion is indicated by the job
+# status. Clients can wait for the JOB_STATUS_CHANGE event. If the
+# job aborts, errors can be obtained via the 'query-jobs' command,
+# though. Note that at this time most vmstate procssing errors only
+# get printed to stderr. This limitation will be fixed at a future
+# date.
+#
+# Note that the VM CPUs will be paused during the time it takes to
+# save the snapshot
+#
+# If @devices is not specified, or is an empty list, then the
+# historical default logic for picking devices will be used.
+#
+# If @vmstate is not specified, then the first valid block
+# device will be used for vmstate.
+#
+# Returns: nothing
+#
+# Example:
+#
+# -> { "execute": "snapshot-save",
+#      "data": {
+#         "job-id": "snapsave0",
+#         "tag": "my-snap",
+#         "vmstate": "disk0",
+#         "devices": ["disk0", "disk1"]
+#      }
+#    }
+# <- { "return": { } }
+#
+# Since: 5.2
+##
+{ 'command': 'snapshot-save',
+  'data': { 'job-id': 'str',
+            'tag': 'str',
+            '*vmstate': 'str',
+            '*devices': ['str'] } }
+
+##
+# @snapshot-load:
+#
+# Load a VM snapshot
+#
+# @job-id: identifier for the newly created job
+# @tag: name of the snapshot to load.
+# @devices: list of block device node names to load a snapshot from
+# @vmstate: block device node name to load vmstate from
+#
+# Applications should not assume that the snapshot load is complete
+# when this command returns. Completion is indicated by the job
+# status. Clients can wait for the JOB_STATUS_CHANGE event. If the
+# job aborts, errors can be obtained via the 'query-jobs' command,
+# though. Note that at this time most vmstate procssing errors only
+# get printed to stderr. This limitation will be fixed at a future
+# date.
+#
+# If @devices is not specified, or is an empty list, then the
+# historical default logic for picking devices will be used.
+#
+# If @vmstate is not specified, then the first valid block
+# device will be used for vmstate.
+#
+# Returns: nothing
+#
+# Example:
+#
+# -> { "execute": "snapshot-load",
+#      "data": {
+#         "job-id": "snapload0",
+#         "tag": "my-snap",
+#         "vmstate": "disk0",
+#         "devices": ["disk0", "disk1"]
+#      }
+#    }
+# <- { "return": { } }
+#
+# Since: 5.2
+##
+{ 'command': 'snapshot-load',
+  'data': { 'job-id': 'str',
+            'tag': 'str',
+            '*vmstate': 'str',
+            '*devices': ['str'] } }
+
+##
+# @snapshot-delete:
+#
+# Delete a VM snapshot
+#
+# @job-id: identifier for the newly created job
+# @tag: name of the snapshot to delete.
+# @devices: list of block device node names to delete a snapshot from
+#
+# Applications should not assume that the snapshot load is complete
+# when this command returns. Completion is indicated by the job
+# status. Clients can wait for the JOB_STATUS_CHANGE event.
+#
+# Note that the VM CPUs will be paused during the time it takes to
+# delete the snapshot
+#
+# If @devices is not specified, or is an empty list, then the
+# historical default logic for picking devices will be used.
+#
+# Returns: nothing
+#
+# Example:
+#
+# -> { "execute": "snapshot-delete",
+#      "data": {
+#         "job-id": "snapdelete0",
+#         "tag": "my-snap",
+#         "devices": ["disk0", "disk1"]
+#      }
+#    }
+# <- { "return": { } }
+#
+# Since: 5.2
+##
+{ 'command': 'snapshot-delete',
+  'data': { 'job-id': 'str',
+            'tag': 'str',
+            '*devices': ['str'] } }
diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
index e26fa4c892..f0f45a4f24 100644
--- a/replay/replay-snapshot.c
+++ b/replay/replay-snapshot.c
@@ -77,13 +77,13 @@ void replay_vmstate_init(void)
 
     if (replay_snapshot) {
         if (replay_mode == REPLAY_MODE_RECORD) {
-            if (save_snapshot(replay_snapshot, &err) != 0) {
+            if (save_snapshot(replay_snapshot, NULL, NULL, &err) != 0) {
                 error_report_err(err);
                 error_report("Could not create snapshot for icount record");
                 exit(1);
             }
         } else if (replay_mode == REPLAY_MODE_PLAY) {
-            if (load_snapshot(replay_snapshot, &err) != 0) {
+            if (load_snapshot(replay_snapshot, NULL, NULL, &err) != 0) {
                 error_report_err(err);
                 error_report("Could not load snapshot for icount replay");
                 exit(1);
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 0cc86b0766..8dbc2fe638 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -4459,7 +4459,7 @@ void qemu_init(int argc, char **argv, char **envp)
     register_global_state();
     if (loadvm) {
         Error *local_err = NULL;
-        if (load_snapshot(loadvm, &local_err) < 0) {
+        if (load_snapshot(loadvm, NULL, NULL, &local_err) < 0) {
             error_report_err(local_err);
             autostart = 0;
             exit(1);
diff --git a/tests/qemu-iotests/310 b/tests/qemu-iotests/310
new file mode 100755
index 0000000000..66cb58d2c8
--- /dev/null
+++ b/tests/qemu-iotests/310
@@ -0,0 +1,255 @@
+#!/usr/bin/env bash
+#
+# Test which nodes are involved in internal snapshots
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=berrange@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1	# failure is the default!
+
+_cleanup()
+{
+    _cleanup_qemu
+    _cleanup_test_img
+    TEST_IMG="$TEST_IMG.alt1" _cleanup_test_img
+    TEST_IMG="$TEST_IMG.alt2" _cleanup_test_img
+    rm -f "$SOCK_DIR/nbd"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+_require_drivers copy-on-read
+
+# Internal snapshots are (currently) impossible with refcount_bits=1,
+# and generally impossible with external data files
+_unsupported_imgopts 'refcount_bits=1[^0-9]' data_file
+
+_require_devices virtio-blk
+
+
+size=128M
+
+if [ -n "$BACKING_FILE" ]; then
+    _make_test_img -b "$BACKING_FILE" -F $IMGFMT $size
+else
+    _make_test_img $size
+fi
+
+    _make_test_img $size
+
+echo "Single disk test"
+
+export capture_events="JOB_STATUS_CHANGE STOP RESUME"
+
+run_snapshot()
+{
+    local op=$1
+
+    _send_qemu_cmd $QEMU_HANDLE "{\"execute\": \"snapshot-${op}\", \"arguments\": {\"job-id\": \"snap${op}0\", \"tag\": \"snap0\", \"devices\": $SNAP_DEVICES}}" "return"
+    # created
+    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+    # running
+    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+    if [ $op = "delete" ]; then
+	if [ $SNAP_DELETE_FAIL = 1 ]; then
+	    # aborting
+	    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+	else
+	    # waiting
+	    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+	    # pending
+	    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+	fi
+    else
+	if [ $SNAP_FAIL = 0 ]; then
+	    _wait_event $QEMU_HANDLE "STOP"
+	    _wait_event $QEMU_HANDLE "RESUME"
+	    # waiting
+	    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+	    # pending
+	    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+	else
+	    if [ $op = "load" ]; then
+		_wait_event $QEMU_HANDLE "STOP"
+	    fi
+	    # aborting
+	    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+	fi
+    fi
+    # concluded
+    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+    _send_qemu_cmd $QEMU_HANDLE "{\"execute\": \"query-jobs\"}" "return"
+    _send_qemu_cmd $QEMU_HANDLE "{\"execute\": \"job-dismiss\", \"arguments\": {\"id\": \"snap${op}0\"}}" "return"
+    # null
+    _wait_event $QEMU_HANDLE "JOB_STATUS_CHANGE"
+}
+
+run_test()
+{
+    keep_stderr=y
+    _launch_qemu -nodefaults -nographic "$@"
+
+    _send_qemu_cmd $QEMU_HANDLE '{"execute": "qmp_capabilities"}' 'return'
+
+    run_snapshot "save"
+    run_snapshot "load"
+    run_snapshot "delete"
+
+    _send_qemu_cmd $QEMU_HANDLE '{"execute": "quit"}' 'return'
+
+    wait=1 _cleanup_qemu
+}
+
+
+echo
+echo "=====  Snapshot single qcow2 image ====="
+echo
+
+SNAP_DEVICES='["diskfmt0"]' SNAP_FAIL=0 SNAP_DELETE_FAIL=0 \
+  run_test \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG','node-name':'disk0'}" \
+    -blockdev "{'driver':'qcow2','file':'disk0','node-name':'diskfmt0'}"
+
+echo
+echo "=====  Snapshot no image ====="
+echo
+
+TEST_IMG="$TEST_IMG.alt1" _make_test_img $size
+
+SNAP_DEVICES="[]" SNAP_FAIL=1 SNAP_DELETE_FAIL=0 \
+  run_test
+
+
+echo
+echo "=====  Snapshot missing image ====="
+echo
+
+SNAP_DEVICES='["diskfmt1729"]' SNAP_FAIL=1 SNAP_DELETE_FAIL=1 \
+  run_test \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG','node-name':'disk0'}" \
+    -blockdev "{'driver':'qcow2','file':'disk0','node-name':'diskfmt0'}"
+
+
+echo
+echo "=====  Snapshot protocol instead of format ====="
+echo
+
+SNAP_DEVICES='["disk0"]' SNAP_FAIL=1 SNAP_DELETE_FAIL=1 \
+  run_test \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG','node-name':'disk0'}" \
+    -blockdev "{'driver':'qcow2','file':'disk0','node-name':'diskfmt0'}"
+
+
+echo
+echo "=====  Snapshot dual qcow2 image ====="
+echo
+
+SNAP_DEVICES='["diskfmt0", "diskfmt1"]' SNAP_FAIL=0 SNAP_DELETE_FAIL=0 \
+  run_test \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG','node-name':'disk0'}" \
+    -blockdev "{'driver':'qcow2','file':'disk0','node-name':'diskfmt0'}" \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG.alt1','node-name':'disk1'}" \
+    -blockdev "{'driver':'qcow2','file':'disk1','node-name':'diskfmt1'}"
+
+
+echo
+echo "=====  Snapshot error with raw image ====="
+echo
+
+IMGOPTS= IMGFMT=raw TEST_IMG="$TEST_IMG.alt2" _make_test_img $size
+
+SNAP_DEVICES='["diskfmt0", "diskfmt1", "diskfmt2"]' SNAP_FAIL=1 SNAP_DELETE_FAIL=1 \
+  run_test \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG','node-name':'disk0'}" \
+    -blockdev "{'driver':'qcow2','file':'disk0','node-name':'diskfmt0'}" \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG.alt1','node-name':'disk1'}" \
+    -blockdev "{'driver':'qcow2','file':'disk1','node-name':'diskfmt1'}" \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG.alt2','node-name':'disk2'}" \
+    -blockdev "{'driver':'raw','file':'disk2','node-name':'diskfmt2'}"
+
+
+echo
+echo "=====  Snapshot with raw image excluded ====="
+echo
+
+SNAP_DEVICES='["diskfmt0", "diskfmt1"]' SNAP_FAIL=0 SNAP_DELETE_FAIL=0 \
+  run_test \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG','node-name':'disk0'}" \
+    -blockdev "{'driver':'qcow2','file':'disk0','node-name':'diskfmt0'}" \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG.alt1','node-name':'disk1'}" \
+    -blockdev "{'driver':'qcow2','file':'disk1','node-name':'diskfmt1'}" \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG.alt2','node-name':'disk2'}" \
+    -blockdev "{'driver':'raw','file':'disk2','node-name':'diskfmt2'}"
+
+echo
+echo "=====  Snapshot bad error reporting to stderr ====="
+echo
+
+# This demonstrates that we're not capturing vmstate loading failures
+# into QMP errors, they're ending up in stderr instead. vmstate needs
+# to report errors via Error object but that is a major piece of work
+# for the future. This test case's expected output log will need
+# adjusting when that is done.
+
+SNAP_DEVICES='["diskfmt0"]'
+SNAP_FAIL=1
+SNAP_DELETE_FAIL=0
+
+keep_stderr=y
+_launch_qemu -nodefaults -nographic \
+    -device virtio-rng \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG','node-name':'disk0'}" \
+    -blockdev "{'driver':'qcow2','file':'disk0','node-name':'diskfmt0'}"
+
+_send_qemu_cmd $QEMU_HANDLE '{"execute": "qmp_capabilities"}' 'return'
+
+run_snapshot "save"
+
+_send_qemu_cmd $QEMU_HANDLE '{"execute": "quit"}' 'return'
+
+wait=1 _cleanup_qemu
+
+# leave off virtio-rng to provoke vmstate failure
+_launch_qemu -nodefaults -nographic \
+    -blockdev "{'driver':'file','filename':'$TEST_IMG','node-name':'disk0'}" \
+    -blockdev "{'driver':'qcow2','file':'disk0','node-name':'diskfmt0'}"
+
+_send_qemu_cmd $QEMU_HANDLE '{"execute": "qmp_capabilities"}' 'return'
+
+run_snapshot "load"
+run_snapshot "delete"
+
+_send_qemu_cmd $QEMU_HANDLE '{"execute": "quit"}' 'return'
+
+wait=1 _cleanup_qemu
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/310.out b/tests/qemu-iotests/310.out
new file mode 100644
index 0000000000..32c797ab1e
--- /dev/null
+++ b/tests/qemu-iotests/310.out
@@ -0,0 +1,369 @@
+QA output created by 310
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
+Single disk test
+
+=====  Snapshot single qcow2 image =====
+
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-save", "arguments": {"job-id": "snapsave0", "tag": "snap0", "devices": ["diskfmt0"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "RESUME"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapsave0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-save", "id": "snapsave0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapsave0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapsave0"}}
+{"execute": "snapshot-load", "arguments": {"job-id": "snapload0", "tag": "snap0", "devices": ["diskfmt0"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "RESUME"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapload0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "snapload0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapload0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapload0"}}
+{"execute": "snapshot-delete", "arguments": {"job-id": "snapdelete0", "tag": "snap0", "devices": ["diskfmt0"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapdelete0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-delete", "id": "snapdelete0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapdelete0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapdelete0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+
+=====  Snapshot no image =====
+
+Formatting 'TEST_DIR/t.IMGFMT.alt1', fmt=IMGFMT size=134217728
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-save", "arguments": {"job-id": "snapsave0", "tag": "snap0", "devices": []}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapsave0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-save", "id": "snapsave0", "error": "No block device supports snapshots"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapsave0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapsave0"}}
+{"execute": "snapshot-load", "arguments": {"job-id": "snapload0", "tag": "snap0", "devices": []}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapload0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "snapload0", "error": "No block device supports snapshots"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapload0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapload0"}}
+{"execute": "snapshot-delete", "arguments": {"job-id": "snapdelete0", "tag": "snap0", "devices": []}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapdelete0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-delete", "id": "snapdelete0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapdelete0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapdelete0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+
+=====  Snapshot missing image =====
+
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-save", "arguments": {"job-id": "snapsave0", "tag": "snap0", "devices": ["diskfmt1729"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapsave0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-save", "id": "snapsave0", "error": "No block device node 'diskfmt1729'"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapsave0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapsave0"}}
+{"execute": "snapshot-load", "arguments": {"job-id": "snapload0", "tag": "snap0", "devices": ["diskfmt1729"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapload0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "snapload0", "error": "No block device node 'diskfmt1729'"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapload0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapload0"}}
+{"execute": "snapshot-delete", "arguments": {"job-id": "snapdelete0", "tag": "snap0", "devices": ["diskfmt1729"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapdelete0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-delete", "id": "snapdelete0", "error": "No block device node 'diskfmt1729'"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapdelete0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapdelete0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+
+=====  Snapshot protocol instead of format =====
+
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-save", "arguments": {"job-id": "snapsave0", "tag": "snap0", "devices": ["disk0"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapsave0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-save", "id": "snapsave0", "error": "Device 'disk0' is writable but does not support snapshots"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapsave0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapsave0"}}
+{"execute": "snapshot-load", "arguments": {"job-id": "snapload0", "tag": "snap0", "devices": ["disk0"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapload0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "snapload0", "error": "Device 'disk0' is writable but does not support snapshots"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapload0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapload0"}}
+{"execute": "snapshot-delete", "arguments": {"job-id": "snapdelete0", "tag": "snap0", "devices": ["disk0"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapdelete0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-delete", "id": "snapdelete0", "error": "Device 'disk0' is writable but does not support snapshots"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapdelete0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapdelete0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+
+=====  Snapshot dual qcow2 image =====
+
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-save", "arguments": {"job-id": "snapsave0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "RESUME"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapsave0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-save", "id": "snapsave0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapsave0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapsave0"}}
+{"execute": "snapshot-load", "arguments": {"job-id": "snapload0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "RESUME"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapload0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "snapload0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapload0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapload0"}}
+{"execute": "snapshot-delete", "arguments": {"job-id": "snapdelete0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapdelete0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-delete", "id": "snapdelete0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapdelete0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapdelete0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+
+=====  Snapshot error with raw image =====
+
+Formatting 'TEST_DIR/t.qcow2.alt2', fmt=IMGFMT size=134217728
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-save", "arguments": {"job-id": "snapsave0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1", "diskfmt2"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapsave0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-save", "id": "snapsave0", "error": "Device 'diskfmt2' is writable but does not support snapshots"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapsave0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapsave0"}}
+{"execute": "snapshot-load", "arguments": {"job-id": "snapload0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1", "diskfmt2"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapload0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "snapload0", "error": "Device 'diskfmt2' is writable but does not support snapshots"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapload0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapload0"}}
+{"execute": "snapshot-delete", "arguments": {"job-id": "snapdelete0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1", "diskfmt2"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapdelete0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-delete", "id": "snapdelete0", "error": "Device 'diskfmt2' is writable but does not support snapshots"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapdelete0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapdelete0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+
+=====  Snapshot with raw image excluded =====
+
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-save", "arguments": {"job-id": "snapsave0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "RESUME"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapsave0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-save", "id": "snapsave0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapsave0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapsave0"}}
+{"execute": "snapshot-load", "arguments": {"job-id": "snapload0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "RESUME"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapload0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "snapload0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapload0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapload0"}}
+{"execute": "snapshot-delete", "arguments": {"job-id": "snapdelete0", "tag": "snap0", "devices": ["diskfmt0", "diskfmt1"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapdelete0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-delete", "id": "snapdelete0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapdelete0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapdelete0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+
+=====  Snapshot bad error reporting to stderr =====
+
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-save", "arguments": {"job-id": "snapsave0", "tag": "snap0", "devices": ["diskfmt0"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapsave0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-save", "id": "snapsave0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapsave0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapsave0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+{"execute": "qmp_capabilities"}
+{"return": {}}
+{"execute": "snapshot-load", "arguments": {"job-id": "snapload0", "tag": "snap0", "devices": ["diskfmt0"]}}
+qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapsave0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "snapload0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "snapload0", "error": "Error -22 while loading VM state"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapload0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapload0"}}
+{"execute": "snapshot-delete", "arguments": {"job-id": "snapdelete0", "tag": "snap0", "devices": ["diskfmt0"]}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "snapload0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "snapdelete0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "snapdelete0"}}
+{"execute": "query-jobs"}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-delete", "id": "snapdelete0"}]}
+{"execute": "job-dismiss", "arguments": {"id": "snapdelete0"}}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "snapdelete0"}}
+{"execute": "quit"}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index a53ea7f78b..b2a5dfbc5a 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -290,6 +290,7 @@
 277 rw quick
 279 rw backing quick
 280 rw migration quick
+310 rw quick
 281 rw quick
 282 rw img quick
 283 auto quick
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/7] migration: improve error reporting of block driver state name
  2020-08-27 11:16 ` [PATCH v3 1/7] migration: improve error reporting of block driver state name Daniel P. Berrangé
@ 2020-08-27 17:17   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2020-08-27 17:17 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, John Snow, qemu-devel, Markus Armbruster,
	Pavel Dovgalyuk, Paolo Bonzini, Max Reitz

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> With blockdev, a BlockDriverState may not have a device name,
> so using a node name is required as an alternative.
> 
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

Queuing this one by itself since it's useful anyway.

Dave

> ---
>  migration/savevm.c         | 12 ++++++------
>  tests/qemu-iotests/267.out |  4 ++--
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index a843d202b5..304d98ff78 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2682,7 +2682,7 @@ int save_snapshot(const char *name, Error **errp)
>  
>      if (!bdrv_all_can_snapshot(&bs)) {
>          error_setg(errp, "Device '%s' is writable but does not support "
> -                   "snapshots", bdrv_get_device_name(bs));
> +                   "snapshots", bdrv_get_device_or_node_name(bs));
>          return ret;
>      }
>  
> @@ -2691,7 +2691,7 @@ int save_snapshot(const char *name, Error **errp)
>          ret = bdrv_all_delete_snapshot(name, &bs1, errp);
>          if (ret < 0) {
>              error_prepend(errp, "Error while deleting snapshot on device "
> -                          "'%s': ", bdrv_get_device_name(bs1));
> +                          "'%s': ", bdrv_get_device_or_node_name(bs1));
>              return ret;
>          }
>      }
> @@ -2766,7 +2766,7 @@ int save_snapshot(const char *name, Error **errp)
>      ret = bdrv_all_create_snapshot(sn, bs, vm_state_size, &bs);
>      if (ret < 0) {
>          error_setg(errp, "Error while creating snapshot on '%s'",
> -                   bdrv_get_device_name(bs));
> +                   bdrv_get_device_or_node_name(bs));
>          goto the_end;
>      }
>  
> @@ -2884,14 +2884,14 @@ int load_snapshot(const char *name, Error **errp)
>      if (!bdrv_all_can_snapshot(&bs)) {
>          error_setg(errp,
>                     "Device '%s' is writable but does not support snapshots",
> -                   bdrv_get_device_name(bs));
> +                   bdrv_get_device_or_node_name(bs));
>          return -ENOTSUP;
>      }
>      ret = bdrv_all_find_snapshot(name, &bs);
>      if (ret < 0) {
>          error_setg(errp,
>                     "Device '%s' does not have the requested snapshot '%s'",
> -                   bdrv_get_device_name(bs), name);
> +                   bdrv_get_device_or_node_name(bs), name);
>          return ret;
>      }
>  
> @@ -2920,7 +2920,7 @@ int load_snapshot(const char *name, Error **errp)
>      ret = bdrv_all_goto_snapshot(name, &bs, errp);
>      if (ret < 0) {
>          error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
> -                      name, bdrv_get_device_name(bs));
> +                      name, bdrv_get_device_or_node_name(bs));
>          goto err_drain;
>      }
>  
> diff --git a/tests/qemu-iotests/267.out b/tests/qemu-iotests/267.out
> index d6d80c099f..215902b3ad 100644
> --- a/tests/qemu-iotests/267.out
> +++ b/tests/qemu-iotests/267.out
> @@ -81,11 +81,11 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
>  Testing: -blockdev driver=file,filename=TEST_DIR/t.IMGFMT,node-name=file
>  QEMU X.Y.Z monitor - type 'help' for more information
>  (qemu) savevm snap0
> -Error: Device '' is writable but does not support snapshots
> +Error: Device 'file' is writable but does not support snapshots
>  (qemu) info snapshots
>  No available block device supports snapshots
>  (qemu) loadvm snap0
> -Error: Device '' is writable but does not support snapshots
> +Error: Device 'file' is writable but does not support snapshots
>  (qemu) quit
>  
>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
> -- 
> 2.26.2
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-08-27 11:16 ` [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands Daniel P. Berrangé
@ 2020-09-01 14:20   ` Markus Armbruster
  2020-09-01 16:47     ` Daniel P. Berrangé
  0 siblings, 1 reply; 19+ messages in thread
From: Markus Armbruster @ 2020-09-01 14:20 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, qemu-devel, Dr. David Alan Gilbert,
	Pavel Dovgalyuk, Paolo Bonzini, Max Reitz, John Snow

Daniel P. Berrangé <berrange@redhat.com> writes:

> savevm, loadvm and delvm are some of the few HMP commands that have never
> been converted to use QMP. The primary reason for this lack of conversion
> is that they block execution of the thread for as long as they run.

Nope.  The primary reason is that the HMP interface is bonkers.

> Despite this downside, however, libvirt and applications using libvirt
> have used these commands for as long as QMP has existed, via the
> "human-monitor-command" passthrough command. IOW, while it is clearly
> desirable to be able to fix the blocking problem, this is not an
> immediate obstacle to real world usage.
>
> Meanwhile there is a need for other features which involve adding new
> parameters to the commands. This is possible with HMP passthrough, but
> it provides no reliable way for apps to introspect features, so using
> QAPI modelling is highly desirable.
>
> This patch thus introduces new snapshot-{load,save,delete} commands to
> QMP that are intended to replace the old HMP counterparts. The new
> commands are given different names, because they will be using the new
> QEMU job framework and thus will have diverging behaviour from the HMP
> originals. It would thus be misleading to keep the same name.
>
> While this design uses the generic job framework, the current impl is
> still blocking. The intention that the blocking problem is fixed later.
> None the less applications using these new commands should assume that
> they are asynchronous and thus wait for the job status change event to
> indicate completion.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[...]
> diff --git a/qapi/job.json b/qapi/job.json
> index 280c2f76f1..51bee470f0 100644
> --- a/qapi/job.json
> +++ b/qapi/job.json
> @@ -22,10 +22,17 @@
>  #
>  # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
>  #
> +# @snapshot-load: snapshot load job type, see "loadvm" (since 5.2)

Do you mean 'see command @snapshot-load?

> +#
> +# @snapshot-save: snapshot save job type, see "savevm" (since 5.2)

@snapshot-save?

> +#
> +# @snapshot-delete: snapshot delete job type, see "delvm" (since 5.2)

@snapshot-delete?

> +#
>  # Since: 1.7
>  ##
>  { 'enum': 'JobType',
> -  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
> +  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
> +           'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
>  
>  ##
>  # @JobStatus:
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 5f6b06172c..d70f627b77 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1720,3 +1720,138 @@
>  ##
>  { 'event': 'UNPLUG_PRIMARY',
>    'data': { 'device-id': 'str' } }
> +
> +##
> +# @snapshot-save:
> +#
> +# Save a VM snapshot
> +#
> +# @job-id: identifier for the newly created job
> +# @tag: name of the snapshot to create. If it already
> +# exists it will be replaced.

Sounds a bit dangerous.  Require a force flag for such an overwrite?
Not sure.

> +# @devices: list of block device node names to save a snapshot to
> +# @vmstate: block device node name to save vmstate to

Worth mentioning that omitting writable block devices is probably a bad
idea?

> +#
> +# Applications should not assume that the snapshot save is complete
> +# when this command returns.

Is it complete then with the current code?  I'm asking because such
properties have a way to sneakily become de facto ABI.  We may not be
able to do anything about that now, other than documenting "don't do
that" like you did, but I'd like to understand the state of affairs all
the same.

> +#                            Completion is indicated by the job
> +# status. Clients can wait for the JOB_STATUS_CHANGE event. If the
> +# job aborts, errors can be obtained via the 'query-jobs' command,
> +# though.

Sure we want to these job basics here?

> +#         Note that at this time most vmstate procssing errors only

Typo: processing

Whatever a "vmstate processing error" is...

> +# get printed to stderr. This limitation will be fixed at a future
> +# date.

Is that a promise?  ;)

> +#
> +# Note that the VM CPUs will be paused during the time it takes to
> +# save the snapshot

End the sentence with a period, please.

> +#
> +# If @devices is not specified, or is an empty list, then the
> +# historical default logic for picking devices will be used.

Why is this useful for QMP?

> +#
> +# If @vmstate is not specified, then the first valid block
> +# device will be used for vmstate.

Why is this useful for QMP?

> +#
> +# Returns: nothing
> +#
> +# Example:
> +#
> +# -> { "execute": "snapshot-save",
> +#      "data": {
> +#         "job-id": "snapsave0",
> +#         "tag": "my-snap",
> +#         "vmstate": "disk0",
> +#         "devices": ["disk0", "disk1"]
> +#      }
> +#    }
> +# <- { "return": { } }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'snapshot-save',
> +  'data': { 'job-id': 'str',
> +            'tag': 'str',
> +            '*vmstate': 'str',
> +            '*devices': ['str'] } }
> +
> +##
> +# @snapshot-load:
> +#
> +# Load a VM snapshot
> +#
> +# @job-id: identifier for the newly created job
> +# @tag: name of the snapshot to load.
> +# @devices: list of block device node names to load a snapshot from
> +# @vmstate: block device node name to load vmstate from

Worth mentioning that omitting block devices that may have changed since
the save is probably a bad idea?

> +#
> +# Applications should not assume that the snapshot load is complete
> +# when this command returns. Completion is indicated by the job
> +# status. Clients can wait for the JOB_STATUS_CHANGE event. If the
> +# job aborts, errors can be obtained via the 'query-jobs' command,
> +# though. Note that at this time most vmstate procssing errors only
> +# get printed to stderr. This limitation will be fixed at a future
> +# date.

Comments on snapshot-load apply.

> +#
> +# If @devices is not specified, or is an empty list, then the
> +# historical default logic for picking devices will be used.

Why is this useful for QMP?

> +#
> +# If @vmstate is not specified, then the first valid block
> +# device will be used for vmstate.

Why is this useful for QMP?

A more useful default could be "if exactly one the block devices being
restored contains a vmstate, use that".

> +#
> +# Returns: nothing
> +#
> +# Example:
> +#
> +# -> { "execute": "snapshot-load",
> +#      "data": {
> +#         "job-id": "snapload0",
> +#         "tag": "my-snap",
> +#         "vmstate": "disk0",
> +#         "devices": ["disk0", "disk1"]
> +#      }
> +#    }
> +# <- { "return": { } }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'snapshot-load',
> +  'data': { 'job-id': 'str',
> +            'tag': 'str',
> +            '*vmstate': 'str',
> +            '*devices': ['str'] } }
> +
> +##
> +# @snapshot-delete:
> +#
> +# Delete a VM snapshot
> +#
> +# @job-id: identifier for the newly created job
> +# @tag: name of the snapshot to delete.
> +# @devices: list of block device node names to delete a snapshot from
> +#
> +# Applications should not assume that the snapshot load is complete
> +# when this command returns. Completion is indicated by the job
> +# status. Clients can wait for the JOB_STATUS_CHANGE event.

Comments on snapshot-load apply.

One difference: no "If the job aborts, ..."  Intentional?

> +#
> +# Note that the VM CPUs will be paused during the time it takes to
> +# delete the snapshot
> +#
> +# If @devices is not specified, or is an empty list, then the
> +# historical default logic for picking devices will be used.

Why is this useful for QMP?

> +#
> +# Returns: nothing
> +#
> +# Example:
> +#
> +# -> { "execute": "snapshot-delete",
> +#      "data": {
> +#         "job-id": "snapdelete0",
> +#         "tag": "my-snap",
> +#         "devices": ["disk0", "disk1"]
> +#      }
> +#    }
> +# <- { "return": { } }
> +#
> +# Since: 5.2
> +##
> +{ 'command': 'snapshot-delete',
> +  'data': { 'job-id': 'str',
> +            'tag': 'str',
> +            '*devices': ['str'] } }
[...]



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-09-01 14:20   ` Markus Armbruster
@ 2020-09-01 16:47     ` Daniel P. Berrangé
  2020-09-02  9:27       ` Markus Armbruster
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-09-01 16:47 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, qemu-devel, Dr. David Alan Gilbert,
	Pavel Dovgalyuk, Paolo Bonzini, Max Reitz, John Snow

On Tue, Sep 01, 2020 at 04:20:47PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > savevm, loadvm and delvm are some of the few HMP commands that have never
> > been converted to use QMP. The primary reason for this lack of conversion
> > is that they block execution of the thread for as long as they run.
> 
> Nope.  The primary reason is that the HMP interface is bonkers.

I don't think that's very helpful description. The HMP interface has
some limitations, but it isn't bonkers - it just doesn't cope with
all the use cases we want. Many people use it succesfully without
issue

> > Despite this downside, however, libvirt and applications using libvirt
> > have used these commands for as long as QMP has existed, via the
> > "human-monitor-command" passthrough command. IOW, while it is clearly
> > desirable to be able to fix the blocking problem, this is not an
> > immediate obstacle to real world usage.
> >
> > Meanwhile there is a need for other features which involve adding new
> > parameters to the commands. This is possible with HMP passthrough, but
> > it provides no reliable way for apps to introspect features, so using
> > QAPI modelling is highly desirable.
> >
> > This patch thus introduces new snapshot-{load,save,delete} commands to
> > QMP that are intended to replace the old HMP counterparts. The new
> > commands are given different names, because they will be using the new
> > QEMU job framework and thus will have diverging behaviour from the HMP
> > originals. It would thus be misleading to keep the same name.
> >
> > While this design uses the generic job framework, the current impl is
> > still blocking. The intention that the blocking problem is fixed later.
> > None the less applications using these new commands should assume that
> > they are asynchronous and thus wait for the job status change event to
> > indicate completion.
> >
> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> [...]
> > diff --git a/qapi/job.json b/qapi/job.json
> > index 280c2f76f1..51bee470f0 100644
> > --- a/qapi/job.json
> > +++ b/qapi/job.json
> > @@ -22,10 +22,17 @@
> >  #
> >  # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
> >  #
> > +# @snapshot-load: snapshot load job type, see "loadvm" (since 5.2)
> 
> Do you mean 'see command @snapshot-load?

Yes, I guess so.

> 
> > +#
> > +# @snapshot-save: snapshot save job type, see "savevm" (since 5.2)
> 
> @snapshot-save?
> 
> > +#
> > +# @snapshot-delete: snapshot delete job type, see "delvm" (since 5.2)
> 
> @snapshot-delete?
> 
> > +#
> >  # Since: 1.7
> >  ##
> >  { 'enum': 'JobType',
> > -  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
> > +  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
> > +           'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
> >  
> >  ##
> >  # @JobStatus:
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 5f6b06172c..d70f627b77 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -1720,3 +1720,138 @@
> >  ##
> >  { 'event': 'UNPLUG_PRIMARY',
> >    'data': { 'device-id': 'str' } }
> > +
> > +##
> > +# @snapshot-save:
> > +#
> > +# Save a VM snapshot
> > +#
> > +# @job-id: identifier for the newly created job
> > +# @tag: name of the snapshot to create. If it already
> > +# exists it will be replaced.
> 
> Sounds a bit dangerous.  Require a force flag for such an overwrite?
> Not sure.

Yes, replacing is quite likely to be a mistake.

"@force" could mean many things, so "replace-existing: bool" is
probably a clearer name.

> 
> > +# @devices: list of block device node names to save a snapshot to
> > +# @vmstate: block device node name to save vmstate to
> 
> Worth mentioning that omitting writable block devices is probably a bad
> idea?

Sure

> > +#
> > +# Applications should not assume that the snapshot save is complete
> > +# when this command returns.
> 
> Is it complete then with the current code?  I'm asking because such
> properties have a way to sneakily become de facto ABI.  We may not be
> able to do anything about that now, other than documenting "don't do
> that" like you did, but I'd like to understand the state of affairs all
> the same.

Yes, the actual snapshot is synchronous with return of the command.

> 
> > +#                            Completion is indicated by the job
> > +# status. Clients can wait for the JOB_STATUS_CHANGE event. If the
> > +# job aborts, errors can be obtained via the 'query-jobs' command,
> > +# though.
> 
> Sure we want to these job basics here?

This ties in with the previous point. If feel if we don't document
the use of events here, then people are likely to blindly assume
synchronous completion. By explicitly telling them to wait for the
JOB_STATUS_CHANGE they are nudged towards a correct solution that
won't break if it becomes async later.

> 
> > +#         Note that at this time most vmstate procssing errors only
> 
> Typo: processing
> 
> Whatever a "vmstate processing error" is...
> 
> > +# get printed to stderr. This limitation will be fixed at a future
> > +# date.
> 
> Is that a promise?  ;)

I don't know when I'll have time, as I've not looked at just how
complex the conversion is. It is *highly* desirable to fix this
otherwise debugging failures is an exercise in extreme pain through
lack of useful information.

> 
> > +#
> > +# Note that the VM CPUs will be paused during the time it takes to
> > +# save the snapshot
> 
> End the sentence with a period, please.
> 
> > +#
> > +# If @devices is not specified, or is an empty list, then the
> > +# historical default logic for picking devices will be used.
> 
> Why is this useful for QMP?
> 
> > +#
> > +# If @vmstate is not specified, then the first valid block
> > +# device will be used for vmstate.
> 
> Why is this useful for QMP?

Both of these makes QEMU just "do the right thing" with the majority
of QEMU guest configurations with no special knowledge needed by
the mgmt app.

It makes it possible for all existing apps to immediately stop using
the loadvm/savevm commands via HMP passthrough, and convert to the
QMP commands.

Without this, applications will need to first convert to use -blockdev
before they can use the load-snapshot/save-snapshot commands, because
the devices are specified exclusively using blockdev node names, not
the legacy drive IDs. I didn't want to make blockdev a mandatory
dependancy unless apps want to opt-in to the fine grained control
over disk choices


> > +##
> > +# @snapshot-load:
> > +#
> > +# Load a VM snapshot
> > +#
> > +# @job-id: identifier for the newly created job
> > +# @tag: name of the snapshot to load.
> > +# @devices: list of block device node names to load a snapshot from
> > +# @vmstate: block device node name to load vmstate from
> 
> Worth mentioning that omitting block devices that may have changed since
> the save is probably a bad idea?

Yep.

> 
> > +#
> > +# Applications should not assume that the snapshot load is complete
> > +# when this command returns. Completion is indicated by the job
> > +# status. Clients can wait for the JOB_STATUS_CHANGE event. If the
> > +# job aborts, errors can be obtained via the 'query-jobs' command,
> > +# though. Note that at this time most vmstate procssing errors only
> > +# get printed to stderr. This limitation will be fixed at a future
> > +# date.
> 
> Comments on snapshot-load apply.
> 
> > +#
> > +# If @devices is not specified, or is an empty list, then the
> > +# historical default logic for picking devices will be used.
> 
> Why is this useful for QMP?
> 
> > +#
> > +# If @vmstate is not specified, then the first valid block
> > +# device will be used for vmstate.
> 
> Why is this useful for QMP?
> 
> A more useful default could be "if exactly one the block devices being
> restored contains a vmstate, use that".

I feel it is more important to be symetric with save-snapshot.  ie if you
supply or omit the same args for save-snapshot and load-snapshot, you
know both will work, or neither will work. You dont get into a situation
where you can succesfully save the snapshot, but not restore it.


> > +##
> > +# @snapshot-delete:
> > +#
> > +# Delete a VM snapshot
> > +#
> > +# @job-id: identifier for the newly created job
> > +# @tag: name of the snapshot to delete.
> > +# @devices: list of block device node names to delete a snapshot from
> > +#
> > +# Applications should not assume that the snapshot load is complete
> > +# when this command returns. Completion is indicated by the job
> > +# status. Clients can wait for the JOB_STATUS_CHANGE event.
> 
> Comments on snapshot-load apply.
> 
> One difference: no "If the job aborts, ..."  Intentional?

I guess it can abort if the file is corrupt perhaps. Generally
thogh if the named snapshot doesnt exist in the block device, it
is considered success, not an error.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-09-01 16:47     ` Daniel P. Berrangé
@ 2020-09-02  9:27       ` Markus Armbruster
  2020-09-02 11:05         ` Daniel P. Berrangé
  0 siblings, 1 reply; 19+ messages in thread
From: Markus Armbruster @ 2020-09-02  9:27 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, qemu-devel, Dr. David Alan Gilbert,
	Pavel Dovgalyuk, Paolo Bonzini, Max Reitz, John Snow

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Tue, Sep 01, 2020 at 04:20:47PM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > savevm, loadvm and delvm are some of the few HMP commands that have never
>> > been converted to use QMP. The primary reason for this lack of conversion
>> > is that they block execution of the thread for as long as they run.
>> 
>> Nope.  The primary reason is that the HMP interface is bonkers.
>
> I don't think that's very helpful description. The HMP interface has
> some limitations, but it isn't bonkers - it just doesn't cope with
> all the use cases we want. Many people use it succesfully without
> issue

It's non-bonkers for the case it was designed for: one disk backed by
QCOW2, plus maybe a CD-ROM.  The user is responsible for ensuring the
CD-ROM's media at loadvm time matches the one at savevm time.  The user
is further responsible for ensuring the guest-visible configuration
matches the one at savevm time.  No worse than migration.

It becomes useless as soon as you have writable non-QCOW2 block devices.
"Stop machine, take external snapshots, savevm, restart machine" should
work, but doesn't.

It becomes bonkers as soon as you have more than one QCOW2: which one
receives the system state?  It depends on the order in which they got
configured or some craziness like that.  Undocumented of course.

But "bonkers" is opinion, and we don't have to agree on opinions, only
on facts.

Fact: I did not reject attempts to bring savevm & friends to QMP because
they block.  I'm not rejecting your attempt because they block.

Fact: I did reject attempts because I considered their QMP interface
bonkers.  I do not consider your proposed interface bonkers (some
defaults I find dubious, but I'm sure we'll figure out what to do
there).

Therefore, the commit message's claim "the primary reason for this lack
of conversion is that they block execution of the thread for as long as
they run" is factually wrong.

If you'd rather not write "because they are bonkers", that's fine; I'm
confident we can find phrasing you like that is also correct.

>> > Despite this downside, however, libvirt and applications using libvirt
>> > have used these commands for as long as QMP has existed, via the
>> > "human-monitor-command" passthrough command. IOW, while it is clearly
>> > desirable to be able to fix the blocking problem, this is not an
>> > immediate obstacle to real world usage.
>> >
>> > Meanwhile there is a need for other features which involve adding new
>> > parameters to the commands. This is possible with HMP passthrough, but
>> > it provides no reliable way for apps to introspect features, so using
>> > QAPI modelling is highly desirable.
>> >
>> > This patch thus introduces new snapshot-{load,save,delete} commands to
>> > QMP that are intended to replace the old HMP counterparts. The new
>> > commands are given different names, because they will be using the new
>> > QEMU job framework and thus will have diverging behaviour from the HMP
>> > originals. It would thus be misleading to keep the same name.
>> >
>> > While this design uses the generic job framework, the current impl is
>> > still blocking. The intention that the blocking problem is fixed later.
>> > None the less applications using these new commands should assume that
>> > they are asynchronous and thus wait for the job status change event to
>> > indicate completion.
>> >
>> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
>> [...]
>> > diff --git a/qapi/job.json b/qapi/job.json
>> > index 280c2f76f1..51bee470f0 100644
>> > --- a/qapi/job.json
>> > +++ b/qapi/job.json
>> > @@ -22,10 +22,17 @@
>> >  #
>> >  # @amend: image options amend job type, see "x-blockdev-amend" (since 5.1)
>> >  #
>> > +# @snapshot-load: snapshot load job type, see "loadvm" (since 5.2)
>> 
>> Do you mean 'see command @snapshot-load?
>
> Yes, I guess so.

Please write it that way then.

>> > +#
>> > +# @snapshot-save: snapshot save job type, see "savevm" (since 5.2)
>> 
>> @snapshot-save?
>> 
>> > +#
>> > +# @snapshot-delete: snapshot delete job type, see "delvm" (since 5.2)
>> 
>> @snapshot-delete?
>> 
>> > +#
>> >  # Since: 1.7
>> >  ##
>> >  { 'enum': 'JobType',
>> > -  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend'] }
>> > +  'data': ['commit', 'stream', 'mirror', 'backup', 'create', 'amend',
>> > +           'snapshot-load', 'snapshot-save', 'snapshot-delete'] }
>> >  
>> >  ##
>> >  # @JobStatus:
>> > diff --git a/qapi/migration.json b/qapi/migration.json
>> > index 5f6b06172c..d70f627b77 100644
>> > --- a/qapi/migration.json
>> > +++ b/qapi/migration.json
>> > @@ -1720,3 +1720,138 @@
>> >  ##
>> >  { 'event': 'UNPLUG_PRIMARY',
>> >    'data': { 'device-id': 'str' } }
>> > +
>> > +##
>> > +# @snapshot-save:
>> > +#
>> > +# Save a VM snapshot
>> > +#
>> > +# @job-id: identifier for the newly created job
>> > +# @tag: name of the snapshot to create. If it already
>> > +# exists it will be replaced.
>> 
>> Sounds a bit dangerous.  Require a force flag for such an overwrite?
>> Not sure.
>
> Yes, replacing is quite likely to be a mistake.
>
> "@force" could mean many things, so "replace-existing: bool" is
> probably a clearer name.

Yes.

>> > +# @devices: list of block device node names to save a snapshot to
>> > +# @vmstate: block device node name to save vmstate to
>> 
>> Worth mentioning that omitting writable block devices is probably a bad
>> idea?
>
> Sure
>
>> > +#
>> > +# Applications should not assume that the snapshot save is complete
>> > +# when this command returns.
>> 
>> Is it complete then with the current code?  I'm asking because such
>> properties have a way to sneakily become de facto ABI.  We may not be
>> able to do anything about that now, other than documenting "don't do
>> that" like you did, but I'd like to understand the state of affairs all
>> the same.
>
> Yes, the actual snapshot is synchronous with return of the command.

Are there any failure modes where the command succeeds, and query-jobs
shows the error?
>
>> 
>> > +#                            Completion is indicated by the job
>> > +# status. Clients can wait for the JOB_STATUS_CHANGE event. If the
>> > +# job aborts, errors can be obtained via the 'query-jobs' command,
>> > +# though.
>> 
>> Sure we want to these job basics here?
>
> This ties in with the previous point. If feel if we don't document
> the use of events here, then people are likely to blindly assume
> synchronous completion. By explicitly telling them to wait for the
> JOB_STATUS_CHANGE they are nudged towards a correct solution that
> won't break if it becomes async later.

No objection.

>> 
>> > +#         Note that at this time most vmstate procssing errors only
>> 
>> Typo: processing
>> 
>> Whatever a "vmstate processing error" is...
>> 
>> > +# get printed to stderr. This limitation will be fixed at a future
>> > +# date.
>> 
>> Is that a promise?  ;)
>
> I don't know when I'll have time, as I've not looked at just how
> complex the conversion is. It is *highly* desirable to fix this
> otherwise debugging failures is an exercise in extreme pain through
> lack of useful information.
>
>> 
>> > +#
>> > +# Note that the VM CPUs will be paused during the time it takes to
>> > +# save the snapshot
>> 
>> End the sentence with a period, please.
>> 
>> > +#
>> > +# If @devices is not specified, or is an empty list, then the
>> > +# historical default logic for picking devices will be used.
>> 
>> Why is this useful for QMP?
>> 
>> > +#
>> > +# If @vmstate is not specified, then the first valid block
>> > +# device will be used for vmstate.
>> 
>> Why is this useful for QMP?
>
> Both of these makes QEMU just "do the right thing" with the majority
> of QEMU guest configurations with no special knowledge needed by
> the mgmt app.
>
> It makes it possible for all existing apps to immediately stop using
> the loadvm/savevm commands via HMP passthrough, and convert to the
> QMP commands.

I appreciate your concern for easy migration from HMP savevm/loadvm to
QMP.  I'm unwilling to permanently screw up the QMP interface for it,
though.

> Without this, applications will need to first convert to use -blockdev
> before they can use the load-snapshot/save-snapshot commands, because
> the devices are specified exclusively using blockdev node names, not
> the legacy drive IDs.

Not true.  *Every* block node has a node name.  If the user doesn't
specify one, the system makes one up.  query-named-block-nodes shows
it.  So does query-block.

>                       I didn't want to make blockdev a mandatory
> dependancy unless apps want to opt-in to the fine grained control
> over disk choices

I guess I could accept defaulting @devices to all capable ones.  That's
what the "historical default logic for picking devices" does.  The
documentation should be improved to actually explain what the default
does, though.

I object to defaulting @vmstate to "first valid block device".  I don't
think requiring applications be explicit on where the vmstate needs to
go is too much of a burden.  If you assure me it is, I'll reconsider.

>> > +##
>> > +# @snapshot-load:
>> > +#
>> > +# Load a VM snapshot
>> > +#
>> > +# @job-id: identifier for the newly created job
>> > +# @tag: name of the snapshot to load.
>> > +# @devices: list of block device node names to load a snapshot from
>> > +# @vmstate: block device node name to load vmstate from
>> 
>> Worth mentioning that omitting block devices that may have changed since
>> the save is probably a bad idea?
>
> Yep.
>
>> 
>> > +#
>> > +# Applications should not assume that the snapshot load is complete
>> > +# when this command returns. Completion is indicated by the job
>> > +# status. Clients can wait for the JOB_STATUS_CHANGE event. If the
>> > +# job aborts, errors can be obtained via the 'query-jobs' command,
>> > +# though. Note that at this time most vmstate procssing errors only
>> > +# get printed to stderr. This limitation will be fixed at a future
>> > +# date.
>> 
>> Comments on snapshot-load apply.
>> 
>> > +#
>> > +# If @devices is not specified, or is an empty list, then the
>> > +# historical default logic for picking devices will be used.
>> 
>> Why is this useful for QMP?
>> 
>> > +#
>> > +# If @vmstate is not specified, then the first valid block
>> > +# device will be used for vmstate.
>> 
>> Why is this useful for QMP?
>> 
>> A more useful default could be "if exactly one the block devices being
>> restored contains a vmstate, use that".
>
> I feel it is more important to be symetric with save-snapshot.  ie if you
> supply or omit the same args for save-snapshot and load-snapshot, you
> know both will work, or neither will work. You dont get into a situation
> where you can succesfully save the snapshot, but not restore it.

For a value of "successful".

loadvm is unsafe unless guest-visible configuration and block device
contents match.

loadvm can make block device contents match for the block devices that
were snapshot by savevm.  For any others, it's the user's
responsibility.

If you have the exact same set of QCOW2 as you had at savevm time, and
didn't mess them up meanwhile, then exactly one of them has the vmstate
named @tag.  This is the sane way to use savem/loadvm.

But you can also use it in other ways, and some of them even work.

If you remove QCOW2s other than the one holding the vmstate from the
set, you still got exactly one vmstate.  Aside: for safe loadvm, you
must replace these QCOW2s, and their replacements must provide the same
contents.

If you remove the one holding the vmstate, you got exactly no vmstate,
and loadvm won't work regardless of default.  D'oh!

If you add QCOW2s that don't have a vmstate named @tag, you still got
exactly one vmstate.  Aside: for safe loadvm, these guys must replace
non-QCOW2s, and they must provide the same contents.  Guests are more
likely to survive sneak addition of hardware than sneak removal,
however.

If you add QCOW2s that have a vmstage named @tag, you got multiple
vmstates.  This is the only situation where your lax "default to first
vmstate found, else error" makes loadvm "work" where "default to the
only vmstate if is exactly one, else error" doesn't.

In my opinion, having the system silently pick one of the multiple
vmstates named @tag is as dangerous as it is unhelpful.  Even if you can
"control" the pick by ordering your command line and / or monitor
commands the right way.

>> > +##
>> > +# @snapshot-delete:
>> > +#
>> > +# Delete a VM snapshot
>> > +#
>> > +# @job-id: identifier for the newly created job
>> > +# @tag: name of the snapshot to delete.
>> > +# @devices: list of block device node names to delete a snapshot from
>> > +#
>> > +# Applications should not assume that the snapshot load is complete
>> > +# when this command returns. Completion is indicated by the job
>> > +# status. Clients can wait for the JOB_STATUS_CHANGE event.
>> 
>> Comments on snapshot-load apply.
>> 
>> One difference: no "If the job aborts, ..."  Intentional?
>
> I guess it can abort if the file is corrupt perhaps. Generally
> thogh if the named snapshot doesnt exist in the block device, it
> is considered success, not an error.

I'd keep the "applications should" admonishments exactly the same for
all three commands.  Suggestion, not demand.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-09-02  9:27       ` Markus Armbruster
@ 2020-09-02 11:05         ` Daniel P. Berrangé
  2020-09-03  9:48           ` Markus Armbruster
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-09-02 11:05 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, qemu-devel, Dr. David Alan Gilbert,
	Pavel Dovgalyuk, Paolo Bonzini, Max Reitz, John Snow

On Wed, Sep 02, 2020 at 11:27:17AM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Tue, Sep 01, 2020 at 04:20:47PM +0200, Markus Armbruster wrote:
> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >> 
> >> > savevm, loadvm and delvm are some of the few HMP commands that have never
> >> > been converted to use QMP. The primary reason for this lack of conversion
> >> > is that they block execution of the thread for as long as they run.
> >> 
> >> Nope.  The primary reason is that the HMP interface is bonkers.
> >
> > I don't think that's very helpful description. The HMP interface has
> > some limitations, but it isn't bonkers - it just doesn't cope with
> > all the use cases we want. Many people use it succesfully without
> > issue
> 
> It's non-bonkers for the case it was designed for: one disk backed by
> QCOW2, plus maybe a CD-ROM.  The user is responsible for ensuring the
> CD-ROM's media at loadvm time matches the one at savevm time.  The user
> is further responsible for ensuring the guest-visible configuration
> matches the one at savevm time.  No worse than migration.

It is fine for multiple writable disks too, if they're all qcow2 backed.

> It becomes useless as soon as you have writable non-QCOW2 block devices.
> "Stop machine, take external snapshots, savevm, restart machine" should
> work, but doesn't.

External snapshots can be made to work today if you're willing to pause
the VM CPUs while you take the external snapshots, and run the migration
to capture VM state. The main reason apps like Boxes / Virt-manager don't
use external snapshots is that they require more complicated decision
making. For each type of storage you're dealing with you have a potentially
different way to manage the external snapshot. Boxes took the view that it
would only support qcow2 to avoid making those decisions. virt-manager has
support for all types of storage and simply doesn't want to add the complex
logic to deal with non-qcow2. 

> It becomes bonkers as soon as you have more than one QCOW2: which one
> receives the system state?  It depends on the order in which they got
> configured or some craziness like that.  Undocumented of course.

Saying it is the first configured disk was not that crazy. This has been
a pretty well defined semantic historically, as disks are usually added
to QEMU in a predictable order, because that in turn controls how QEMU
assigns addresses, and thus the order in which guest OS detect them.

The introduction of use of the block layer for UEFI storage really
threw a spanner in the works from QEMU's side. Even then the semantics
aren't bad from a app's POV - first disk is still well defined as a
concept, as UEFI vars isn't really considered a disk from an app POV.

> Therefore, the commit message's claim "the primary reason for this lack
> of conversion is that they block execution of the thread for as long as
> they run" is factually wrong.

That part of the commit message is referring to the original reason for
not porting loadvm/savevm when QMP was first created 10+ years ago, which
was primarily due to its blocking nature. The issues around disk selection
post-date that. I'll modify it to describe both issues


> >> > +#
> >> > +# Applications should not assume that the snapshot save is complete
> >> > +# when this command returns.
> >> 
> >> Is it complete then with the current code?  I'm asking because such
> >> properties have a way to sneakily become de facto ABI.  We may not be
> >> able to do anything about that now, other than documenting "don't do
> >> that" like you did, but I'd like to understand the state of affairs all
> >> the same.
> >
> > Yes, the actual snapshot is synchronous with return of the command.
> 
> Are there any failure modes where the command succeeds, and query-jobs
> shows the error?

As implemented in this series, the commands always succeed with the
errors only ever reported via query-jobs. So for error detection
you are already forced to use  the job framework.

If you want to see some examples, take a look at the very last patch
in the series which introduces I/O test 310. The test covers various
success and failure scenarios, so you can see the behaviour we are
currently generating with this series.


> >> > +#
> >> > +# If @vmstate is not specified, then the first valid block
> >> > +# device will be used for vmstate.
> >> 
> >> Why is this useful for QMP?
> >
> > Both of these makes QEMU just "do the right thing" with the majority
> > of QEMU guest configurations with no special knowledge needed by
> > the mgmt app.
> >
> > It makes it possible for all existing apps to immediately stop using
> > the loadvm/savevm commands via HMP passthrough, and convert to the
> > QMP commands.
> 
> I appreciate your concern for easy migration from HMP savevm/loadvm to
> QMP.  I'm unwilling to permanently screw up the QMP interface for it,
> though.
> 
> > Without this, applications will need to first convert to use -blockdev
> > before they can use the load-snapshot/save-snapshot commands, because
> > the devices are specified exclusively using blockdev node names, not
> > the legacy drive IDs.
> 
> Not true.  *Every* block node has a node name.  If the user doesn't
> specify one, the system makes one up.  query-named-block-nodes shows
> it.  So does query-block.

query-named-block-nodes isn't friendly as a way to lookup node names,
as it doesn't include the "id" value for the original -drive, so
correlating block nodes to drives is not straightforward.  query-block
does seem a bit better in respect.

So if an app is currently using loadvm/savevm with human_monitor_command,
and is using -drive, they have to go through a mapping process to figure
out node names. Not especially friendly if they were happy with the
current choice of disks QEMU makes by default. 

> >                       I didn't want to make blockdev a mandatory
> > dependancy unless apps want to opt-in to the fine grained control
> > over disk choices
> 
> I guess I could accept defaulting @devices to all capable ones.  That's
> what the "historical default logic for picking devices" does.  The
> documentation should be improved to actually explain what the default
> does, though.
> 
> I object to defaulting @vmstate to "first valid block device".  I don't
> think requiring applications be explicit on where the vmstate needs to
> go is too much of a burden.  If you assure me it is, I'll reconsider.

Both parameters have the same issue wrt needing to have the blockdev
node names. So once you have to specify one parameter, you might as
well specify both. IOW, either we make both optional with defaults, or
both mandatory with no default. I don't think it makes sense to mix.


> >> > +#
> >> > +# If @devices is not specified, or is an empty list, then the
> >> > +# historical default logic for picking devices will be used.
> >> 
> >> Why is this useful for QMP?
> >> 
> >> > +#
> >> > +# If @vmstate is not specified, then the first valid block
> >> > +# device will be used for vmstate.
> >> 
> >> Why is this useful for QMP?
> >> 
> >> A more useful default could be "if exactly one the block devices being
> >> restored contains a vmstate, use that".
> >
> > I feel it is more important to be symetric with save-snapshot.  ie if you
> > supply or omit the same args for save-snapshot and load-snapshot, you
> > know both will work, or neither will work. You dont get into a situation
> > where you can succesfully save the snapshot, but not restore it.
> 
> For a value of "successful".
> 
> loadvm is unsafe unless guest-visible configuration and block device
> contents match.
> 
> loadvm can make block device contents match for the block devices that
> were snapshot by savevm.  For any others, it's the user's
> responsibility.

Yep, this is likely another thing that should be documented for the QMP
commands. The VM frontend config must be identical across load/save
otherwise VM state will fail to restore. DIsk backends should be the
same too unless you're specifying an explicit list of node names.

> If you have the exact same set of QCOW2 as you had at savevm time, and
> didn't mess them up meanwhile, then exactly one of them has the vmstate
> named @tag.  This is the sane way to use savem/loadvm.

Yep.

> But you can also use it in other ways, and some of them even work.
> 
> If you remove QCOW2s other than the one holding the vmstate from the
> set, you still got exactly one vmstate.  Aside: for safe loadvm, you
> must replace these QCOW2s, and their replacements must provide the same
> contents.
> 
> If you remove the one holding the vmstate, you got exactly no vmstate,
> and loadvm won't work regardless of default.  D'oh!
> 
> If you add QCOW2s that don't have a vmstate named @tag, you still got
> exactly one vmstate.  Aside: for safe loadvm, these guys must replace
> non-QCOW2s, and they must provide the same contents.  Guests are more
> likely to survive sneak addition of hardware than sneak removal,
> however.
> 
> If you add QCOW2s that have a vmstage named @tag, you got multiple
> vmstates.  This is the only situation where your lax "default to first
> vmstate found, else error" makes loadvm "work" where "default to the
> only vmstate if is exactly one, else error" doesn't.
> 
> In my opinion, having the system silently pick one of the multiple
> vmstates named @tag is as dangerous as it is unhelpful.  Even if you can
> "control" the pick by ordering your command line and / or monitor
> commands the right way.

I think it would be reasonable to make the existance of multiple
vmstates on different disks be a hard error both for QMP and for
the existing HMP commands, if not specifying an explicit device,
as its almostly certainly a admin/mgmt app screw up. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-09-02 11:05         ` Daniel P. Berrangé
@ 2020-09-03  9:48           ` Markus Armbruster
  2020-09-03 10:17             ` Kevin Wolf
  0 siblings, 1 reply; 19+ messages in thread
From: Markus Armbruster @ 2020-09-03  9:48 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, qemu-devel, Dr. David Alan Gilbert,
	Pavel Dovgalyuk, Paolo Bonzini, Max Reitz, John Snow

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Sep 02, 2020 at 11:27:17AM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > On Tue, Sep 01, 2020 at 04:20:47PM +0200, Markus Armbruster wrote:
>> >> Daniel P. Berrangé <berrange@redhat.com> writes:
>> >> 
>> >> > savevm, loadvm and delvm are some of the few HMP commands that have never
>> >> > been converted to use QMP. The primary reason for this lack of conversion
>> >> > is that they block execution of the thread for as long as they run.
>> >> 
>> >> Nope.  The primary reason is that the HMP interface is bonkers.
>> >
>> > I don't think that's very helpful description. The HMP interface has
>> > some limitations, but it isn't bonkers - it just doesn't cope with
>> > all the use cases we want. Many people use it succesfully without
>> > issue
>> 
>> It's non-bonkers for the case it was designed for: one disk backed by
>> QCOW2, plus maybe a CD-ROM.  The user is responsible for ensuring the
>> CD-ROM's media at loadvm time matches the one at savevm time.  The user
>> is further responsible for ensuring the guest-visible configuration
>> matches the one at savevm time.  No worse than migration.
>
> It is fine for multiple writable disks too, if they're all qcow2 backed.
>
>> It becomes useless as soon as you have writable non-QCOW2 block devices.
>> "Stop machine, take external snapshots, savevm, restart machine" should
>> work, but doesn't.
>
> External snapshots can be made to work today if you're willing to pause
> the VM CPUs while you take the external snapshots, and run the migration
> to capture VM state.

Yes.

Pausing is necessary because we don't have a way to take snapshots when
migration completes.

Even if we had it, live migration to file would be a problematic way to
capture VM state, because it doesn't update changing state in place.

savevm is not live either, of course.

My point is that savevm is "QCOW2 or bust" unnecessarily.  Your proposed
QMP interface isn't.

>                      The main reason apps like Boxes / Virt-manager don't
> use external snapshots is that they require more complicated decision
> making. For each type of storage you're dealing with you have a potentially
> different way to manage the external snapshot. Boxes took the view that it
> would only support qcow2 to avoid making those decisions. virt-manager has
> support for all types of storage and simply doesn't want to add the complex
> logic to deal with non-qcow2. 
>
>> It becomes bonkers as soon as you have more than one QCOW2: which one
>> receives the system state?  It depends on the order in which they got
>> configured or some craziness like that.  Undocumented of course.
>
> Saying it is the first configured disk was not that crazy. This has been
> a pretty well defined semantic historically, as disks are usually added
> to QEMU in a predictable order, because that in turn controls how QEMU
> assigns addresses, and thus the order in which guest OS detect them.
>
> The introduction of use of the block layer for UEFI storage really
> threw a spanner in the works from QEMU's side. Even then the semantics
> aren't bad from a app's POV - first disk is still well defined as a
> concept, as UEFI vars isn't really considered a disk from an app POV.
>
>> Therefore, the commit message's claim "the primary reason for this lack
>> of conversion is that they block execution of the thread for as long as
>> they run" is factually wrong.
>
> That part of the commit message is referring to the original reason for
> not porting loadvm/savevm when QMP was first created 10+ years ago, which
> was primarily due to its blocking nature. The issues around disk selection
> post-date that. I'll modify it to describe both issues
>
>
>> >> > +#
>> >> > +# Applications should not assume that the snapshot save is complete
>> >> > +# when this command returns.
>> >> 
>> >> Is it complete then with the current code?  I'm asking because such
>> >> properties have a way to sneakily become de facto ABI.  We may not be
>> >> able to do anything about that now, other than documenting "don't do
>> >> that" like you did, but I'd like to understand the state of affairs all
>> >> the same.
>> >
>> > Yes, the actual snapshot is synchronous with return of the command.
>> 
>> Are there any failure modes where the command succeeds, and query-jobs
>> shows the error?
>
> As implemented in this series, the commands always succeed with the
> errors only ever reported via query-jobs. So for error detection
> you are already forced to use  the job framework.

This is what I hoped for when I asked.  It makes misuse of the interface
unlikely to survive basic testing.  Pretty much eliminates the risk of
misuse becoming de facto ABI over time.

> If you want to see some examples, take a look at the very last patch
> in the series which introduces I/O test 310. The test covers various
> success and failure scenarios, so you can see the behaviour we are
> currently generating with this series.
>
>
>> >> > +#
>> >> > +# If @vmstate is not specified, then the first valid block
>> >> > +# device will be used for vmstate.
>> >> 
>> >> Why is this useful for QMP?
>> >
>> > Both of these makes QEMU just "do the right thing" with the majority
>> > of QEMU guest configurations with no special knowledge needed by
>> > the mgmt app.
>> >
>> > It makes it possible for all existing apps to immediately stop using
>> > the loadvm/savevm commands via HMP passthrough, and convert to the
>> > QMP commands.
>> 
>> I appreciate your concern for easy migration from HMP savevm/loadvm to
>> QMP.  I'm unwilling to permanently screw up the QMP interface for it,
>> though.
>> 
>> > Without this, applications will need to first convert to use -blockdev
>> > before they can use the load-snapshot/save-snapshot commands, because
>> > the devices are specified exclusively using blockdev node names, not
>> > the legacy drive IDs.
>> 
>> Not true.  *Every* block node has a node name.  If the user doesn't
>> specify one, the system makes one up.  query-named-block-nodes shows
>> it.  So does query-block.
>
> query-named-block-nodes isn't friendly as a way to lookup node names,
> as it doesn't include the "id" value for the original -drive, so
> correlating block nodes to drives is not straightforward.  query-block
> does seem a bit better in respect.
>
> So if an app is currently using loadvm/savevm with human_monitor_command,
> and is using -drive, they have to go through a mapping process to figure
> out node names. Not especially friendly if they were happy with the
> current choice of disks QEMU makes by default. 

I take "not especially friendly to clients converting from (bonkers) HMP
commands" over "similarly bonkers" any day.  Okay, I'm exaggerating for
effect, your proposed interface is much better even with this default.
I do hate this default, though.

QMP is (meant to be) explicit and non-magic.

If having to map from drive ID to node-name really is too much of a
burden, we can look for ways to make it easier, or we can make savem
optionally accept drive IDs instead of node-names, like we do in several
other places for backward compatibility.

>> >                       I didn't want to make blockdev a mandatory
>> > dependancy unless apps want to opt-in to the fine grained control
>> > over disk choices
>> 
>> I guess I could accept defaulting @devices to all capable ones.  That's
>> what the "historical default logic for picking devices" does.  The

Correction: to all writable ones.

>> documentation should be improved to actually explain what the default
>> does, though.
>> 
>> I object to defaulting @vmstate to "first valid block device".  I don't
>> think requiring applications be explicit on where the vmstate needs to
>> go is too much of a burden.  If you assure me it is, I'll reconsider.
>
> Both parameters have the same issue wrt needing to have the blockdev
> node names. So once you have to specify one parameter, you might as
> well specify both. IOW, either we make both optional with defaults, or
> both mandatory with no default. I don't think it makes sense to mix.
>
>
>> >> > +#
>> >> > +# If @devices is not specified, or is an empty list, then the
>> >> > +# historical default logic for picking devices will be used.
>> >> 
>> >> Why is this useful for QMP?
>> >> 
>> >> > +#
>> >> > +# If @vmstate is not specified, then the first valid block
>> >> > +# device will be used for vmstate.
>> >> 
>> >> Why is this useful for QMP?
>> >> 
>> >> A more useful default could be "if exactly one the block devices being
>> >> restored contains a vmstate, use that".
>> >
>> > I feel it is more important to be symetric with save-snapshot.  ie if you
>> > supply or omit the same args for save-snapshot and load-snapshot, you
>> > know both will work, or neither will work. You dont get into a situation
>> > where you can succesfully save the snapshot, but not restore it.
>> 
>> For a value of "successful".
>> 
>> loadvm is unsafe unless guest-visible configuration and block device
>> contents match.
>> 
>> loadvm can make block device contents match for the block devices that
>> were snapshot by savevm.  For any others, it's the user's
>> responsibility.
>
> Yep, this is likely another thing that should be documented for the QMP
> commands. The VM frontend config must be identical across load/save
> otherwise VM state will fail to restore. DIsk backends should be the
> same too unless you're specifying an explicit list of node names.

Yes, please.

>> If you have the exact same set of QCOW2 as you had at savevm time, and
>> didn't mess them up meanwhile, then exactly one of them has the vmstate
>> named @tag.  This is the sane way to use savem/loadvm.
>
> Yep.
>
>> But you can also use it in other ways, and some of them even work.
>> 
>> If you remove QCOW2s other than the one holding the vmstate from the
>> set, you still got exactly one vmstate.  Aside: for safe loadvm, you
>> must replace these QCOW2s, and their replacements must provide the same
>> contents.
>> 
>> If you remove the one holding the vmstate, you got exactly no vmstate,
>> and loadvm won't work regardless of default.  D'oh!
>> 
>> If you add QCOW2s that don't have a vmstate named @tag, you still got
>> exactly one vmstate.  Aside: for safe loadvm, these guys must replace
>> non-QCOW2s, and they must provide the same contents.  Guests are more
>> likely to survive sneak addition of hardware than sneak removal,
>> however.
>> 
>> If you add QCOW2s that have a vmstage named @tag, you got multiple
>> vmstates.  This is the only situation where your lax "default to first
>> vmstate found, else error" makes loadvm "work" where "default to the
>> only vmstate if is exactly one, else error" doesn't.
>> 
>> In my opinion, having the system silently pick one of the multiple
>> vmstates named @tag is as dangerous as it is unhelpful.  Even if you can
>> "control" the pick by ordering your command line and / or monitor
>> commands the right way.
>
> I think it would be reasonable to make the existance of multiple
> vmstates on different disks be a hard error both for QMP and for
> the existing HMP commands, if not specifying an explicit device,
> as its almostly certainly a admin/mgmt app screw up. 

I consider this a no-brainer for QMP.  If you somehow manage to get
yourself into a situation where you have multiple VM states, and need to
use one of them, you can ask explicitly for the one you want.

HMP doesn't let you ask.  Hmm.  I think "no go, resolve the conflict
with QMP or offline" is better than "pray there is a right one, and pray
QEMU picks it", which is what we have now.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-09-03  9:48           ` Markus Armbruster
@ 2020-09-03 10:17             ` Kevin Wolf
  2020-09-03 10:21               ` Daniel P. Berrangé
  0 siblings, 1 reply; 19+ messages in thread
From: Kevin Wolf @ 2020-09-03 10:17 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Krempa, Daniel P. Berrangé,
	Denis V. Lunev, qemu-block, Juan Quintela, qemu-devel,
	Dr. David Alan Gilbert, Pavel Dovgalyuk, Paolo Bonzini,
	Max Reitz, John Snow

Am 03.09.2020 um 11:48 hat Markus Armbruster geschrieben:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Wed, Sep 02, 2020 at 11:27:17AM +0200, Markus Armbruster wrote:
> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >> 
> >> > On Tue, Sep 01, 2020 at 04:20:47PM +0200, Markus Armbruster wrote:
> >> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >> >> 
> >> >> > savevm, loadvm and delvm are some of the few HMP commands that have never
> >> >> > been converted to use QMP. The primary reason for this lack of conversion
> >> >> > is that they block execution of the thread for as long as they run.
> >> >> 
> >> >> Nope.  The primary reason is that the HMP interface is bonkers.
> >> >
> >> > I don't think that's very helpful description. The HMP interface has
> >> > some limitations, but it isn't bonkers - it just doesn't cope with
> >> > all the use cases we want. Many people use it succesfully without
> >> > issue
> >> 
> >> It's non-bonkers for the case it was designed for: one disk backed by
> >> QCOW2, plus maybe a CD-ROM.  The user is responsible for ensuring the
> >> CD-ROM's media at loadvm time matches the one at savevm time.  The user
> >> is further responsible for ensuring the guest-visible configuration
> >> matches the one at savevm time.  No worse than migration.
> >
> > It is fine for multiple writable disks too, if they're all qcow2 backed.
> >
> >> It becomes useless as soon as you have writable non-QCOW2 block devices.
> >> "Stop machine, take external snapshots, savevm, restart machine" should
> >> work, but doesn't.

This is because savevm can't snapshot all writable disks and fails in
this case, right? So the proposed alternative with a list of nodes to
create disk snapshots for would solve this.

> > External snapshots can be made to work today if you're willing to pause
> > the VM CPUs while you take the external snapshots, and run the migration
> > to capture VM state.
> 
> Yes.
> 
> Pausing is necessary because we don't have a way to take snapshots when
> migration completes.
> 
> Even if we had it, live migration to file would be a problematic way to
> capture VM state, because it doesn't update changing state in place.

It's also not the semantics you usally want for snapshots. You want to
take a snapshot of the state _now_, not of the state at some arbitrary
time in the future.

That's why I mentioned the post-copy semantics for live snapshots.

> savevm is not live either, of course.
> 
> My point is that savevm is "QCOW2 or bust" unnecessarily.  Your
> proposed QMP interface isn't.

Strictly speaking, not qcow2, but "block driver that supports
snapshots". Not that it matters in practice, but this happens to include
sheepdog.

It also means that we could implement a filter driver that saves the VM
state to an external file, which would allow saving the VM state even if
you use only raw images (of course, only useful if you have means to
snapshot the disk state externally or in combination with external
snapshots).

> > If you want to see some examples, take a look at the very last patch
> > in the series which introduces I/O test 310. The test covers various
> > success and failure scenarios, so you can see the behaviour we are
> > currently generating with this series.
> >
> >
> >> >> > +#
> >> >> > +# If @vmstate is not specified, then the first valid block
> >> >> > +# device will be used for vmstate.
> >> >> 
> >> >> Why is this useful for QMP?
> >> >
> >> > Both of these makes QEMU just "do the right thing" with the majority
> >> > of QEMU guest configurations with no special knowledge needed by
> >> > the mgmt app.
> >> >
> >> > It makes it possible for all existing apps to immediately stop using
> >> > the loadvm/savevm commands via HMP passthrough, and convert to the
> >> > QMP commands.
> >> 
> >> I appreciate your concern for easy migration from HMP savevm/loadvm to
> >> QMP.  I'm unwilling to permanently screw up the QMP interface for it,
> >> though.
> >> 
> >> > Without this, applications will need to first convert to use -blockdev
> >> > before they can use the load-snapshot/save-snapshot commands, because
> >> > the devices are specified exclusively using blockdev node names, not
> >> > the legacy drive IDs.
> >> 
> >> Not true.  *Every* block node has a node name.  If the user doesn't
> >> specify one, the system makes one up.  query-named-block-nodes shows
> >> it.  So does query-block.
> >
> > query-named-block-nodes isn't friendly as a way to lookup node names,
> > as it doesn't include the "id" value for the original -drive, so
> > correlating block nodes to drives is not straightforward.  query-block
> > does seem a bit better in respect.
> >
> > So if an app is currently using loadvm/savevm with human_monitor_command,
> > and is using -drive, they have to go through a mapping process to figure
> > out node names. Not especially friendly if they were happy with the
> > current choice of disks QEMU makes by default. 
> 
> I take "not especially friendly to clients converting from (bonkers) HMP
> commands" over "similarly bonkers" any day.  Okay, I'm exaggerating for
> effect, your proposed interface is much better even with this default.
> I do hate this default, though.
> 
> QMP is (meant to be) explicit and non-magic.
> 
> If having to map from drive ID to node-name really is too much of a
> burden, we can look for ways to make it easier, or we can make savem
> optionally accept drive IDs instead of node-names, like we do in several
> other places for backward compatibility.

Yes, letting commands accept both node-names and drive IDs is trivial
and we do it pretty much everywhere. Much better than randomly selecting
an image to save the VM state to.

But I'm not even sure if makign people switch to -blockdev would be a
bad thing. If they insist on bad interfaces, they can keep using HMP
'savevm', too...

Kevin



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-09-03 10:17             ` Kevin Wolf
@ 2020-09-03 10:21               ` Daniel P. Berrangé
  2020-09-03 10:44                 ` Markus Armbruster
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-09-03 10:21 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Peter Krempa, Denis V. Lunev, qemu-block, Juan Quintela,
	qemu-devel, Markus Armbruster, Pavel Dovgalyuk, Paolo Bonzini,
	Max Reitz, John Snow, Dr. David Alan Gilbert

On Thu, Sep 03, 2020 at 12:17:29PM +0200, Kevin Wolf wrote:
> Am 03.09.2020 um 11:48 hat Markus Armbruster geschrieben:
> > If having to map from drive ID to node-name really is too much of a
> > burden, we can look for ways to make it easier, or we can make savem
> > optionally accept drive IDs instead of node-names, like we do in several
> > other places for backward compatibility.
> 
> Yes, letting commands accept both node-names and drive IDs is trivial
> and we do it pretty much everywhere. Much better than randomly selecting
> an image to save the VM state to.

Is there anything which guarantees that node-names and drive IDs will
never clash ?  I didn't look for drive IDs as I was trying to ensure
a non-ambiguous lookup in case a string was both a valid node name
and a valid drive ID

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands
  2020-09-03 10:21               ` Daniel P. Berrangé
@ 2020-09-03 10:44                 ` Markus Armbruster
  0 siblings, 0 replies; 19+ messages in thread
From: Markus Armbruster @ 2020-09-03 10:44 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, qemu-devel, Max Reitz, Pavel Dovgalyuk,
	Paolo Bonzini, John Snow, Dr. David Alan Gilbert

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Thu, Sep 03, 2020 at 12:17:29PM +0200, Kevin Wolf wrote:
>> Am 03.09.2020 um 11:48 hat Markus Armbruster geschrieben:
>> > If having to map from drive ID to node-name really is too much of a
>> > burden, we can look for ways to make it easier, or we can make savem
>> > optionally accept drive IDs instead of node-names, like we do in several
>> > other places for backward compatibility.
>> 
>> Yes, letting commands accept both node-names and drive IDs is trivial
>> and we do it pretty much everywhere. Much better than randomly selecting
>> an image to save the VM state to.
>
> Is there anything which guarantees that node-names and drive IDs will
> never clash ?  I didn't look for drive IDs as I was trying to ensure
> a non-ambiguous lookup in case a string was both a valid node name
> and a valid drive ID

If I remember correctly, we always use a pair of members, technically
both optional, must specify exactly one, because a single string would
not work.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP
  2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
                   ` (6 preceding siblings ...)
  2020-08-27 11:16 ` [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands Daniel P. Berrangé
@ 2020-09-11 11:52 ` Dr. David Alan Gilbert
  2020-09-11 16:45   ` Daniel P. Berrangé
  7 siblings, 1 reply; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2020-09-11 11:52 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, John Snow, qemu-devel, Markus Armbruster,
	Pavel Dovgalyuk, Paolo Bonzini, Max Reitz

Kevin:
  While we're still arguing about details of the last commit; can we get
the first few commits in - they seem to be generally nice cleanups/error
handling.

Dave

* Daniel P. Berrangé (berrange@redhat.com) wrote:
>  v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html
>  v2: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg07523.html
> 
> When QMP was first introduced some 10+ years ago now, the snapshot
> related commands (savevm/loadvm/delvm) were not converted. This was
> primarily because their implementation causes blocking of the thread
> running the monitor commands. This was (and still is) considered
> undesirable behaviour both in HMP and QMP.
> 
> In theory someone was supposed to fix this flaw at some point in the
> past 10 years and bring them into the QMP world. Sadly, thus far it
> hasn't happened as people always had more important things to work
> on. Enterprise apps were much more interested in external snapshots
> than internal snapshots as they have many more features.
> 
> Meanwhile users still want to use internal snapshots as there is
> a certainly simplicity in having everything self-contained in one
> image, even though it has limitations. Thus the apps that end up
> executing the savevm/loadvm/delvm via the "human-monitor-command"
> QMP command.
> 
> IOW, the problematic blocking behaviour that was one of the reasons
> for not having savevm/loadvm/delvm in QMP is experienced by applications
> regardless. By not portting the commands to QMP due to one design flaw,
> we've forced apps and users to suffer from other design flaws of HMP (
> bad error reporting, strong type checking of args, no introspection) for
> an additional 10 years. This feels rather sub-optimal :-(
> 
> In practice users don't appear to care strongly about the fact that these
> commands block the VM while they run. I might have seen one bug report
> about it, but it certainly isn't something that comes up as a frequent
> topic except among us QEMU maintainers. Users do care about having
> access to the snapshot feature.
> 
> Where I am seeing frequent complaints is wrt the use of OVMF combined
> with snapshots which has some serious pain points. This is getting worse
> as the push to ditch legacy BIOS in favour of UEFI gain momentum both
> across OS vendors and mgmt apps. Solving it requires new parameters to
> the commands, but doing this in HMP is super unappealing.
> 
> After 10 years, I think it is time for us to be a little pragmatic about
> our handling of snapshots commands. My desire is that libvirt should never
> use "human-monitor-command" under any circumstances, because of the
> inherant flaws in HMP as a protocol for machine consumption.
> 
> Thus in this series I'm proposing a fairly direct mapping of the existing
> HMP commands for savevm/loadvm/delvm into QMP as a first step. This does
> not solve the blocking thread problem, but it does put in a place a
> design using the jobs framework which can facilitate solving it later.
> It does also solve the error reporting, type checking and introspection
> problems inherant to HMP. So we're winning on 3 out of the 4 problems,
> and pushed apps to a QMP design that will let us solve the last
> remaining problem.
> 
> With a QMP variant, we reasonably deal with the problems related to OVMF:
> 
>  - The logic to pick which disk to store the vmstate in is not
>    satsifactory.
> 
>    The first block driver state cannot be assumed to be the root disk
>    image, it might be OVMF varstore and we don't want to store vmstate
>    in there.
> 
>  - The logic to decide which disks must be snapshotted is hardwired
>    to all disks which are writable
> 
>    Again with OVMF there might be a writable varstore, but this can be
>    raw rather than qcow2 format, and thus unable to be snapshotted.
>    While users might wish to snapshot their varstore, in some/many/most
>    cases it is entirely uneccessary. Users are blocked from snapshotting
>    their VM though due to this varstore.
> 
> These are solved by adding two parameters to the commands. The first is
> a block device node name that identifies the image to store vmstate in,
> and the second is a list of node names to include for the snapshots.
> If the list of nodes isn't given, it falls back to the historical
> behaviour of using all disks matching some undocumented criteria.
> 
> In the block code I've only dealt with node names for block devices, as
> IIUC, this is all that libvirt should need in the -blockdev world it now
> lives in. IOW, I've made not attempt to cope with people wanting to use
> these QMP commands in combination with -drive args, as libvirt will
> never use -drive with a QEMU new enough to have these new commands.
> 
> The main limitations of this current impl
> 
>  - The snapshot process runs serialized in the main thread. ie QEMU
>    guest execution is blocked for the duration. The job framework
>    lets us fix this in future without changing the QMP semantics
>    exposed to the apps.
> 
>  - Most vmstate loading errors just go to stderr, as they are not
>    using Error **errp reporting. Thus the job framework just
>    reports a fairly generic message
> 
>      "Error -22 while loading VM state"
> 
>    Again this can be fixed later without changing the QMP semantics
>    exposed to apps.
> 
> I've done some minimal work in libvirt to start to make use of the new
> commands to validate their functionality, but this isn't finished yet.
> 
> My ultimate goal is to make the GNOME Boxes maintainer happy again by
> having internal snapshots work with OVMF:
> 
>   https://gitlab.gnome.org/GNOME/gnome-boxes/-/commit/c486da262f6566326fbcb5e=
> f45c5f64048f16a6e
> 
> Changed in v3:
> 
>  - Schedule a bottom half to escape from coroutine context in
>    the jobs. This is needed because the locking in the snapshot
>    code goes horribly wrong when run from a background coroutine
>    instead of the main event thread.
> 
>  - Re-factor way we iterate over devices, so that we correctly
>    report non-existant devices passed by the user over QMP.
> 
>  - Add QAPI docs notes about limitations wrt vmstate error
>    reporting (it all goes to stderr not an Error **errp)
>    so QMP only gets a fairly generic error message currently.
> 
>  - Add I/O test to validate many usage scenarios / errors
> 
>  - Add I/O test helpers to handle QMP events with a deterministic
>    ordering
> 
>  - Ensure 'delete-snapshot' reports an error if requesting
>    delete from devices that don't support snapshot, instead of
>    silently succeeding with no erro.
> 
> Changed in v2:
> 
>  - Use new command names "snapshot-{load,save,delete}" to make it
>    clear that these are different from the "savevm|loadvm|delvm"
>    as they use the Job framework
> 
>  - Use an include list for block devs, not an exclude list
> 
> Daniel P. Berrang=C3=A9 (7):
>   migration: improve error reporting of block driver state name
>   block: push error reporting into bdrv_all_*_snapshot functions
>   migration: stop returning errno from load_snapshot()
>   block: add ability to specify list of blockdevs during snapshot
>   block: allow specifying name of block device for vmstate storage
>   iotests: add support for capturing and matching QMP events
>   migration: introduce snapshot-{save,load,delete} QMP commands
> 
>  block/monitor/block-hmp-cmds.c |   7 +-
>  block/snapshot.c               | 233 ++++++++++++++-------
>  include/block/snapshot.h       |  19 +-
>  include/migration/snapshot.h   |  10 +-
>  migration/savevm.c             | 258 +++++++++++++++++++----
>  monitor/hmp-cmds.c             |  11 +-
>  qapi/job.json                  |   9 +-
>  qapi/migration.json            | 135 ++++++++++++
>  replay/replay-snapshot.c       |   4 +-
>  softmmu/vl.c                   |   2 +-
>  tests/qemu-iotests/267.out     |  14 +-
>  tests/qemu-iotests/310         | 255 +++++++++++++++++++++++
>  tests/qemu-iotests/310.out     | 369 +++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/common.qemu | 107 +++++++++-
>  tests/qemu-iotests/group       |   1 +
>  15 files changed, 1289 insertions(+), 145 deletions(-)
>  create mode 100755 tests/qemu-iotests/310
>  create mode 100644 tests/qemu-iotests/310.out
> 
> --=20
> 2.26.2
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP
  2020-09-11 11:52 ` [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Dr. David Alan Gilbert
@ 2020-09-11 16:45   ` Daniel P. Berrangé
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel P. Berrangé @ 2020-09-11 16:45 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, Peter Krempa, Denis V. Lunev, qemu-block,
	Juan Quintela, qemu-devel, Markus Armbruster, Pavel Dovgalyuk,
	Paolo Bonzini, Max Reitz, John Snow

On Fri, Sep 11, 2020 at 12:52:04PM +0100, Dr. David Alan Gilbert wrote:
> Kevin:
>   While we're still arguing about details of the last commit; can we get
> the first few commits in - they seem to be generally nice cleanups/error
> handling.

I'm going to try to send a update next week which just cuts out
the features that are causing disagreements. Someone else can
worry about them later...


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-09-11 16:46 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-27 11:15 [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Daniel P. Berrangé
2020-08-27 11:16 ` [PATCH v3 1/7] migration: improve error reporting of block driver state name Daniel P. Berrangé
2020-08-27 17:17   ` Dr. David Alan Gilbert
2020-08-27 11:16 ` [PATCH v3 2/7] block: push error reporting into bdrv_all_*_snapshot functions Daniel P. Berrangé
2020-08-27 11:16 ` [PATCH v3 3/7] migration: stop returning errno from load_snapshot() Daniel P. Berrangé
2020-08-27 11:16 ` [PATCH v3 4/7] block: add ability to specify list of blockdevs during snapshot Daniel P. Berrangé
2020-08-27 11:16 ` [PATCH v3 5/7] block: allow specifying name of block device for vmstate storage Daniel P. Berrangé
2020-08-27 11:16 ` [PATCH v3 6/7] iotests: add support for capturing and matching QMP events Daniel P. Berrangé
2020-08-27 11:16 ` [PATCH v3 7/7] migration: introduce snapshot-{save, load, delete} QMP commands Daniel P. Berrangé
2020-09-01 14:20   ` Markus Armbruster
2020-09-01 16:47     ` Daniel P. Berrangé
2020-09-02  9:27       ` Markus Armbruster
2020-09-02 11:05         ` Daniel P. Berrangé
2020-09-03  9:48           ` Markus Armbruster
2020-09-03 10:17             ` Kevin Wolf
2020-09-03 10:21               ` Daniel P. Berrangé
2020-09-03 10:44                 ` Markus Armbruster
2020-09-11 11:52 ` [PATCH v3 0/7] migration: bring improved savevm/loadvm/delvm to QMP Dr. David Alan Gilbert
2020-09-11 16:45   ` Daniel P. Berrangé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.