All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/33] migration: capture error reports into Error object
@ 2021-02-04 17:18 Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state() Daniel P. Berrangé
                   ` (33 more replies)
  0 siblings, 34 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

Due to its long term heritage most of the migration code just invokes
'error_report' when problems hit. This was fine for HMP, since the
messages get redirected from stderr, into the HMP console. It is not
OK for QMP because the errors will not be fed back to the QMP client.

This wasn't a terrible real world problem with QMP so far because
live migration happens in the background, so at least on the target side
there is not a QMP command that needs to capture the incoming migration.
It is a problem on the source side but it doesn't hit frequently as the
source side has fewer failure scenarios. None the less on both sides it
would be desirable if 'query-migrate' can report errors correctly.
With the introduction of the load-snapshot QMP commands, the need for
error reporting becomes more pressing.

Wiring up good error reporting is a large and difficult job, which
this series does NOT complete. The focus here has been on converting
all methods in savevm.c which have an 'int' return value capable of
reporting errors. This covers most of the infrastructure for controlling
the migration state serialization / protocol.

The remaining part that is missing error reporting are the callbacks in
the VMStateDescription struct which can return failure codes, but have
no "Error **errp" parameter. Thinking about how this might be dealt with
in future, a big bang conversion is likely non-viable. We'll probably
want to introduce a duplicate set of callbacks with the "Error **errp"
parameter and convert impls in batches, eventually removing the
original callbacks. I don't intend todo that myself in the immediate
future.

IOW, this patch series probably solves 50% of the problem, but we
still do need the rest to get ideal error reporting.

In doing this savevm conversion I noticed a bunch of places which
see and then ignore errors. I only fixed one or two of them which
were clearly dubious. Other places in savevm.c where it seemed it
was probably ok to ignore errors, I've left using error_report()
on the basis that those are really warnings. Perhaps they could
be changed to warn_report() instead.

There are alot of patches here, but I felt it was easier to review
for correctness if I converted 1 function at a time. The series
does not neccessarily have to be reviewed/appied in 1 go.

Daniel P. Berrangé (33):
  migration: push Error **errp into qemu_loadvm_state()
  migration: push Error **errp into qemu_loadvm_state_header()
  migration: push Error **errp into qemu_loadvm_state_setup()
  migration: push Error **errp into qemu_load_device_state()
  migration: push Error **errp into qemu_loadvm_state_main()
  migration: push Error **errp into qemu_loadvm_section_start_full()
  migration: push Error **errp into qemu_loadvm_section_part_end()
  migration: push Error **errp into loadvm_process_command()
  migration: push Error **errp into loadvm_handle_cmd_packaged()
  migration: push Error **errp into loadvm_postcopy_handle_advise()
  migration: push Error **errp into ram_postcopy_incoming_init()
  migration: push Error **errp into loadvm_postcopy_handle_listen()
  migration: push Error **errp into loadvm_postcopy_handle_run()
  migration: push Error **errp into loadvm_postcopy_ram_handle_discard()
  migration: make loadvm_postcopy_handle_resume() void
  migration: push Error **errp into loadvm_handle_recv_bitmap()
  migration: push Error **errp into loadvm_process_enable_colo()
  migration: push Error **errp into colo_init_ram_cache()
  migration: push Error **errp into check_section_footer()
  migration: push Error **errp into global_state_store()
  migration: remove error reporting from qemu_fopen_bdrv() callers
  migration: push Error **errp into qemu_savevm_state_iterate()
  migration: simplify some error reporting in save_snapshot()
  migration: push Error **errp into qemu_savevm_state_setup()
  migration: push Error **errp into qemu_savevm_state_complete_precopy()
  migration: push Error **errp into
    qemu_savevm_state_complete_precopy_non_iterable()
  migration: push Error **errp into qemu_savevm_state_complete_precopy()
  migration: push Error **errp into qemu_savevm_send_packaged()
  migration: push Error **errp into qemu_savevm_live_state()
  migration: push Error **errp into qemu_save_device_state()
  migration: push Error **errp into qemu_savevm_state_resume_prepare()
  migration: push Error **errp into postcopy_resume_handshake()
  migration: push Error **errp into postcopy_do_resume()

 include/migration/colo.h                      |   2 +-
 include/migration/global_state.h              |   2 +-
 migration/colo.c                              |  12 +-
 migration/global_state.c                      |   6 +-
 migration/migration.c                         |  80 ++-
 migration/postcopy-ram.c                      |   8 +-
 migration/postcopy-ram.h                      |   2 +-
 migration/ram.c                               |  17 +-
 migration/ram.h                               |   4 +-
 migration/savevm.c                            | 594 ++++++++++--------
 migration/savevm.h                            |  23 +-
 .../tests/internal-snapshots-qapi.out         |   3 +-
 12 files changed, 427 insertions(+), 326 deletions(-)

-- 
2.29.2




^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 21:57   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 02/33] migration: push Error **errp into qemu_loadvm_state_header() Daniel P. Berrangé
                   ` (32 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c |  4 ++--
 migration/savevm.c    | 36 ++++++++++++++++++++----------------
 migration/savevm.h    |  2 +-
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 1986cb8573..287a18d269 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -519,7 +519,7 @@ static void process_incoming_migration_co(void *opaque)
     postcopy_state_set(POSTCOPY_INCOMING_NONE);
     migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
                       MIGRATION_STATUS_ACTIVE);
-    ret = qemu_loadvm_state(mis->from_src_file);
+    ret = qemu_loadvm_state(mis->from_src_file, &local_err);
 
     ps = postcopy_state_get();
     trace_process_incoming_migration_co_end(ret, ps);
@@ -563,7 +563,7 @@ static void process_incoming_migration_co(void *opaque)
     }
 
     if (ret < 0) {
-        error_report("load of migration failed: %s", strerror(-ret));
+        error_report_err(local_err);
         goto fail;
     }
     mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
diff --git a/migration/savevm.c b/migration/savevm.c
index 6b320423c7..c8d93eee1e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2638,40 +2638,49 @@ out:
     return ret;
 }
 
-int qemu_loadvm_state(QEMUFile *f)
+int qemu_loadvm_state(QEMUFile *f, Error **errp)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
-    Error *local_err = NULL;
     int ret;
 
-    if (qemu_savevm_state_blocked(&local_err)) {
-        error_report_err(local_err);
-        return -EINVAL;
+    if (qemu_savevm_state_blocked(errp)) {
+        return -1;
     }
 
     ret = qemu_loadvm_state_header(f);
     if (ret) {
-        return ret;
+        error_setg(errp, "Error %d while loading VM state", ret);
+        return -1;
     }
 
     if (qemu_loadvm_state_setup(f) != 0) {
-        return -EINVAL;
+        error_setg(errp, "Error %d while loading VM state", -EINVAL);
+        return -1;
     }
 
     cpu_synchronize_all_pre_loadvm();
 
     ret = qemu_loadvm_state_main(f, mis);
+    if (ret < 0) {
+        error_setg(errp, "Error %d while loading VM state", ret);
+        ret = -1;
+    }
     qemu_event_set(&mis->main_thread_load_event);
 
     trace_qemu_loadvm_state_post_main(ret);
 
     if (mis->have_listen_thread) {
+        error_setg(errp, "Error %d while loading VM state", ret);
         /* Listen thread still going, can't clean up yet */
         return ret;
     }
 
     if (ret == 0) {
         ret = qemu_file_get_error(f);
+        if (ret < 0) {
+            error_setg(errp, "Error %d while loading VM state", ret);
+            ret = -1;
+        }
     }
 
     /*
@@ -2690,8 +2699,8 @@ int qemu_loadvm_state(QEMUFile *f)
         uint8_t  section_type = qemu_get_byte(f);
 
         if (section_type != QEMU_VM_VMDESCRIPTION) {
-            error_report("Expected vmdescription section, but got %d",
-                         section_type);
+            error_setg(errp, "Expected vmdescription section, but got %d",
+                       section_type);
             /*
              * It doesn't seem worth failing at this point since
              * we apparently have an otherwise valid VM state
@@ -2921,7 +2930,6 @@ void qmp_xen_load_devices_state(const char *filename, Error **errp)
 {
     QEMUFile *f;
     QIOChannelFile *ioc;
-    int ret;
 
     /* Guest must be paused before loading the device state; the RAM state
      * will already have been loaded by xc
@@ -2940,11 +2948,8 @@ void qmp_xen_load_devices_state(const char *filename, Error **errp)
     f = qemu_fopen_channel_input(QIO_CHANNEL(ioc));
     object_unref(OBJECT(ioc));
 
-    ret = qemu_loadvm_state(f);
+    qemu_loadvm_state(f, errp);
     qemu_fclose(f);
-    if (ret < 0) {
-        error_setg(errp, QERR_IO_ERROR);
-    }
     migration_incoming_state_destroy();
 }
 
@@ -3018,14 +3023,13 @@ bool load_snapshot(const char *name, const char *vmstate,
         goto err_drain;
     }
     aio_context_acquire(aio_context);
-    ret = qemu_loadvm_state(f);
+    ret = qemu_loadvm_state(f, errp);
     migration_incoming_state_destroy();
     aio_context_release(aio_context);
 
     bdrv_drain_all_end();
 
     if (ret < 0) {
-        error_setg(errp, "Error %d while loading VM state", ret);
         return false;
     }
 
diff --git a/migration/savevm.h b/migration/savevm.h
index ba64a7e271..1069e2dd4f 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -60,7 +60,7 @@ void qemu_savevm_send_colo_enable(QEMUFile *f);
 void qemu_savevm_live_state(QEMUFile *f);
 int qemu_save_device_state(QEMUFile *f);
 
-int qemu_loadvm_state(QEMUFile *f);
+int qemu_loadvm_state(QEMUFile *f, Error **errp);
 void qemu_loadvm_state_cleanup(void);
 int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
 int qemu_load_device_state(QEMUFile *f);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 02/33] migration: push Error **errp into qemu_loadvm_state_header()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 21:58   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 03/33] migration: push Error **errp into qemu_loadvm_state_setup() Daniel P. Berrangé
                   ` (31 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index c8d93eee1e..870199b629 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2448,38 +2448,43 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
     return 0;
 }
 
-static int qemu_loadvm_state_header(QEMUFile *f)
+static int qemu_loadvm_state_header(QEMUFile *f, Error **errp)
 {
     unsigned int v;
     int ret;
 
     v = qemu_get_be32(f);
     if (v != QEMU_VM_FILE_MAGIC) {
-        error_report("Not a migration stream");
-        return -EINVAL;
+        error_setg(errp, "Not a migration stream, magic %x != %x",
+                   v, QEMU_VM_FILE_MAGIC);
+        return -1;
     }
 
     v = qemu_get_be32(f);
     if (v == QEMU_VM_FILE_VERSION_COMPAT) {
-        error_report("SaveVM v2 format is obsolete and don't work anymore");
-        return -ENOTSUP;
+        error_setg(errp, "SaveVM v2 format is obsolete and don't work anymore");
+        return -1;
     }
     if (v != QEMU_VM_FILE_VERSION) {
-        error_report("Unsupported migration stream version");
-        return -ENOTSUP;
+        error_setg(errp, "Unsupported migration stream, version %x != %x",
+                   v, QEMU_VM_FILE_VERSION);
+        return -1;
     }
 
     if (migrate_get_current()->send_configuration) {
-        if (qemu_get_byte(f) != QEMU_VM_CONFIGURATION) {
-            error_report("Configuration section missing");
+        v = qemu_get_byte(f);
+        if (v != QEMU_VM_CONFIGURATION) {
+            error_setg(errp, "Configuration section missing, %x != %x",
+                       v, QEMU_VM_CONFIGURATION);
             qemu_loadvm_state_cleanup();
-            return -EINVAL;
+            return -1;
         }
         ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0);
 
         if (ret) {
+            error_setg(errp, "Error %d while loading VM state", ret);
             qemu_loadvm_state_cleanup();
-            return ret;
+            return -1;
         }
     }
     return 0;
@@ -2647,9 +2652,7 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
         return -1;
     }
 
-    ret = qemu_loadvm_state_header(f);
-    if (ret) {
-        error_setg(errp, "Error %d while loading VM state", ret);
+    if (qemu_loadvm_state_header(f, errp) < 0) {
         return -1;
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 03/33] migration: push Error **errp into qemu_loadvm_state_setup()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state() Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 02/33] migration: push Error **errp into qemu_loadvm_state_header() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 21:59   ` Philippe Mathieu-Daudé
  2021-02-05  7:50   ` Markus Armbruster
  2021-02-04 17:18 ` [PATCH 04/33] migration: push Error **errp into qemu_load_device_state() Daniel P. Berrangé
                   ` (30 subsequent siblings)
  33 siblings, 2 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 870199b629..f4ed14a230 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2490,7 +2490,7 @@ static int qemu_loadvm_state_header(QEMUFile *f, Error **errp)
     return 0;
 }
 
-static int qemu_loadvm_state_setup(QEMUFile *f)
+static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp)
 {
     SaveStateEntry *se;
     int ret;
@@ -2509,7 +2509,7 @@ static int qemu_loadvm_state_setup(QEMUFile *f)
         ret = se->ops->load_setup(f, se->opaque);
         if (ret < 0) {
             qemu_file_set_error(f, ret);
-            error_report("Load state of device %s failed", se->idstr);
+            error_setg(errp, "Load state of device %s failed", se->idstr);
             return ret;
         }
     }
@@ -2656,8 +2656,7 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
         return -1;
     }
 
-    if (qemu_loadvm_state_setup(f) != 0) {
-        error_setg(errp, "Error %d while loading VM state", -EINVAL);
+    if (qemu_loadvm_state_setup(f, errp) < 0) {
         return -1;
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 04/33] migration: push Error **errp into qemu_load_device_state()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (2 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 03/33] migration: push Error **errp into qemu_loadvm_state_setup() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 22:01   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 05/33] migration: push Error **errp into qemu_loadvm_state_main() Daniel P. Berrangé
                   ` (29 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/colo.c   | 3 +--
 migration/savevm.c | 4 ++--
 migration/savevm.h | 2 +-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index de27662cab..e344b7cf32 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -748,9 +748,8 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
     qemu_mutex_lock_iothread();
     vmstate_loading = true;
     colo_flush_ram_cache();
-    ret = qemu_load_device_state(fb);
+    ret = qemu_load_device_state(fb, errp);
     if (ret < 0) {
-        error_setg(errp, "COLO: load device state failed");
         vmstate_loading = false;
         qemu_mutex_unlock_iothread();
         return;
diff --git a/migration/savevm.c b/migration/savevm.c
index f4ed14a230..dd41292d4e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2726,7 +2726,7 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
     return ret;
 }
 
-int qemu_load_device_state(QEMUFile *f)
+int qemu_load_device_state(QEMUFile *f, Error **errp)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
     int ret;
@@ -2734,7 +2734,7 @@ int qemu_load_device_state(QEMUFile *f)
     /* Load QEMU_VM_SECTION_FULL section */
     ret = qemu_loadvm_state_main(f, mis);
     if (ret < 0) {
-        error_report("Failed to load device state: %d", ret);
+        error_setg(errp, "Failed to load device state: %d", ret);
         return ret;
     }
 
diff --git a/migration/savevm.h b/migration/savevm.h
index 1069e2dd4f..c727bc103e 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -63,6 +63,6 @@ int qemu_save_device_state(QEMUFile *f);
 int qemu_loadvm_state(QEMUFile *f, Error **errp);
 void qemu_loadvm_state_cleanup(void);
 int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
-int qemu_load_device_state(QEMUFile *f);
+int qemu_load_device_state(QEMUFile *f, Error **errp);
 
 #endif
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 05/33] migration: push Error **errp into qemu_loadvm_state_main()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (3 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 04/33] migration: push Error **errp into qemu_load_device_state() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-15 18:35   ` Dr. David Alan Gilbert
  2021-02-04 17:18 ` [PATCH 06/33] migration: push Error **errp into qemu_loadvm_section_start_full() Daniel P. Berrangé
                   ` (28 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/colo.c   |  3 +-
 migration/savevm.c | 73 +++++++++++++++++++++++++++++++---------------
 migration/savevm.h |  3 +-
 3 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index e344b7cf32..4a050ac579 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -705,11 +705,10 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
 
     qemu_mutex_lock_iothread();
     cpu_synchronize_all_states();
-    ret = qemu_loadvm_state_main(mis->from_src_file, mis);
+    ret = qemu_loadvm_state_main(mis->from_src_file, mis, errp);
     qemu_mutex_unlock_iothread();
 
     if (ret < 0) {
-        error_setg(errp, "Load VM's live state (ram) error");
         return;
     }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index dd41292d4e..e47aec435c 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1819,6 +1819,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
     QEMUFile *f = mis->from_src_file;
     int load_res;
     MigrationState *migr = migrate_get_current();
+    Error *local_err = NULL;
 
     object_ref(OBJECT(migr));
 
@@ -1833,7 +1834,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
      * in qemu_file, and thus we must be blocking now.
      */
     qemu_file_set_blocking(f, true);
-    load_res = qemu_loadvm_state_main(f, mis);
+    load_res = qemu_loadvm_state_main(f, mis, &local_err);
 
     /*
      * This is tricky, but, mis->from_src_file can change after it
@@ -1849,6 +1850,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
     if (load_res < 0) {
         qemu_file_set_error(f, load_res);
         dirty_bitmap_mig_cancel_incoming();
+        error_report_err(local_err);
         if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
             !migrate_postcopy_ram() && migrate_dirty_bitmaps())
         {
@@ -1859,12 +1861,10 @@ static void *postcopy_ram_listen_thread(void *opaque)
                          __func__, load_res);
             load_res = 0; /* prevent further exit() */
         } else {
-            error_report("%s: loadvm failed: %d", __func__, load_res);
             migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
                                            MIGRATION_STATUS_FAILED);
         }
-    }
-    if (load_res >= 0) {
+    } else {
         /*
          * This looks good, but it's possible that the device loading in the
          * main thread hasn't finished yet, and so we might not be in 'RUN'
@@ -2116,14 +2116,17 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
  * @mis: Incoming state
  * @length: Length of packaged data to read
  *
- * Returns: Negative values on error
- *
+ * Returns:
+ *   0: success
+ *   LOADVM_QUIT: success, but stop
+ *   -1: error
  */
 static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
 {
     int ret;
     size_t length;
     QIOChannelBuffer *bioc;
+    Error *local_err = NULL;
 
     length = qemu_get_be32(mis->from_src_file);
     trace_loadvm_handle_cmd_packaged(length);
@@ -2149,8 +2152,11 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
 
     QEMUFile *packf = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
 
-    ret = qemu_loadvm_state_main(packf, mis);
+    ret = qemu_loadvm_state_main(packf, mis, &local_err);
     trace_loadvm_handle_cmd_packaged_main(ret);
+    if (ret < 0) {
+        error_report_err(local_err);
+    }
     qemu_fclose(packf);
     object_unref(OBJECT(bioc));
 
@@ -2568,7 +2574,14 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
     return true;
 }
 
-int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
+/*
+ * Returns:
+ *   0: success
+ *   LOADVM_QUIT: success, but stop
+ *   -1: error
+ */
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis,
+                           Error **errp)
 {
     uint8_t section_type;
     int ret = 0;
@@ -2579,7 +2592,9 @@ retry:
 
         if (qemu_file_get_error(f)) {
             ret = qemu_file_get_error(f);
-            break;
+            error_setg(errp,
+                       "Failed to load device state section ID: %d", ret);
+            goto out;
         }
 
         trace_qemu_loadvm_state_section(section_type);
@@ -2588,6 +2603,9 @@ retry:
         case QEMU_VM_SECTION_FULL:
             ret = qemu_loadvm_section_start_full(f, mis);
             if (ret < 0) {
+                error_setg(errp,
+                           "Failed to load device state section start: %d",
+                           ret);
                 goto out;
             }
             break;
@@ -2595,29 +2613,38 @@ retry:
         case QEMU_VM_SECTION_END:
             ret = qemu_loadvm_section_part_end(f, mis);
             if (ret < 0) {
+                error_setg(errp,
+                           "Failed to load device state section end: %d", ret);
                 goto out;
             }
             break;
         case QEMU_VM_COMMAND:
             ret = loadvm_process_command(f);
             trace_qemu_loadvm_state_section_command(ret);
-            if ((ret < 0) || (ret == LOADVM_QUIT)) {
+            if (ret < 0) {
+                error_setg(errp,
+                           "Failed to load device state command: %d", ret);
+                goto out;
+            }
+            if (ret == LOADVM_QUIT) {
                 goto out;
             }
             break;
         case QEMU_VM_EOF:
             /* This is the end of migration */
+            ret = 0;
             goto out;
         default:
-            error_report("Unknown savevm section type %d", section_type);
-            ret = -EINVAL;
+            error_setg(errp,
+                       "Unknown savevm section type %d", section_type);
+            ret = -1;
             goto out;
         }
     }
 
 out:
     if (ret < 0) {
-        qemu_file_set_error(f, ret);
+        qemu_file_set_error(f, -EINVAL);
 
         /* Cancel bitmaps incoming regardless of recovery */
         dirty_bitmap_mig_cancel_incoming();
@@ -2643,6 +2670,12 @@ out:
     return ret;
 }
 
+/*
+ * Returns:
+ *   0: success
+ *   LOADVM_QUIT: success, but stop
+ *   -1: error
+ */
 int qemu_loadvm_state(QEMUFile *f, Error **errp)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
@@ -2662,17 +2695,12 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
 
     cpu_synchronize_all_pre_loadvm();
 
-    ret = qemu_loadvm_state_main(f, mis);
-    if (ret < 0) {
-        error_setg(errp, "Error %d while loading VM state", ret);
-        ret = -1;
-    }
+    ret = qemu_loadvm_state_main(f, mis, errp);
     qemu_event_set(&mis->main_thread_load_event);
 
     trace_qemu_loadvm_state_post_main(ret);
 
     if (mis->have_listen_thread) {
-        error_setg(errp, "Error %d while loading VM state", ret);
         /* Listen thread still going, can't clean up yet */
         return ret;
     }
@@ -2729,13 +2757,10 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
 int qemu_load_device_state(QEMUFile *f, Error **errp)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
-    int ret;
 
     /* Load QEMU_VM_SECTION_FULL section */
-    ret = qemu_loadvm_state_main(f, mis);
-    if (ret < 0) {
-        error_setg(errp, "Failed to load device state: %d", ret);
-        return ret;
+    if (qemu_loadvm_state_main(f, mis, errp) < 0) {
+        return -1;
     }
 
     cpu_synchronize_all_post_init();
diff --git a/migration/savevm.h b/migration/savevm.h
index c727bc103e..1cec83c729 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -62,7 +62,8 @@ int qemu_save_device_state(QEMUFile *f);
 
 int qemu_loadvm_state(QEMUFile *f, Error **errp);
 void qemu_loadvm_state_cleanup(void);
-int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis,
+                           Error **errp);
 int qemu_load_device_state(QEMUFile *f, Error **errp);
 
 #endif
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 06/33] migration: push Error **errp into qemu_loadvm_section_start_full()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (4 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 05/33] migration: push Error **errp into qemu_loadvm_state_main() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 22:04   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 07/33] migration: push Error **errp into qemu_loadvm_section_part_end() Daniel P. Berrangé
                   ` (27 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

This is particularly useful for loading snapshots as this is a likely
error scenario to hit when the source and dest VM configs do not
match. This is illustrated by the improved error reporting in the
QMP load snapshot test.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c                            | 49 +++++++++----------
 .../tests/internal-snapshots-qapi.out         |  3 +-
 2 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index e47aec435c..f2eee0a4a7 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2350,7 +2350,8 @@ static bool check_section_footer(QEMUFile *f, SaveStateEntry *se)
 }
 
 static int
-qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
+qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis,
+                               Error **errp)
 {
     uint32_t instance_id, version_id, section_id;
     SaveStateEntry *se;
@@ -2360,18 +2361,18 @@ qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
     /* Read section start */
     section_id = qemu_get_be32(f);
     if (!qemu_get_counted_string(f, idstr)) {
-        error_report("Unable to read ID string for section %u",
-                     section_id);
-        return -EINVAL;
+        error_setg(errp, "Unable to read ID string for section %u",
+                   section_id);
+        return -1;
     }
     instance_id = qemu_get_be32(f);
     version_id = qemu_get_be32(f);
 
     ret = qemu_file_get_error(f);
     if (ret) {
-        error_report("%s: Failed to read instance/version ID: %d",
-                     __func__, ret);
-        return ret;
+        error_setg(errp, "Failed to read instance/version ID: %d",
+                   ret);
+        return -1;
     }
 
     trace_qemu_loadvm_state_section_startfull(section_id, idstr,
@@ -2379,36 +2380,37 @@ qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
     /* Find savevm section */
     se = find_se(idstr, instance_id);
     if (se == NULL) {
-        error_report("Unknown savevm section or instance '%s' %"PRIu32". "
-                     "Make sure that your current VM setup matches your "
-                     "saved VM setup, including any hotplugged devices",
-                     idstr, instance_id);
-        return -EINVAL;
+        error_setg(errp, "Unknown savevm section or instance '%s' %"PRIu32". "
+                   "Make sure that your current VM setup matches your "
+                   "saved VM setup, including any hotplugged devices",
+                   idstr, instance_id);
+        return -1;
     }
 
     /* Validate version */
     if (version_id > se->version_id) {
-        error_report("savevm: unsupported version %d for '%s' v%d",
-                     version_id, idstr, se->version_id);
-        return -EINVAL;
+        error_setg(errp, "savevm: unsupported version %d for '%s' v%d",
+                   version_id, idstr, se->version_id);
+        return -1;
     }
     se->load_version_id = version_id;
     se->load_section_id = section_id;
 
     /* Validate if it is a device's state */
     if (xen_enabled() && se->is_ram) {
-        error_report("loadvm: %s RAM loading not allowed on Xen", idstr);
-        return -EINVAL;
+        error_setg(errp, "loadvm: %s RAM loading not allowed on Xen", idstr);
+        return -1;
     }
 
     ret = vmstate_load(f, se);
     if (ret < 0) {
-        error_report("error while loading state for instance 0x%"PRIx32" of"
-                     " device '%s'", instance_id, idstr);
-        return ret;
+        error_setg(errp, "error while loading state for instance 0x%"PRIx32" of"
+                   " device '%s'", instance_id, idstr);
+        return -1;
     }
     if (!check_section_footer(f, se)) {
-        return -EINVAL;
+        error_setg(errp, "failed check for device state section footer");
+        return -1;
     }
 
     return 0;
@@ -2601,11 +2603,8 @@ retry:
         switch (section_type) {
         case QEMU_VM_SECTION_START:
         case QEMU_VM_SECTION_FULL:
-            ret = qemu_loadvm_section_start_full(f, mis);
+            ret = qemu_loadvm_section_start_full(f, mis, errp);
             if (ret < 0) {
-                error_setg(errp,
-                           "Failed to load device state section start: %d",
-                           ret);
                 goto out;
             }
             break;
diff --git a/tests/qemu-iotests/tests/internal-snapshots-qapi.out b/tests/qemu-iotests/tests/internal-snapshots-qapi.out
index 26ff4a838c..fd3e2a9ed0 100644
--- a/tests/qemu-iotests/tests/internal-snapshots-qapi.out
+++ b/tests/qemu-iotests/tests/internal-snapshots-qapi.out
@@ -345,13 +345,12 @@ Formatting 'TEST_DIR/t.qcow2.alt2', fmt=IMGFMT size=134217728
                                      "devices": ["diskfmt0"]}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "load-err-stderr"}}
-qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "load-err-stderr"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "STOP"}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "load-err-stderr"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "concluded", "id": "load-err-stderr"}}
 {"execute": "query-jobs"}
-{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}
+{"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}
 {"execute": "job-dismiss", "arguments": {"id": "load-err-stderr"}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "null", "id": "load-err-stderr"}}
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 07/33] migration: push Error **errp into qemu_loadvm_section_part_end()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (5 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 06/33] migration: push Error **errp into qemu_loadvm_section_start_full() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-05 16:16   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 08/33] migration: push Error **errp into loadvm_process_command() Daniel P. Berrangé
                   ` (26 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index f2eee0a4a7..350d5a315a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2417,7 +2417,8 @@ qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis,
 }
 
 static int
-qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
+qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis,
+                             Error **errp)
 {
     uint32_t section_id;
     SaveStateEntry *se;
@@ -2427,9 +2428,9 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
 
     ret = qemu_file_get_error(f);
     if (ret) {
-        error_report("%s: Failed to read section ID: %d",
-                     __func__, ret);
-        return ret;
+        error_setg(errp, "failed to read device state section end ID: %d",
+                   ret);
+        return -1;
     }
 
     trace_qemu_loadvm_state_section_partend(section_id);
@@ -2439,18 +2440,19 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
         }
     }
     if (se == NULL) {
-        error_report("Unknown savevm section %d", section_id);
-        return -EINVAL;
+        error_setg(errp, "unknown savevm section %d", section_id);
+        return -1;
     }
 
     ret = vmstate_load(f, se);
     if (ret < 0) {
-        error_report("error while loading state section id %d(%s)",
-                     section_id, se->idstr);
-        return ret;
+        error_setg(errp, "error while loading state section id %d(%s)",
+                   section_id, se->idstr);
+        return -1;
     }
     if (!check_section_footer(f, se)) {
-        return -EINVAL;
+        error_setg(errp, "failed check for device state section footer");
+        return -1;
     }
 
     return 0;
@@ -2610,10 +2612,8 @@ retry:
             break;
         case QEMU_VM_SECTION_PART:
         case QEMU_VM_SECTION_END:
-            ret = qemu_loadvm_section_part_end(f, mis);
+            ret = qemu_loadvm_section_part_end(f, mis, errp);
             if (ret < 0) {
-                error_setg(errp,
-                           "Failed to load device state section end: %d", ret);
                 goto out;
             }
             break;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 08/33] migration: push Error **errp into loadvm_process_command()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (6 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 07/33] migration: push Error **errp into qemu_loadvm_section_part_end() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-05 16:18   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 09/33] migration: push Error **errp into loadvm_handle_cmd_packaged() Daniel P. Berrangé
                   ` (25 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 87 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 64 insertions(+), 23 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 350d5a315a..450c36994f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2223,34 +2223,37 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis)
  * Process an incoming 'QEMU_VM_COMMAND'
  * 0           just a normal return
  * LOADVM_QUIT All good, but exit the loop
- * <0          Error
+ * -1          Error
  */
-static int loadvm_process_command(QEMUFile *f)
+static int loadvm_process_command(QEMUFile *f, Error **errp)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
     uint16_t cmd;
     uint16_t len;
     uint32_t tmp32;
+    int ret;
 
     cmd = qemu_get_be16(f);
     len = qemu_get_be16(f);
 
     /* Check validity before continue processing of cmds */
     if (qemu_file_get_error(f)) {
-        return qemu_file_get_error(f);
+        error_setg(errp, "device state stream has error: %d",
+                   qemu_file_get_error(f));
+        return -1;
     }
 
     trace_loadvm_process_command(cmd, len);
     if (cmd >= MIG_CMD_MAX || cmd == MIG_CMD_INVALID) {
-        error_report("MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
-        return -EINVAL;
+        error_setg(errp, "MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
+        return -1;
     }
 
     if (mig_cmd_args[cmd].len != -1 && mig_cmd_args[cmd].len != len) {
-        error_report("%s received with bad length - expecting %zu, got %d",
-                     mig_cmd_args[cmd].name,
-                     (size_t)mig_cmd_args[cmd].len, len);
-        return -ERANGE;
+        error_setg(errp, "%s received with bad length - expecting %zu, got %d",
+                   mig_cmd_args[cmd].name,
+                   (size_t)mig_cmd_args[cmd].len, len);
+        return -1;
     }
 
     switch (cmd) {
@@ -2262,7 +2265,7 @@ static int loadvm_process_command(QEMUFile *f)
         }
         mis->to_src_file = qemu_file_get_return_path(f);
         if (!mis->to_src_file) {
-            error_report("CMD_OPEN_RETURN_PATH failed");
+            error_setg(errp, "CMD_OPEN_RETURN_PATH failed");
             return -1;
         }
         break;
@@ -2271,36 +2274,76 @@ static int loadvm_process_command(QEMUFile *f)
         tmp32 = qemu_get_be32(f);
         trace_loadvm_process_command_ping(tmp32);
         if (!mis->to_src_file) {
-            error_report("CMD_PING (0x%x) received with no return path",
-                         tmp32);
+            error_setg(errp, "CMD_PING (0x%x) received with no return path",
+                       tmp32);
             return -1;
         }
         migrate_send_rp_pong(mis, tmp32);
         break;
 
     case MIG_CMD_PACKAGED:
-        return loadvm_handle_cmd_packaged(mis);
+        ret = loadvm_handle_cmd_packaged(mis);
+        if (ret < 0) {
+            error_setg(errp, "Failed to load device state command: %d", ret);
+            return -1;
+        }
+        return ret;
 
     case MIG_CMD_POSTCOPY_ADVISE:
-        return loadvm_postcopy_handle_advise(mis, len);
+        ret = loadvm_postcopy_handle_advise(mis, len);
+        if (ret < 0) {
+            error_setg(errp, "Failed to load device state command: %d", ret);
+            return -1;
+        }
+        return ret;
 
     case MIG_CMD_POSTCOPY_LISTEN:
-        return loadvm_postcopy_handle_listen(mis);
+        ret = loadvm_postcopy_handle_listen(mis);
+        if (ret < 0) {
+            error_setg(errp, "Failed to load device state command: %d", ret);
+            return -1;
+        }
+        return ret;
 
     case MIG_CMD_POSTCOPY_RUN:
-        return loadvm_postcopy_handle_run(mis);
+        ret = loadvm_postcopy_handle_run(mis);
+        if (ret < 0) {
+            error_setg(errp, "Failed to load device state command: %d", ret);
+            return -1;
+        }
+        return ret;
 
     case MIG_CMD_POSTCOPY_RAM_DISCARD:
-        return loadvm_postcopy_ram_handle_discard(mis, len);
+        ret = loadvm_postcopy_ram_handle_discard(mis, len);
+        if (ret < 0) {
+            error_setg(errp, "Failed to load device state command: %d", ret);
+            return -1;
+        }
+        return ret;
 
     case MIG_CMD_POSTCOPY_RESUME:
-        return loadvm_postcopy_handle_resume(mis);
+        ret = loadvm_postcopy_handle_resume(mis);
+        if (ret < 0) {
+            error_setg(errp, "Failed to load device state command: %d", ret);
+            return -1;
+        }
+        return ret;
 
     case MIG_CMD_RECV_BITMAP:
-        return loadvm_handle_recv_bitmap(mis, len);
+        ret = loadvm_handle_recv_bitmap(mis, len);
+        if (ret < 0) {
+            error_setg(errp, "Failed to load device state command: %d", ret);
+            return -1;
+        }
+        return ret;
 
     case MIG_CMD_ENABLE_COLO:
-        return loadvm_process_enable_colo(mis);
+        ret = loadvm_process_enable_colo(mis);
+        if (ret < 0) {
+            error_setg(errp, "Failed to load device state command: %d", ret);
+            return -1;
+        }
+        return ret;
     }
 
     return 0;
@@ -2618,11 +2661,9 @@ retry:
             }
             break;
         case QEMU_VM_COMMAND:
-            ret = loadvm_process_command(f);
+            ret = loadvm_process_command(f, errp);
             trace_qemu_loadvm_state_section_command(ret);
             if (ret < 0) {
-                error_setg(errp,
-                           "Failed to load device state command: %d", ret);
                 goto out;
             }
             if (ret == LOADVM_QUIT) {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 09/33] migration: push Error **errp into loadvm_handle_cmd_packaged()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (7 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 08/33] migration: push Error **errp into loadvm_process_command() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 10/33] migration: push Error **errp into loadvm_postcopy_handle_advise() Daniel P. Berrangé
                   ` (24 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 24 ++++++++----------------
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 450c36994f..d9170b4364 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2121,18 +2121,18 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
  *   LOADVM_QUIT: success, but stop
  *   -1: error
  */
-static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
+static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis, Error **errp)
 {
     int ret;
     size_t length;
     QIOChannelBuffer *bioc;
-    Error *local_err = NULL;
 
     length = qemu_get_be32(mis->from_src_file);
     trace_loadvm_handle_cmd_packaged(length);
 
     if (length > MAX_VM_CMD_PACKAGED_SIZE) {
-        error_report("Unreasonably large packaged state: %zu", length);
+        error_setg(errp, "Unreasonably large packaged state: %zu > %d",
+                   length, MAX_VM_CMD_PACKAGED_SIZE);
         return -1;
     }
 
@@ -2143,20 +2143,17 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
                           length);
     if (ret != length) {
         object_unref(OBJECT(bioc));
-        error_report("CMD_PACKAGED: Buffer receive fail ret=%d length=%zu",
-                     ret, length);
-        return (ret < 0) ? ret : -EAGAIN;
+        error_setg(errp, "CMD_PACKAGED: Buffer receive fail ret=%d length=%zu",
+                   ret, length);
+        return -1;
     }
     bioc->usage += length;
     trace_loadvm_handle_cmd_packaged_received(ret);
 
     QEMUFile *packf = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
 
-    ret = qemu_loadvm_state_main(packf, mis, &local_err);
+    ret = qemu_loadvm_state_main(packf, mis, errp);
     trace_loadvm_handle_cmd_packaged_main(ret);
-    if (ret < 0) {
-        error_report_err(local_err);
-    }
     qemu_fclose(packf);
     object_unref(OBJECT(bioc));
 
@@ -2282,12 +2279,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
         break;
 
     case MIG_CMD_PACKAGED:
-        ret = loadvm_handle_cmd_packaged(mis);
-        if (ret < 0) {
-            error_setg(errp, "Failed to load device state command: %d", ret);
-            return -1;
-        }
-        return ret;
+        return loadvm_handle_cmd_packaged(mis, errp);
 
     case MIG_CMD_POSTCOPY_ADVISE:
         ret = loadvm_postcopy_handle_advise(mis, len);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 10/33] migration: push Error **errp into loadvm_postcopy_handle_advise()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (8 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 09/33] migration: push Error **errp into loadvm_handle_cmd_packaged() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-05 16:21   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 11/33] migration: push Error **errp into ram_postcopy_incoming_init() Daniel P. Berrangé
                   ` (23 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 43 +++++++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index d9170b4364..b0eb250d1c 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1644,38 +1644,41 @@ enum LoadVMExitCodes {
  * quickly.
  */
 static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
-                                         uint16_t len)
+                                         uint16_t len,
+                                         Error **errp)
 {
     PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
     uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
-    Error *local_err = NULL;
 
     trace_loadvm_postcopy_handle_advise();
     if (ps != POSTCOPY_INCOMING_NONE) {
-        error_report("CMD_POSTCOPY_ADVISE in wrong postcopy state (%d)", ps);
+        error_setg(errp,
+                   "CMD_POSTCOPY_ADVISE in wrong postcopy state (%d)", ps);
         return -1;
     }
 
     switch (len) {
     case 0:
         if (migrate_postcopy_ram()) {
-            error_report("RAM postcopy is enabled but have 0 byte advise");
-            return -EINVAL;
+            error_setg(errp, "RAM postcopy is enabled but have 0 byte advise");
+            return -1;
         }
         return 0;
     case 8 + 8:
         if (!migrate_postcopy_ram()) {
-            error_report("RAM postcopy is disabled but have 16 byte advise");
-            return -EINVAL;
+            error_setg(errp,
+                       "RAM postcopy is disabled but have 16 byte advise");
+            return -1;
         }
         break;
     default:
-        error_report("CMD_POSTCOPY_ADVISE invalid length (%d)", len);
-        return -EINVAL;
+        error_setg(errp, "CMD_POSTCOPY_ADVISE invalid length (%d)", len);
+        return -1;
     }
 
     if (!postcopy_ram_supported_by_host(mis)) {
         postcopy_state_set(POSTCOPY_INCOMING_NONE);
+        error_setg(errp, "Postcopy RAM not supported by host");
         return -1;
     }
 
@@ -1697,9 +1700,9 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
          *      also fails when passed to an older qemu that doesn't
          *      do huge pages.
          */
-        error_report("Postcopy needs matching RAM page sizes (s=%" PRIx64
-                                                             " d=%" PRIx64 ")",
-                     remote_pagesize_summary, local_pagesize_summary);
+        error_setg(errp, "Postcopy needs matching RAM page sizes "
+                   "(s=%" PRIx64 " d=%" PRIx64 ")",
+                   remote_pagesize_summary, local_pagesize_summary);
         return -1;
     }
 
@@ -1709,17 +1712,18 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
          * Again, some differences could be dealt with, but for now keep it
          * simple.
          */
-        error_report("Postcopy needs matching target page sizes (s=%d d=%zd)",
-                     (int)remote_tps, qemu_target_page_size());
+        error_setg(errp,
+                   "Postcopy needs matching target page sizes (s=%d d=%zd)",
+                   (int)remote_tps, qemu_target_page_size());
         return -1;
     }
 
-    if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_ADVISE, &local_err)) {
-        error_report_err(local_err);
+    if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_ADVISE, errp)) {
         return -1;
     }
 
     if (ram_postcopy_incoming_init(mis)) {
+        error_setg(errp, "Postcopy RAM incoming init failed");
         return -1;
     }
 
@@ -2282,12 +2286,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
         return loadvm_handle_cmd_packaged(mis, errp);
 
     case MIG_CMD_POSTCOPY_ADVISE:
-        ret = loadvm_postcopy_handle_advise(mis, len);
-        if (ret < 0) {
-            error_setg(errp, "Failed to load device state command: %d", ret);
-            return -1;
-        }
-        return ret;
+        return loadvm_postcopy_handle_advise(mis, len, errp);
 
     case MIG_CMD_POSTCOPY_LISTEN:
         ret = loadvm_postcopy_handle_listen(mis);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 11/33] migration: push Error **errp into ram_postcopy_incoming_init()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (9 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 10/33] migration: push Error **errp into loadvm_postcopy_handle_advise() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 12/33] migration: push Error **errp into loadvm_postcopy_handle_listen() Daniel P. Berrangé
                   ` (22 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/postcopy-ram.c | 8 ++++++--
 migration/postcopy-ram.h | 2 +-
 migration/ram.c          | 6 +++---
 migration/ram.h          | 2 +-
 migration/savevm.c       | 3 +--
 5 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index ab482adef1..54b748757a 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -446,6 +446,7 @@ out:
  */
 static int init_range(RAMBlock *rb, void *opaque)
 {
+    Error **errp = opaque;
     const char *block_name = qemu_ram_get_idstr(rb);
     void *host_addr = qemu_ram_get_host_addr(rb);
     ram_addr_t offset = qemu_ram_get_offset(rb);
@@ -459,6 +460,8 @@ static int init_range(RAMBlock *rb, void *opaque)
      * (Precopy will just overwrite this data, so doesn't need the discard)
      */
     if (ram_discard_range(block_name, 0, length)) {
+        error_setg(errp, "failed to discard RAM block %s len=%zu",
+                   block_name, length);
         return -1;
     }
 
@@ -507,9 +510,10 @@ static int cleanup_range(RAMBlock *rb, void *opaque)
  * postcopy later; must be called prior to any precopy.
  * called from arch_init's similarly named ram_postcopy_incoming_init
  */
-int postcopy_ram_incoming_init(MigrationIncomingState *mis)
+int postcopy_ram_incoming_init(MigrationIncomingState *mis,
+                               Error **errp)
 {
-    if (foreach_not_ignored_block(init_range, NULL)) {
+    if (foreach_not_ignored_block(init_range, errp)) {
         return -1;
     }
 
diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h
index 6d2b3cf124..7458ac1199 100644
--- a/migration/postcopy-ram.h
+++ b/migration/postcopy-ram.h
@@ -27,7 +27,7 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis);
  * postcopy later; must be called prior to any precopy.
  * called from ram.c's similarly named ram_postcopy_incoming_init
  */
-int postcopy_ram_incoming_init(MigrationIncomingState *mis);
+int postcopy_ram_incoming_init(MigrationIncomingState *mis, Error **errp);
 
 /*
  * At the end of a migration where postcopy_ram_incoming_init was called.
diff --git a/migration/ram.c b/migration/ram.c
index 7811cde643..f6180e8f4f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3156,7 +3156,7 @@ static int ram_load_cleanup(void *opaque)
 /**
  * ram_postcopy_incoming_init: allocate postcopy data structures
  *
- * Returns 0 for success and negative if there was one error
+ * Returns 0 for success and -1 if there was one error
  *
  * @mis: current migration incoming state
  *
@@ -3164,9 +3164,9 @@ static int ram_load_cleanup(void *opaque)
  * postcopy-ram. postcopy-ram's similarly names
  * postcopy_ram_incoming_init does the work.
  */
-int ram_postcopy_incoming_init(MigrationIncomingState *mis)
+int ram_postcopy_incoming_init(MigrationIncomingState *mis, Error **errp)
 {
-    return postcopy_ram_incoming_init(mis);
+    return postcopy_ram_incoming_init(mis, errp);
 }
 
 /**
diff --git a/migration/ram.h b/migration/ram.h
index 011e85414e..1cea36ba51 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -61,7 +61,7 @@ void ram_postcopy_migrated_memory_release(MigrationState *ms);
 int ram_postcopy_send_discard_bitmap(MigrationState *ms);
 /* For incoming postcopy discard */
 int ram_discard_range(const char *block_name, uint64_t start, size_t length);
-int ram_postcopy_incoming_init(MigrationIncomingState *mis);
+int ram_postcopy_incoming_init(MigrationIncomingState *mis, Error **errp);
 
 void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
 
diff --git a/migration/savevm.c b/migration/savevm.c
index b0eb250d1c..c505526406 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1722,8 +1722,7 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
         return -1;
     }
 
-    if (ram_postcopy_incoming_init(mis)) {
-        error_setg(errp, "Postcopy RAM incoming init failed");
+    if (ram_postcopy_incoming_init(mis, errp)) {
         return -1;
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 12/33] migration: push Error **errp into loadvm_postcopy_handle_listen()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (10 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 11/33] migration: push Error **errp into ram_postcopy_incoming_init() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-05 16:23   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 13/33] migration: push Error **errp into loadvm_postcopy_handle_run() Daniel P. Berrangé
                   ` (21 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index c505526406..447596383f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1909,14 +1909,15 @@ static void *postcopy_ram_listen_thread(void *opaque)
 }
 
 /* After this message we must be able to immediately receive postcopy data */
-static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
+static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis,
+                                         Error **errp)
 {
     PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_LISTENING);
     trace_loadvm_postcopy_handle_listen();
-    Error *local_err = NULL;
 
     if (ps != POSTCOPY_INCOMING_ADVISE && ps != POSTCOPY_INCOMING_DISCARD) {
-        error_report("CMD_POSTCOPY_LISTEN in wrong postcopy state (%d)", ps);
+        error_setg(errp,
+                   "CMD_POSTCOPY_LISTEN in wrong postcopy state (%d)", ps);
         return -1;
     }
     if (ps == POSTCOPY_INCOMING_ADVISE) {
@@ -1937,12 +1938,12 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
     if (migrate_postcopy_ram()) {
         if (postcopy_ram_incoming_setup(mis)) {
             postcopy_ram_incoming_cleanup(mis);
+            error_setg(errp, "Failed to setup incoming postcoyp RAM blocks");
             return -1;
         }
     }
 
-    if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_LISTEN, &local_err)) {
-        error_report_err(local_err);
+    if (postcopy_notify(POSTCOPY_NOTIFY_INBOUND_LISTEN, errp)) {
         return -1;
     }
 
@@ -2288,12 +2289,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
         return loadvm_postcopy_handle_advise(mis, len, errp);
 
     case MIG_CMD_POSTCOPY_LISTEN:
-        ret = loadvm_postcopy_handle_listen(mis);
-        if (ret < 0) {
-            error_setg(errp, "Failed to load device state command: %d", ret);
-            return -1;
-        }
-        return ret;
+        return loadvm_postcopy_handle_listen(mis, errp);
 
     case MIG_CMD_POSTCOPY_RUN:
         ret = loadvm_postcopy_handle_run(mis);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 13/33] migration: push Error **errp into loadvm_postcopy_handle_run()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (11 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 12/33] migration: push Error **errp into loadvm_postcopy_handle_listen() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-05 16:23   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 14/33] migration: push Error **errp into loadvm_postcopy_ram_handle_discard() Daniel P. Berrangé
                   ` (20 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 447596383f..fa7883ae5e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1998,13 +1998,13 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
 }
 
 /* After all discards we can start running and asking for pages */
-static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
+static int loadvm_postcopy_handle_run(MigrationIncomingState *mis, Error **errp)
 {
     PostcopyState ps = postcopy_state_get();
 
     trace_loadvm_postcopy_handle_run();
     if (ps != POSTCOPY_INCOMING_LISTENING) {
-        error_report("CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
+        error_setg(errp, "CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
         return -1;
     }
 
@@ -2292,12 +2292,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
         return loadvm_postcopy_handle_listen(mis, errp);
 
     case MIG_CMD_POSTCOPY_RUN:
-        ret = loadvm_postcopy_handle_run(mis);
-        if (ret < 0) {
-            error_setg(errp, "Failed to load device state command: %d", ret);
-            return -1;
-        }
-        return ret;
+        return loadvm_postcopy_handle_run(mis, errp);
 
     case MIG_CMD_POSTCOPY_RAM_DISCARD:
         ret = loadvm_postcopy_ram_handle_discard(mis, len);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 14/33] migration: push Error **errp into loadvm_postcopy_ram_handle_discard()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (12 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 13/33] migration: push Error **errp into loadvm_postcopy_handle_run() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-05 16:24   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 15/33] migration: make loadvm_postcopy_handle_resume() void Daniel P. Berrangé
                   ` (19 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index fa7883ae5e..2216c61c6f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1735,7 +1735,8 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
  * There can be 0..many of these messages, each encoding multiple pages.
  */
 static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
-                                              uint16_t len)
+                                              uint16_t len,
+                                              Error **errp)
 {
     int tmp;
     char ramid[256];
@@ -1748,7 +1749,8 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
         /* 1st discard */
         tmp = postcopy_ram_prepare_discard(mis);
         if (tmp) {
-            return tmp;
+            error_setg(errp, "Failed to prepare for RAM discard: %d", tmp);
+            return -1;
         }
         break;
 
@@ -1757,8 +1759,9 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
         break;
 
     default:
-        error_report("CMD_POSTCOPY_RAM_DISCARD in wrong postcopy state (%d)",
-                     ps);
+        error_setg(errp,
+                   "CMD_POSTCOPY_RAM_DISCARD in wrong postcopy state (%d)",
+                   ps);
         return -1;
     }
     /* We're expecting a
@@ -1767,29 +1770,29 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
      *    then at least 1 16 byte chunk
     */
     if (len < (1 + 1 + 1 + 1 + 2 * 8)) {
-        error_report("CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
+        error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
         return -1;
     }
 
     tmp = qemu_get_byte(mis->from_src_file);
     if (tmp != postcopy_ram_discard_version) {
-        error_report("CMD_POSTCOPY_RAM_DISCARD invalid version (%d)", tmp);
+        error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD invalid version (%d)", tmp);
         return -1;
     }
 
     if (!qemu_get_counted_string(mis->from_src_file, ramid)) {
-        error_report("CMD_POSTCOPY_RAM_DISCARD Failed to read RAMBlock ID");
+        error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD Failed to read RAMBlock ID");
         return -1;
     }
     tmp = qemu_get_byte(mis->from_src_file);
     if (tmp != 0) {
-        error_report("CMD_POSTCOPY_RAM_DISCARD missing nil (%d)", tmp);
+        error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD missing nil (%d)", tmp);
         return -1;
     }
 
     len -= 3 + strlen(ramid);
     if (len % 16) {
-        error_report("CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
+        error_setg(errp, "CMD_POSTCOPY_RAM_DISCARD invalid length (%d)", len);
         return -1;
     }
     trace_loadvm_postcopy_ram_handle_discard_header(ramid, len);
@@ -1801,7 +1804,8 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
         len -= 16;
         int ret = ram_discard_range(ramid, start_addr, block_length);
         if (ret) {
-            return ret;
+            error_setg(errp, "Failed to discard RAM range %s: %d", ramid, ret);
+            return -1;
         }
     }
     trace_loadvm_postcopy_ram_handle_discard_end();
@@ -2295,12 +2299,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
         return loadvm_postcopy_handle_run(mis, errp);
 
     case MIG_CMD_POSTCOPY_RAM_DISCARD:
-        ret = loadvm_postcopy_ram_handle_discard(mis, len);
-        if (ret < 0) {
-            error_setg(errp, "Failed to load device state command: %d", ret);
-            return -1;
-        }
-        return ret;
+        return loadvm_postcopy_ram_handle_discard(mis, len, errp);
 
     case MIG_CMD_POSTCOPY_RESUME:
         ret = loadvm_postcopy_handle_resume(mis);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 15/33] migration: make loadvm_postcopy_handle_resume() void
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (13 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 14/33] migration: push Error **errp into loadvm_postcopy_ram_handle_discard() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 16/33] migration: push Error **errp into loadvm_handle_recv_bitmap() Daniel P. Berrangé
                   ` (18 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 2216c61c6f..041175162a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2067,12 +2067,12 @@ static void migrate_send_rp_req_pages_pending(MigrationIncomingState *mis)
     }
 }
 
-static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
+static void loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
 {
     if (mis->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {
         error_report("%s: illegal resume received", __func__);
         /* Don't fail the load, only for this. */
-        return 0;
+        return;
     }
 
     /*
@@ -2113,8 +2113,6 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
      * migrate_send_rp_message_req_pages() is not thread safe, yet.
      */
     qemu_sem_post(&mis->postcopy_pause_sem_fault);
-
-    return 0;
 }
 
 /**
@@ -2302,12 +2300,8 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
         return loadvm_postcopy_ram_handle_discard(mis, len, errp);
 
     case MIG_CMD_POSTCOPY_RESUME:
-        ret = loadvm_postcopy_handle_resume(mis);
-        if (ret < 0) {
-            error_setg(errp, "Failed to load device state command: %d", ret);
-            return -1;
-        }
-        return ret;
+        loadvm_postcopy_handle_resume(mis);
+        return 0;
 
     case MIG_CMD_RECV_BITMAP:
         ret = loadvm_handle_recv_bitmap(mis, len);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 16/33] migration: push Error **errp into loadvm_handle_recv_bitmap()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (14 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 15/33] migration: make loadvm_postcopy_handle_resume() void Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 17/33] migration: push Error **errp into loadvm_process_enable_colo() Daniel P. Berrangé
                   ` (17 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 041175162a..b41c812188 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2173,7 +2173,8 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis, Error **errp)
  * len (1 byte) + ramblock_name (<255 bytes)
  */
 static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
-                                     uint16_t len)
+                                     uint16_t len,
+                                     Error **errp)
 {
     QEMUFile *file = mis->from_src_file;
     RAMBlock *rb;
@@ -2182,24 +2183,26 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
 
     cnt = qemu_get_counted_string(file, block_name);
     if (!cnt) {
-        error_report("%s: failed to read block name", __func__);
-        return -EINVAL;
+        error_setg(errp, "%s: failed to read block name", __func__);
+        return -1;
     }
 
     /* Validate before using the data */
     if (qemu_file_get_error(file)) {
-        return qemu_file_get_error(file);
+        error_setg(errp, "migration stream has error: %d",
+                   qemu_file_get_error(file));
+        return -1;
     }
 
     if (len != cnt + 1) {
-        error_report("%s: invalid payload length (%d)", __func__, len);
-        return -EINVAL;
+        error_setg(errp, "%s: invalid payload length (%d)", __func__, len);
+        return -1;
     }
 
     rb = qemu_ram_block_by_name(block_name);
     if (!rb) {
-        error_report("%s: block '%s' not found", __func__, block_name);
-        return -EINVAL;
+        error_setg(errp, "%s: block '%s' not found", __func__, block_name);
+        return -1;
     }
 
     migrate_send_rp_recv_bitmap(mis, block_name);
@@ -2304,12 +2307,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
         return 0;
 
     case MIG_CMD_RECV_BITMAP:
-        ret = loadvm_handle_recv_bitmap(mis, len);
-        if (ret < 0) {
-            error_setg(errp, "Failed to load device state command: %d", ret);
-            return -1;
-        }
-        return ret;
+        return loadvm_handle_recv_bitmap(mis, len, errp);
 
     case MIG_CMD_ENABLE_COLO:
         ret = loadvm_process_enable_colo(mis);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 17/33] migration: push Error **errp into loadvm_process_enable_colo()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (15 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 16/33] migration: push Error **errp into loadvm_handle_recv_bitmap() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 18/33] migration: push Error **errp into colo_init_ram_cache() Daniel P. Berrangé
                   ` (16 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 include/migration/colo.h |  2 +-
 migration/migration.c    |  6 +++---
 migration/savevm.c       | 25 +++++++++++--------------
 3 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 768e1f04c3..1d38191360 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -25,7 +25,7 @@ void migrate_start_colo_process(MigrationState *s);
 bool migration_in_colo_state(void);
 
 /* loadvm */
-int migration_incoming_enable_colo(void);
+int migration_incoming_enable_colo(Error **errp);
 void migration_incoming_disable_colo(void);
 bool migration_incoming_colo_enabled(void);
 void *colo_process_incoming_thread(void *opaque);
diff --git a/migration/migration.c b/migration/migration.c
index 287a18d269..b9cf56e61f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -397,11 +397,11 @@ void migration_incoming_disable_colo(void)
     migration_colo_enabled = false;
 }
 
-int migration_incoming_enable_colo(void)
+int migration_incoming_enable_colo(Error **errp)
 {
     if (ram_block_discard_disable(true)) {
-        error_report("COLO: cannot disable RAM discard");
-        return -EBUSY;
+        error_setg(errp, "COLO: cannot disable RAM discard");
+        return -1;
     }
     migration_colo_enabled = true;
     return 0;
diff --git a/migration/savevm.c b/migration/savevm.c
index b41c812188..c59e76b478 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2212,15 +2212,18 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
     return 0;
 }
 
-static int loadvm_process_enable_colo(MigrationIncomingState *mis)
+static int loadvm_process_enable_colo(MigrationIncomingState *mis,
+                                      Error **errp)
 {
-    int ret = migration_incoming_enable_colo();
+    int ret;
+    if (migration_incoming_enable_colo(errp) < 0) {
+        return -1;
+    }
 
-    if (!ret) {
-        ret = colo_init_ram_cache();
-        if (ret) {
-            migration_incoming_disable_colo();
-        }
+    ret = colo_init_ram_cache();
+    if (ret < 0) {
+        error_setg(errp, "failed to init colo RAM cache: %d", ret);
+        migration_incoming_disable_colo();
     }
     return ret;
 }
@@ -2237,7 +2240,6 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
     uint16_t cmd;
     uint16_t len;
     uint32_t tmp32;
-    int ret;
 
     cmd = qemu_get_be16(f);
     len = qemu_get_be16(f);
@@ -2310,12 +2312,7 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
         return loadvm_handle_recv_bitmap(mis, len, errp);
 
     case MIG_CMD_ENABLE_COLO:
-        ret = loadvm_process_enable_colo(mis);
-        if (ret < 0) {
-            error_setg(errp, "Failed to load device state command: %d", ret);
-            return -1;
-        }
-        return ret;
+        return loadvm_process_enable_colo(mis, errp);
     }
 
     return 0;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 18/33] migration: push Error **errp into colo_init_ram_cache()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (16 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 17/33] migration: push Error **errp into loadvm_process_enable_colo() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 19/33] migration: push Error **errp into check_section_footer() Daniel P. Berrangé
                   ` (15 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/ram.c    | 11 ++++++-----
 migration/ram.h    |  2 +-
 migration/savevm.c |  8 +++-----
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f6180e8f4f..0b8c5f3c86 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3028,7 +3028,7 @@ static void colo_init_ram_state(void)
  * memory of the secondary VM, it is need to hold the global lock
  * to call this helper.
  */
-int colo_init_ram_cache(void)
+int colo_init_ram_cache(Error **errp)
 {
     RAMBlock *block;
 
@@ -3038,16 +3038,17 @@ int colo_init_ram_cache(void)
                                                     NULL,
                                                     false);
             if (!block->colo_cache) {
-                error_report("%s: Can't alloc memory for COLO cache of block %s,"
-                             "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
-                             block->used_length);
+                error_setg_errno(errp, errno,
+                                 "%s: Can't alloc memory for COLO cache of block %s,"
+                                 "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
+                                 block->used_length);
                 RAMBLOCK_FOREACH_NOT_IGNORED(block) {
                     if (block->colo_cache) {
                         qemu_anon_ram_free(block->colo_cache, block->used_length);
                         block->colo_cache = NULL;
                     }
                 }
-                return -errno;
+                return -1;
             }
         }
     }
diff --git a/migration/ram.h b/migration/ram.h
index 1cea36ba51..88b0b6636b 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -74,7 +74,7 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
 int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
 
 /* ram cache */
-int colo_init_ram_cache(void);
+int colo_init_ram_cache(Error **errp);
 void colo_flush_ram_cache(void);
 void colo_release_ram_cache(void);
 void colo_incoming_start_dirty_log(void);
diff --git a/migration/savevm.c b/migration/savevm.c
index c59e76b478..ace76e32f7 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2215,17 +2215,15 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
 static int loadvm_process_enable_colo(MigrationIncomingState *mis,
                                       Error **errp)
 {
-    int ret;
     if (migration_incoming_enable_colo(errp) < 0) {
         return -1;
     }
 
-    ret = colo_init_ram_cache();
-    if (ret < 0) {
-        error_setg(errp, "failed to init colo RAM cache: %d", ret);
+    if (colo_init_ram_cache(errp) < 0) {
         migration_incoming_disable_colo();
+        return -1;
     }
-    return ret;
+    return 0;
 }
 
 /*
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 19/33] migration: push Error **errp into check_section_footer()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (17 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 18/33] migration: push Error **errp into colo_init_ram_cache() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-05 16:26   ` Philippe Mathieu-Daudé
  2021-02-04 17:18 ` [PATCH 20/33] migration: push Error **errp into global_state_store() Daniel P. Berrangé
                   ` (14 subsequent siblings)
  33 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index ace76e32f7..289a3d55bb 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2320,9 +2320,9 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
  * Read a footer off the wire and check that it matches the expected section
  *
  * Returns: true if the footer was good
- *          false if there is a problem (and calls error_report to say why)
+ *          false if there is a problem
  */
-static bool check_section_footer(QEMUFile *f, SaveStateEntry *se)
+static bool check_section_footer(QEMUFile *f, SaveStateEntry *se, Error **errp)
 {
     int ret;
     uint8_t read_mark;
@@ -2337,21 +2337,21 @@ static bool check_section_footer(QEMUFile *f, SaveStateEntry *se)
 
     ret = qemu_file_get_error(f);
     if (ret) {
-        error_report("%s: Read section footer failed: %d",
-                     __func__, ret);
+        error_setg(errp, "read section footer failed: %d",
+                   ret);
         return false;
     }
 
     if (read_mark != QEMU_VM_SECTION_FOOTER) {
-        error_report("Missing section footer for %s", se->idstr);
+        error_setg(errp, "Missing section footer for %s", se->idstr);
         return false;
     }
 
     read_section_id = qemu_get_be32(f);
     if (read_section_id != se->load_section_id) {
-        error_report("Mismatched section id in footer for %s -"
-                     " read 0x%x expected 0x%x",
-                     se->idstr, read_section_id, se->load_section_id);
+        error_setg(errp, "Mismatched section id in footer for %s -"
+                   " read 0x%x expected 0x%x",
+                   se->idstr, read_section_id, se->load_section_id);
         return false;
     }
 
@@ -2418,8 +2418,7 @@ qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis,
                    " device '%s'", instance_id, idstr);
         return -1;
     }
-    if (!check_section_footer(f, se)) {
-        error_setg(errp, "failed check for device state section footer");
+    if (!check_section_footer(f, se, errp)) {
         return -1;
     }
 
@@ -2460,8 +2459,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis,
                    section_id, se->idstr);
         return -1;
     }
-    if (!check_section_footer(f, se)) {
-        error_setg(errp, "failed check for device state section footer");
+    if (!check_section_footer(f, se, errp)) {
         return -1;
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 20/33] migration: push Error **errp into global_state_store()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (18 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 19/33] migration: push Error **errp into check_section_footer() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 21/33] migration: remove error reporting from qemu_fopen_bdrv() callers Daniel P. Berrangé
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 include/migration/global_state.h | 2 +-
 migration/global_state.c         | 6 +++---
 migration/migration.c            | 8 ++++++--
 migration/savevm.c               | 5 ++---
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/include/migration/global_state.h b/include/migration/global_state.h
index 945eb35d5b..eeade88ef8 100644
--- a/include/migration/global_state.h
+++ b/include/migration/global_state.h
@@ -16,7 +16,7 @@
 #include "qapi/qapi-types-run-state.h"
 
 void register_global_state(void);
-int global_state_store(void);
+int global_state_store(Error **errp);
 void global_state_store_running(void);
 bool global_state_received(void);
 RunState global_state_get_runstate(void);
diff --git a/migration/global_state.c b/migration/global_state.c
index a33947ca32..36fda38aad 100644
--- a/migration/global_state.c
+++ b/migration/global_state.c
@@ -29,13 +29,13 @@ typedef struct {
 
 static GlobalState global_state;
 
-int global_state_store(void)
+int global_state_store(Error **errp)
 {
     if (!runstate_store((char *)global_state.runstate,
                         sizeof(global_state.runstate))) {
-        error_report("runstate name too big: %s", global_state.runstate);
+        error_setg(errp, "runstate name too big: %s", global_state.runstate);
         trace_migrate_state_too_big();
-        return -EINVAL;
+        return -1;
     }
     return 0;
 }
diff --git a/migration/migration.c b/migration/migration.c
index b9cf56e61f..395a1b10f5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2811,6 +2811,7 @@ static int postcopy_start(MigrationState *ms)
     int64_t bandwidth = migrate_max_postcopy_bandwidth();
     bool restart_block = false;
     int cur_state = MIGRATION_STATUS_ACTIVE;
+    Error *local_err = NULL;
     if (!migrate_pause_before_switchover()) {
         migrate_set_state(&ms->state, MIGRATION_STATUS_ACTIVE,
                           MIGRATION_STATUS_POSTCOPY_ACTIVE);
@@ -2821,9 +2822,10 @@ static int postcopy_start(MigrationState *ms)
     trace_postcopy_start_set_run();
 
     qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
-    global_state_store();
+    global_state_store(&local_err);
     ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
     if (ret < 0) {
+        error_report_err(local_err);
         goto fail;
     }
 
@@ -3030,11 +3032,12 @@ static void migration_completion(MigrationState *s)
     int current_active_state = s->state;
 
     if (s->state == MIGRATION_STATUS_ACTIVE) {
+        Error *local_err = NULL;
         qemu_mutex_lock_iothread();
         s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
         qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
         s->vm_was_running = runstate_is_running();
-        ret = global_state_store();
+        ret = global_state_store(&local_err);
 
         if (!ret) {
             bool inactivate = !migrate_colo_enabled();
@@ -3055,6 +3058,7 @@ static void migration_completion(MigrationState *s)
         qemu_mutex_unlock_iothread();
 
         if (ret < 0) {
+            error_report_err(local_err);
             goto fail;
         }
     } else if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
diff --git a/migration/savevm.c b/migration/savevm.c
index 289a3d55bb..c18b7e6033 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2828,9 +2828,8 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
 
     saved_vm_running = runstate_is_running();
 
-    ret = global_state_store();
-    if (ret) {
-        error_setg(errp, "Error saving global state");
+    ret = global_state_store(errp);
+    if (ret < 0) {
         return false;
     }
     vm_stop(RUN_STATE_SAVE_VM);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 21/33] migration: remove error reporting from qemu_fopen_bdrv() callers
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (19 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 20/33] migration: push Error **errp into global_state_store() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 22/33] migration: push Error **errp into qemu_savevm_state_iterate() Daniel P. Berrangé
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This method cannot fail since it merely allocates a single struct, so
the only possible failure (ENOMEM) will cause an abort() already.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index c18b7e6033..6a7b930b1c 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2861,10 +2861,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
 
     /* save the VM state */
     f = qemu_fopen_bdrv(bs, 1);
-    if (!f) {
-        error_setg(errp, "Could not open VM state file");
-        goto the_end;
-    }
+
     ret = qemu_savevm_state(f, errp);
     vm_state_size = qemu_ftell(f);
     ret2 = qemu_fclose(f);
@@ -3041,10 +3038,6 @@ bool load_snapshot(const char *name, const char *vmstate,
 
     /* restore the VM state */
     f = qemu_fopen_bdrv(bs_vm_state, 0);
-    if (!f) {
-        error_setg(errp, "Could not open VM state file");
-        goto err_drain;
-    }
 
     qemu_system_reset(SHUTDOWN_CAUSE_NONE);
     mis->from_src_file = f;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 22/33] migration: push Error **errp into qemu_savevm_state_iterate()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (20 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 21/33] migration: remove error reporting from qemu_fopen_bdrv() callers Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 23/33] migration: simplify some error reporting in save_snapshot() Daniel P. Berrangé
                   ` (11 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c |  8 +++++++-
 migration/savevm.c    | 47 ++++++++++++++++++++++++++-----------------
 migration/savevm.h    |  2 +-
 3 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 395a1b10f5..a85d101ad8 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3393,6 +3393,8 @@ static MigIterateState migration_iteration_run(MigrationState *s)
                           pend_pre, pend_compat, pend_post);
 
     if (pending_size && pending_size >= s->threshold_size) {
+        int ret;
+        Error *local_err = NULL;
         /* Still a significant amount to transfer */
         if (!in_postcopy && pend_pre <= s->threshold_size &&
             qatomic_read(&s->start_postcopy)) {
@@ -3402,7 +3404,11 @@ static MigIterateState migration_iteration_run(MigrationState *s)
             return MIG_ITERATE_SKIP;
         }
         /* Just another iteration step */
-        qemu_savevm_state_iterate(s->to_dst_file, in_postcopy);
+        ret = qemu_savevm_state_iterate(s->to_dst_file, in_postcopy,
+                                        &local_err);
+        if (ret < 0) {
+            error_report_err(local_err);
+        }
     } else {
         trace_migration_thread_low_pending(pending_size);
         migration_completion(s);
diff --git a/migration/savevm.c b/migration/savevm.c
index 6a7b930b1c..23e4d5a1a2 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1220,8 +1220,9 @@ int qemu_savevm_state_resume_prepare(MigrationState *s)
  *   negative: there was one error, and we have -errno.
  *   0 : We haven't finished, caller have to go again
  *   1 : We have finished, we can go to complete phase
+ *  -1 : error reported, go to cleanup phase
  */
-int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
+int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, Error **errp)
 {
     SaveStateEntry *se;
     int ret = 1;
@@ -1261,11 +1262,13 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
         save_section_footer(f, se);
 
         if (ret < 0) {
-            error_report("failed to save SaveStateEntry with id(name): %d(%s)",
-                         se->section_id, se->idstr);
+            error_setg(errp,
+                       "failed to save SaveStateEntry with id(name): %d(%s)",
+                       se->section_id, se->idstr);
             qemu_file_set_error(f, ret);
+            return -1;
         }
-        if (ret <= 0) {
+        if (ret == 0) {
             /* Do not proceed to the next vmstate before this one reported
                completion of the current stage. This serializes the migration
                and reduces the probability that a faster changing state is
@@ -1517,7 +1520,6 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
 {
     int ret;
     MigrationState *ms = migrate_get_current();
-    MigrationStatus status;
 
     if (migration_is_running(ms->state)) {
         error_setg(errp, QERR_MIGRATION_ACTIVE);
@@ -1538,34 +1540,43 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
     qemu_savevm_state_setup(f);
     qemu_mutex_lock_iothread();
 
-    while (qemu_file_get_error(f) == 0) {
-        if (qemu_savevm_state_iterate(f, false) > 0) {
+    while (1) {
+        ret = qemu_savevm_state_iterate(f, false, errp);
+        if (ret < 0) {
+            goto fail;
+        }
+        if (ret > 0) {
             break;
         }
+        ret = qemu_file_get_error(f);
+        if (ret != 0) {
+            error_setg_errno(errp, -ret, "Error while writing VM state");
+            goto fail;
+        }
     }
 
+    qemu_savevm_state_complete_precopy(f, false, false);
     ret = qemu_file_get_error(f);
-    if (ret == 0) {
-        qemu_savevm_state_complete_precopy(f, false, false);
-        ret = qemu_file_get_error(f);
-    }
-    qemu_savevm_state_cleanup();
     if (ret != 0) {
         error_setg_errno(errp, -ret, "Error while writing VM state");
+        goto fail;
     }
 
-    if (ret != 0) {
-        status = MIGRATION_STATUS_FAILED;
-    } else {
-        status = MIGRATION_STATUS_COMPLETED;
-    }
-    migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status);
+    qemu_savevm_state_cleanup();
+    migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP,
+                      MIGRATION_STATUS_COMPLETED);
 
     /* f is outer parameter, it should not stay in global migration state after
      * this function finished */
     ms->to_dst_file = NULL;
 
     return ret;
+
+ fail:
+    qemu_savevm_state_cleanup();
+    migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP,
+                      MIGRATION_STATUS_FAILED);
+    return -1;
 }
 
 void qemu_savevm_live_state(QEMUFile *f)
diff --git a/migration/savevm.h b/migration/savevm.h
index 1cec83c729..e187640806 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -34,7 +34,7 @@ void qemu_savevm_state_setup(QEMUFile *f);
 bool qemu_savevm_state_guest_unplug_pending(void);
 int qemu_savevm_state_resume_prepare(MigrationState *s);
 void qemu_savevm_state_header(QEMUFile *f);
-int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy);
+int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, Error **errp);
 void qemu_savevm_state_cleanup(void);
 void qemu_savevm_state_complete_postcopy(QEMUFile *f);
 int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 23/33] migration: simplify some error reporting in save_snapshot()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (21 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 22/33] migration: push Error **errp into qemu_savevm_state_iterate() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 24/33] migration: push Error **errp into qemu_savevm_state_setup() Daniel P. Berrangé
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

Re-arrange code to remove need for a separate 'ret2' variable, accepting
the duplicated qemu_fclose() call as resulting in clearer code to follow
the flow of.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 23e4d5a1a2..fdf8b6edfb 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2788,7 +2788,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
 {
     BlockDriverState *bs;
     QEMUSnapshotInfo sn1, *sn = &sn1;
-    int ret = -1, ret2;
+    int ret = -1;
     QEMUFile *f;
     int saved_vm_running;
     uint64_t vm_state_size;
@@ -2818,11 +2818,11 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
                 return false;
             }
         } else {
-            ret2 = bdrv_all_has_snapshot(name, has_devices, devices, errp);
-            if (ret2 < 0) {
+            ret = bdrv_all_has_snapshot(name, has_devices, devices, errp);
+            if (ret < 0) {
                 return false;
             }
-            if (ret2 == 1) {
+            if (ret == 1) {
                 error_setg(errp,
                            "Snapshot '%s' already exists in one or more devices",
                            name);
@@ -2874,13 +2874,14 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
     f = qemu_fopen_bdrv(bs, 1);
 
     ret = qemu_savevm_state(f, errp);
-    vm_state_size = qemu_ftell(f);
-    ret2 = qemu_fclose(f);
     if (ret < 0) {
+        qemu_fclose(f);
         goto the_end;
     }
-    if (ret2 < 0) {
-        ret = ret2;
+    vm_state_size = qemu_ftell(f);
+    ret = qemu_fclose(f);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "failed to close vmstate file");
         goto the_end;
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 24/33] migration: push Error **errp into qemu_savevm_state_setup()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (22 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 23/33] migration: simplify some error reporting in save_snapshot() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:18 ` [PATCH 25/33] migration: push Error **errp into qemu_savevm_state_complete_precopy() Daniel P. Berrangé
                   ` (9 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

In doing this the callers now actually honour the failures that can
be reported instead of carrying on as if everything was normal.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c |  9 ++++++++-
 migration/savevm.c    | 18 ++++++++++++------
 migration/savevm.h    |  2 +-
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index a85d101ad8..e814d47796 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3522,6 +3522,7 @@ static void *migration_thread(void *opaque)
     int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     MigThrError thr_error;
     bool urgent = false;
+    Error *local_err = NULL;
 
     rcu_register_thread();
 
@@ -3556,7 +3557,12 @@ static void *migration_thread(void *opaque)
         qemu_savevm_send_colo_enable(s->to_dst_file);
     }
 
-    qemu_savevm_state_setup(s->to_dst_file);
+    if (qemu_savevm_state_setup(s->to_dst_file, &local_err) < 0) {
+        error_report_err(local_err);
+        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+                          MIGRATION_STATUS_FAILED);
+        goto out;
+    }
 
     if (qemu_savevm_state_guest_unplug_pending()) {
         migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
@@ -3609,6 +3615,7 @@ static void *migration_thread(void *opaque)
 
     trace_migration_thread_after_loop();
     migration_iteration_finish(s);
+ out:
     object_unref(OBJECT(s));
     rcu_unregister_thread();
     return NULL;
diff --git a/migration/savevm.c b/migration/savevm.c
index fdf8b6edfb..318ba547bc 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1159,10 +1159,9 @@ bool qemu_savevm_state_guest_unplug_pending(void)
     return false;
 }
 
-void qemu_savevm_state_setup(QEMUFile *f)
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
 {
     SaveStateEntry *se;
-    Error *local_err = NULL;
     int ret;
 
     trace_savevm_state_setup();
@@ -1180,14 +1179,18 @@ void qemu_savevm_state_setup(QEMUFile *f)
         ret = se->ops->save_setup(f, se->opaque);
         save_section_footer(f, se);
         if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "Failed to setup device state handler");
             qemu_file_set_error(f, ret);
-            break;
+            return -1;
         }
     }
 
-    if (precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)) {
-        error_report_err(local_err);
+    if (precopy_notify(PRECOPY_NOTIFY_SETUP, errp)) {
+        return -1;
     }
+
+    return 0;
 }
 
 int qemu_savevm_state_resume_prepare(MigrationState *s)
@@ -1537,8 +1540,11 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
 
     qemu_mutex_unlock_iothread();
     qemu_savevm_state_header(f);
-    qemu_savevm_state_setup(f);
+    ret = qemu_savevm_state_setup(f, errp);
     qemu_mutex_lock_iothread();
+    if (ret < 0) {
+        goto fail;
+    }
 
     while (1) {
         ret = qemu_savevm_state_iterate(f, false, errp);
diff --git a/migration/savevm.h b/migration/savevm.h
index e187640806..b7133655f2 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -30,7 +30,7 @@
 #define QEMU_VM_SECTION_FOOTER       0x7e
 
 bool qemu_savevm_state_blocked(Error **errp);
-void qemu_savevm_state_setup(QEMUFile *f);
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp);
 bool qemu_savevm_state_guest_unplug_pending(void);
 int qemu_savevm_state_resume_prepare(MigrationState *s);
 void qemu_savevm_state_header(QEMUFile *f);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 25/33] migration: push Error **errp into qemu_savevm_state_complete_precopy()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (23 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 24/33] migration: push Error **errp into qemu_savevm_state_setup() Daniel P. Berrangé
@ 2021-02-04 17:18 ` Daniel P. Berrangé
  2021-02-04 17:19 ` [PATCH 26/33] migration: push Error **errp into qemu_savevm_state_complete_precopy_non_iterable() Daniel P. Berrangé
                   ` (8 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 318ba547bc..3b46fbba32 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1329,7 +1329,8 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f)
 }
 
 static
-int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy)
+int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy,
+                                                Error **errp)
 {
     SaveStateEntry *se;
     int ret;
@@ -1355,6 +1356,8 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy)
         trace_savevm_section_end(se->idstr, se->section_id, ret);
         save_section_footer(f, se);
         if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "failed to complete precopy device state save");
             qemu_file_set_error(f, ret);
             return -1;
         }
@@ -1450,9 +1453,10 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
     cpu_synchronize_all_states();
 
     if (!in_postcopy || iterable_only) {
-        ret = qemu_savevm_state_complete_precopy_iterable(f, in_postcopy);
-        if (ret) {
-            return ret;
+        if (qemu_savevm_state_complete_precopy_iterable(f, in_postcopy,
+                                                        &local_err) < 0) {
+            error_report_err(local_err);
+            return -1;
         }
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 26/33] migration: push Error **errp into qemu_savevm_state_complete_precopy_non_iterable()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (24 preceding siblings ...)
  2021-02-04 17:18 ` [PATCH 25/33] migration: push Error **errp into qemu_savevm_state_complete_precopy() Daniel P. Berrangé
@ 2021-02-04 17:19 ` Daniel P. Berrangé
  2021-02-04 17:19 ` [PATCH 27/33] migration: push Error **errp into qemu_savevm_state_complete_precopy() Daniel P. Berrangé
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/savevm.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 3b46fbba32..95e228a646 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1369,7 +1369,8 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy,
 static
 int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
                                                     bool in_postcopy,
-                                                    bool inactivate_disks)
+                                                    bool inactivate_disks,
+                                                    Error **errp)
 {
     g_autoptr(JSONWriter) vmdesc = NULL;
     int vmdesc_len;
@@ -1398,9 +1399,11 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
 
         save_section_header(f, se, QEMU_VM_SECTION_FULL);
         ret = vmstate_save(f, se, vmdesc);
-        if (ret) {
+        if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "failed to save device state '%s'", se->idstr);
             qemu_file_set_error(f, ret);
-            return ret;
+            return -1;
         }
         trace_savevm_section_end(se->idstr, se->section_id, 0);
         save_section_footer(f, se);
@@ -1413,10 +1416,10 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
          * bdrv_invalidate_cache_all() on the other end won't fail. */
         ret = bdrv_inactivate_all();
         if (ret) {
-            error_report("%s: bdrv_inactivate_all() failed (%d)",
-                         __func__, ret);
+            error_setg_errno(errp, -ret,
+                             "failed to deactivate disks when completing precopy save");
             qemu_file_set_error(f, ret);
-            return ret;
+            return -1;
         }
     }
     if (!in_postcopy) {
@@ -1440,7 +1443,6 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
 int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
                                        bool inactivate_disks)
 {
-    int ret;
     Error *local_err = NULL;
     bool in_postcopy = migration_in_postcopy();
 
@@ -1464,10 +1466,11 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
         goto flush;
     }
 
-    ret = qemu_savevm_state_complete_precopy_non_iterable(f, in_postcopy,
-                                                          inactivate_disks);
-    if (ret) {
-        return ret;
+    if (qemu_savevm_state_complete_precopy_non_iterable(f, in_postcopy,
+                                                        inactivate_disks,
+                                                        &local_err) < 0) {
+        error_report_err(local_err);
+        return -1;
     }
 
 flush:
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 27/33] migration: push Error **errp into qemu_savevm_state_complete_precopy()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (25 preceding siblings ...)
  2021-02-04 17:19 ` [PATCH 26/33] migration: push Error **errp into qemu_savevm_state_complete_precopy_non_iterable() Daniel P. Berrangé
@ 2021-02-04 17:19 ` Daniel P. Berrangé
  2021-02-04 17:19 ` [PATCH 28/33] migration: push Error **errp into qemu_savevm_send_packaged() Daniel P. Berrangé
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c | 14 +++++++++++---
 migration/savevm.c    | 18 +++++++++++-------
 migration/savevm.h    |  3 ++-
 3 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index e814d47796..2ccb1b66b5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2845,7 +2845,11 @@ static int postcopy_start(MigrationState *ms)
      * Cause any non-postcopiable, but iterative devices to
      * send out their final data.
      */
-    qemu_savevm_state_complete_precopy(ms->to_dst_file, true, false);
+    if (qemu_savevm_state_complete_precopy(ms->to_dst_file, true, false,
+                                           &local_err) < 0) {
+        error_report_err(local_err);
+        goto fail;
+    }
 
     /*
      * in Finish migrate and with the io-lock held everything should
@@ -2898,7 +2902,10 @@ static int postcopy_start(MigrationState *ms)
      */
     qemu_savevm_send_postcopy_listen(fb);
 
-    qemu_savevm_state_complete_precopy(fb, false, false);
+    if (qemu_savevm_state_complete_precopy(fb, false, false, &local_err) < 0) {
+        error_report_err(local_err);
+        goto fail_closefb;
+    }
     if (migrate_postcopy_ram()) {
         qemu_savevm_send_ping(fb, 3);
     }
@@ -3049,7 +3056,8 @@ static void migration_completion(MigrationState *s)
             if (ret >= 0) {
                 qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
                 ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
-                                                         inactivate);
+                                                         inactivate,
+                                                         &local_err);
             }
             if (inactivate && ret >= 0) {
                 s->block_inactive = true;
diff --git a/migration/savevm.c b/migration/savevm.c
index 95e228a646..d6c36e6b6b 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1441,7 +1441,8 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
 }
 
 int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
-                                       bool inactivate_disks)
+                                       bool inactivate_disks,
+                                       Error **errp)
 {
     Error *local_err = NULL;
     bool in_postcopy = migration_in_postcopy();
@@ -1456,8 +1457,7 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
 
     if (!in_postcopy || iterable_only) {
         if (qemu_savevm_state_complete_precopy_iterable(f, in_postcopy,
-                                                        &local_err) < 0) {
-            error_report_err(local_err);
+                                                        errp) < 0) {
             return -1;
         }
     }
@@ -1468,8 +1468,7 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
 
     if (qemu_savevm_state_complete_precopy_non_iterable(f, in_postcopy,
                                                         inactivate_disks,
-                                                        &local_err) < 0) {
-        error_report_err(local_err);
+                                                        errp) < 0) {
         return -1;
     }
 
@@ -1568,7 +1567,9 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
         }
     }
 
-    qemu_savevm_state_complete_precopy(f, false, false);
+    if (qemu_savevm_state_complete_precopy(f, false, false, errp) < 0) {
+        goto fail;
+    }
     ret = qemu_file_get_error(f);
     if (ret != 0) {
         error_setg_errno(errp, -ret, "Error while writing VM state");
@@ -1594,8 +1595,11 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
 
 void qemu_savevm_live_state(QEMUFile *f)
 {
+    Error *local_err = NULL;
     /* save QEMU_VM_SECTION_END section */
-    qemu_savevm_state_complete_precopy(f, true, false);
+    if (qemu_savevm_state_complete_precopy(f, true, false, &local_err) < 0) {
+        error_report_err(local_err);
+    }
     qemu_put_byte(f, QEMU_VM_EOF);
 }
 
diff --git a/migration/savevm.h b/migration/savevm.h
index b7133655f2..e3120a4fb0 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -38,7 +38,8 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, Error **errp);
 void qemu_savevm_state_cleanup(void);
 void qemu_savevm_state_complete_postcopy(QEMUFile *f);
 int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
-                                       bool inactivate_disks);
+                                       bool inactivate_disks,
+                                       Error **errp);
 void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
                                uint64_t *res_precopy_only,
                                uint64_t *res_compatible,
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 28/33] migration: push Error **errp into qemu_savevm_send_packaged()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (26 preceding siblings ...)
  2021-02-04 17:19 ` [PATCH 27/33] migration: push Error **errp into qemu_savevm_state_complete_precopy() Daniel P. Berrangé
@ 2021-02-04 17:19 ` Daniel P. Berrangé
  2021-02-04 17:19 ` [PATCH 29/33] migration: push Error **errp into qemu_savevm_live_state() Daniel P. Berrangé
                   ` (5 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c | 4 +++-
 migration/savevm.c    | 9 +++++----
 migration/savevm.h    | 3 ++-
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 2ccb1b66b5..984276d066 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2927,7 +2927,9 @@ static int postcopy_start(MigrationState *ms)
     restart_block = false;
 
     /* Now send that blob */
-    if (qemu_savevm_send_packaged(ms->to_dst_file, bioc->data, bioc->usage)) {
+    if (qemu_savevm_send_packaged(ms->to_dst_file, bioc->data, bioc->usage,
+                                  &local_err)) {
+        error_report_err(local_err);
         goto fail_closefb;
     }
     qemu_fclose(fb);
diff --git a/migration/savevm.c b/migration/savevm.c
index d6c36e6b6b..deea8854db 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1002,15 +1002,16 @@ void qemu_savevm_send_open_return_path(QEMUFile *f)
  *
  * Returns:
  *    0 on success
- *    -ve on error
+ *    -1 on error
  */
-int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
+int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len,
+                              Error **errp)
 {
     uint32_t tmp;
 
     if (len > MAX_VM_CMD_PACKAGED_SIZE) {
-        error_report("%s: Unreasonably large packaged state: %zu",
-                     __func__, len);
+        error_setg(errp, "unreasonably large packaged state: %zu > %d",
+                   len, MAX_VM_CMD_PACKAGED_SIZE);
         return -1;
     }
 
diff --git a/migration/savevm.h b/migration/savevm.h
index e3120a4fb0..2d46e848cd 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -46,7 +46,8 @@ void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
                                uint64_t *res_postcopy_only);
 void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
 void qemu_savevm_send_open_return_path(QEMUFile *f);
-int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len);
+int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len,
+                              Error **errp);
 void qemu_savevm_send_postcopy_advise(QEMUFile *f);
 void qemu_savevm_send_postcopy_listen(QEMUFile *f);
 void qemu_savevm_send_postcopy_run(QEMUFile *f);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 29/33] migration: push Error **errp into qemu_savevm_live_state()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (27 preceding siblings ...)
  2021-02-04 17:19 ` [PATCH 28/33] migration: push Error **errp into qemu_savevm_send_packaged() Daniel P. Berrangé
@ 2021-02-04 17:19 ` Daniel P. Berrangé
  2021-02-04 17:19 ` [PATCH 30/33] migration: push Error **errp into qemu_save_device_state() Daniel P. Berrangé
                   ` (4 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/colo.c   | 4 +++-
 migration/savevm.c | 8 ++++----
 migration/savevm.h | 2 +-
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 4a050ac579..a76b72c984 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -470,7 +470,9 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
      * TODO: We may need a timeout mechanism to prevent COLO process
      * to be blocked here.
      */
-    qemu_savevm_live_state(s->to_dst_file);
+    if (qemu_savevm_live_state(s->to_dst_file, &local_err) < 0) {
+        goto out;
+    }
 
     qemu_fflush(fb);
 
diff --git a/migration/savevm.c b/migration/savevm.c
index deea8854db..884d12c6eb 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1594,14 +1594,14 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
     return -1;
 }
 
-void qemu_savevm_live_state(QEMUFile *f)
+int qemu_savevm_live_state(QEMUFile *f, Error **errp)
 {
-    Error *local_err = NULL;
     /* save QEMU_VM_SECTION_END section */
-    if (qemu_savevm_state_complete_precopy(f, true, false, &local_err) < 0) {
-        error_report_err(local_err);
+    if (qemu_savevm_state_complete_precopy(f, true, false, errp) < 0) {
+        return -1;
     }
     qemu_put_byte(f, QEMU_VM_EOF);
+    return 0;
 }
 
 int qemu_save_device_state(QEMUFile *f)
diff --git a/migration/savevm.h b/migration/savevm.h
index 2d46e848cd..7abd75b668 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -59,7 +59,7 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
                                            uint64_t *start_list,
                                            uint64_t *length_list);
 void qemu_savevm_send_colo_enable(QEMUFile *f);
-void qemu_savevm_live_state(QEMUFile *f);
+int qemu_savevm_live_state(QEMUFile *f, Error **errp);
 int qemu_save_device_state(QEMUFile *f);
 
 int qemu_loadvm_state(QEMUFile *f, Error **errp);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 30/33] migration: push Error **errp into qemu_save_device_state()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (28 preceding siblings ...)
  2021-02-04 17:19 ` [PATCH 29/33] migration: push Error **errp into qemu_savevm_live_state() Daniel P. Berrangé
@ 2021-02-04 17:19 ` Daniel P. Berrangé
  2021-02-04 17:19 ` [PATCH 31/33] migration: push Error **errp into qemu_savevm_state_resume_prepare() Daniel P. Berrangé
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/colo.c   |  2 +-
 migration/savevm.c | 51 ++++++++++++++++++++++++++++------------------
 migration/savevm.h |  2 +-
 3 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index a76b72c984..fc824a9732 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -459,7 +459,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
     /* Note: device state is saved into buffer */
-    ret = qemu_save_device_state(fb);
+    ret = qemu_save_device_state(fb, &local_err);
 
     qemu_mutex_unlock_iothread();
     if (ret < 0) {
diff --git a/migration/savevm.c b/migration/savevm.c
index 884d12c6eb..994a7c7dab 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1604,9 +1604,10 @@ int qemu_savevm_live_state(QEMUFile *f, Error **errp)
     return 0;
 }
 
-int qemu_save_device_state(QEMUFile *f)
+int qemu_save_device_state(QEMUFile *f, Error **errp)
 {
     SaveStateEntry *se;
+    int ret;
 
     if (!migration_in_colo_state()) {
         qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
@@ -1615,7 +1616,6 @@ int qemu_save_device_state(QEMUFile *f)
     cpu_synchronize_all_states();
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
-        int ret;
 
         if (se->is_ram) {
             continue;
@@ -1630,8 +1630,9 @@ int qemu_save_device_state(QEMUFile *f)
         save_section_header(f, se, QEMU_VM_SECTION_FULL);
 
         ret = vmstate_save(f, se, NULL);
-        if (ret) {
-            return ret;
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "failed to save device state");
+            return -1;
         }
 
         save_section_footer(f, se);
@@ -1639,7 +1640,12 @@ int qemu_save_device_state(QEMUFile *f)
 
     qemu_put_byte(f, QEMU_VM_EOF);
 
-    return qemu_file_get_error(f);
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "I/O error saving device state");
+        return -1;
+    }
+    return 0;
 }
 
 static SaveStateEntry *find_se(const char *idstr, uint32_t instance_id)
@@ -2959,22 +2965,27 @@ void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
     qio_channel_set_name(QIO_CHANNEL(ioc), "migration-xen-save-state");
     f = qemu_fopen_channel_output(QIO_CHANNEL(ioc));
     object_unref(OBJECT(ioc));
-    ret = qemu_save_device_state(f);
-    if (ret < 0 || qemu_fclose(f) < 0) {
+    ret = qemu_save_device_state(f, errp);
+    if (ret < 0) {
+        goto the_end;
+    }
+
+    if (qemu_fclose(f) < 0) {
         error_setg(errp, QERR_IO_ERROR);
-    } else {
-        /* libxl calls the QMP command "stop" before calling
-         * "xen-save-devices-state" and in case of migration failure, libxl
-         * would call "cont".
-         * So call bdrv_inactivate_all (release locks) here to let the other
-         * side of the migration take control of the images.
-         */
-        if (live && !saved_vm_running) {
-            ret = bdrv_inactivate_all();
-            if (ret) {
-                error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)",
-                           __func__, ret);
-            }
+        goto the_end;
+    }
+
+    /* libxl calls the QMP command "stop" before calling
+     * "xen-save-devices-state" and in case of migration failure, libxl
+     * would call "cont".
+     * So call bdrv_inactivate_all (release locks) here to let the other
+     * side of the migration take control of the images.
+     */
+    if (live && !saved_vm_running) {
+        ret = bdrv_inactivate_all();
+        if (ret) {
+            error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)",
+                       __func__, ret);
         }
     }
 
diff --git a/migration/savevm.h b/migration/savevm.h
index 7abd75b668..a91e097b51 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -60,7 +60,7 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
                                            uint64_t *length_list);
 void qemu_savevm_send_colo_enable(QEMUFile *f);
 int qemu_savevm_live_state(QEMUFile *f, Error **errp);
-int qemu_save_device_state(QEMUFile *f);
+int qemu_save_device_state(QEMUFile *f, Error **errp);
 
 int qemu_loadvm_state(QEMUFile *f, Error **errp);
 void qemu_loadvm_state_cleanup(void);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 31/33] migration: push Error **errp into qemu_savevm_state_resume_prepare()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (29 preceding siblings ...)
  2021-02-04 17:19 ` [PATCH 30/33] migration: push Error **errp into qemu_save_device_state() Daniel P. Berrangé
@ 2021-02-04 17:19 ` Daniel P. Berrangé
  2021-02-04 17:19 ` [PATCH 32/33] migration: push Error **errp into postcopy_resume_handshake() Daniel P. Berrangé
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c | 9 ++++-----
 migration/savevm.c    | 5 +++--
 migration/savevm.h    | 2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 984276d066..3f0586842d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3165,16 +3165,15 @@ static int postcopy_resume_handshake(MigrationState *s)
 static int postcopy_do_resume(MigrationState *s)
 {
     int ret;
+    Error *local_err = NULL;
 
     /*
      * Call all the resume_prepare() hooks, so that modules can be
      * ready for the migration resume.
      */
-    ret = qemu_savevm_state_resume_prepare(s);
-    if (ret) {
-        error_report("%s: resume_prepare() failure detected: %d",
-                     __func__, ret);
-        return ret;
+    if (qemu_savevm_state_resume_prepare(s, &local_err) < 0) {
+        error_report_err(local_err);
+        return -1;
     }
 
     /*
diff --git a/migration/savevm.c b/migration/savevm.c
index 994a7c7dab..1d9790aa5b 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1194,7 +1194,7 @@ int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
     return 0;
 }
 
-int qemu_savevm_state_resume_prepare(MigrationState *s)
+int qemu_savevm_state_resume_prepare(MigrationState *s, Error **errp)
 {
     SaveStateEntry *se;
     int ret;
@@ -1212,7 +1212,8 @@ int qemu_savevm_state_resume_prepare(MigrationState *s)
         }
         ret = se->ops->resume_prepare(s, se->opaque);
         if (ret < 0) {
-            return ret;
+            error_setg_errno(errp, -ret, "failed state resume prepare");
+            return -1;
         }
     }
 
diff --git a/migration/savevm.h b/migration/savevm.h
index a91e097b51..b0c40e38a7 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -32,7 +32,7 @@
 bool qemu_savevm_state_blocked(Error **errp);
 int qemu_savevm_state_setup(QEMUFile *f, Error **errp);
 bool qemu_savevm_state_guest_unplug_pending(void);
-int qemu_savevm_state_resume_prepare(MigrationState *s);
+int qemu_savevm_state_resume_prepare(MigrationState *s, Error **errp);
 void qemu_savevm_state_header(QEMUFile *f);
 int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy, Error **errp);
 void qemu_savevm_state_cleanup(void);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 32/33] migration: push Error **errp into postcopy_resume_handshake()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (30 preceding siblings ...)
  2021-02-04 17:19 ` [PATCH 31/33] migration: push Error **errp into qemu_savevm_state_resume_prepare() Daniel P. Berrangé
@ 2021-02-04 17:19 ` Daniel P. Berrangé
  2021-02-04 17:19 ` [PATCH 33/33] migration: push Error **errp into postcopy_do_resume() Daniel P. Berrangé
  2021-02-04 18:22 ` [PATCH 00/33] migration: capture error reports into Error object Dr. David Alan Gilbert
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 3f0586842d..32a61b04bf 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3146,7 +3146,7 @@ typedef enum MigThrError {
     MIG_THR_ERR_FATAL = 2,
 } MigThrError;
 
-static int postcopy_resume_handshake(MigrationState *s)
+static int postcopy_resume_handshake(MigrationState *s, Error **errp)
 {
     qemu_savevm_send_postcopy_resume(s->to_dst_file);
 
@@ -3158,13 +3158,14 @@ static int postcopy_resume_handshake(MigrationState *s)
         return 0;
     }
 
+    error_setg(errp, "postcopy resume handshake failed state %x != %x",
+               s->state, MIGRATION_STATUS_POSTCOPY_ACTIVE);
     return -1;
 }
 
 /* Return zero if success, or <0 for error */
 static int postcopy_do_resume(MigrationState *s)
 {
-    int ret;
     Error *local_err = NULL;
 
     /*
@@ -3180,10 +3181,9 @@ static int postcopy_do_resume(MigrationState *s)
      * Last handshake with destination on the resume (destination will
      * switch to postcopy-active afterwards)
      */
-    ret = postcopy_resume_handshake(s);
-    if (ret) {
-        error_report("%s: handshake failed: %d", __func__, ret);
-        return ret;
+    if (postcopy_resume_handshake(s, &local_err) < 0) {
+        error_report_err(local_err);
+        return -1;
     }
 
     return 0;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 33/33] migration: push Error **errp into postcopy_do_resume()
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (31 preceding siblings ...)
  2021-02-04 17:19 ` [PATCH 32/33] migration: push Error **errp into postcopy_resume_handshake() Daniel P. Berrangé
@ 2021-02-04 17:19 ` Daniel P. Berrangé
  2021-02-04 18:22 ` [PATCH 00/33] migration: capture error reports into Error object Dr. David Alan Gilbert
  33 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 17:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Daniel P. Berrangé,
	Dr. David Alan Gilbert, Hailiang Zhang

This is an incremental step in converting vmstate loading code to report
via Error objects instead of printing directly to the console/monitor.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 32a61b04bf..135a26349f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3163,17 +3163,14 @@ static int postcopy_resume_handshake(MigrationState *s, Error **errp)
     return -1;
 }
 
-/* Return zero if success, or <0 for error */
-static int postcopy_do_resume(MigrationState *s)
+/* Return zero if success, or -1 for error */
+static int postcopy_do_resume(MigrationState *s, Error **errp)
 {
-    Error *local_err = NULL;
-
     /*
      * Call all the resume_prepare() hooks, so that modules can be
      * ready for the migration resume.
      */
-    if (qemu_savevm_state_resume_prepare(s, &local_err) < 0) {
-        error_report_err(local_err);
+    if (qemu_savevm_state_resume_prepare(s, errp) < 0) {
         return -1;
     }
 
@@ -3181,8 +3178,7 @@ static int postcopy_do_resume(MigrationState *s)
      * Last handshake with destination on the resume (destination will
      * switch to postcopy-active afterwards)
      */
-    if (postcopy_resume_handshake(s, &local_err) < 0) {
-        error_report_err(local_err);
+    if (postcopy_resume_handshake(s, errp) < 0) {
         return -1;
     }
 
@@ -3196,6 +3192,7 @@ static int postcopy_do_resume(MigrationState *s)
  */
 static MigThrError postcopy_pause(MigrationState *s)
 {
+    Error *local_err = NULL;
     assert(s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
 
     while (true) {
@@ -3235,7 +3232,7 @@ static MigThrError postcopy_pause(MigrationState *s)
             qemu_sem_post(&s->postcopy_pause_rp_sem);
 
             /* Do the resume logic */
-            if (postcopy_do_resume(s) == 0) {
+            if (postcopy_do_resume(s, &local_err) == 0) {
                 /* Let's continue! */
                 trace_postcopy_pause_continued();
                 return MIG_THR_ERR_RECOVERED;
@@ -3245,6 +3242,7 @@ static MigThrError postcopy_pause(MigrationState *s)
                  * pause again. Pause is always better than throwing
                  * data away.
                  */
+                error_report_err(local_err);
                 continue;
             }
         } else {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
                   ` (32 preceding siblings ...)
  2021-02-04 17:19 ` [PATCH 33/33] migration: push Error **errp into postcopy_do_resume() Daniel P. Berrangé
@ 2021-02-04 18:22 ` Dr. David Alan Gilbert
  2021-02-04 19:09   ` Daniel P. Berrangé
  33 siblings, 1 reply; 64+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-04 18:22 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Juan Quintela, qemu-devel, Hailiang Zhang

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> Due to its long term heritage most of the migration code just invokes
> 'error_report' when problems hit. This was fine for HMP, since the
> messages get redirected from stderr, into the HMP console. It is not
> OK for QMP because the errors will not be fed back to the QMP client.
> 
> This wasn't a terrible real world problem with QMP so far because
> live migration happens in the background, so at least on the target side
> there is not a QMP command that needs to capture the incoming migration.
> It is a problem on the source side but it doesn't hit frequently as the
> source side has fewer failure scenarios. None the less on both sides it
> would be desirable if 'query-migrate' can report errors correctly.
> With the introduction of the load-snapshot QMP commands, the need for
> error reporting becomes more pressing.
> 
> Wiring up good error reporting is a large and difficult job, which
> this series does NOT complete. The focus here has been on converting
> all methods in savevm.c which have an 'int' return value capable of
> reporting errors. This covers most of the infrastructure for controlling
> the migration state serialization / protocol.
> 
> The remaining part that is missing error reporting are the callbacks in
> the VMStateDescription struct which can return failure codes, but have
> no "Error **errp" parameter. Thinking about how this might be dealt with
> in future, a big bang conversion is likely non-viable. We'll probably
> want to introduce a duplicate set of callbacks with the "Error **errp"
> parameter and convert impls in batches, eventually removing the
> original callbacks. I don't intend todo that myself in the immediate
> future.
> 
> IOW, this patch series probably solves 50% of the problem, but we
> still do need the rest to get ideal error reporting.
> 
> In doing this savevm conversion I noticed a bunch of places which
> see and then ignore errors. I only fixed one or two of them which
> were clearly dubious. Other places in savevm.c where it seemed it
> was probably ok to ignore errors, I've left using error_report()
> on the basis that those are really warnings. Perhaps they could
> be changed to warn_report() instead.
> 
> There are alot of patches here, but I felt it was easier to review
> for correctness if I converted 1 function at a time. The series
> does not neccessarily have to be reviewed/appied in 1 go.

After this series, what do my errors look like, and where do they end
up?
Do I get my nice backtrace shwoing that device failed, then that was
part of that one...

Dave

> Daniel P. Berrangé (33):
>   migration: push Error **errp into qemu_loadvm_state()
>   migration: push Error **errp into qemu_loadvm_state_header()
>   migration: push Error **errp into qemu_loadvm_state_setup()
>   migration: push Error **errp into qemu_load_device_state()
>   migration: push Error **errp into qemu_loadvm_state_main()
>   migration: push Error **errp into qemu_loadvm_section_start_full()
>   migration: push Error **errp into qemu_loadvm_section_part_end()
>   migration: push Error **errp into loadvm_process_command()
>   migration: push Error **errp into loadvm_handle_cmd_packaged()
>   migration: push Error **errp into loadvm_postcopy_handle_advise()
>   migration: push Error **errp into ram_postcopy_incoming_init()
>   migration: push Error **errp into loadvm_postcopy_handle_listen()
>   migration: push Error **errp into loadvm_postcopy_handle_run()
>   migration: push Error **errp into loadvm_postcopy_ram_handle_discard()
>   migration: make loadvm_postcopy_handle_resume() void
>   migration: push Error **errp into loadvm_handle_recv_bitmap()
>   migration: push Error **errp into loadvm_process_enable_colo()
>   migration: push Error **errp into colo_init_ram_cache()
>   migration: push Error **errp into check_section_footer()
>   migration: push Error **errp into global_state_store()
>   migration: remove error reporting from qemu_fopen_bdrv() callers
>   migration: push Error **errp into qemu_savevm_state_iterate()
>   migration: simplify some error reporting in save_snapshot()
>   migration: push Error **errp into qemu_savevm_state_setup()
>   migration: push Error **errp into qemu_savevm_state_complete_precopy()
>   migration: push Error **errp into
>     qemu_savevm_state_complete_precopy_non_iterable()
>   migration: push Error **errp into qemu_savevm_state_complete_precopy()
>   migration: push Error **errp into qemu_savevm_send_packaged()
>   migration: push Error **errp into qemu_savevm_live_state()
>   migration: push Error **errp into qemu_save_device_state()
>   migration: push Error **errp into qemu_savevm_state_resume_prepare()
>   migration: push Error **errp into postcopy_resume_handshake()
>   migration: push Error **errp into postcopy_do_resume()
> 
>  include/migration/colo.h                      |   2 +-
>  include/migration/global_state.h              |   2 +-
>  migration/colo.c                              |  12 +-
>  migration/global_state.c                      |   6 +-
>  migration/migration.c                         |  80 ++-
>  migration/postcopy-ram.c                      |   8 +-
>  migration/postcopy-ram.h                      |   2 +-
>  migration/ram.c                               |  17 +-
>  migration/ram.h                               |   4 +-
>  migration/savevm.c                            | 594 ++++++++++--------
>  migration/savevm.h                            |  23 +-
>  .../tests/internal-snapshots-qapi.out         |   3 +-
>  12 files changed, 427 insertions(+), 326 deletions(-)
> 
> -- 
> 2.29.2
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-04 18:22 ` [PATCH 00/33] migration: capture error reports into Error object Dr. David Alan Gilbert
@ 2021-02-04 19:09   ` Daniel P. Berrangé
  2021-02-08 13:29     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-04 19:09 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Hailiang Zhang, qemu-devel, Juan Quintela

On Thu, Feb 04, 2021 at 06:22:49PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > Due to its long term heritage most of the migration code just invokes
> > 'error_report' when problems hit. This was fine for HMP, since the
> > messages get redirected from stderr, into the HMP console. It is not
> > OK for QMP because the errors will not be fed back to the QMP client.
> > 
> > This wasn't a terrible real world problem with QMP so far because
> > live migration happens in the background, so at least on the target side
> > there is not a QMP command that needs to capture the incoming migration.
> > It is a problem on the source side but it doesn't hit frequently as the
> > source side has fewer failure scenarios. None the less on both sides it
> > would be desirable if 'query-migrate' can report errors correctly.
> > With the introduction of the load-snapshot QMP commands, the need for
> > error reporting becomes more pressing.
> > 
> > Wiring up good error reporting is a large and difficult job, which
> > this series does NOT complete. The focus here has been on converting
> > all methods in savevm.c which have an 'int' return value capable of
> > reporting errors. This covers most of the infrastructure for controlling
> > the migration state serialization / protocol.
> > 
> > The remaining part that is missing error reporting are the callbacks in
> > the VMStateDescription struct which can return failure codes, but have
> > no "Error **errp" parameter. Thinking about how this might be dealt with
> > in future, a big bang conversion is likely non-viable. We'll probably
> > want to introduce a duplicate set of callbacks with the "Error **errp"
> > parameter and convert impls in batches, eventually removing the
> > original callbacks. I don't intend todo that myself in the immediate
> > future.
> > 
> > IOW, this patch series probably solves 50% of the problem, but we
> > still do need the rest to get ideal error reporting.
> > 
> > In doing this savevm conversion I noticed a bunch of places which
> > see and then ignore errors. I only fixed one or two of them which
> > were clearly dubious. Other places in savevm.c where it seemed it
> > was probably ok to ignore errors, I've left using error_report()
> > on the basis that those are really warnings. Perhaps they could
> > be changed to warn_report() instead.
> > 
> > There are alot of patches here, but I felt it was easier to review
> > for correctness if I converted 1 function at a time. The series
> > does not neccessarily have to be reviewed/appied in 1 go.
> 
> After this series, what do my errors look like, and where do they end
> up?
> Do I get my nice backtrace shwoing that device failed, then that was
> part of that one...

It hasn't modified any of the VMStateDescription callbacks so any
of the per-device logic that was printing errors will still be using
error_report to the console as before.

The errors that have changed (at this stage) are only the higher
level ones that are in the generic part of the code. Where those
errors mentioned a device name/ID they still do.

In some of the parts I've modified there will have been multiple
error_reports collapsed into one error_setg() but the ones that
are eliminated are high level generic messages with no useful
info, so I don't think loosing those is a problem per-se.

The example that I tested was the case where we load a snapshot
under a different config that we saved it with. This is the scenario
that gave the non-deterministic ordering in the iotest you disabled
from my previous series.

In that case, we changed from:

  qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
  {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}

To

  {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}

From a HMP loadvm POV, this means instead of seeing

  (hmp)  loadvm foo
  Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
  Error -22 while loading VM state

You will only see the detailed error message

  (hmp)  loadvm foo
  Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices

In this case I think loosing the "Error -22 while loading VM state"
is fine, as it didn't add value IMHO.


If we get around to converting the VMStateDescription callbacks to
take an error object, then I think we'll possibly need to stack the
error message from the callback, with the higher level message.

Do you have any familiar/good examples of error message stacking I
can look at ?  I should be able to say whether they would be impacted
by this series or not - if they are, then I hopefully only threw away
the fairly useless high level messages, like the "Error -22" message
above.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state()
  2021-02-04 17:18 ` [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state() Daniel P. Berrangé
@ 2021-02-04 21:57   ` Philippe Mathieu-Daudé
  2021-02-05  9:33     ` Daniel P. Berrangé
  0 siblings, 1 reply; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-04 21:57 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/migration.c |  4 ++--
>  migration/savevm.c    | 36 ++++++++++++++++++++----------------
>  migration/savevm.h    |  2 +-
>  3 files changed, 23 insertions(+), 19 deletions(-)
...

> diff --git a/migration/savevm.c b/migration/savevm.c
> index 6b320423c7..c8d93eee1e 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2638,40 +2638,49 @@ out:
>      return ret;
>  }
>  
> -int qemu_loadvm_state(QEMUFile *f)
> +int qemu_loadvm_state(QEMUFile *f, Error **errp)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
> -    Error *local_err = NULL;
>      int ret;
>  
> -    if (qemu_savevm_state_blocked(&local_err)) {
> -        error_report_err(local_err);
> -        return -EINVAL;
> +    if (qemu_savevm_state_blocked(errp)) {
> +        return -1;
>      }
>  
>      ret = qemu_loadvm_state_header(f);
>      if (ret) {
> -        return ret;
> +        error_setg(errp, "Error %d while loading VM state", ret);

Using error_setg_errno() instead (multiple occurences):
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> +        return -1;
>      }
>  



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/33] migration: push Error **errp into qemu_loadvm_state_header()
  2021-02-04 17:18 ` [PATCH 02/33] migration: push Error **errp into qemu_loadvm_state_header() Daniel P. Berrangé
@ 2021-02-04 21:58   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-04 21:58 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
...

>      if (migrate_get_current()->send_configuration) {
> -        if (qemu_get_byte(f) != QEMU_VM_CONFIGURATION) {
> -            error_report("Configuration section missing");
> +        v = qemu_get_byte(f);
> +        if (v != QEMU_VM_CONFIGURATION) {
> +            error_setg(errp, "Configuration section missing, %x != %x",
> +                       v, QEMU_VM_CONFIGURATION);
>              qemu_loadvm_state_cleanup();
> -            return -EINVAL;
> +            return -1;
>          }
>          ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0);
>  
>          if (ret) {
> +            error_setg(errp, "Error %d while loading VM state", ret);

error_setg_errno(), otherwise:
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>              qemu_loadvm_state_cleanup();
> -            return ret;
> +            return -1;
>          }
>      }
>      return 0;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/33] migration: push Error **errp into qemu_loadvm_state_setup()
  2021-02-04 17:18 ` [PATCH 03/33] migration: push Error **errp into qemu_loadvm_state_setup() Daniel P. Berrangé
@ 2021-02-04 21:59   ` Philippe Mathieu-Daudé
  2021-02-05  7:50   ` Markus Armbruster
  1 sibling, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-04 21:59 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 04/33] migration: push Error **errp into qemu_load_device_state()
  2021-02-04 17:18 ` [PATCH 04/33] migration: push Error **errp into qemu_load_device_state() Daniel P. Berrangé
@ 2021-02-04 22:01   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-04 22:01 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/colo.c   | 3 +--
>  migration/savevm.c | 4 ++--
>  migration/savevm.h | 2 +-
>  3 files changed, 4 insertions(+), 5 deletions(-)
...

> -int qemu_load_device_state(QEMUFile *f)
> +int qemu_load_device_state(QEMUFile *f, Error **errp)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      int ret;
> @@ -2734,7 +2734,7 @@ int qemu_load_device_state(QEMUFile *f)
>      /* Load QEMU_VM_SECTION_FULL section */
>      ret = qemu_loadvm_state_main(f, mis);
>      if (ret < 0) {
> -        error_report("Failed to load device state: %d", ret);
> +        error_setg(errp, "Failed to load device state: %d", ret);

error_setg_errno(), otherwise:
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>          return ret;
>      }



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 06/33] migration: push Error **errp into qemu_loadvm_section_start_full()
  2021-02-04 17:18 ` [PATCH 06/33] migration: push Error **errp into qemu_loadvm_section_start_full() Daniel P. Berrangé
@ 2021-02-04 22:04   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-04 22:04 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> This is particularly useful for loading snapshots as this is a likely
> error scenario to hit when the source and dest VM configs do not
> match. This is illustrated by the improved error reporting in the
> QMP load snapshot test.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c                            | 49 +++++++++----------
>  .../tests/internal-snapshots-qapi.out         |  3 +-
>  2 files changed, 25 insertions(+), 27 deletions(-)
...

>      instance_id = qemu_get_be32(f);
>      version_id = qemu_get_be32(f);
>  
>      ret = qemu_file_get_error(f);
>      if (ret) {
> -        error_report("%s: Failed to read instance/version ID: %d",
> -                     __func__, ret);
> -        return ret;
> +        error_setg(errp, "Failed to read instance/version ID: %d",
> +                   ret);

error_setg_errno()

> +        return -1;
>      }
...

> @@ -2601,11 +2603,8 @@ retry:
>          switch (section_type) {
>          case QEMU_VM_SECTION_START:
>          case QEMU_VM_SECTION_FULL:
> -            ret = qemu_loadvm_section_start_full(f, mis);
> +            ret = qemu_loadvm_section_start_full(f, mis, errp);
>              if (ret < 0) {
> -                error_setg(errp,
> -                           "Failed to load device state section start: %d",
> -                           ret);

Ditto.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>                  goto out;
>              }
>              break;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/33] migration: push Error **errp into qemu_loadvm_state_setup()
  2021-02-04 17:18 ` [PATCH 03/33] migration: push Error **errp into qemu_loadvm_state_setup() Daniel P. Berrangé
  2021-02-04 21:59   ` Philippe Mathieu-Daudé
@ 2021-02-05  7:50   ` Markus Armbruster
  1 sibling, 0 replies; 64+ messages in thread
From: Markus Armbruster @ 2021-02-05  7:50 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Hailiang Zhang, qemu-devel, Dr. David Alan Gilbert, Juan Quintela

Daniel P. Berrangé <berrange@redhat.com> writes:

> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 870199b629..f4ed14a230 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2490,7 +2490,7 @@ static int qemu_loadvm_state_header(QEMUFile *f, Error **errp)
>      return 0;
>  }
>  
> -static int qemu_loadvm_state_setup(QEMUFile *f)
> +static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp)
>  {
>      SaveStateEntry *se;
>      int ret;
> @@ -2509,7 +2509,7 @@ static int qemu_loadvm_state_setup(QEMUFile *f)
>          ret = se->ops->load_setup(f, se->opaque);
>          if (ret < 0) {
>              qemu_file_set_error(f, ret);
> -            error_report("Load state of device %s failed", se->idstr);
> +            error_setg(errp, "Load state of device %s failed", se->idstr);
>              return ret;
>          }
>      }
> @@ -2656,8 +2656,7 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
>          return -1;
>      }
>  
> -    if (qemu_loadvm_state_setup(f) != 0) {
> -        error_setg(errp, "Error %d while loading VM state", -EINVAL);
> +    if (qemu_loadvm_state_setup(f, errp) < 0) {
>          return -1;
>      }

Drive-by remark, *not* a demand: I don't like "0 on success, -1 on
failure".  When we return just one value on success and one on failure,
I prefer true and false.  Negative value on failure is of course fine
for returning error codes, and were we want to return arbitrary
non-negative values on success.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state()
  2021-02-04 21:57   ` Philippe Mathieu-Daudé
@ 2021-02-05  9:33     ` Daniel P. Berrangé
  2021-02-05  9:35       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-05  9:33 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Juan Quintela, qemu-devel, Dr. David Alan Gilbert, Hailiang Zhang

On Thu, Feb 04, 2021 at 10:57:20PM +0100, Philippe Mathieu-Daudé wrote:
> On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> > This is an incremental step in converting vmstate loading code to report
> > via Error objects instead of printing directly to the console/monitor.
> > 
> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > ---
> >  migration/migration.c |  4 ++--
> >  migration/savevm.c    | 36 ++++++++++++++++++++----------------
> >  migration/savevm.h    |  2 +-
> >  3 files changed, 23 insertions(+), 19 deletions(-)
> ...
> 
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 6b320423c7..c8d93eee1e 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -2638,40 +2638,49 @@ out:
> >      return ret;
> >  }
> >  
> > -int qemu_loadvm_state(QEMUFile *f)
> > +int qemu_loadvm_state(QEMUFile *f, Error **errp)
> >  {
> >      MigrationIncomingState *mis = migration_incoming_get_current();
> > -    Error *local_err = NULL;
> >      int ret;
> >  
> > -    if (qemu_savevm_state_blocked(&local_err)) {
> > -        error_report_err(local_err);
> > -        return -EINVAL;
> > +    if (qemu_savevm_state_blocked(errp)) {
> > +        return -1;
> >      }
> >  
> >      ret = qemu_loadvm_state_header(f);
> >      if (ret) {
> > -        return ret;
> > +        error_setg(errp, "Error %d while loading VM state", ret);
> 
> Using error_setg_errno() instead (multiple occurences):

I don't think we want todo that in general, because the code is
already not reliable at actually returning an errno value, sometimes
returning just "-1". At the end of this series it will almost always
be returning "-1", not an errno.  There are some places where an
errno is relevant though - specificially qemu_get_file_error calls.

> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state()
  2021-02-05  9:33     ` Daniel P. Berrangé
@ 2021-02-05  9:35       ` Philippe Mathieu-Daudé
  2021-03-11 12:38         ` Daniel P. Berrangé
  0 siblings, 1 reply; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-05  9:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Juan Quintela, QEMU Developers, Dr. David Alan Gilbert, Hailiang Zhang

On Fri, Feb 5, 2021 at 10:33 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Thu, Feb 04, 2021 at 10:57:20PM +0100, Philippe Mathieu-Daudé wrote:
> > On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> > > This is an incremental step in converting vmstate loading code to report
> > > via Error objects instead of printing directly to the console/monitor.
> > >
> > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > > ---
> > >  migration/migration.c |  4 ++--
> > >  migration/savevm.c    | 36 ++++++++++++++++++++----------------
> > >  migration/savevm.h    |  2 +-
> > >  3 files changed, 23 insertions(+), 19 deletions(-)
> > ...
> >
> > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > index 6b320423c7..c8d93eee1e 100644
> > > --- a/migration/savevm.c
> > > +++ b/migration/savevm.c
> > > @@ -2638,40 +2638,49 @@ out:
> > >      return ret;
> > >  }
> > >
> > > -int qemu_loadvm_state(QEMUFile *f)
> > > +int qemu_loadvm_state(QEMUFile *f, Error **errp)
> > >  {
> > >      MigrationIncomingState *mis = migration_incoming_get_current();
> > > -    Error *local_err = NULL;
> > >      int ret;
> > >
> > > -    if (qemu_savevm_state_blocked(&local_err)) {
> > > -        error_report_err(local_err);
> > > -        return -EINVAL;
> > > +    if (qemu_savevm_state_blocked(errp)) {
> > > +        return -1;
> > >      }
> > >
> > >      ret = qemu_loadvm_state_header(f);
> > >      if (ret) {
> > > -        return ret;
> > > +        error_setg(errp, "Error %d while loading VM state", ret);
> >
> > Using error_setg_errno() instead (multiple occurences):
>
> I don't think we want todo that in general, because the code is
> already not reliable at actually returning an errno value, sometimes
> returning just "-1". At the end of this series it will almost always
> be returning "-1", not an errno.  There are some places where an
> errno is relevant though - specificially qemu_get_file_error calls.

Fair. Ignore my other same comments in this. R-b tag stands.

>
> > Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >
>
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/33] migration: push Error **errp into qemu_loadvm_section_part_end()
  2021-02-04 17:18 ` [PATCH 07/33] migration: push Error **errp into qemu_loadvm_section_part_end() Daniel P. Berrangé
@ 2021-02-05 16:16   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-05 16:16 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 26 +++++++++++++-------------
>  1 file changed, 13 insertions(+), 13 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 08/33] migration: push Error **errp into loadvm_process_command()
  2021-02-04 17:18 ` [PATCH 08/33] migration: push Error **errp into loadvm_process_command() Daniel P. Berrangé
@ 2021-02-05 16:18   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-05 16:18 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 87 ++++++++++++++++++++++++++++++++++------------
>  1 file changed, 64 insertions(+), 23 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 350d5a315a..450c36994f 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2223,34 +2223,37 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis)
>   * Process an incoming 'QEMU_VM_COMMAND'
>   * 0           just a normal return
>   * LOADVM_QUIT All good, but exit the loop
> - * <0          Error
> + * -1          Error
>   */
> -static int loadvm_process_command(QEMUFile *f)
> +static int loadvm_process_command(QEMUFile *f, Error **errp)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      uint16_t cmd;
>      uint16_t len;
>      uint32_t tmp32;
> +    int ret;
>  
>      cmd = qemu_get_be16(f);
>      len = qemu_get_be16(f);
>  
>      /* Check validity before continue processing of cmds */
>      if (qemu_file_get_error(f)) {

Eventually assign 'ret' and use it here

> -        return qemu_file_get_error(f);
> +        error_setg(errp, "device state stream has error: %d",
> +                   qemu_file_get_error(f));

and here.

> +        return -1;
>      }
>  
>      trace_loadvm_process_command(cmd, len);
>      if (cmd >= MIG_CMD_MAX || cmd == MIG_CMD_INVALID) {
> -        error_report("MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
> -        return -EINVAL;
> +        error_setg(errp, "MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
> +        return -1;

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>      }



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/33] migration: push Error **errp into loadvm_postcopy_handle_advise()
  2021-02-04 17:18 ` [PATCH 10/33] migration: push Error **errp into loadvm_postcopy_handle_advise() Daniel P. Berrangé
@ 2021-02-05 16:21   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-05 16:21 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 43 +++++++++++++++++++++----------------------
>  1 file changed, 21 insertions(+), 22 deletions(-)
...

>  
>      if (ram_postcopy_incoming_init(mis)) {
> +        error_setg(errp, "Postcopy RAM incoming init failed");

We gain error precision, OK.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>          return -1;
>      }



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 12/33] migration: push Error **errp into loadvm_postcopy_handle_listen()
  2021-02-04 17:18 ` [PATCH 12/33] migration: push Error **errp into loadvm_postcopy_handle_listen() Daniel P. Berrangé
@ 2021-02-05 16:23   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-05 16:23 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 18 +++++++-----------
>  1 file changed, 7 insertions(+), 11 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index c505526406..447596383f 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1909,14 +1909,15 @@ static void *postcopy_ram_listen_thread(void *opaque)
>  }
...

> @@ -1937,12 +1938,12 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>      if (migrate_postcopy_ram()) {
>          if (postcopy_ram_incoming_setup(mis)) {
>              postcopy_ram_incoming_cleanup(mis);
> +            error_setg(errp, "Failed to setup incoming postcoyp RAM blocks");

New error, OK.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>              return -1;
>          }
>      }



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/33] migration: push Error **errp into loadvm_postcopy_handle_run()
  2021-02-04 17:18 ` [PATCH 13/33] migration: push Error **errp into loadvm_postcopy_handle_run() Daniel P. Berrangé
@ 2021-02-05 16:23   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-05 16:23 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 14/33] migration: push Error **errp into loadvm_postcopy_ram_handle_discard()
  2021-02-04 17:18 ` [PATCH 14/33] migration: push Error **errp into loadvm_postcopy_ram_handle_discard() Daniel P. Berrangé
@ 2021-02-05 16:24   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-05 16:24 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 31 +++++++++++++++----------------
>  1 file changed, 15 insertions(+), 16 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index fa7883ae5e..2216c61c6f 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1735,7 +1735,8 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
>   * There can be 0..many of these messages, each encoding multiple pages.
>   */
>  static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
> -                                              uint16_t len)
> +                                              uint16_t len,
> +                                              Error **errp)
>  {
>      int tmp;
>      char ramid[256];
> @@ -1748,7 +1749,8 @@ static int loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
>          /* 1st discard */
>          tmp = postcopy_ram_prepare_discard(mis);
>          if (tmp) {
> -            return tmp;
> +            error_setg(errp, "Failed to prepare for RAM discard: %d", tmp);

New error, OK.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> +            return -1;
>          }



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 19/33] migration: push Error **errp into check_section_footer()
  2021-02-04 17:18 ` [PATCH 19/33] migration: push Error **errp into check_section_footer() Daniel P. Berrangé
@ 2021-02-05 16:26   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-02-05 16:26 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Hailiang Zhang, Dr. David Alan Gilbert, Juan Quintela

On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/savevm.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-04 19:09   ` Daniel P. Berrangé
@ 2021-02-08 13:29     ` Dr. David Alan Gilbert
  2021-02-08 13:42       ` Daniel P. Berrangé
  0 siblings, 1 reply; 64+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-08 13:29 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Hailiang Zhang, qemu-devel, Juan Quintela

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Thu, Feb 04, 2021 at 06:22:49PM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > Due to its long term heritage most of the migration code just invokes
> > > 'error_report' when problems hit. This was fine for HMP, since the
> > > messages get redirected from stderr, into the HMP console. It is not
> > > OK for QMP because the errors will not be fed back to the QMP client.
> > > 
> > > This wasn't a terrible real world problem with QMP so far because
> > > live migration happens in the background, so at least on the target side
> > > there is not a QMP command that needs to capture the incoming migration.
> > > It is a problem on the source side but it doesn't hit frequently as the
> > > source side has fewer failure scenarios. None the less on both sides it
> > > would be desirable if 'query-migrate' can report errors correctly.
> > > With the introduction of the load-snapshot QMP commands, the need for
> > > error reporting becomes more pressing.
> > > 
> > > Wiring up good error reporting is a large and difficult job, which
> > > this series does NOT complete. The focus here has been on converting
> > > all methods in savevm.c which have an 'int' return value capable of
> > > reporting errors. This covers most of the infrastructure for controlling
> > > the migration state serialization / protocol.
> > > 
> > > The remaining part that is missing error reporting are the callbacks in
> > > the VMStateDescription struct which can return failure codes, but have
> > > no "Error **errp" parameter. Thinking about how this might be dealt with
> > > in future, a big bang conversion is likely non-viable. We'll probably
> > > want to introduce a duplicate set of callbacks with the "Error **errp"
> > > parameter and convert impls in batches, eventually removing the
> > > original callbacks. I don't intend todo that myself in the immediate
> > > future.
> > > 
> > > IOW, this patch series probably solves 50% of the problem, but we
> > > still do need the rest to get ideal error reporting.
> > > 
> > > In doing this savevm conversion I noticed a bunch of places which
> > > see and then ignore errors. I only fixed one or two of them which
> > > were clearly dubious. Other places in savevm.c where it seemed it
> > > was probably ok to ignore errors, I've left using error_report()
> > > on the basis that those are really warnings. Perhaps they could
> > > be changed to warn_report() instead.
> > > 
> > > There are alot of patches here, but I felt it was easier to review
> > > for correctness if I converted 1 function at a time. The series
> > > does not neccessarily have to be reviewed/appied in 1 go.
> > 
> > After this series, what do my errors look like, and where do they end
> > up?
> > Do I get my nice backtrace shwoing that device failed, then that was
> > part of that one...
> 
> It hasn't modified any of the VMStateDescription callbacks so any
> of the per-device logic that was printing errors will still be using
> error_report to the console as before.
> 
> The errors that have changed (at this stage) are only the higher
> level ones that are in the generic part of the code. Where those
> errors mentioned a device name/ID they still do.
> 
> In some of the parts I've modified there will have been multiple
> error_reports collapsed into one error_setg() but the ones that
> are eliminated are high level generic messages with no useful
> info, so I don't think loosing those is a problem per-se.
> 
> The example that I tested was the case where we load a snapshot
> under a different config that we saved it with. This is the scenario
> that gave the non-deterministic ordering in the iotest you disabled
> from my previous series.
> 
> In that case, we changed from:
> 
>   qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
>   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}
> 
> To
> 
>   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}
> 
> From a HMP loadvm POV, this means instead of seeing
> 
>   (hmp)  loadvm foo
>   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
>   Error -22 while loading VM state
> 
> You will only see the detailed error message
> 
>   (hmp)  loadvm foo
>   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> 
> In this case I think loosing the "Error -22 while loading VM state"
> is fine, as it didn't add value IMHO.
> 
> 
> If we get around to converting the VMStateDescription callbacks to
> take an error object, then I think we'll possibly need to stack the
> error message from the callback, with the higher level message.
> 
> Do you have any familiar/good examples of error message stacking I
> can look at ?  I should be able to say whether they would be impacted
> by this series or not - if they are, then I hopefully only threw away
> the fairly useless high level messages, like the "Error -22" message
> above.

Can you try migrating:
  ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng,disable-modern=true
to
  ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng

what I currently get is:
qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
qemu-system-x86_64: Failed to load PCIDevice:config
qemu-system-x86_64: Failed to load virtio-rng:virtio
qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
qemu-system-x86_64: load of migration failed: Invalid argument

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-08 13:29     ` Dr. David Alan Gilbert
@ 2021-02-08 13:42       ` Daniel P. Berrangé
  2021-02-08 14:29         ` Dr. David Alan Gilbert
  2021-02-15 18:38         ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-08 13:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Hailiang Zhang, qemu-devel, Juan Quintela

On Mon, Feb 08, 2021 at 01:29:03PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Thu, Feb 04, 2021 at 06:22:49PM +0000, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > Due to its long term heritage most of the migration code just invokes
> > > > 'error_report' when problems hit. This was fine for HMP, since the
> > > > messages get redirected from stderr, into the HMP console. It is not
> > > > OK for QMP because the errors will not be fed back to the QMP client.
> > > > 
> > > > This wasn't a terrible real world problem with QMP so far because
> > > > live migration happens in the background, so at least on the target side
> > > > there is not a QMP command that needs to capture the incoming migration.
> > > > It is a problem on the source side but it doesn't hit frequently as the
> > > > source side has fewer failure scenarios. None the less on both sides it
> > > > would be desirable if 'query-migrate' can report errors correctly.
> > > > With the introduction of the load-snapshot QMP commands, the need for
> > > > error reporting becomes more pressing.
> > > > 
> > > > Wiring up good error reporting is a large and difficult job, which
> > > > this series does NOT complete. The focus here has been on converting
> > > > all methods in savevm.c which have an 'int' return value capable of
> > > > reporting errors. This covers most of the infrastructure for controlling
> > > > the migration state serialization / protocol.
> > > > 
> > > > The remaining part that is missing error reporting are the callbacks in
> > > > the VMStateDescription struct which can return failure codes, but have
> > > > no "Error **errp" parameter. Thinking about how this might be dealt with
> > > > in future, a big bang conversion is likely non-viable. We'll probably
> > > > want to introduce a duplicate set of callbacks with the "Error **errp"
> > > > parameter and convert impls in batches, eventually removing the
> > > > original callbacks. I don't intend todo that myself in the immediate
> > > > future.
> > > > 
> > > > IOW, this patch series probably solves 50% of the problem, but we
> > > > still do need the rest to get ideal error reporting.
> > > > 
> > > > In doing this savevm conversion I noticed a bunch of places which
> > > > see and then ignore errors. I only fixed one or two of them which
> > > > were clearly dubious. Other places in savevm.c where it seemed it
> > > > was probably ok to ignore errors, I've left using error_report()
> > > > on the basis that those are really warnings. Perhaps they could
> > > > be changed to warn_report() instead.
> > > > 
> > > > There are alot of patches here, but I felt it was easier to review
> > > > for correctness if I converted 1 function at a time. The series
> > > > does not neccessarily have to be reviewed/appied in 1 go.
> > > 
> > > After this series, what do my errors look like, and where do they end
> > > up?
> > > Do I get my nice backtrace shwoing that device failed, then that was
> > > part of that one...
> > 
> > It hasn't modified any of the VMStateDescription callbacks so any
> > of the per-device logic that was printing errors will still be using
> > error_report to the console as before.
> > 
> > The errors that have changed (at this stage) are only the higher
> > level ones that are in the generic part of the code. Where those
> > errors mentioned a device name/ID they still do.
> > 
> > In some of the parts I've modified there will have been multiple
> > error_reports collapsed into one error_setg() but the ones that
> > are eliminated are high level generic messages with no useful
> > info, so I don't think loosing those is a problem per-se.
> > 
> > The example that I tested was the case where we load a snapshot
> > under a different config that we saved it with. This is the scenario
> > that gave the non-deterministic ordering in the iotest you disabled
> > from my previous series.
> > 
> > In that case, we changed from:
> > 
> >   qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}
> > 
> > To
> > 
> >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}
> > 
> > From a HMP loadvm POV, this means instead of seeing
> > 
> >   (hmp)  loadvm foo
> >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> >   Error -22 while loading VM state
> > 
> > You will only see the detailed error message
> > 
> >   (hmp)  loadvm foo
> >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > 
> > In this case I think loosing the "Error -22 while loading VM state"
> > is fine, as it didn't add value IMHO.
> > 
> > 
> > If we get around to converting the VMStateDescription callbacks to
> > take an error object, then I think we'll possibly need to stack the
> > error message from the callback, with the higher level message.
> > 
> > Do you have any familiar/good examples of error message stacking I
> > can look at ?  I should be able to say whether they would be impacted
> > by this series or not - if they are, then I hopefully only threw away
> > the fairly useless high level messages, like the "Error -22" message
> > above.
> 
> Can you try migrating:
>   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng,disable-modern=true
> to
>   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng
> 
> what I currently get is:
> qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> qemu-system-x86_64: Failed to load PCIDevice:config
> qemu-system-x86_64: Failed to load virtio-rng:virtio
> qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> qemu-system-x86_64: load of migration failed: Invalid argument

After my patches the very last line is gone.

So, still reporting using  error_report() is the first 3:

 qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
 qemu-system-x86_64: Failed to load PCIDevice:config
 qemu-system-x86_64: Failed to load virtio-rng:virtio

Then reported in process_incoming_migration_co() using the message
populated in the Error object, using error_report_err():

 qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'

Finally, this is no longer reported:

 qemu-system-x86_64: load of migration failed: Invalid argument

So in this case we've not lost any useful information

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-08 13:42       ` Daniel P. Berrangé
@ 2021-02-08 14:29         ` Dr. David Alan Gilbert
  2021-02-08 14:36           ` Daniel P. Berrangé
  2021-02-15 18:38         ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 64+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-08 14:29 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Hailiang Zhang, qemu-devel, Juan Quintela

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Feb 08, 2021 at 01:29:03PM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Thu, Feb 04, 2021 at 06:22:49PM +0000, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > Due to its long term heritage most of the migration code just invokes
> > > > > 'error_report' when problems hit. This was fine for HMP, since the
> > > > > messages get redirected from stderr, into the HMP console. It is not
> > > > > OK for QMP because the errors will not be fed back to the QMP client.
> > > > > 
> > > > > This wasn't a terrible real world problem with QMP so far because
> > > > > live migration happens in the background, so at least on the target side
> > > > > there is not a QMP command that needs to capture the incoming migration.
> > > > > It is a problem on the source side but it doesn't hit frequently as the
> > > > > source side has fewer failure scenarios. None the less on both sides it
> > > > > would be desirable if 'query-migrate' can report errors correctly.
> > > > > With the introduction of the load-snapshot QMP commands, the need for
> > > > > error reporting becomes more pressing.
> > > > > 
> > > > > Wiring up good error reporting is a large and difficult job, which
> > > > > this series does NOT complete. The focus here has been on converting
> > > > > all methods in savevm.c which have an 'int' return value capable of
> > > > > reporting errors. This covers most of the infrastructure for controlling
> > > > > the migration state serialization / protocol.
> > > > > 
> > > > > The remaining part that is missing error reporting are the callbacks in
> > > > > the VMStateDescription struct which can return failure codes, but have
> > > > > no "Error **errp" parameter. Thinking about how this might be dealt with
> > > > > in future, a big bang conversion is likely non-viable. We'll probably
> > > > > want to introduce a duplicate set of callbacks with the "Error **errp"
> > > > > parameter and convert impls in batches, eventually removing the
> > > > > original callbacks. I don't intend todo that myself in the immediate
> > > > > future.
> > > > > 
> > > > > IOW, this patch series probably solves 50% of the problem, but we
> > > > > still do need the rest to get ideal error reporting.
> > > > > 
> > > > > In doing this savevm conversion I noticed a bunch of places which
> > > > > see and then ignore errors. I only fixed one or two of them which
> > > > > were clearly dubious. Other places in savevm.c where it seemed it
> > > > > was probably ok to ignore errors, I've left using error_report()
> > > > > on the basis that those are really warnings. Perhaps they could
> > > > > be changed to warn_report() instead.
> > > > > 
> > > > > There are alot of patches here, but I felt it was easier to review
> > > > > for correctness if I converted 1 function at a time. The series
> > > > > does not neccessarily have to be reviewed/appied in 1 go.
> > > > 
> > > > After this series, what do my errors look like, and where do they end
> > > > up?
> > > > Do I get my nice backtrace shwoing that device failed, then that was
> > > > part of that one...
> > > 
> > > It hasn't modified any of the VMStateDescription callbacks so any
> > > of the per-device logic that was printing errors will still be using
> > > error_report to the console as before.
> > > 
> > > The errors that have changed (at this stage) are only the higher
> > > level ones that are in the generic part of the code. Where those
> > > errors mentioned a device name/ID they still do.
> > > 
> > > In some of the parts I've modified there will have been multiple
> > > error_reports collapsed into one error_setg() but the ones that
> > > are eliminated are high level generic messages with no useful
> > > info, so I don't think loosing those is a problem per-se.
> > > 
> > > The example that I tested was the case where we load a snapshot
> > > under a different config that we saved it with. This is the scenario
> > > that gave the non-deterministic ordering in the iotest you disabled
> > > from my previous series.
> > > 
> > > In that case, we changed from:
> > > 
> > >   qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}
> > > 
> > > To
> > > 
> > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}
> > > 
> > > From a HMP loadvm POV, this means instead of seeing
> > > 
> > >   (hmp)  loadvm foo
> > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > >   Error -22 while loading VM state
> > > 
> > > You will only see the detailed error message
> > > 
> > >   (hmp)  loadvm foo
> > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > 
> > > In this case I think loosing the "Error -22 while loading VM state"
> > > is fine, as it didn't add value IMHO.
> > > 
> > > 
> > > If we get around to converting the VMStateDescription callbacks to
> > > take an error object, then I think we'll possibly need to stack the
> > > error message from the callback, with the higher level message.
> > > 
> > > Do you have any familiar/good examples of error message stacking I
> > > can look at ?  I should be able to say whether they would be impacted
> > > by this series or not - if they are, then I hopefully only threw away
> > > the fairly useless high level messages, like the "Error -22" message
> > > above.
> > 
> > Can you try migrating:
> >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng,disable-modern=true
> > to
> >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng
> > 
> > what I currently get is:
> > qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> > qemu-system-x86_64: Failed to load PCIDevice:config
> > qemu-system-x86_64: Failed to load virtio-rng:virtio
> > qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> > qemu-system-x86_64: load of migration failed: Invalid argument
> 
> After my patches the very last line is gone.
> 
> So, still reporting using  error_report() is the first 3:
> 
>  qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
>  qemu-system-x86_64: Failed to load PCIDevice:config
>  qemu-system-x86_64: Failed to load virtio-rng:virtio

So those are still ending up in the stderr/log ?

> Then reported in process_incoming_migration_co() using the message
> populated in the Error object, using error_report_err():
> 
>  qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'

Does that mean we've not got that error associated with the others?  It
could be a pain where we've got multiple devices (e.g. NICs or storage)
and need to realise which one is failing.

> Finally, this is no longer reported:
> 
>  qemu-system-x86_64: load of migration failed: Invalid argument
> 
> So in this case we've not lost any useful information

You occasionally get other things other than Invalid argument; in
particular you get EIO; it can help you determine if the source killed
the migration connection first.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-08 14:29         ` Dr. David Alan Gilbert
@ 2021-02-08 14:36           ` Daniel P. Berrangé
  0 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-08 14:36 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Hailiang Zhang, qemu-devel, Juan Quintela

On Mon, Feb 08, 2021 at 02:29:41PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Mon, Feb 08, 2021 at 01:29:03PM +0000, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > On Thu, Feb 04, 2021 at 06:22:49PM +0000, Dr. David Alan Gilbert wrote:
> > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > > Due to its long term heritage most of the migration code just invokes
> > > > > > 'error_report' when problems hit. This was fine for HMP, since the
> > > > > > messages get redirected from stderr, into the HMP console. It is not
> > > > > > OK for QMP because the errors will not be fed back to the QMP client.
> > > > > > 
> > > > > > This wasn't a terrible real world problem with QMP so far because
> > > > > > live migration happens in the background, so at least on the target side
> > > > > > there is not a QMP command that needs to capture the incoming migration.
> > > > > > It is a problem on the source side but it doesn't hit frequently as the
> > > > > > source side has fewer failure scenarios. None the less on both sides it
> > > > > > would be desirable if 'query-migrate' can report errors correctly.
> > > > > > With the introduction of the load-snapshot QMP commands, the need for
> > > > > > error reporting becomes more pressing.
> > > > > > 
> > > > > > Wiring up good error reporting is a large and difficult job, which
> > > > > > this series does NOT complete. The focus here has been on converting
> > > > > > all methods in savevm.c which have an 'int' return value capable of
> > > > > > reporting errors. This covers most of the infrastructure for controlling
> > > > > > the migration state serialization / protocol.
> > > > > > 
> > > > > > The remaining part that is missing error reporting are the callbacks in
> > > > > > the VMStateDescription struct which can return failure codes, but have
> > > > > > no "Error **errp" parameter. Thinking about how this might be dealt with
> > > > > > in future, a big bang conversion is likely non-viable. We'll probably
> > > > > > want to introduce a duplicate set of callbacks with the "Error **errp"
> > > > > > parameter and convert impls in batches, eventually removing the
> > > > > > original callbacks. I don't intend todo that myself in the immediate
> > > > > > future.
> > > > > > 
> > > > > > IOW, this patch series probably solves 50% of the problem, but we
> > > > > > still do need the rest to get ideal error reporting.
> > > > > > 
> > > > > > In doing this savevm conversion I noticed a bunch of places which
> > > > > > see and then ignore errors. I only fixed one or two of them which
> > > > > > were clearly dubious. Other places in savevm.c where it seemed it
> > > > > > was probably ok to ignore errors, I've left using error_report()
> > > > > > on the basis that those are really warnings. Perhaps they could
> > > > > > be changed to warn_report() instead.
> > > > > > 
> > > > > > There are alot of patches here, but I felt it was easier to review
> > > > > > for correctness if I converted 1 function at a time. The series
> > > > > > does not neccessarily have to be reviewed/appied in 1 go.
> > > > > 
> > > > > After this series, what do my errors look like, and where do they end
> > > > > up?
> > > > > Do I get my nice backtrace shwoing that device failed, then that was
> > > > > part of that one...
> > > > 
> > > > It hasn't modified any of the VMStateDescription callbacks so any
> > > > of the per-device logic that was printing errors will still be using
> > > > error_report to the console as before.
> > > > 
> > > > The errors that have changed (at this stage) are only the higher
> > > > level ones that are in the generic part of the code. Where those
> > > > errors mentioned a device name/ID they still do.
> > > > 
> > > > In some of the parts I've modified there will have been multiple
> > > > error_reports collapsed into one error_setg() but the ones that
> > > > are eliminated are high level generic messages with no useful
> > > > info, so I don't think loosing those is a problem per-se.
> > > > 
> > > > The example that I tested was the case where we load a snapshot
> > > > under a different config that we saved it with. This is the scenario
> > > > that gave the non-deterministic ordering in the iotest you disabled
> > > > from my previous series.
> > > > 
> > > > In that case, we changed from:
> > > > 
> > > >   qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}
> > > > 
> > > > To
> > > > 
> > > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}
> > > > 
> > > > From a HMP loadvm POV, this means instead of seeing
> > > > 
> > > >   (hmp)  loadvm foo
> > > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > >   Error -22 while loading VM state
> > > > 
> > > > You will only see the detailed error message
> > > > 
> > > >   (hmp)  loadvm foo
> > > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > > 
> > > > In this case I think loosing the "Error -22 while loading VM state"
> > > > is fine, as it didn't add value IMHO.
> > > > 
> > > > 
> > > > If we get around to converting the VMStateDescription callbacks to
> > > > take an error object, then I think we'll possibly need to stack the
> > > > error message from the callback, with the higher level message.
> > > > 
> > > > Do you have any familiar/good examples of error message stacking I
> > > > can look at ?  I should be able to say whether they would be impacted
> > > > by this series or not - if they are, then I hopefully only threw away
> > > > the fairly useless high level messages, like the "Error -22" message
> > > > above.
> > > 
> > > Can you try migrating:
> > >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng,disable-modern=true
> > > to
> > >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng
> > > 
> > > what I currently get is:
> > > qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> > > qemu-system-x86_64: Failed to load PCIDevice:config
> > > qemu-system-x86_64: Failed to load virtio-rng:virtio
> > > qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> > > qemu-system-x86_64: load of migration failed: Invalid argument
> > 
> > After my patches the very last line is gone.
> > 
> > So, still reporting using  error_report() is the first 3:
> > 
> >  qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> >  qemu-system-x86_64: Failed to load PCIDevice:config
> >  qemu-system-x86_64: Failed to load virtio-rng:virtio
> 
> So those are still ending up in the stderr/log ?

yes.

> > Then reported in process_incoming_migration_co() using the message
> > populated in the Error object, using error_report_err():
> > 
> >  qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> 
> Does that mean we've not got that error associated with the others?  It
> could be a pain where we've got multiple devices (e.g. NICs or storage)
> and need to realise which one is failing.

In the case of migration, this message will still get put into stderr
with the others.

In the case of HMP "loadvm", this message will also still get into
stderr with the others.

In the case of QMP "load-snapshot", this message will get reported
back to the app via the "query-jobs" error field, and not appear on
stderr.  Obviously long term it would be preferrable if we can get
all the other mesages chained up into the Error object too, so we
get the full set in one place.

> 
> > Finally, this is no longer reported:
> > 
> >  qemu-system-x86_64: load of migration failed: Invalid argument
> > 
> > So in this case we've not lost any useful information
> 
> You occasionally get other things other than Invalid argument; in
> particular you get EIO; it can help you determine if the source killed
> the migration connection first.

All the places which checked qemu_file_get_error() and reported the
errno, should still be turned into Error objects, so I believe we
should get the EIO scenario reports still.

> 
> Dave
> 
> > Regards,
> > Daniel
> > -- 
> > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/33] migration: push Error **errp into qemu_loadvm_state_main()
  2021-02-04 17:18 ` [PATCH 05/33] migration: push Error **errp into qemu_loadvm_state_main() Daniel P. Berrangé
@ 2021-02-15 18:35   ` Dr. David Alan Gilbert
  2021-02-15 18:58     ` Daniel P. Berrangé
  2021-03-11 12:17     ` Daniel P. Berrangé
  0 siblings, 2 replies; 64+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-15 18:35 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Juan Quintela, qemu-devel, Hailiang Zhang

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> This is an incremental step in converting vmstate loading code to report
> via Error objects instead of printing directly to the console/monitor.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/colo.c   |  3 +-
>  migration/savevm.c | 73 +++++++++++++++++++++++++++++++---------------
>  migration/savevm.h |  3 +-
>  3 files changed, 52 insertions(+), 27 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index e344b7cf32..4a050ac579 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -705,11 +705,10 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
>  
>      qemu_mutex_lock_iothread();
>      cpu_synchronize_all_states();
> -    ret = qemu_loadvm_state_main(mis->from_src_file, mis);
> +    ret = qemu_loadvm_state_main(mis->from_src_file, mis, errp);
>      qemu_mutex_unlock_iothread();
>  
>      if (ret < 0) {
> -        error_setg(errp, "Load VM's live state (ram) error");
>          return;
>      }
>  
> diff --git a/migration/savevm.c b/migration/savevm.c
> index dd41292d4e..e47aec435c 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1819,6 +1819,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
>      QEMUFile *f = mis->from_src_file;
>      int load_res;
>      MigrationState *migr = migrate_get_current();
> +    Error *local_err = NULL;
>  
>      object_ref(OBJECT(migr));
>  
> @@ -1833,7 +1834,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
>       * in qemu_file, and thus we must be blocking now.
>       */
>      qemu_file_set_blocking(f, true);
> -    load_res = qemu_loadvm_state_main(f, mis);
> +    load_res = qemu_loadvm_state_main(f, mis, &local_err);
>  
>      /*
>       * This is tricky, but, mis->from_src_file can change after it
> @@ -1849,6 +1850,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
>      if (load_res < 0) {
>          qemu_file_set_error(f, load_res);
>          dirty_bitmap_mig_cancel_incoming();
> +        error_report_err(local_err);
>          if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
>              !migrate_postcopy_ram() && migrate_dirty_bitmaps())
>          {
> @@ -1859,12 +1861,10 @@ static void *postcopy_ram_listen_thread(void *opaque)
>                           __func__, load_res);
>              load_res = 0; /* prevent further exit() */
>          } else {
> -            error_report("%s: loadvm failed: %d", __func__, load_res);
>              migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
>                                             MIGRATION_STATUS_FAILED);
>          }
> -    }
> -    if (load_res >= 0) {
> +    } else {
>          /*
>           * This looks good, but it's possible that the device loading in the
>           * main thread hasn't finished yet, and so we might not be in 'RUN'
> @@ -2116,14 +2116,17 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
>   * @mis: Incoming state
>   * @length: Length of packaged data to read
>   *
> - * Returns: Negative values on error
> - *
> + * Returns:
> + *   0: success
> + *   LOADVM_QUIT: success, but stop
> + *   -1: error
>   */
>  static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
>  {
>      int ret;
>      size_t length;
>      QIOChannelBuffer *bioc;
> +    Error *local_err = NULL;
>  
>      length = qemu_get_be32(mis->from_src_file);
>      trace_loadvm_handle_cmd_packaged(length);
> @@ -2149,8 +2152,11 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
>  
>      QEMUFile *packf = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
>  
> -    ret = qemu_loadvm_state_main(packf, mis);
> +    ret = qemu_loadvm_state_main(packf, mis, &local_err);
>      trace_loadvm_handle_cmd_packaged_main(ret);
> +    if (ret < 0) {
> +        error_report_err(local_err);
> +    }
>      qemu_fclose(packf);
>      object_unref(OBJECT(bioc));
>  
> @@ -2568,7 +2574,14 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
>      return true;
>  }
>  
> -int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> +/*
> + * Returns:
> + *   0: success
> + *   LOADVM_QUIT: success, but stop
> + *   -1: error
> + */
> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis,
> +                           Error **errp)
>  {
>      uint8_t section_type;
>      int ret = 0;
> @@ -2579,7 +2592,9 @@ retry:
>  
>          if (qemu_file_get_error(f)) {
>              ret = qemu_file_get_error(f);
> -            break;
> +            error_setg(errp,
> +                       "Failed to load device state section ID: %d", ret);

Can I ask why these don't use strerror(ret) ?

The test I'm running is, start a VM with an actual guest and a useful
amount of ram:

./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -nographic -m 8G -drive if=virtio,file=/home/vmimages/fedora-33-nest.qcow

./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -nographic -m 8G -drive if=virtio,file=/home/vmimages/fedora-33-nest.qcow -incoming tcp:0:4444

source:
  migrate_set_speed 1m
  migrate -d tcp:0:4444
  <Now quickly>
  migrate_cancel

In the old world I get:
qemu-system-x86_64: load of migration failed: Input/output error

In your world I get:
qemu-system-x86_64: Failed to load device state section ID: -5

(5 being EIO)

Dave


> +            goto out;
>          }
>  
>          trace_qemu_loadvm_state_section(section_type);
> @@ -2588,6 +2603,9 @@ retry:
>          case QEMU_VM_SECTION_FULL:
>              ret = qemu_loadvm_section_start_full(f, mis);
>              if (ret < 0) {
> +                error_setg(errp,
> +                           "Failed to load device state section start: %d",
> +                           ret);
>                  goto out;
>              }
>              break;
> @@ -2595,29 +2613,38 @@ retry:
>          case QEMU_VM_SECTION_END:
>              ret = qemu_loadvm_section_part_end(f, mis);
>              if (ret < 0) {
> +                error_setg(errp,
> +                           "Failed to load device state section end: %d", ret);
>                  goto out;
>              }
>              break;
>          case QEMU_VM_COMMAND:
>              ret = loadvm_process_command(f);
>              trace_qemu_loadvm_state_section_command(ret);
> -            if ((ret < 0) || (ret == LOADVM_QUIT)) {
> +            if (ret < 0) {
> +                error_setg(errp,
> +                           "Failed to load device state command: %d", ret);
> +                goto out;
> +            }
> +            if (ret == LOADVM_QUIT) {
>                  goto out;
>              }
>              break;
>          case QEMU_VM_EOF:
>              /* This is the end of migration */
> +            ret = 0;
>              goto out;
>          default:
> -            error_report("Unknown savevm section type %d", section_type);
> -            ret = -EINVAL;
> +            error_setg(errp,
> +                       "Unknown savevm section type %d", section_type);
> +            ret = -1;
>              goto out;
>          }
>      }
>  
>  out:
>      if (ret < 0) {
> -        qemu_file_set_error(f, ret);
> +        qemu_file_set_error(f, -EINVAL);
>  
>          /* Cancel bitmaps incoming regardless of recovery */
>          dirty_bitmap_mig_cancel_incoming();
> @@ -2643,6 +2670,12 @@ out:
>      return ret;
>  }
>  
> +/*
> + * Returns:
> + *   0: success
> + *   LOADVM_QUIT: success, but stop
> + *   -1: error
> + */
>  int qemu_loadvm_state(QEMUFile *f, Error **errp)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
> @@ -2662,17 +2695,12 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
>  
>      cpu_synchronize_all_pre_loadvm();
>  
> -    ret = qemu_loadvm_state_main(f, mis);
> -    if (ret < 0) {
> -        error_setg(errp, "Error %d while loading VM state", ret);
> -        ret = -1;
> -    }
> +    ret = qemu_loadvm_state_main(f, mis, errp);
>      qemu_event_set(&mis->main_thread_load_event);
>  
>      trace_qemu_loadvm_state_post_main(ret);
>  
>      if (mis->have_listen_thread) {
> -        error_setg(errp, "Error %d while loading VM state", ret);
>          /* Listen thread still going, can't clean up yet */
>          return ret;
>      }
> @@ -2729,13 +2757,10 @@ int qemu_loadvm_state(QEMUFile *f, Error **errp)
>  int qemu_load_device_state(QEMUFile *f, Error **errp)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
> -    int ret;
>  
>      /* Load QEMU_VM_SECTION_FULL section */
> -    ret = qemu_loadvm_state_main(f, mis);
> -    if (ret < 0) {
> -        error_setg(errp, "Failed to load device state: %d", ret);
> -        return ret;
> +    if (qemu_loadvm_state_main(f, mis, errp) < 0) {
> +        return -1;
>      }
>  
>      cpu_synchronize_all_post_init();
> diff --git a/migration/savevm.h b/migration/savevm.h
> index c727bc103e..1cec83c729 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -62,7 +62,8 @@ int qemu_save_device_state(QEMUFile *f);
>  
>  int qemu_loadvm_state(QEMUFile *f, Error **errp);
>  void qemu_loadvm_state_cleanup(void);
> -int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis,
> +                           Error **errp);
>  int qemu_load_device_state(QEMUFile *f, Error **errp);
>  
>  #endif
> -- 
> 2.29.2
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-08 13:42       ` Daniel P. Berrangé
  2021-02-08 14:29         ` Dr. David Alan Gilbert
@ 2021-02-15 18:38         ` Dr. David Alan Gilbert
  2021-02-15 18:58           ` Daniel P. Berrangé
  1 sibling, 1 reply; 64+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-15 18:38 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Hailiang Zhang, qemu-devel, Juan Quintela

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Feb 08, 2021 at 01:29:03PM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Thu, Feb 04, 2021 at 06:22:49PM +0000, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > Due to its long term heritage most of the migration code just invokes
> > > > > 'error_report' when problems hit. This was fine for HMP, since the
> > > > > messages get redirected from stderr, into the HMP console. It is not
> > > > > OK for QMP because the errors will not be fed back to the QMP client.
> > > > > 
> > > > > This wasn't a terrible real world problem with QMP so far because
> > > > > live migration happens in the background, so at least on the target side
> > > > > there is not a QMP command that needs to capture the incoming migration.
> > > > > It is a problem on the source side but it doesn't hit frequently as the
> > > > > source side has fewer failure scenarios. None the less on both sides it
> > > > > would be desirable if 'query-migrate' can report errors correctly.
> > > > > With the introduction of the load-snapshot QMP commands, the need for
> > > > > error reporting becomes more pressing.
> > > > > 
> > > > > Wiring up good error reporting is a large and difficult job, which
> > > > > this series does NOT complete. The focus here has been on converting
> > > > > all methods in savevm.c which have an 'int' return value capable of
> > > > > reporting errors. This covers most of the infrastructure for controlling
> > > > > the migration state serialization / protocol.
> > > > > 
> > > > > The remaining part that is missing error reporting are the callbacks in
> > > > > the VMStateDescription struct which can return failure codes, but have
> > > > > no "Error **errp" parameter. Thinking about how this might be dealt with
> > > > > in future, a big bang conversion is likely non-viable. We'll probably
> > > > > want to introduce a duplicate set of callbacks with the "Error **errp"
> > > > > parameter and convert impls in batches, eventually removing the
> > > > > original callbacks. I don't intend todo that myself in the immediate
> > > > > future.
> > > > > 
> > > > > IOW, this patch series probably solves 50% of the problem, but we
> > > > > still do need the rest to get ideal error reporting.
> > > > > 
> > > > > In doing this savevm conversion I noticed a bunch of places which
> > > > > see and then ignore errors. I only fixed one or two of them which
> > > > > were clearly dubious. Other places in savevm.c where it seemed it
> > > > > was probably ok to ignore errors, I've left using error_report()
> > > > > on the basis that those are really warnings. Perhaps they could
> > > > > be changed to warn_report() instead.
> > > > > 
> > > > > There are alot of patches here, but I felt it was easier to review
> > > > > for correctness if I converted 1 function at a time. The series
> > > > > does not neccessarily have to be reviewed/appied in 1 go.
> > > > 
> > > > After this series, what do my errors look like, and where do they end
> > > > up?
> > > > Do I get my nice backtrace shwoing that device failed, then that was
> > > > part of that one...
> > > 
> > > It hasn't modified any of the VMStateDescription callbacks so any
> > > of the per-device logic that was printing errors will still be using
> > > error_report to the console as before.
> > > 
> > > The errors that have changed (at this stage) are only the higher
> > > level ones that are in the generic part of the code. Where those
> > > errors mentioned a device name/ID they still do.
> > > 
> > > In some of the parts I've modified there will have been multiple
> > > error_reports collapsed into one error_setg() but the ones that
> > > are eliminated are high level generic messages with no useful
> > > info, so I don't think loosing those is a problem per-se.
> > > 
> > > The example that I tested was the case where we load a snapshot
> > > under a different config that we saved it with. This is the scenario
> > > that gave the non-deterministic ordering in the iotest you disabled
> > > from my previous series.
> > > 
> > > In that case, we changed from:
> > > 
> > >   qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}
> > > 
> > > To
> > > 
> > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}
> > > 
> > > From a HMP loadvm POV, this means instead of seeing
> > > 
> > >   (hmp)  loadvm foo
> > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > >   Error -22 while loading VM state
> > > 
> > > You will only see the detailed error message
> > > 
> > >   (hmp)  loadvm foo
> > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > 
> > > In this case I think loosing the "Error -22 while loading VM state"
> > > is fine, as it didn't add value IMHO.
> > > 
> > > 
> > > If we get around to converting the VMStateDescription callbacks to
> > > take an error object, then I think we'll possibly need to stack the
> > > error message from the callback, with the higher level message.
> > > 
> > > Do you have any familiar/good examples of error message stacking I
> > > can look at ?  I should be able to say whether they would be impacted
> > > by this series or not - if they are, then I hopefully only threw away
> > > the fairly useless high level messages, like the "Error -22" message
> > > above.
> > 
> > Can you try migrating:
> >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng,disable-modern=true
> > to
> >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng
> > 
> > what I currently get is:
> > qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> > qemu-system-x86_64: Failed to load PCIDevice:config
> > qemu-system-x86_64: Failed to load virtio-rng:virtio
> > qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> > qemu-system-x86_64: load of migration failed: Invalid argument
> 
> After my patches the very last line is gone.
> 
> So, still reporting using  error_report() is the first 3:
> 
>  qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
>  qemu-system-x86_64: Failed to load PCIDevice:config
>  qemu-system-x86_64: Failed to load virtio-rng:virtio
> 
> Then reported in process_incoming_migration_co() using the message
> populated in the Error object, using error_report_err():
> 
>  qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> 
> Finally, this is no longer reported:
> 
>  qemu-system-x86_64: load of migration failed: Invalid argument
> 
> So in this case we've not lost any useful information

One thing to check, and I *think* you're OK, but we have one place where
we actually check the error number:

migration.c:
3414 static MigThrError migration_detect_error(MigrationState *s)
...
3426     /* Try to detect any file errors */
3427     ret = qemu_file_get_error_obj(s->to_dst_file, &local_error);
3428     if (!ret) {
3429         /* Everything is fine */
3430         assert(!local_error);
3431         return MIG_THR_ERR_NONE;
3432     }
3433 
3434     if (local_error) {
3435         migrate_set_error(s, local_error);
3436         error_free(local_error);
3437     }
3438 
3439     if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret == -EIO) {
3440         /*
3441          * For postcopy, we allow the network to be down for a
3442          * while. After that, it can be continued by a
3443          * recovery phase.
3444          */
3445         return postcopy_pause(s);
3446     } else {

This is to go into postcopy pause if the network connection broke (but
not if for example a device moaned about being in an invalid state)

If I read this correctly, file errors are still being preserved - is
that correct?

Dave


> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-15 18:38         ` Dr. David Alan Gilbert
@ 2021-02-15 18:58           ` Daniel P. Berrangé
  2021-02-15 19:01             ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-15 18:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Juan Quintela, Hailiang Zhang, qemu-devel

On Mon, Feb 15, 2021 at 06:38:05PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Mon, Feb 08, 2021 at 01:29:03PM +0000, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > On Thu, Feb 04, 2021 at 06:22:49PM +0000, Dr. David Alan Gilbert wrote:
> > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > > Due to its long term heritage most of the migration code just invokes
> > > > > > 'error_report' when problems hit. This was fine for HMP, since the
> > > > > > messages get redirected from stderr, into the HMP console. It is not
> > > > > > OK for QMP because the errors will not be fed back to the QMP client.
> > > > > > 
> > > > > > This wasn't a terrible real world problem with QMP so far because
> > > > > > live migration happens in the background, so at least on the target side
> > > > > > there is not a QMP command that needs to capture the incoming migration.
> > > > > > It is a problem on the source side but it doesn't hit frequently as the
> > > > > > source side has fewer failure scenarios. None the less on both sides it
> > > > > > would be desirable if 'query-migrate' can report errors correctly.
> > > > > > With the introduction of the load-snapshot QMP commands, the need for
> > > > > > error reporting becomes more pressing.
> > > > > > 
> > > > > > Wiring up good error reporting is a large and difficult job, which
> > > > > > this series does NOT complete. The focus here has been on converting
> > > > > > all methods in savevm.c which have an 'int' return value capable of
> > > > > > reporting errors. This covers most of the infrastructure for controlling
> > > > > > the migration state serialization / protocol.
> > > > > > 
> > > > > > The remaining part that is missing error reporting are the callbacks in
> > > > > > the VMStateDescription struct which can return failure codes, but have
> > > > > > no "Error **errp" parameter. Thinking about how this might be dealt with
> > > > > > in future, a big bang conversion is likely non-viable. We'll probably
> > > > > > want to introduce a duplicate set of callbacks with the "Error **errp"
> > > > > > parameter and convert impls in batches, eventually removing the
> > > > > > original callbacks. I don't intend todo that myself in the immediate
> > > > > > future.
> > > > > > 
> > > > > > IOW, this patch series probably solves 50% of the problem, but we
> > > > > > still do need the rest to get ideal error reporting.
> > > > > > 
> > > > > > In doing this savevm conversion I noticed a bunch of places which
> > > > > > see and then ignore errors. I only fixed one or two of them which
> > > > > > were clearly dubious. Other places in savevm.c where it seemed it
> > > > > > was probably ok to ignore errors, I've left using error_report()
> > > > > > on the basis that those are really warnings. Perhaps they could
> > > > > > be changed to warn_report() instead.
> > > > > > 
> > > > > > There are alot of patches here, but I felt it was easier to review
> > > > > > for correctness if I converted 1 function at a time. The series
> > > > > > does not neccessarily have to be reviewed/appied in 1 go.
> > > > > 
> > > > > After this series, what do my errors look like, and where do they end
> > > > > up?
> > > > > Do I get my nice backtrace shwoing that device failed, then that was
> > > > > part of that one...
> > > > 
> > > > It hasn't modified any of the VMStateDescription callbacks so any
> > > > of the per-device logic that was printing errors will still be using
> > > > error_report to the console as before.
> > > > 
> > > > The errors that have changed (at this stage) are only the higher
> > > > level ones that are in the generic part of the code. Where those
> > > > errors mentioned a device name/ID they still do.
> > > > 
> > > > In some of the parts I've modified there will have been multiple
> > > > error_reports collapsed into one error_setg() but the ones that
> > > > are eliminated are high level generic messages with no useful
> > > > info, so I don't think loosing those is a problem per-se.
> > > > 
> > > > The example that I tested was the case where we load a snapshot
> > > > under a different config that we saved it with. This is the scenario
> > > > that gave the non-deterministic ordering in the iotest you disabled
> > > > from my previous series.
> > > > 
> > > > In that case, we changed from:
> > > > 
> > > >   qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}
> > > > 
> > > > To
> > > > 
> > > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}
> > > > 
> > > > From a HMP loadvm POV, this means instead of seeing
> > > > 
> > > >   (hmp)  loadvm foo
> > > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > >   Error -22 while loading VM state
> > > > 
> > > > You will only see the detailed error message
> > > > 
> > > >   (hmp)  loadvm foo
> > > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > > 
> > > > In this case I think loosing the "Error -22 while loading VM state"
> > > > is fine, as it didn't add value IMHO.
> > > > 
> > > > 
> > > > If we get around to converting the VMStateDescription callbacks to
> > > > take an error object, then I think we'll possibly need to stack the
> > > > error message from the callback, with the higher level message.
> > > > 
> > > > Do you have any familiar/good examples of error message stacking I
> > > > can look at ?  I should be able to say whether they would be impacted
> > > > by this series or not - if they are, then I hopefully only threw away
> > > > the fairly useless high level messages, like the "Error -22" message
> > > > above.
> > > 
> > > Can you try migrating:
> > >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng,disable-modern=true
> > > to
> > >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng
> > > 
> > > what I currently get is:
> > > qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> > > qemu-system-x86_64: Failed to load PCIDevice:config
> > > qemu-system-x86_64: Failed to load virtio-rng:virtio
> > > qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> > > qemu-system-x86_64: load of migration failed: Invalid argument
> > 
> > After my patches the very last line is gone.
> > 
> > So, still reporting using  error_report() is the first 3:
> > 
> >  qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> >  qemu-system-x86_64: Failed to load PCIDevice:config
> >  qemu-system-x86_64: Failed to load virtio-rng:virtio
> > 
> > Then reported in process_incoming_migration_co() using the message
> > populated in the Error object, using error_report_err():
> > 
> >  qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> > 
> > Finally, this is no longer reported:
> > 
> >  qemu-system-x86_64: load of migration failed: Invalid argument
> > 
> > So in this case we've not lost any useful information
> 
> One thing to check, and I *think* you're OK, but we have one place where
> we actually check the error number:
> 
> migration.c:
> 3414 static MigThrError migration_detect_error(MigrationState *s)
> ...
> 3426     /* Try to detect any file errors */
> 3427     ret = qemu_file_get_error_obj(s->to_dst_file, &local_error);
> 3428     if (!ret) {
> 3429         /* Everything is fine */
> 3430         assert(!local_error);
> 3431         return MIG_THR_ERR_NONE;
> 3432     }
> 3433 
> 3434     if (local_error) {
> 3435         migrate_set_error(s, local_error);
> 3436         error_free(local_error);
> 3437     }
> 3438 
> 3439     if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret == -EIO) {
> 3440         /*
> 3441          * For postcopy, we allow the network to be down for a
> 3442          * while. After that, it can be continued by a
> 3443          * recovery phase.
> 3444          */
> 3445         return postcopy_pause(s);
> 3446     } else {
> 
> This is to go into postcopy pause if the network connection broke (but
> not if for example a device moaned about being in an invalid state)
> 
> If I read this correctly, file errors are still being preserved - is
> that correct?

Yes, in places where QemuFile is reporting an actual I/O error I've
tried to preserve that. Only removed setting of fake I/O errors. So
if anything, we ought to get more accurate at detecting the recoverable
scenarios once we fully cleanup errors.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/33] migration: push Error **errp into qemu_loadvm_state_main()
  2021-02-15 18:35   ` Dr. David Alan Gilbert
@ 2021-02-15 18:58     ` Daniel P. Berrangé
  2021-03-11 12:17     ` Daniel P. Berrangé
  1 sibling, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-15 18:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Hailiang Zhang, qemu-devel, Juan Quintela

On Mon, Feb 15, 2021 at 06:35:15PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > This is an incremental step in converting vmstate loading code to report
> > via Error objects instead of printing directly to the console/monitor.
> > 
> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > ---
> >  migration/colo.c   |  3 +-
> >  migration/savevm.c | 73 +++++++++++++++++++++++++++++++---------------
> >  migration/savevm.h |  3 +-
> >  3 files changed, 52 insertions(+), 27 deletions(-)
> > 
> > diff --git a/migration/colo.c b/migration/colo.c
> > index e344b7cf32..4a050ac579 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -705,11 +705,10 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
> >  
> >      qemu_mutex_lock_iothread();
> >      cpu_synchronize_all_states();
> > -    ret = qemu_loadvm_state_main(mis->from_src_file, mis);
> > +    ret = qemu_loadvm_state_main(mis->from_src_file, mis, errp);
> >      qemu_mutex_unlock_iothread();
> >  
> >      if (ret < 0) {
> > -        error_setg(errp, "Load VM's live state (ram) error");
> >          return;
> >      }
> >  
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index dd41292d4e..e47aec435c 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -1819,6 +1819,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >      QEMUFile *f = mis->from_src_file;
> >      int load_res;
> >      MigrationState *migr = migrate_get_current();
> > +    Error *local_err = NULL;
> >  
> >      object_ref(OBJECT(migr));
> >  
> > @@ -1833,7 +1834,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >       * in qemu_file, and thus we must be blocking now.
> >       */
> >      qemu_file_set_blocking(f, true);
> > -    load_res = qemu_loadvm_state_main(f, mis);
> > +    load_res = qemu_loadvm_state_main(f, mis, &local_err);
> >  
> >      /*
> >       * This is tricky, but, mis->from_src_file can change after it
> > @@ -1849,6 +1850,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >      if (load_res < 0) {
> >          qemu_file_set_error(f, load_res);
> >          dirty_bitmap_mig_cancel_incoming();
> > +        error_report_err(local_err);
> >          if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
> >              !migrate_postcopy_ram() && migrate_dirty_bitmaps())
> >          {
> > @@ -1859,12 +1861,10 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >                           __func__, load_res);
> >              load_res = 0; /* prevent further exit() */
> >          } else {
> > -            error_report("%s: loadvm failed: %d", __func__, load_res);
> >              migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> >                                             MIGRATION_STATUS_FAILED);
> >          }
> > -    }
> > -    if (load_res >= 0) {
> > +    } else {
> >          /*
> >           * This looks good, but it's possible that the device loading in the
> >           * main thread hasn't finished yet, and so we might not be in 'RUN'
> > @@ -2116,14 +2116,17 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
> >   * @mis: Incoming state
> >   * @length: Length of packaged data to read
> >   *
> > - * Returns: Negative values on error
> > - *
> > + * Returns:
> > + *   0: success
> > + *   LOADVM_QUIT: success, but stop
> > + *   -1: error
> >   */
> >  static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> >  {
> >      int ret;
> >      size_t length;
> >      QIOChannelBuffer *bioc;
> > +    Error *local_err = NULL;
> >  
> >      length = qemu_get_be32(mis->from_src_file);
> >      trace_loadvm_handle_cmd_packaged(length);
> > @@ -2149,8 +2152,11 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> >  
> >      QEMUFile *packf = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
> >  
> > -    ret = qemu_loadvm_state_main(packf, mis);
> > +    ret = qemu_loadvm_state_main(packf, mis, &local_err);
> >      trace_loadvm_handle_cmd_packaged_main(ret);
> > +    if (ret < 0) {
> > +        error_report_err(local_err);
> > +    }
> >      qemu_fclose(packf);
> >      object_unref(OBJECT(bioc));
> >  
> > @@ -2568,7 +2574,14 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
> >      return true;
> >  }
> >  
> > -int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> > +/*
> > + * Returns:
> > + *   0: success
> > + *   LOADVM_QUIT: success, but stop
> > + *   -1: error
> > + */
> > +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis,
> > +                           Error **errp)
> >  {
> >      uint8_t section_type;
> >      int ret = 0;
> > @@ -2579,7 +2592,9 @@ retry:
> >  
> >          if (qemu_file_get_error(f)) {
> >              ret = qemu_file_get_error(f);
> > -            break;
> > +            error_setg(errp,
> > +                       "Failed to load device state section ID: %d", ret);
> 
> Can I ask why these don't use strerror(ret) ?

No good reason.

> 
> The test I'm running is, start a VM with an actual guest and a useful
> amount of ram:
> 
> ./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -nographic -m 8G -drive if=virtio,file=/home/vmimages/fedora-33-nest.qcow
> 
> ./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -nographic -m 8G -drive if=virtio,file=/home/vmimages/fedora-33-nest.qcow -incoming tcp:0:4444
> 
> source:
>   migrate_set_speed 1m
>   migrate -d tcp:0:4444
>   <Now quickly>
>   migrate_cancel
> 
> In the old world I get:
> qemu-system-x86_64: load of migration failed: Input/output error
> 
> In your world I get:
> qemu-system-x86_64: Failed to load device state section ID: -5
> 
> (5 being EIO)

Yep, looks like I should fix that.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-15 18:58           ` Daniel P. Berrangé
@ 2021-02-15 19:01             ` Dr. David Alan Gilbert
  2021-02-16  9:30               ` Daniel P. Berrangé
  0 siblings, 1 reply; 64+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-15 19:01 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Juan Quintela, Hailiang Zhang, qemu-devel

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Feb 15, 2021 at 06:38:05PM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Mon, Feb 08, 2021 at 01:29:03PM +0000, Dr. David Alan Gilbert wrote:
> > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > On Thu, Feb 04, 2021 at 06:22:49PM +0000, Dr. David Alan Gilbert wrote:
> > > > > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > > > > Due to its long term heritage most of the migration code just invokes
> > > > > > > 'error_report' when problems hit. This was fine for HMP, since the
> > > > > > > messages get redirected from stderr, into the HMP console. It is not
> > > > > > > OK for QMP because the errors will not be fed back to the QMP client.
> > > > > > > 
> > > > > > > This wasn't a terrible real world problem with QMP so far because
> > > > > > > live migration happens in the background, so at least on the target side
> > > > > > > there is not a QMP command that needs to capture the incoming migration.
> > > > > > > It is a problem on the source side but it doesn't hit frequently as the
> > > > > > > source side has fewer failure scenarios. None the less on both sides it
> > > > > > > would be desirable if 'query-migrate' can report errors correctly.
> > > > > > > With the introduction of the load-snapshot QMP commands, the need for
> > > > > > > error reporting becomes more pressing.
> > > > > > > 
> > > > > > > Wiring up good error reporting is a large and difficult job, which
> > > > > > > this series does NOT complete. The focus here has been on converting
> > > > > > > all methods in savevm.c which have an 'int' return value capable of
> > > > > > > reporting errors. This covers most of the infrastructure for controlling
> > > > > > > the migration state serialization / protocol.
> > > > > > > 
> > > > > > > The remaining part that is missing error reporting are the callbacks in
> > > > > > > the VMStateDescription struct which can return failure codes, but have
> > > > > > > no "Error **errp" parameter. Thinking about how this might be dealt with
> > > > > > > in future, a big bang conversion is likely non-viable. We'll probably
> > > > > > > want to introduce a duplicate set of callbacks with the "Error **errp"
> > > > > > > parameter and convert impls in batches, eventually removing the
> > > > > > > original callbacks. I don't intend todo that myself in the immediate
> > > > > > > future.
> > > > > > > 
> > > > > > > IOW, this patch series probably solves 50% of the problem, but we
> > > > > > > still do need the rest to get ideal error reporting.
> > > > > > > 
> > > > > > > In doing this savevm conversion I noticed a bunch of places which
> > > > > > > see and then ignore errors. I only fixed one or two of them which
> > > > > > > were clearly dubious. Other places in savevm.c where it seemed it
> > > > > > > was probably ok to ignore errors, I've left using error_report()
> > > > > > > on the basis that those are really warnings. Perhaps they could
> > > > > > > be changed to warn_report() instead.
> > > > > > > 
> > > > > > > There are alot of patches here, but I felt it was easier to review
> > > > > > > for correctness if I converted 1 function at a time. The series
> > > > > > > does not neccessarily have to be reviewed/appied in 1 go.
> > > > > > 
> > > > > > After this series, what do my errors look like, and where do they end
> > > > > > up?
> > > > > > Do I get my nice backtrace shwoing that device failed, then that was
> > > > > > part of that one...
> > > > > 
> > > > > It hasn't modified any of the VMStateDescription callbacks so any
> > > > > of the per-device logic that was printing errors will still be using
> > > > > error_report to the console as before.
> > > > > 
> > > > > The errors that have changed (at this stage) are only the higher
> > > > > level ones that are in the generic part of the code. Where those
> > > > > errors mentioned a device name/ID they still do.
> > > > > 
> > > > > In some of the parts I've modified there will have been multiple
> > > > > error_reports collapsed into one error_setg() but the ones that
> > > > > are eliminated are high level generic messages with no useful
> > > > > info, so I don't think loosing those is a problem per-se.
> > > > > 
> > > > > The example that I tested was the case where we load a snapshot
> > > > > under a different config that we saved it with. This is the scenario
> > > > > that gave the non-deterministic ordering in the iotest you disabled
> > > > > from my previous series.
> > > > > 
> > > > > In that case, we changed from:
> > > > > 
> > > > >   qemu-system-x86_64: Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Error -22 while loading VM state"}]}
> > > > > 
> > > > > To
> > > > > 
> > > > >   {"return": [{"current-progress": 1, "status": "concluded", "total-progress": 1, "type": "snapshot-load", "id": "load-err-stderr", "error": "Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices"}]}
> > > > > 
> > > > > From a HMP loadvm POV, this means instead of seeing
> > > > > 
> > > > >   (hmp)  loadvm foo
> > > > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > > >   Error -22 while loading VM state
> > > > > 
> > > > > You will only see the detailed error message
> > > > > 
> > > > >   (hmp)  loadvm foo
> > > > >   Unknown savevm section or instance '0000:00:02.0/virtio-rng' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices
> > > > > 
> > > > > In this case I think loosing the "Error -22 while loading VM state"
> > > > > is fine, as it didn't add value IMHO.
> > > > > 
> > > > > 
> > > > > If we get around to converting the VMStateDescription callbacks to
> > > > > take an error object, then I think we'll possibly need to stack the
> > > > > error message from the callback, with the higher level message.
> > > > > 
> > > > > Do you have any familiar/good examples of error message stacking I
> > > > > can look at ?  I should be able to say whether they would be impacted
> > > > > by this series or not - if they are, then I hopefully only threw away
> > > > > the fairly useless high level messages, like the "Error -22" message
> > > > > above.
> > > > 
> > > > Can you try migrating:
> > > >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng,disable-modern=true
> > > > to
> > > >   ./x86_64-softmmu/qemu-system-x86_64 -M pc -nographic -device virtio-rng
> > > > 
> > > > what I currently get is:
> > > > qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> > > > qemu-system-x86_64: Failed to load PCIDevice:config
> > > > qemu-system-x86_64: Failed to load virtio-rng:virtio
> > > > qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> > > > qemu-system-x86_64: load of migration failed: Invalid argument
> > > 
> > > After my patches the very last line is gone.
> > > 
> > > So, still reporting using  error_report() is the first 3:
> > > 
> > >  qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x6 read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
> > >  qemu-system-x86_64: Failed to load PCIDevice:config
> > >  qemu-system-x86_64: Failed to load virtio-rng:virtio
> > > 
> > > Then reported in process_incoming_migration_co() using the message
> > > populated in the Error object, using error_report_err():
> > > 
> > >  qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:04.0/virtio-rng'
> > > 
> > > Finally, this is no longer reported:
> > > 
> > >  qemu-system-x86_64: load of migration failed: Invalid argument
> > > 
> > > So in this case we've not lost any useful information
> > 
> > One thing to check, and I *think* you're OK, but we have one place where
> > we actually check the error number:
> > 
> > migration.c:
> > 3414 static MigThrError migration_detect_error(MigrationState *s)
> > ...
> > 3426     /* Try to detect any file errors */
> > 3427     ret = qemu_file_get_error_obj(s->to_dst_file, &local_error);
> > 3428     if (!ret) {
> > 3429         /* Everything is fine */
> > 3430         assert(!local_error);
> > 3431         return MIG_THR_ERR_NONE;
> > 3432     }
> > 3433 
> > 3434     if (local_error) {
> > 3435         migrate_set_error(s, local_error);
> > 3436         error_free(local_error);
> > 3437     }
> > 3438 
> > 3439     if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret == -EIO) {
> > 3440         /*
> > 3441          * For postcopy, we allow the network to be down for a
> > 3442          * while. After that, it can be continued by a
> > 3443          * recovery phase.
> > 3444          */
> > 3445         return postcopy_pause(s);
> > 3446     } else {
> > 
> > This is to go into postcopy pause if the network connection broke (but
> > not if for example a device moaned about being in an invalid state)
> > 
> > If I read this correctly, file errors are still being preserved - is
> > that correct?
> 
> Yes, in places where QemuFile is reporting an actual I/O error I've
> tried to preserve that. Only removed setting of fake I/O errors. So
> if anything, we ought to get more accurate at detecting the recoverable
> scenarios once we fully cleanup errors.

OK, good.

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-15 19:01             ` Dr. David Alan Gilbert
@ 2021-02-16  9:30               ` Daniel P. Berrangé
  2021-02-16 19:32                 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-02-16  9:30 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Juan Quintela, Hailiang Zhang, qemu-devel

On Mon, Feb 15, 2021 at 07:01:28PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Mon, Feb 15, 2021 at 06:38:05PM +0000, Dr. David Alan Gilbert wrote:
> > > One thing to check, and I *think* you're OK, but we have one place where
> > > we actually check the error number:
> > > 
> > > migration.c:
> > > 3414 static MigThrError migration_detect_error(MigrationState *s)
> > > ...
> > > 3426     /* Try to detect any file errors */
> > > 3427     ret = qemu_file_get_error_obj(s->to_dst_file, &local_error);
> > > 3428     if (!ret) {
> > > 3429         /* Everything is fine */
> > > 3430         assert(!local_error);
> > > 3431         return MIG_THR_ERR_NONE;
> > > 3432     }
> > > 3433 
> > > 3434     if (local_error) {
> > > 3435         migrate_set_error(s, local_error);
> > > 3436         error_free(local_error);
> > > 3437     }
> > > 3438 
> > > 3439     if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret == -EIO) {
> > > 3440         /*
> > > 3441          * For postcopy, we allow the network to be down for a
> > > 3442          * while. After that, it can be continued by a
> > > 3443          * recovery phase.
> > > 3444          */
> > > 3445         return postcopy_pause(s);
> > > 3446     } else {
> > > 
> > > This is to go into postcopy pause if the network connection broke (but
> > > not if for example a device moaned about being in an invalid state)
> > > 
> > > If I read this correctly, file errors are still being preserved - is
> > > that correct?
> > 
> > Yes, in places where QemuFile is reporting an actual I/O error I've
> > tried to preserve that. Only removed setting of fake I/O errors. So
> > if anything, we ought to get more accurate at detecting the recoverable
> > scenarios once we fully cleanup errors.
> 
> OK, good.

One scenario to possibly check though is that in a few places we used
error_report_err() but didn't immediately return an error code back to
the caller, instead carrying on doing other calls. It is possible that
we thus reported an error about bad data, and then later hit the EIO
check for QemuFile.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/33] migration: capture error reports into Error object
  2021-02-16  9:30               ` Daniel P. Berrangé
@ 2021-02-16 19:32                 ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 64+ messages in thread
From: Dr. David Alan Gilbert @ 2021-02-16 19:32 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: Juan Quintela, Hailiang Zhang, qemu-devel

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Feb 15, 2021 at 07:01:28PM +0000, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > On Mon, Feb 15, 2021 at 06:38:05PM +0000, Dr. David Alan Gilbert wrote:
> > > > One thing to check, and I *think* you're OK, but we have one place where
> > > > we actually check the error number:
> > > > 
> > > > migration.c:
> > > > 3414 static MigThrError migration_detect_error(MigrationState *s)
> > > > ...
> > > > 3426     /* Try to detect any file errors */
> > > > 3427     ret = qemu_file_get_error_obj(s->to_dst_file, &local_error);
> > > > 3428     if (!ret) {
> > > > 3429         /* Everything is fine */
> > > > 3430         assert(!local_error);
> > > > 3431         return MIG_THR_ERR_NONE;
> > > > 3432     }
> > > > 3433 
> > > > 3434     if (local_error) {
> > > > 3435         migrate_set_error(s, local_error);
> > > > 3436         error_free(local_error);
> > > > 3437     }
> > > > 3438 
> > > > 3439     if (state == MIGRATION_STATUS_POSTCOPY_ACTIVE && ret == -EIO) {
> > > > 3440         /*
> > > > 3441          * For postcopy, we allow the network to be down for a
> > > > 3442          * while. After that, it can be continued by a
> > > > 3443          * recovery phase.
> > > > 3444          */
> > > > 3445         return postcopy_pause(s);
> > > > 3446     } else {
> > > > 
> > > > This is to go into postcopy pause if the network connection broke (but
> > > > not if for example a device moaned about being in an invalid state)
> > > > 
> > > > If I read this correctly, file errors are still being preserved - is
> > > > that correct?
> > > 
> > > Yes, in places where QemuFile is reporting an actual I/O error I've
> > > tried to preserve that. Only removed setting of fake I/O errors. So
> > > if anything, we ought to get more accurate at detecting the recoverable
> > > scenarios once we fully cleanup errors.
> > 
> > OK, good.
> 
> One scenario to possibly check though is that in a few places we used
> error_report_err() but didn't immediately return an error code back to
> the caller, instead carrying on doing other calls. It is possible that
> we thus reported an error about bad data, and then later hit the EIO
> check for QemuFile.

That's generally OK; it gets pretty painful to do the qemu file checks
after every read.

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 05/33] migration: push Error **errp into qemu_loadvm_state_main()
  2021-02-15 18:35   ` Dr. David Alan Gilbert
  2021-02-15 18:58     ` Daniel P. Berrangé
@ 2021-03-11 12:17     ` Daniel P. Berrangé
  1 sibling, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-03-11 12:17 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: Hailiang Zhang, qemu-devel, Juan Quintela

On Mon, Feb 15, 2021 at 06:35:15PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > This is an incremental step in converting vmstate loading code to report
> > via Error objects instead of printing directly to the console/monitor.
> > 
> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > ---
> >  migration/colo.c   |  3 +-
> >  migration/savevm.c | 73 +++++++++++++++++++++++++++++++---------------
> >  migration/savevm.h |  3 +-
> >  3 files changed, 52 insertions(+), 27 deletions(-)
> > 
> > diff --git a/migration/colo.c b/migration/colo.c
> > index e344b7cf32..4a050ac579 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -705,11 +705,10 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
> >  
> >      qemu_mutex_lock_iothread();
> >      cpu_synchronize_all_states();
> > -    ret = qemu_loadvm_state_main(mis->from_src_file, mis);
> > +    ret = qemu_loadvm_state_main(mis->from_src_file, mis, errp);
> >      qemu_mutex_unlock_iothread();
> >  
> >      if (ret < 0) {
> > -        error_setg(errp, "Load VM's live state (ram) error");
> >          return;
> >      }
> >  
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index dd41292d4e..e47aec435c 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -1819,6 +1819,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >      QEMUFile *f = mis->from_src_file;
> >      int load_res;
> >      MigrationState *migr = migrate_get_current();
> > +    Error *local_err = NULL;
> >  
> >      object_ref(OBJECT(migr));
> >  
> > @@ -1833,7 +1834,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >       * in qemu_file, and thus we must be blocking now.
> >       */
> >      qemu_file_set_blocking(f, true);
> > -    load_res = qemu_loadvm_state_main(f, mis);
> > +    load_res = qemu_loadvm_state_main(f, mis, &local_err);
> >  
> >      /*
> >       * This is tricky, but, mis->from_src_file can change after it
> > @@ -1849,6 +1850,7 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >      if (load_res < 0) {
> >          qemu_file_set_error(f, load_res);
> >          dirty_bitmap_mig_cancel_incoming();
> > +        error_report_err(local_err);
> >          if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
> >              !migrate_postcopy_ram() && migrate_dirty_bitmaps())
> >          {
> > @@ -1859,12 +1861,10 @@ static void *postcopy_ram_listen_thread(void *opaque)
> >                           __func__, load_res);
> >              load_res = 0; /* prevent further exit() */
> >          } else {
> > -            error_report("%s: loadvm failed: %d", __func__, load_res);
> >              migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> >                                             MIGRATION_STATUS_FAILED);
> >          }
> > -    }
> > -    if (load_res >= 0) {
> > +    } else {
> >          /*
> >           * This looks good, but it's possible that the device loading in the
> >           * main thread hasn't finished yet, and so we might not be in 'RUN'
> > @@ -2116,14 +2116,17 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis)
> >   * @mis: Incoming state
> >   * @length: Length of packaged data to read
> >   *
> > - * Returns: Negative values on error
> > - *
> > + * Returns:
> > + *   0: success
> > + *   LOADVM_QUIT: success, but stop
> > + *   -1: error
> >   */
> >  static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> >  {
> >      int ret;
> >      size_t length;
> >      QIOChannelBuffer *bioc;
> > +    Error *local_err = NULL;
> >  
> >      length = qemu_get_be32(mis->from_src_file);
> >      trace_loadvm_handle_cmd_packaged(length);
> > @@ -2149,8 +2152,11 @@ static int loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
> >  
> >      QEMUFile *packf = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
> >  
> > -    ret = qemu_loadvm_state_main(packf, mis);
> > +    ret = qemu_loadvm_state_main(packf, mis, &local_err);
> >      trace_loadvm_handle_cmd_packaged_main(ret);
> > +    if (ret < 0) {
> > +        error_report_err(local_err);
> > +    }
> >      qemu_fclose(packf);
> >      object_unref(OBJECT(bioc));
> >  
> > @@ -2568,7 +2574,14 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
> >      return true;
> >  }
> >  
> > -int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> > +/*
> > + * Returns:
> > + *   0: success
> > + *   LOADVM_QUIT: success, but stop
> > + *   -1: error
> > + */
> > +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis,
> > +                           Error **errp)
> >  {
> >      uint8_t section_type;
> >      int ret = 0;
> > @@ -2579,7 +2592,9 @@ retry:
> >  
> >          if (qemu_file_get_error(f)) {
> >              ret = qemu_file_get_error(f);
> > -            break;
> > +            error_setg(errp,
> > +                       "Failed to load device state section ID: %d", ret);
> 
> Can I ask why these don't use strerror(ret) ?
> 
> The test I'm running is, start a VM with an actual guest and a useful
> amount of ram:
> 
> ./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -nographic -m 8G -drive if=virtio,file=/home/vmimages/fedora-33-nest.qcow
> 
> ./x86_64-softmmu/qemu-system-x86_64 -M pc,accel=kvm -nographic -m 8G -drive if=virtio,file=/home/vmimages/fedora-33-nest.qcow -incoming tcp:0:4444
> 
> source:
>   migrate_set_speed 1m
>   migrate -d tcp:0:4444
>   <Now quickly>
>   migrate_cancel
> 
> In the old world I get:
> qemu-system-x86_64: load of migration failed: Input/output error
> 
> In your world I get:
> qemu-system-x86_64: Failed to load device state section ID: -5
> 
> (5 being EIO)

Ok, so it looks like I do indeed need to pay more attention to
correctly using error_setg_errno() instead of error_setg(), as
Philippe suggested in the earlier patches.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state()
  2021-02-05  9:35       ` Philippe Mathieu-Daudé
@ 2021-03-11 12:38         ` Daniel P. Berrangé
  0 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2021-03-11 12:38 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Juan Quintela, QEMU Developers, Dr. David Alan Gilbert, Hailiang Zhang

On Fri, Feb 05, 2021 at 10:35:28AM +0100, Philippe Mathieu-Daudé wrote:
> On Fri, Feb 5, 2021 at 10:33 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Thu, Feb 04, 2021 at 10:57:20PM +0100, Philippe Mathieu-Daudé wrote:
> > > On 2/4/21 6:18 PM, Daniel P. Berrangé wrote:
> > > > This is an incremental step in converting vmstate loading code to report
> > > > via Error objects instead of printing directly to the console/monitor.
> > > >
> > > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > > > ---
> > > >  migration/migration.c |  4 ++--
> > > >  migration/savevm.c    | 36 ++++++++++++++++++++----------------
> > > >  migration/savevm.h    |  2 +-
> > > >  3 files changed, 23 insertions(+), 19 deletions(-)
> > > ...
> > >
> > > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > > index 6b320423c7..c8d93eee1e 100644
> > > > --- a/migration/savevm.c
> > > > +++ b/migration/savevm.c
> > > > @@ -2638,40 +2638,49 @@ out:
> > > >      return ret;
> > > >  }
> > > >
> > > > -int qemu_loadvm_state(QEMUFile *f)
> > > > +int qemu_loadvm_state(QEMUFile *f, Error **errp)
> > > >  {
> > > >      MigrationIncomingState *mis = migration_incoming_get_current();
> > > > -    Error *local_err = NULL;
> > > >      int ret;
> > > >
> > > > -    if (qemu_savevm_state_blocked(&local_err)) {
> > > > -        error_report_err(local_err);
> > > > -        return -EINVAL;
> > > > +    if (qemu_savevm_state_blocked(errp)) {
> > > > +        return -1;
> > > >      }
> > > >
> > > >      ret = qemu_loadvm_state_header(f);
> > > >      if (ret) {
> > > > -        return ret;
> > > > +        error_setg(errp, "Error %d while loading VM state", ret);
> > >
> > > Using error_setg_errno() instead (multiple occurences):
> >
> > I don't think we want todo that in general, because the code is
> > already not reliable at actually returning an errno value, sometimes
> > returning just "-1". At the end of this series it will almost always
> > be returning "-1", not an errno.  There are some places where an
> > errno is relevant though - specificially qemu_get_file_error calls.
> 
> Fair. Ignore my other same comments in this. R-b tag stands.

On further investigation you are right. Not using error_setg_errno
has caused a regression in error quality as shown by Dave in a later
patch in this series.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2021-03-11 12:41 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-04 17:18 [PATCH 00/33] migration: capture error reports into Error object Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 01/33] migration: push Error **errp into qemu_loadvm_state() Daniel P. Berrangé
2021-02-04 21:57   ` Philippe Mathieu-Daudé
2021-02-05  9:33     ` Daniel P. Berrangé
2021-02-05  9:35       ` Philippe Mathieu-Daudé
2021-03-11 12:38         ` Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 02/33] migration: push Error **errp into qemu_loadvm_state_header() Daniel P. Berrangé
2021-02-04 21:58   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 03/33] migration: push Error **errp into qemu_loadvm_state_setup() Daniel P. Berrangé
2021-02-04 21:59   ` Philippe Mathieu-Daudé
2021-02-05  7:50   ` Markus Armbruster
2021-02-04 17:18 ` [PATCH 04/33] migration: push Error **errp into qemu_load_device_state() Daniel P. Berrangé
2021-02-04 22:01   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 05/33] migration: push Error **errp into qemu_loadvm_state_main() Daniel P. Berrangé
2021-02-15 18:35   ` Dr. David Alan Gilbert
2021-02-15 18:58     ` Daniel P. Berrangé
2021-03-11 12:17     ` Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 06/33] migration: push Error **errp into qemu_loadvm_section_start_full() Daniel P. Berrangé
2021-02-04 22:04   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 07/33] migration: push Error **errp into qemu_loadvm_section_part_end() Daniel P. Berrangé
2021-02-05 16:16   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 08/33] migration: push Error **errp into loadvm_process_command() Daniel P. Berrangé
2021-02-05 16:18   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 09/33] migration: push Error **errp into loadvm_handle_cmd_packaged() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 10/33] migration: push Error **errp into loadvm_postcopy_handle_advise() Daniel P. Berrangé
2021-02-05 16:21   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 11/33] migration: push Error **errp into ram_postcopy_incoming_init() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 12/33] migration: push Error **errp into loadvm_postcopy_handle_listen() Daniel P. Berrangé
2021-02-05 16:23   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 13/33] migration: push Error **errp into loadvm_postcopy_handle_run() Daniel P. Berrangé
2021-02-05 16:23   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 14/33] migration: push Error **errp into loadvm_postcopy_ram_handle_discard() Daniel P. Berrangé
2021-02-05 16:24   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 15/33] migration: make loadvm_postcopy_handle_resume() void Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 16/33] migration: push Error **errp into loadvm_handle_recv_bitmap() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 17/33] migration: push Error **errp into loadvm_process_enable_colo() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 18/33] migration: push Error **errp into colo_init_ram_cache() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 19/33] migration: push Error **errp into check_section_footer() Daniel P. Berrangé
2021-02-05 16:26   ` Philippe Mathieu-Daudé
2021-02-04 17:18 ` [PATCH 20/33] migration: push Error **errp into global_state_store() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 21/33] migration: remove error reporting from qemu_fopen_bdrv() callers Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 22/33] migration: push Error **errp into qemu_savevm_state_iterate() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 23/33] migration: simplify some error reporting in save_snapshot() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 24/33] migration: push Error **errp into qemu_savevm_state_setup() Daniel P. Berrangé
2021-02-04 17:18 ` [PATCH 25/33] migration: push Error **errp into qemu_savevm_state_complete_precopy() Daniel P. Berrangé
2021-02-04 17:19 ` [PATCH 26/33] migration: push Error **errp into qemu_savevm_state_complete_precopy_non_iterable() Daniel P. Berrangé
2021-02-04 17:19 ` [PATCH 27/33] migration: push Error **errp into qemu_savevm_state_complete_precopy() Daniel P. Berrangé
2021-02-04 17:19 ` [PATCH 28/33] migration: push Error **errp into qemu_savevm_send_packaged() Daniel P. Berrangé
2021-02-04 17:19 ` [PATCH 29/33] migration: push Error **errp into qemu_savevm_live_state() Daniel P. Berrangé
2021-02-04 17:19 ` [PATCH 30/33] migration: push Error **errp into qemu_save_device_state() Daniel P. Berrangé
2021-02-04 17:19 ` [PATCH 31/33] migration: push Error **errp into qemu_savevm_state_resume_prepare() Daniel P. Berrangé
2021-02-04 17:19 ` [PATCH 32/33] migration: push Error **errp into postcopy_resume_handshake() Daniel P. Berrangé
2021-02-04 17:19 ` [PATCH 33/33] migration: push Error **errp into postcopy_do_resume() Daniel P. Berrangé
2021-02-04 18:22 ` [PATCH 00/33] migration: capture error reports into Error object Dr. David Alan Gilbert
2021-02-04 19:09   ` Daniel P. Berrangé
2021-02-08 13:29     ` Dr. David Alan Gilbert
2021-02-08 13:42       ` Daniel P. Berrangé
2021-02-08 14:29         ` Dr. David Alan Gilbert
2021-02-08 14:36           ` Daniel P. Berrangé
2021-02-15 18:38         ` Dr. David Alan Gilbert
2021-02-15 18:58           ` Daniel P. Berrangé
2021-02-15 19:01             ` Dr. David Alan Gilbert
2021-02-16  9:30               ` Daniel P. Berrangé
2021-02-16 19:32                 ` Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.