[PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover
@ 2023-10-04 22:02 Peter Xu
  2023-10-04 22:02 ` [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status Peter Xu
                   ` (9 more replies)
  0 siblings, 10 replies; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

v3:
- Rebase to master after majority of Fabiano's series merged
- Dropped patch 2: "migration: Let migrate_set_error() take ownership"
  (When looking again, I'm not happy with neither that of my patch nor
   current code.  But in that case I prefer change nothing rather than
   changing from one ugly to another)
- Dropped a few Fabiano's R-b due to either rebase (return path thread now
  is recycled when pausing postcopy) or additional error_free() after I
  dropped patch 2

v1: https://lore.kernel.org/r/20230829214235.69309-1-peterx@redhat.com
v2: https://lore.kernel.org/r/20230912222145.731099-1-peterx@redhat.com

This series allow better error handling in the postcopy return path thread,
so that we'll start to store the errors in MigrationState and can be seen
from query-migrate later, comparing to before where we do error_report()
and never remember the error.

Meanwhile, it allows double-failures to happen during postcopy recovery,
IOW, one can fail again right during RECOVER phase on both sides, even if
RECOVER phase should be an extremely small window.

Please have a look, thanks.

Fabiano Rosas (1):
  tests/migration-test: Add a test for postcopy hangs during RECOVER

Peter Xu (9):
  migration: Display error in query-migrate irrelevant of status
  migration: Introduce migrate_has_error()
  migration: Refactor error handling in source return path
  migration: Deliver return path file error to migrate state too
  qemufile: Always return a verbose error
  migration: Remember num of ramblocks to sync during recovery
  migration: Add migration_rp_wait|kick()
  migration: Allow network to fail even during recovery
  migration: Allow RECOVER->PAUSED convertion for dest qemu

 qapi/migration.json          |   5 +-
 migration/migration.h        |  21 +++-
 migration/qemu-file.h        |   1 +
 migration/ram.h              |   5 +-
 migration/migration.c        | 206 ++++++++++++++++++++++-------------
 migration/qemu-file.c        |  17 ++-
 migration/ram.c              |  76 +++++++------
 migration/savevm.c           |   3 +-
 tests/qtest/migration-test.c |  94 ++++++++++++++++
 migration/trace-events       |   4 +-
 10 files changed, 312 insertions(+), 120 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05  7:28   ` Juan Quintela
  2023-10-04 22:02 ` [PATCH v3 02/10] migration: Introduce migrate_has_error() Peter Xu
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

Display it as long as being set, irrelevant of FAILED status.  E.g., it may
also be applicable to PAUSED stage of postcopy, to provide hint on what has
gone wrong.

The error_mutex seems to be overlooked when referencing the error, add it
to be very safe.

This will change QAPI behavior by showing up error message outside !FAILED
status, but it's intended and doesn't expect to break anyone.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qapi/migration.json   | 5 ++---
 migration/migration.c | 8 +++++---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 8843e74b59..c241b6d318 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -230,9 +230,8 @@
 #     throttled during auto-converge.  This is only present when
 #     auto-converge has started throttling guest cpus.  (Since 2.7)
 #
-# @error-desc: the human readable error description string, when
-#     @status is 'failed'. Clients should not attempt to parse the
-#     error strings.  (Since 2.7)
+# @error-desc: the human readable error description string. Clients
+#     should not attempt to parse the error strings.  (Since 2.7)
 #
 # @postcopy-blocktime: total time when all vCPU were blocked during
 #     postcopy live migration.  This is only present when the
diff --git a/migration/migration.c b/migration/migration.c
index 585d3c8f55..010056d6f3 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1057,9 +1057,6 @@ static void fill_source_migration_info(MigrationInfo *info)
         break;
     case MIGRATION_STATUS_FAILED:
         info->has_status = true;
-        if (s->error) {
-            info->error_desc = g_strdup(error_get_pretty(s->error));
-        }
         break;
     case MIGRATION_STATUS_CANCELLED:
         info->has_status = true;
@@ -1069,6 +1066,11 @@ static void fill_source_migration_info(MigrationInfo *info)
         break;
     }
     info->status = state;
+
+    QEMU_LOCK_GUARD(&s->error_mutex);
+    if (s->error) {
+        info->error_desc = g_strdup(error_get_pretty(s->error));
+    }
 }
 
 static void fill_destination_migration_info(MigrationInfo *info)
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 02/10] migration: Introduce migrate_has_error()
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
  2023-10-04 22:02 ` [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05  7:30   ` Juan Quintela
  2023-10-04 22:02 ` [PATCH v3 03/10] migration: Refactor error handling in source return path Peter Xu
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

Introduce a helper to detect whether MigrationState.error is set for
whatever reason.

This is preparation work for any thread (e.g. source return path thread) to
setup errors in an unified way to MigrationState, rather than relying on
its own way to set errors (mark_source_rp_bad()).

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.h | 1 +
 migration/migration.c | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/migration/migration.h b/migration/migration.h
index 972597f4de..4106a1dc54 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -476,6 +476,7 @@ bool  migration_has_all_channels(void);
 uint64_t migrate_max_downtime(void);
 
 void migrate_set_error(MigrationState *s, const Error *error);
+bool migrate_has_error(MigrationState *s);
 
 void migrate_fd_connect(MigrationState *s, Error *error_in);
 
diff --git a/migration/migration.c b/migration/migration.c
index 010056d6f3..4c6de8c2dd 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1231,6 +1231,13 @@ void migrate_set_error(MigrationState *s, const Error *error)
     }
 }
 
+bool migrate_has_error(MigrationState *s)
+{
+    /* The lock is not helpful here, but still follow the rule */
+    QEMU_LOCK_GUARD(&s->error_mutex);
+    return qatomic_read(&s->error);
+}
+
 static void migrate_error_free(MigrationState *s)
 {
     QEMU_LOCK_GUARD(&s->error_mutex);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 03/10] migration: Refactor error handling in source return path
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
  2023-10-04 22:02 ` [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status Peter Xu
  2023-10-04 22:02 ` [PATCH v3 02/10] migration: Introduce migrate_has_error() Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05  6:11   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2023-10-04 22:02 ` [PATCH v3 04/10] migration: Deliver return path file error to migrate state too Peter Xu
                   ` (6 subsequent siblings)
  9 siblings, 3 replies; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

rp_state.error was a boolean used to show error happened in return path
thread.  That's not only duplicating error reporting (migrate_set_error),
but also not good enough in that we only do error_report() and set it to
true, we never can keep a history of the exact error and show it in
query-migrate.

To make this better, a few things done:

  - Use error_setg() rather than error_report() across the whole lifecycle
    of return path thread, keeping the error in an Error*.

  - Use migrate_set_error() to apply that captured error to the global
    migration object when error occured in this thread.

  - With above, no need to have mark_source_rp_bad(), remove it, alongside
    with rp_state.error itself.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.h  |   1 -
 migration/ram.h        |   5 +-
 migration/migration.c  | 123 ++++++++++++++++++-----------------------
 migration/ram.c        |  41 +++++++-------
 migration/trace-events |   4 +-
 5 files changed, 79 insertions(+), 95 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index 4106a1dc54..33a7831da4 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -308,7 +308,6 @@ struct MigrationState {
         /* Protected by qemu_file_lock */
         QEMUFile     *from_dst_file;
         QemuThread    rp_thread;
-        bool          error;
         /*
          * We can also check non-zero of rp_thread, but there's no "official"
          * way to do this, so this bool makes it slightly more elegant.
diff --git a/migration/ram.h b/migration/ram.h
index 145c915ca7..14ed666d58 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -51,7 +51,8 @@ uint64_t ram_bytes_total(void);
 void mig_throttle_counter_reset(void);
 
 uint64_t ram_pagesize_summary(void);
-int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
+int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len,
+                         Error **errp);
 void ram_postcopy_migrated_memory_release(MigrationState *ms);
 /* For outgoing discard bitmap */
 void ram_postcopy_send_discard_bitmap(MigrationState *ms);
@@ -71,7 +72,7 @@ void ramblock_recv_bitmap_set(RAMBlock *rb, void *host_addr);
 void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr, size_t nr);
 int64_t ramblock_recv_bitmap_send(QEMUFile *file,
                                   const char *block_name);
-int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
+int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb, Error **errp);
 bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
 void postcopy_preempt_shutdown_file(MigrationState *s);
 void *postcopy_preempt_thread(void *opaque);
diff --git a/migration/migration.c b/migration/migration.c
index 4c6de8c2dd..e821e80094 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -99,7 +99,7 @@ static int migration_maybe_pause(MigrationState *s,
                                  int *current_active_state,
                                  int new_state);
 static void migrate_fd_cancel(MigrationState *s);
-static int await_return_path_close_on_source(MigrationState *s);
+static void await_return_path_close_on_source(MigrationState *s);
 
 static bool migration_needs_multiple_sockets(void)
 {
@@ -1427,7 +1427,6 @@ int migrate_init(MigrationState *s, Error **errp)
     s->to_dst_file = NULL;
     s->state = MIGRATION_STATUS_NONE;
     s->rp_state.from_dst_file = NULL;
-    s->rp_state.error = false;
     s->mbps = 0.0;
     s->pages_per_second = 0.0;
     s->downtime = 0;
@@ -1750,16 +1749,6 @@ void qmp_migrate_continue(MigrationStatus state, Error **errp)
     qemu_sem_post(&s->pause_sem);
 }
 
-/* migration thread support */
-/*
- * Something bad happened to the RP stream, mark an error
- * The caller shall print or trace something to indicate why
- */
-static void mark_source_rp_bad(MigrationState *s)
-{
-    s->rp_state.error = true;
-}
-
 static struct rp_cmd_args {
     ssize_t     len; /* -1 = variable */
     const char *name;
@@ -1781,7 +1770,7 @@ static struct rp_cmd_args {
  * and we don't need to send pages that have already been sent.
  */
 static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname,
-                                       ram_addr_t start, size_t len)
+                                        ram_addr_t start, size_t len, Error **errp)
 {
     long our_host_ps = qemu_real_host_page_size();
 
@@ -1793,37 +1782,36 @@ static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname,
      */
     if (!QEMU_IS_ALIGNED(start, our_host_ps) ||
         !QEMU_IS_ALIGNED(len, our_host_ps)) {
-        error_report("%s: Misaligned page request, start: " RAM_ADDR_FMT
-                     " len: %zd", __func__, start, len);
-        mark_source_rp_bad(ms);
+        error_setg(errp, "MIG_RP_MSG_REQ_PAGES: Misaligned page request, start:"
+                   RAM_ADDR_FMT " len: %zd", start, len);
         return;
     }
 
-    if (ram_save_queue_pages(rbname, start, len)) {
-        mark_source_rp_bad(ms);
-    }
+    ram_save_queue_pages(rbname, start, len, errp);
 }
 
-static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name)
+static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name,
+                                         Error **errp)
 {
     RAMBlock *block = qemu_ram_block_by_name(block_name);
 
     if (!block) {
-        error_report("%s: invalid block name '%s'", __func__, block_name);
+        error_setg(errp, "MIG_RP_MSG_RECV_BITMAP has invalid block name '%s'",
+                   block_name);
         return -EINVAL;
     }
 
     /* Fetch the received bitmap and refresh the dirty bitmap */
-    return ram_dirty_bitmap_reload(s, block);
+    return ram_dirty_bitmap_reload(s, block, errp);
 }
 
-static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value)
+static int migrate_handle_rp_resume_ack(MigrationState *s,
+                                        uint32_t value, Error **errp)
 {
     trace_source_return_path_thread_resume_ack(value);
 
     if (value != MIGRATION_RESUME_ACK_VALUE) {
-        error_report("%s: illegal resume_ack value %"PRIu32,
-                     __func__, value);
+        error_setg(errp, "illegal resume_ack value %"PRIu32, value);
         return -1;
     }
 
@@ -1882,48 +1870,46 @@ static void *source_return_path_thread(void *opaque)
     uint32_t tmp32, sibling_error;
     ram_addr_t start = 0; /* =0 to silence warning */
     size_t  len = 0, expected_len;
+    Error *err = NULL;
     int res;
 
     trace_source_return_path_thread_entry();
     rcu_register_thread();
 
-    while (!ms->rp_state.error && !qemu_file_get_error(rp) &&
+    while (!migrate_has_error(ms) && !qemu_file_get_error(rp) &&
            migration_is_setup_or_active(ms->state)) {
         trace_source_return_path_thread_loop_top();
+
         header_type = qemu_get_be16(rp);
         header_len = qemu_get_be16(rp);
 
         if (qemu_file_get_error(rp)) {
-            mark_source_rp_bad(ms);
             goto out;
         }
 
         if (header_type >= MIG_RP_MSG_MAX ||
             header_type == MIG_RP_MSG_INVALID) {
-            error_report("RP: Received invalid message 0x%04x length 0x%04x",
-                         header_type, header_len);
-            mark_source_rp_bad(ms);
+            error_setg(&err, "Received invalid message 0x%04x length 0x%04x",
+                       header_type, header_len);
             goto out;
         }
 
         if ((rp_cmd_args[header_type].len != -1 &&
             header_len != rp_cmd_args[header_type].len) ||
             header_len > sizeof(buf)) {
-            error_report("RP: Received '%s' message (0x%04x) with"
-                         "incorrect length %d expecting %zu",
-                         rp_cmd_args[header_type].name, header_type, header_len,
-                         (size_t)rp_cmd_args[header_type].len);
-            mark_source_rp_bad(ms);
+            error_setg(&err, "Received '%s' message (0x%04x) with"
+                       "incorrect length %d expecting %zu",
+                       rp_cmd_args[header_type].name, header_type, header_len,
+                       (size_t)rp_cmd_args[header_type].len);
             goto out;
         }
 
         /* We know we've got a valid header by this point */
         res = qemu_get_buffer(rp, buf, header_len);
         if (res != header_len) {
-            error_report("RP: Failed reading data for message 0x%04x"
-                         " read %d expected %d",
-                         header_type, res, header_len);
-            mark_source_rp_bad(ms);
+            error_setg(&err, "Failed reading data for message 0x%04x"
+                       " read %d expected %d",
+                       header_type, res, header_len);
             goto out;
         }
 
@@ -1933,8 +1919,7 @@ static void *source_return_path_thread(void *opaque)
             sibling_error = ldl_be_p(buf);
             trace_source_return_path_thread_shut(sibling_error);
             if (sibling_error) {
-                error_report("RP: Sibling indicated error %d", sibling_error);
-                mark_source_rp_bad(ms);
+                error_setg(&err, "Sibling indicated error %d", sibling_error);
             }
             /*
              * We'll let the main thread deal with closing the RP
@@ -1952,7 +1937,10 @@ static void *source_return_path_thread(void *opaque)
         case MIG_RP_MSG_REQ_PAGES:
             start = ldq_be_p(buf);
             len = ldl_be_p(buf + 8);
-            migrate_handle_rp_req_pages(ms, NULL, start, len);
+            migrate_handle_rp_req_pages(ms, NULL, start, len, &err);
+            if (err) {
+                goto out;
+            }
             break;
 
         case MIG_RP_MSG_REQ_PAGES_ID:
@@ -1967,32 +1955,32 @@ static void *source_return_path_thread(void *opaque)
                 expected_len += tmp32;
             }
             if (header_len != expected_len) {
-                error_report("RP: Req_Page_id with length %d expecting %zd",
-                             header_len, expected_len);
-                mark_source_rp_bad(ms);
+                error_setg(&err, "Req_Page_id with length %d expecting %zd",
+                           header_len, expected_len);
+                goto out;
+            }
+            migrate_handle_rp_req_pages(ms, (char *)&buf[13], start, len,
+                                        &err);
+            if (err) {
                 goto out;
             }
-            migrate_handle_rp_req_pages(ms, (char *)&buf[13], start, len);
             break;
 
         case MIG_RP_MSG_RECV_BITMAP:
             if (header_len < 1) {
-                error_report("%s: missing block name", __func__);
-                mark_source_rp_bad(ms);
+                error_setg(&err, "MIG_RP_MSG_RECV_BITMAP missing block name");
                 goto out;
             }
             /* Format: len (1B) + idstr (<255B). This ends the idstr. */
             buf[buf[0] + 1] = '\0';
-            if (migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1))) {
-                mark_source_rp_bad(ms);
+            if (migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1), &err)) {
                 goto out;
             }
             break;
 
         case MIG_RP_MSG_RESUME_ACK:
             tmp32 = ldl_be_p(buf);
-            if (migrate_handle_rp_resume_ack(ms, tmp32)) {
-                mark_source_rp_bad(ms);
+            if (migrate_handle_rp_resume_ack(ms, tmp32, &err)) {
                 goto out;
             }
             break;
@@ -2008,9 +1996,14 @@ static void *source_return_path_thread(void *opaque)
     }
 
 out:
-    if (qemu_file_get_error(rp)) {
+    if (err) {
+        /*
+         * Collect any error in return-path thread and report it to the
+         * migration state object.
+         */
+        migrate_set_error(ms, err);
+        error_free(err);
         trace_source_return_path_thread_bad_end();
-        mark_source_rp_bad(ms);
     }
 
     trace_source_return_path_thread_end();
@@ -2036,13 +2029,10 @@ static int open_return_path_on_source(MigrationState *ms)
     return 0;
 }
 
-/* Returns 0 if the RP was ok, otherwise there was an error on the RP */
-static int await_return_path_close_on_source(MigrationState *ms)
+static void await_return_path_close_on_source(MigrationState *ms)
 {
-    int ret;
-
     if (!ms->rp_state.rp_thread_created) {
-        return 0;
+        return;
     }
 
     trace_migration_return_path_end_before();
@@ -2060,18 +2050,10 @@ static int await_return_path_close_on_source(MigrationState *ms)
         }
     }
 
-    trace_await_return_path_close_on_source_joining();
     qemu_thread_join(&ms->rp_state.rp_thread);
     ms->rp_state.rp_thread_created = false;
-    trace_await_return_path_close_on_source_close();
-
-    ret = ms->rp_state.error;
-    ms->rp_state.error = false;
-
     migration_release_dst_files(ms);
-
-    trace_migration_return_path_end_after(ret);
-    return ret;
+    trace_migration_return_path_end_after();
 }
 
 static inline void
@@ -2367,7 +2349,10 @@ static void migration_completion(MigrationState *s)
         goto fail;
     }
 
-    if (await_return_path_close_on_source(s)) {
+    await_return_path_close_on_source(s);
+
+    /* If return path has error, should have been set here */
+    if (migrate_has_error(s)) {
         goto fail;
     }
 
diff --git a/migration/ram.c b/migration/ram.c
index e4bfd39f08..c54e071ea3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1951,7 +1951,8 @@ static void migration_page_queue_free(RAMState *rs)
  * @start: starting address from the start of the RAMBlock
  * @len: length (in bytes) to send
  */
-int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len)
+int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len,
+                         Error **errp)
 {
     RAMBlock *ramblock;
     RAMState *rs = ram_state;
@@ -1968,7 +1969,7 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len)
              * Shouldn't happen, we can't reuse the last RAMBlock if
              * it's the 1st request.
              */
-            error_report("ram_save_queue_pages no previous block");
+            error_setg(errp, "MIG_RP_MSG_REQ_PAGES has no previous block");
             return -1;
         }
     } else {
@@ -1976,16 +1977,17 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len)
 
         if (!ramblock) {
             /* We shouldn't be asked for a non-existent RAMBlock */
-            error_report("ram_save_queue_pages no block '%s'", rbname);
+            error_setg(errp, "MIG_RP_MSG_REQ_PAGES has no block '%s'", rbname);
             return -1;
         }
         rs->last_req_rb = ramblock;
     }
     trace_ram_save_queue_pages(ramblock->idstr, start, len);
     if (!offset_in_ramblock(ramblock, start + len - 1)) {
-        error_report("%s request overrun start=" RAM_ADDR_FMT " len="
-                     RAM_ADDR_FMT " blocklen=" RAM_ADDR_FMT,
-                     __func__, start, len, ramblock->used_length);
+        error_setg(errp, "MIG_RP_MSG_REQ_PAGES request overrun, "
+                   "start=" RAM_ADDR_FMT " len="
+                   RAM_ADDR_FMT " blocklen=" RAM_ADDR_FMT,
+                   start, len, ramblock->used_length);
         return -1;
     }
 
@@ -2017,9 +2019,9 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len)
         assert(len % page_size == 0);
         while (len) {
             if (ram_save_host_page_urgent(pss)) {
-                error_report("%s: ram_save_host_page_urgent() failed: "
-                             "ramblock=%s, start_addr=0x"RAM_ADDR_FMT,
-                             __func__, ramblock->idstr, start);
+                error_setg(errp, "ram_save_host_page_urgent() failed: "
+                           "ramblock=%s, start_addr=0x"RAM_ADDR_FMT,
+                           ramblock->idstr, start);
                 ret = -1;
                 break;
             }
@@ -4151,7 +4153,7 @@ static void ram_dirty_bitmap_reload_notify(MigrationState *s)
  * This is only used when the postcopy migration is paused but wants
  * to resume from a middle point.
  */
-int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
+int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
 {
     int ret = -EINVAL;
     /* from_dst_file is always valid because we're within rp_thread */
@@ -4163,8 +4165,8 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
     trace_ram_dirty_bitmap_reload_begin(block->idstr);
 
     if (s->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {
-        error_report("%s: incorrect state %s", __func__,
-                     MigrationStatus_str(s->state));
+        error_setg(errp, "Reload bitmap in incorrect state %s",
+                   MigrationStatus_str(s->state));
         return -EINVAL;
     }
 
@@ -4181,9 +4183,8 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
 
     /* The size of the bitmap should match with our ramblock */
     if (size != local_size) {
-        error_report("%s: ramblock '%s' bitmap size mismatch "
-                     "(0x%"PRIx64" != 0x%"PRIx64")", __func__,
-                     block->idstr, size, local_size);
+        error_setg(errp, "ramblock '%s' bitmap size mismatch (0x%"PRIx64
+                   " != 0x%"PRIx64")", block->idstr, size, local_size);
         ret = -EINVAL;
         goto out;
     }
@@ -4193,16 +4194,16 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
 
     ret = qemu_file_get_error(file);
     if (ret || size != local_size) {
-        error_report("%s: read bitmap failed for ramblock '%s': %d"
-                     " (size 0x%"PRIx64", got: 0x%"PRIx64")",
-                     __func__, block->idstr, ret, local_size, size);
+        error_setg(errp, "read bitmap failed for ramblock '%s': %d"
+                   " (size 0x%"PRIx64", got: 0x%"PRIx64")",
+                   block->idstr, ret, local_size, size);
         ret = -EIO;
         goto out;
     }
 
     if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) {
-        error_report("%s: ramblock '%s' end mark incorrect: 0x%"PRIx64,
-                     __func__, block->idstr, end_mark);
+        error_setg(errp, "ramblock '%s' end mark incorrect: 0x%"PRIx64,
+                   block->idstr, end_mark);
         ret = -EINVAL;
         goto out;
     }
diff --git a/migration/trace-events b/migration/trace-events
index 002abe3a4e..5739f6b266 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -147,8 +147,6 @@ multifd_tls_outgoing_handshake_complete(void *ioc) "ioc=%p"
 multifd_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname, void *err)  "ioc=%p ioctype=%s hostname=%s err=%p"
 
 # migration.c
-await_return_path_close_on_source_close(void) ""
-await_return_path_close_on_source_joining(void) ""
 migrate_set_state(const char *new_state) "new state %s"
 migrate_fd_cleanup(void) ""
 migrate_fd_error(const char *error_desc) "error=%s"
@@ -165,7 +163,7 @@ migration_completion_postcopy_end_after_complete(void) ""
 migration_rate_limit_pre(int ms) "%d ms"
 migration_rate_limit_post(int urgent) "urgent: %d"
 migration_return_path_end_before(void) ""
-migration_return_path_end_after(int rp_error) "%d"
+migration_return_path_end_after(void) ""
 migration_thread_after_loop(void) ""
 migration_thread_file_err(void) ""
 migration_thread_setup_complete(void) ""
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 04/10] migration: Deliver return path file error to migrate state too
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
                   ` (2 preceding siblings ...)
  2023-10-04 22:02 ` [PATCH v3 03/10] migration: Refactor error handling in source return path Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05  7:32   ` Juan Quintela
  2023-10-04 22:02 ` [PATCH v3 05/10] qemufile: Always return a verbose error Peter Xu
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

We've already did this for most of the return path thread errors, but not
yet for the IO errors happened on the return path qemufile.  Do that too.

Re-export qemu_file_get_error_obj().

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/qemu-file.h | 1 +
 migration/migration.c | 1 +
 migration/qemu-file.c | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index 03e718c264..75efe503c4 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -120,6 +120,7 @@ int coroutine_mixed_fn qemu_peek_byte(QEMUFile *f, int offset);
 void qemu_file_skip(QEMUFile *f, int size);
 int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp);
 void qemu_file_set_error_obj(QEMUFile *f, int ret, Error *err);
+int qemu_file_get_error_obj(QEMUFile *f, Error **errp);
 void qemu_file_set_error(QEMUFile *f, int ret);
 int qemu_file_shutdown(QEMUFile *f);
 QEMUFile *qemu_file_get_return_path(QEMUFile *f);
diff --git a/migration/migration.c b/migration/migration.c
index e821e80094..b28b504b4c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1884,6 +1884,7 @@ static void *source_return_path_thread(void *opaque)
         header_len = qemu_get_be16(rp);
 
         if (qemu_file_get_error(rp)) {
+            qemu_file_get_error_obj(rp, &err);
             goto out;
         }
 
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 5e8207dae4..ffa9c0a48a 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -146,7 +146,7 @@ void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks)
  * is not 0.
  *
  */
-static int qemu_file_get_error_obj(QEMUFile *f, Error **errp)
+int qemu_file_get_error_obj(QEMUFile *f, Error **errp)
 {
     if (errp) {
         *errp = f->last_error_obj ? error_copy(f->last_error_obj) : NULL;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 05/10] qemufile: Always return a verbose error
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
                   ` (3 preceding siblings ...)
  2023-10-04 22:02 ` [PATCH v3 04/10] migration: Deliver return path file error to migrate state too Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05  7:42   ` Juan Quintela
  2023-10-04 22:02 ` [PATCH v3 06/10] migration: Remember num of ramblocks to sync during recovery Peter Xu
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

There're a lot of cases where we only have an errno set in last_error but
without a detailed error description.  When this happens, try to generate
an error contains the errno as a descriptive error.

This will be helpful in cases where one relies on the Error*.  E.g.,
migration state only caches Error* in MigrationState.error.  With this,
we'll display correct error messages in e.g. query-migrate when the error
was only set by qemu_file_set_error().

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/qemu-file.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index ffa9c0a48a..c12a905a34 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -142,15 +142,24 @@ void qemu_file_set_hooks(QEMUFile *f, const QEMUFileHooks *hooks)
  *
  * Return negative error value if there has been an error on previous
  * operations, return 0 if no error happened.
- * Optional, it returns Error* in errp, but it may be NULL even if return value
- * is not 0.
  *
+ * If errp is specified, a verbose error message will be copied over.
  */
 int qemu_file_get_error_obj(QEMUFile *f, Error **errp)
 {
+    if (!f->last_error) {
+        return 0;
+    }
+
+    /* There is an error */
     if (errp) {
-        *errp = f->last_error_obj ? error_copy(f->last_error_obj) : NULL;
+        if (f->last_error_obj) {
+            *errp = error_copy(f->last_error_obj);
+        } else {
+            error_setg_errno(errp, -f->last_error, "Channel error");
+        }
     }
+
     return f->last_error;
 }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 06/10] migration: Remember num of ramblocks to sync during recovery
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
                   ` (4 preceding siblings ...)
  2023-10-04 22:02 ` [PATCH v3 05/10] qemufile: Always return a verbose error Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05  7:43   ` Juan Quintela
  2023-10-04 22:02 ` [PATCH v3 07/10] migration: Add migration_rp_wait|kick() Peter Xu
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

Instead of only relying on the count of rp_sem, make the counter be part of
RAMState so it can be used in both threads to synchronize on the process.

rp_sem will be further reused in follow up patches, as a way to kick the
main thread, e.g., on recovery failures.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/ram.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index c54e071ea3..ef4af3fbce 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -394,6 +394,14 @@ struct RAMState {
     /* Queue of outstanding page requests from the destination */
     QemuMutex src_page_req_mutex;
     QSIMPLEQ_HEAD(, RAMSrcPageRequest) src_page_requests;
+
+    /*
+     * This is only used when postcopy is in recovery phase, to communicate
+     * between the migration thread and the return path thread on dirty
+     * bitmap synchronizations.  This field is unused in other stages of
+     * RAM migration.
+     */
+    unsigned int postcopy_bmap_sync_requested;
 };
 typedef struct RAMState RAMState;
 
@@ -4121,20 +4129,20 @@ static int ram_dirty_bitmap_sync_all(MigrationState *s, RAMState *rs)
 {
     RAMBlock *block;
     QEMUFile *file = s->to_dst_file;
-    int ramblock_count = 0;
 
     trace_ram_dirty_bitmap_sync_start();
 
+    qatomic_set(&rs->postcopy_bmap_sync_requested, 0);
     RAMBLOCK_FOREACH_NOT_IGNORED(block) {
         qemu_savevm_send_recv_bitmap(file, block->idstr);
         trace_ram_dirty_bitmap_request(block->idstr);
-        ramblock_count++;
+        qatomic_inc(&rs->postcopy_bmap_sync_requested);
     }
 
     trace_ram_dirty_bitmap_sync_wait();
 
     /* Wait until all the ramblocks' dirty bitmap synced */
-    while (ramblock_count--) {
+    while (qatomic_read(&rs->postcopy_bmap_sync_requested)) {
         qemu_sem_wait(&s->rp_state.rp_sem);
     }
 
@@ -4161,6 +4169,7 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
     unsigned long *le_bitmap, nbits = block->used_length >> TARGET_PAGE_BITS;
     uint64_t local_size = DIV_ROUND_UP(nbits, 8);
     uint64_t size, end_mark;
+    RAMState *rs = ram_state;
 
     trace_ram_dirty_bitmap_reload_begin(block->idstr);
 
@@ -4226,6 +4235,8 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
     /* We'll recalculate migration_dirty_pages in ram_state_resume_prepare(). */
     trace_ram_dirty_bitmap_reload_complete(block->idstr);
 
+    qatomic_dec(&rs->postcopy_bmap_sync_requested);
+
     /*
      * We succeeded to sync bitmap for current ramblock. If this is
      * the last one to sync, we need to notify the main send thread.
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 07/10] migration: Add migration_rp_wait|kick()
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
                   ` (5 preceding siblings ...)
  2023-10-04 22:02 ` [PATCH v3 06/10] migration: Remember num of ramblocks to sync during recovery Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05  7:49   ` Juan Quintela
  2023-10-04 22:02 ` [PATCH v3 08/10] migration: Allow network to fail even during recovery Peter Xu
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

It's just a simple wrapper for rp_sem on either wait() or kick(), make it
even clearer on how it is used.  Prepared to be used even for other things.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.h | 15 +++++++++++++++
 migration/migration.c | 14 ++++++++++++--
 migration/ram.c       | 16 +++++++---------
 3 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index 33a7831da4..573aa69f19 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -315,6 +315,12 @@ struct MigrationState {
          * be cleared in the rp_thread!
          */
         bool          rp_thread_created;
+        /*
+         * Used to synchronize between migration main thread and return
+         * path thread.  The migration thread can wait() on this sem, while
+         * other threads (e.g., return path thread) can kick it using a
+         * post().
+         */
         QemuSemaphore rp_sem;
         /*
          * We post to this when we got one PONG from dest. So far it's an
@@ -526,4 +532,13 @@ void migration_populate_vfio_info(MigrationInfo *info);
 void migration_reset_vfio_bytes_transferred(void);
 void postcopy_temp_page_reset(PostcopyTmpPage *tmp_page);
 
+/* Migration thread waiting for return path thread. */
+void migration_rp_wait(MigrationState *s);
+/*
+ * Kick the migration thread waiting for return path messages.  NOTE: the
+ * name can be slightly confusing (when read as "kick the rp thread"), just
+ * to remember the target is always the migration thread.
+ */
+void migration_rp_kick(MigrationState *s);
+
 #endif
diff --git a/migration/migration.c b/migration/migration.c
index b28b504b4c..1b7ed2d35a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1749,6 +1749,16 @@ void qmp_migrate_continue(MigrationStatus state, Error **errp)
     qemu_sem_post(&s->pause_sem);
 }
 
+void migration_rp_wait(MigrationState *s)
+{
+    qemu_sem_wait(&s->rp_state.rp_sem);
+}
+
+void migration_rp_kick(MigrationState *s)
+{
+    qemu_sem_post(&s->rp_state.rp_sem);
+}
+
 static struct rp_cmd_args {
     ssize_t     len; /* -1 = variable */
     const char *name;
@@ -1820,7 +1830,7 @@ static int migrate_handle_rp_resume_ack(MigrationState *s,
                       MIGRATION_STATUS_POSTCOPY_ACTIVE);
 
     /* Notify send thread that time to continue send pages */
-    qemu_sem_post(&s->rp_state.rp_sem);
+    migration_rp_kick(s);
 
     return 0;
 }
@@ -2447,7 +2457,7 @@ static int postcopy_resume_handshake(MigrationState *s)
     qemu_savevm_send_postcopy_resume(s->to_dst_file);
 
     while (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
-        qemu_sem_wait(&s->rp_state.rp_sem);
+        migration_rp_wait(s);
     }
 
     if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
diff --git a/migration/ram.c b/migration/ram.c
index ef4af3fbce..43ba62be83 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4143,7 +4143,7 @@ static int ram_dirty_bitmap_sync_all(MigrationState *s, RAMState *rs)
 
     /* Wait until all the ramblocks' dirty bitmap synced */
     while (qatomic_read(&rs->postcopy_bmap_sync_requested)) {
-        qemu_sem_wait(&s->rp_state.rp_sem);
+        migration_rp_wait(s);
     }
 
     trace_ram_dirty_bitmap_sync_complete();
@@ -4151,11 +4151,6 @@ static int ram_dirty_bitmap_sync_all(MigrationState *s, RAMState *rs)
     return 0;
 }
 
-static void ram_dirty_bitmap_reload_notify(MigrationState *s)
-{
-    qemu_sem_post(&s->rp_state.rp_sem);
-}
-
 /*
  * Read the received bitmap, revert it as the initial dirty bitmap.
  * This is only used when the postcopy migration is paused but wants
@@ -4238,10 +4233,13 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
     qatomic_dec(&rs->postcopy_bmap_sync_requested);
 
     /*
-     * We succeeded to sync bitmap for current ramblock. If this is
-     * the last one to sync, we need to notify the main send thread.
+     * We succeeded to sync bitmap for current ramblock. Always kick the
+     * migration thread to check whether all requested bitmaps are
+     * reloaded.  NOTE: it's racy to only kick when requested==0, because
+     * we don't know whether the migration thread may still be increasing
+     * it.
      */
-    ram_dirty_bitmap_reload_notify(s);
+    migration_rp_kick(s);
 
     ret = 0;
 out:
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 08/10] migration: Allow network to fail even during recovery
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
                   ` (6 preceding siblings ...)
  2023-10-04 22:02 ` [PATCH v3 07/10] migration: Add migration_rp_wait|kick() Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05 13:25   ` Fabiano Rosas
  2023-10-04 22:02 ` [PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu Peter Xu
  2023-10-04 22:02 ` [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER Peter Xu
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela, Xiaohui Li

Normally the postcopy recover phase should only exist for a super short
period, that's the duration when QEMU is trying to recover from an
interrupted postcopy migration, during which handshake will be carried out
for continuing the procedure with state changes from PAUSED -> RECOVER ->
POSTCOPY_ACTIVE again.

Here RECOVER phase should be super small, that happens right after the
admin specified a new but working network link for QEMU to reconnect to
dest QEMU.

However there can still be case where the channel is broken in this small
RECOVER window.

If it happens, with current code there's no way the src QEMU can got kicked
out of RECOVER stage. No way either to retry the recover in another channel
when established.

This patch allows the RECOVER phase to fail itself too - we're mostly
ready, just some small things missing, e.g. properly kick the main
migration thread out when sleeping on rp_sem when we found that we're at
RECOVER stage.  When this happens, it fails the RECOVER itself, and
rollback to PAUSED stage.  Then the user can retry another round of
recovery.

To make it even stronger, teach QMP command migrate-pause to explicitly
kick src/dst QEMU out when needed, so even if for some reason the migration
thread didn't got kicked out already by a failing rethrn-path thread, the
admin can also kick it out.

This will be an super, super corner case, but still try to cover that.

One can try to test this with two proxy channels for migration:

  (a) socat unix-listen:/tmp/src.sock,reuseaddr,fork tcp:localhost:10000
  (b) socat tcp-listen:10000,reuseaddr,fork unix:/tmp/dst.sock

So the migration channel will be:

                      (a)          (b)
  src -> /tmp/src.sock -> tcp:10000 -> /tmp/dst.sock -> dst

Then to make QEMU hang at RECOVER stage, one can do below:

  (1) stop the postcopy using QMP command postcopy-pause
  (2) kill the 2nd proxy (b)
  (3) try to recover the postcopy using /tmp/src.sock on src
  (4) src QEMU will go into RECOVER stage but won't be able to continue
      from there, because the channel is actually broken at (b)

Before this patch, step (4) will make src QEMU stuck in RECOVER stage,
without a way to kick the QEMU out or continue the postcopy again.  After
this patch, (4) will quickly fail qemu and bounce back to PAUSED stage.

Admin can also kick QEMU from (4) into PAUSED when needed using
migrate-pause when needed.

After bouncing back to PAUSED stage, one can recover again.

Reported-by: Xiaohui Li <xiaohli@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2111332
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.h |  8 ++++--
 migration/migration.c | 63 +++++++++++++++++++++++++++++++++++++++----
 migration/ram.c       |  4 ++-
 3 files changed, 67 insertions(+), 8 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index 573aa69f19..f985d3dedb 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -492,6 +492,7 @@ int migrate_init(MigrationState *s, Error **errp);
 bool migration_is_blocked(Error **errp);
 /* True if outgoing migration has entered postcopy phase */
 bool migration_in_postcopy(void);
+bool migration_postcopy_is_alive(int state);
 MigrationState *migrate_get_current(void);
 
 uint64_t ram_get_total_transferred_pages(void);
@@ -532,8 +533,11 @@ void migration_populate_vfio_info(MigrationInfo *info);
 void migration_reset_vfio_bytes_transferred(void);
 void postcopy_temp_page_reset(PostcopyTmpPage *tmp_page);
 
-/* Migration thread waiting for return path thread. */
-void migration_rp_wait(MigrationState *s);
+/*
+ * Migration thread waiting for return path thread.  Return non-zero if an
+ * error is detected.
+ */
+int migration_rp_wait(MigrationState *s);
 /*
  * Kick the migration thread waiting for return path messages.  NOTE: the
  * name can be slightly confusing (when read as "kick the rp thread"), just
diff --git a/migration/migration.c b/migration/migration.c
index 1b7ed2d35a..1a7f214fcf 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1345,6 +1345,17 @@ bool migration_in_postcopy(void)
     }
 }
 
+bool migration_postcopy_is_alive(int state)
+{
+    switch (state) {
+    case MIGRATION_STATUS_POSTCOPY_ACTIVE:
+    case MIGRATION_STATUS_POSTCOPY_RECOVER:
+        return true;
+    default:
+        return false;
+    }
+}
+
 bool migration_in_postcopy_after_devices(MigrationState *s)
 {
     return migration_in_postcopy() && s->postcopy_after_devices;
@@ -1552,8 +1563,15 @@ void qmp_migrate_pause(Error **errp)
     MigrationIncomingState *mis = migration_incoming_get_current();
     int ret = 0;
 
-    if (ms->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+    if (migration_postcopy_is_alive(ms->state)) {
         /* Source side, during postcopy */
+        Error *error = NULL;
+
+        /* Tell the core migration that we're pausing */
+        error_setg(&error, "Postcopy migration is paused by the user");
+        migrate_set_error(ms, error);
+        error_free(error);
+
         qemu_mutex_lock(&ms->qemu_file_lock);
         if (ms->to_dst_file) {
             ret = qemu_file_shutdown(ms->to_dst_file);
@@ -1562,10 +1580,17 @@ void qmp_migrate_pause(Error **errp)
         if (ret) {
             error_setg(errp, "Failed to pause source migration");
         }
+
+        /*
+         * Kick the migration thread out of any waiting windows (on behalf
+         * of the rp thread).
+         */
+        migration_rp_kick(ms);
+
         return;
     }
 
-    if (mis->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+    if (migration_postcopy_is_alive(mis->state)) {
         ret = qemu_file_shutdown(mis->from_src_file);
         if (ret) {
             error_setg(errp, "Failed to pause destination migration");
@@ -1574,7 +1599,7 @@ void qmp_migrate_pause(Error **errp)
     }
 
     error_setg(errp, "migrate-pause is currently only supported "
-               "during postcopy-active state");
+               "during postcopy-active or postcopy-recover state");
 }
 
 bool migration_is_blocked(Error **errp)
@@ -1749,9 +1774,21 @@ void qmp_migrate_continue(MigrationStatus state, Error **errp)
     qemu_sem_post(&s->pause_sem);
 }
 
-void migration_rp_wait(MigrationState *s)
+int migration_rp_wait(MigrationState *s)
 {
+    /* If migration has failure already, ignore the wait */
+    if (migrate_has_error(s)) {
+        return -1;
+    }
+
     qemu_sem_wait(&s->rp_state.rp_sem);
+
+    /* After wait, double check that there's no failure */
+    if (migrate_has_error(s)) {
+        return -1;
+    }
+
+    return 0;
 }
 
 void migration_rp_kick(MigrationState *s)
@@ -2017,6 +2054,20 @@ out:
         trace_source_return_path_thread_bad_end();
     }
 
+    if (ms->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
+        /*
+         * this will be extremely unlikely: that we got yet another network
+         * issue during recovering of the 1st network failure.. during this
+         * period the main migration thread can be waiting on rp_sem for
+         * this thread to sync with the other side.
+         *
+         * When this happens, explicitly kick the migration thread out of
+         * RECOVER stage and back to PAUSED, so the admin can try
+         * everything again.
+         */
+        migration_rp_kick(ms);
+    }
+
     trace_source_return_path_thread_end();
     rcu_unregister_thread();
     return NULL;
@@ -2457,7 +2508,9 @@ static int postcopy_resume_handshake(MigrationState *s)
     qemu_savevm_send_postcopy_resume(s->to_dst_file);
 
     while (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) {
-        migration_rp_wait(s);
+        if (migration_rp_wait(s)) {
+            return -1;
+        }
     }
 
     if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
diff --git a/migration/ram.c b/migration/ram.c
index 43ba62be83..2565f53f5c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4143,7 +4143,9 @@ static int ram_dirty_bitmap_sync_all(MigrationState *s, RAMState *rs)
 
     /* Wait until all the ramblocks' dirty bitmap synced */
     while (qatomic_read(&rs->postcopy_bmap_sync_requested)) {
-        migration_rp_wait(s);
+        if (migration_rp_wait(s)) {
+            return -1;
+        }
     }
 
     trace_ram_dirty_bitmap_sync_complete();
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
                   ` (7 preceding siblings ...)
  2023-10-04 22:02 ` [PATCH v3 08/10] migration: Allow network to fail even during recovery Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05  8:24   ` Juan Quintela
  2023-10-04 22:02 ` [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER Peter Xu
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

There's a bug on dest that if a double fault triggered on dest qemu (a
network issue during postcopy-recover), we won't set PAUSED correctly
because we assumed we always came from ACTIVE.

Fix that by always overwriting the state to PAUSE.

We could also check for these two states, but maybe it's an overkill.  We
did the same on the src QEMU to unconditionally switch to PAUSE anyway.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/savevm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 60eec7c31f..497ce02bd7 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2734,7 +2734,8 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
         qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex);
     }
 
-    migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
+    /* Current state can be either ACTIVE or RECOVER */
+    migrate_set_state(&mis->state, mis->state,
                       MIGRATION_STATUS_POSTCOPY_PAUSED);
 
     /* Notify the fault thread for the invalidated file handle */
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
                   ` (8 preceding siblings ...)
  2023-10-04 22:02 ` [PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu Peter Xu
@ 2023-10-04 22:02 ` Peter Xu
  2023-10-05 13:24   ` Fabiano Rosas
  9 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-04 22:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, Fabiano Rosas, Juan Quintela

From: Fabiano Rosas <farosas@suse.de>

To do so, create two paired sockets, but make them not providing real data.
Feed those fake sockets to src/dst QEMUs for recovery to let them go into
RECOVER stage without going out.  Test that we can always kick it out and
recover again with the right ports.

This patch is based on Fabiano's version here:

https://lore.kernel.org/r/877cowmdu0.fsf@suse.de

Signed-off-by: Fabiano Rosas <farosas@suse.de>
[peterx: write commit message, remove case 1, fix bugs, and more]
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tests/qtest/migration-test.c | 94 ++++++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 46f1c275a2..fb7a3765e4 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -729,6 +729,7 @@ typedef struct {
     /* Postcopy specific fields */
     void *postcopy_data;
     bool postcopy_preempt;
+    bool postcopy_recovery_test_fail;
 } MigrateCommon;
 
 static int test_migrate_start(QTestState **from, QTestState **to,
@@ -1381,6 +1382,78 @@ static void test_postcopy_preempt_tls_psk(void)
 }
 #endif
 
+static void wait_for_postcopy_status(QTestState *one, const char *status)
+{
+    wait_for_migration_status(one, status,
+                              (const char * []) { "failed", "active",
+                                                  "completed", NULL });
+}
+
+static void postcopy_recover_fail(QTestState *from, QTestState *to)
+{
+    int ret, pair1[2], pair2[2];
+    char c;
+
+    /* Create two unrelated socketpairs */
+    ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair1);
+    g_assert_cmpint(ret, ==, 0);
+
+    ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair2);
+    g_assert_cmpint(ret, ==, 0);
+
+    /*
+     * Give the guests unpaired ends of the sockets, so they'll all blocked
+     * at reading.  This mimics a wrong channel established.
+     */
+    qtest_qmp_fds_assert_success(from, &pair1[0], 1,
+                                 "{ 'execute': 'getfd',"
+                                 "  'arguments': { 'fdname': 'fd-mig' }}");
+    qtest_qmp_fds_assert_success(to, &pair2[0], 1,
+                                 "{ 'execute': 'getfd',"
+                                 "  'arguments': { 'fdname': 'fd-mig' }}");
+
+    /*
+     * Write the 1st byte as QEMU_VM_COMMAND (0x8) for the dest socket, to
+     * emulate the 1st byte of a real recovery, but stops from there to
+     * keep dest QEMU in RECOVER.  This is needed so that we can kick off
+     * the recover process on dest QEMU (by triggering the G_IO_IN event).
+     *
+     * NOTE: this trick is not needed on src QEMUs, because src doesn't
+     * rely on an pre-existing G_IO_IN event, so it will always trigger the
+     * upcoming recovery anyway even if it can read nothing.
+     */
+#define QEMU_VM_COMMAND              0x08
+    c = QEMU_VM_COMMAND;
+    ret = send(pair2[1], &c, 1, 0);
+    g_assert_cmpint(ret, ==, 1);
+
+    migrate_recover(to, "fd:fd-mig");
+    migrate_qmp(from, "fd:fd-mig", "{'resume': true}");
+
+    /*
+     * Make sure both QEMU instances will go into RECOVER stage, then test
+     * kicking them out using migrate-pause.
+     */
+    wait_for_postcopy_status(from, "postcopy-recover");
+    wait_for_postcopy_status(to, "postcopy-recover");
+
+    /*
+     * This would be issued by the admin upon noticing the hang, we should
+     * make sure we're able to kick this out.
+     */
+    migrate_pause(from);
+    wait_for_postcopy_status(from, "postcopy-paused");
+
+    /* Do the same test on dest */
+    migrate_pause(to);
+    wait_for_postcopy_status(to, "postcopy-paused");
+
+    close(pair1[0]);
+    close(pair1[1]);
+    close(pair2[0]);
+    close(pair2[1]);
+}
+
 static void test_postcopy_recovery_common(MigrateCommon *args)
 {
     QTestState *from, *to;
@@ -1420,6 +1493,15 @@ static void test_postcopy_recovery_common(MigrateCommon *args)
                               (const char * []) { "failed", "active",
                                                   "completed", NULL });
 
+    if (args->postcopy_recovery_test_fail) {
+        /*
+         * Test when a wrong socket specified for recover, and then the
+         * ability to kick it out, and continue with a correct socket.
+         */
+        postcopy_recover_fail(from, to);
+        /* continue with a good recovery */
+    }
+
     /*
      * Create a new socket to emulate a new channel that is different
      * from the broken migration channel; tell the destination to
@@ -1459,6 +1541,15 @@ static void test_postcopy_recovery_compress(void)
     test_postcopy_recovery_common(&args);
 }
 
+static void test_postcopy_recovery_double_fail(void)
+{
+    MigrateCommon args = {
+        .postcopy_recovery_test_fail = true,
+    };
+
+    test_postcopy_recovery_common(&args);
+}
+
 #ifdef CONFIG_GNUTLS
 static void test_postcopy_recovery_tls_psk(void)
 {
@@ -2841,6 +2932,9 @@ int main(int argc, char **argv)
             qtest_add_func("/migration/postcopy/recovery/compress/plain",
                            test_postcopy_recovery_compress);
         }
+        qtest_add_func("/migration/postcopy/recovery/double-failures",
+                       test_postcopy_recovery_double_fail);
+
     }
 
     qtest_add_func("/migration/bad_dest", test_baddest);
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] migration: Refactor error handling in source return path
  2023-10-04 22:02 ` [PATCH v3 03/10] migration: Refactor error handling in source return path Peter Xu
@ 2023-10-05  6:11   ` Philippe Mathieu-Daudé
  2023-10-05 16:05     ` Peter Xu
  2023-10-05  8:22   ` Juan Quintela
  2023-10-05 12:57   ` Fabiano Rosas
  2 siblings, 1 reply; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-10-05  6:11 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: Fabiano Rosas, Juan Quintela, Markus Armbruster

Hi Peter,

On 5/10/23 00:02, Peter Xu wrote:
> rp_state.error was a boolean used to show error happened in return path
> thread.  That's not only duplicating error reporting (migrate_set_error),
> but also not good enough in that we only do error_report() and set it to
> true, we never can keep a history of the exact error and show it in
> query-migrate.
> 
> To make this better, a few things done:
> 
>    - Use error_setg() rather than error_report() across the whole lifecycle
>      of return path thread, keeping the error in an Error*.
> 
>    - Use migrate_set_error() to apply that captured error to the global
>      migration object when error occured in this thread.
> 
>    - With above, no need to have mark_source_rp_bad(), remove it, alongside
>      with rp_state.error itself.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   migration/migration.h  |   1 -
>   migration/ram.h        |   5 +-
>   migration/migration.c  | 123 ++++++++++++++++++-----------------------
>   migration/ram.c        |  41 +++++++-------
>   migration/trace-events |   4 +-
>   5 files changed, 79 insertions(+), 95 deletions(-)


> -int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
> +int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
>   {
>       int ret = -EINVAL;
>       /* from_dst_file is always valid because we're within rp_thread */


> @@ -4193,16 +4194,16 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
>   
>       ret = qemu_file_get_error(file);
>       if (ret || size != local_size) {
> -        error_report("%s: read bitmap failed for ramblock '%s': %d"
> -                     " (size 0x%"PRIx64", got: 0x%"PRIx64")",
> -                     __func__, block->idstr, ret, local_size, size);
> +        error_setg(errp, "read bitmap failed for ramblock '%s': %d"
> +                   " (size 0x%"PRIx64", got: 0x%"PRIx64")",
> +                   block->idstr, ret, local_size, size);
>           ret = -EIO;
>           goto out;
>       }
>   
>       if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) {
> -        error_report("%s: ramblock '%s' end mark incorrect: 0x%"PRIx64,
> -                     __func__, block->idstr, end_mark);
> +        error_setg(errp, "ramblock '%s' end mark incorrect: 0x%"PRIx64,
> +                   block->idstr, end_mark);
>           ret = -EINVAL;
>           goto out;
>       }

This function returns -EIO/-EINVAL errors, propagated to its 2 callers
  - migrate_handle_rp_recv_bitmap()
  - migrate_handle_rp_resume_ack()
which are only used in source_return_path_thread() where the return
value is only checked as boolean.

Could we simplify them returning a boolean (which is the pattern with
functions taking an Error** as last parameter)?

Regards,

Phil.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status
  2023-10-04 22:02 ` [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status Peter Xu
@ 2023-10-05  7:28   ` Juan Quintela
  0 siblings, 0 replies; 35+ messages in thread
From: Juan Quintela @ 2023-10-05  7:28 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas

Peter Xu <peterx@redhat.com> wrote:
> Display it as long as being set, irrelevant of FAILED status.  E.g., it may
> also be applicable to PAUSED stage of postcopy, to provide hint on what has
> gone wrong.
>
> The error_mutex seems to be overlooked when referencing the error, add it
> to be very safe.
>
> This will change QAPI behavior by showing up error message outside !FAILED
> status, but it's intended and doesn't expect to break anyone.
>
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

queued.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 02/10] migration: Introduce migrate_has_error()
  2023-10-04 22:02 ` [PATCH v3 02/10] migration: Introduce migrate_has_error() Peter Xu
@ 2023-10-05  7:30   ` Juan Quintela
  0 siblings, 0 replies; 35+ messages in thread
From: Juan Quintela @ 2023-10-05  7:30 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas

Peter Xu <peterx@redhat.com> wrote:
> Introduce a helper to detect whether MigrationState.error is set for
> whatever reason.
>
> This is preparation work for any thread (e.g. source return path thread) to
> setup errors in an unified way to MigrationState, rather than relying on
> its own way to set errors (mark_source_rp_bad()).
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>
queued



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 04/10] migration: Deliver return path file error to migrate state too
  2023-10-04 22:02 ` [PATCH v3 04/10] migration: Deliver return path file error to migrate state too Peter Xu
@ 2023-10-05  7:32   ` Juan Quintela
  0 siblings, 0 replies; 35+ messages in thread
From: Juan Quintela @ 2023-10-05  7:32 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas

Peter Xu <peterx@redhat.com> wrote:
> We've already did this for most of the return path thread errors, but not
> yet for the IO errors happened on the return path qemufile.  Do that too.
>
> Re-export qemu_file_get_error_obj().
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 05/10] qemufile: Always return a verbose error
  2023-10-04 22:02 ` [PATCH v3 05/10] qemufile: Always return a verbose error Peter Xu
@ 2023-10-05  7:42   ` Juan Quintela
  0 siblings, 0 replies; 35+ messages in thread
From: Juan Quintela @ 2023-10-05  7:42 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas

Peter Xu <peterx@redhat.com> wrote:
> There're a lot of cases where we only have an errno set in last_error but
> without a detailed error description.  When this happens, try to generate
> an error contains the errno as a descriptive error.
>
> This will be helpful in cases where one relies on the Error*.  E.g.,
> migration state only caches Error* in MigrationState.error.  With this,
> we'll display correct error messages in e.g. query-migrate when the error
> was only set by qemu_file_set_error().
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

Have to adjust due to the static missing from previous patch.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 06/10] migration: Remember num of ramblocks to sync during recovery
  2023-10-04 22:02 ` [PATCH v3 06/10] migration: Remember num of ramblocks to sync during recovery Peter Xu
@ 2023-10-05  7:43   ` Juan Quintela
  0 siblings, 0 replies; 35+ messages in thread
From: Juan Quintela @ 2023-10-05  7:43 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas

Peter Xu <peterx@redhat.com> wrote:
> Instead of only relying on the count of rp_sem, make the counter be part of
> RAMState so it can be used in both threads to synchronize on the process.
>
> rp_sem will be further reused in follow up patches, as a way to kick the
> main thread, e.g., on recovery failures.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

queued.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 07/10] migration: Add migration_rp_wait|kick()
  2023-10-04 22:02 ` [PATCH v3 07/10] migration: Add migration_rp_wait|kick() Peter Xu
@ 2023-10-05  7:49   ` Juan Quintela
  2023-10-05 20:47     ` Peter Xu
  0 siblings, 1 reply; 35+ messages in thread
From: Juan Quintela @ 2023-10-05  7:49 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas

Peter Xu <peterx@redhat.com> wrote:
> It's just a simple wrapper for rp_sem on either wait() or kick(), make it
> even clearer on how it is used.  Prepared to be used even for other things.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

I agree with the idea, but I think that the problem is the name of the
semaphore.


> +void migration_rp_wait(MigrationState *s)
> +{
> +    qemu_sem_wait(&s->rp_state.rp_sem);

I am not sure if it would be better to have the wrappers or just rename

If we rename the remaphore to migration_thread, this becomes:

    qemu_sem_wait(&s->rp_state.return_path_ready);

    qemu_sem_post(&s->rp_state.return_path_ready);

Or something similar?



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] migration: Refactor error handling in source return path
  2023-10-04 22:02 ` [PATCH v3 03/10] migration: Refactor error handling in source return path Peter Xu
  2023-10-05  6:11   ` Philippe Mathieu-Daudé
@ 2023-10-05  8:22   ` Juan Quintela
  2023-10-05 19:35     ` Peter Xu
  2023-10-05 12:57   ` Fabiano Rosas
  2 siblings, 1 reply; 35+ messages in thread
From: Juan Quintela @ 2023-10-05  8:22 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas

Peter Xu <peterx@redhat.com> wrote:
> rp_state.error was a boolean used to show error happened in return path
> thread.  That's not only duplicating error reporting (migrate_set_error),
> but also not good enough in that we only do error_report() and set it to
> true, we never can keep a history of the exact error and show it in
> query-migrate.
>
> To make this better, a few things done:
>
>   - Use error_setg() rather than error_report() across the whole lifecycle
>     of return path thread, keeping the error in an Error*.

Good.

>   - Use migrate_set_error() to apply that captured error to the global
>     migration object when error occured in this thread.

Good.

>   - With above, no need to have mark_source_rp_bad(), remove it, alongside
>     with rp_state.error itself.

Good.

>  uint64_t ram_pagesize_summary(void);
> -int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
> +int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len,
> +                         Error **errp);


good.

> @@ -1793,37 +1782,36 @@ static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname,
>       */
>      if (!QEMU_IS_ALIGNED(start, our_host_ps) ||
>          !QEMU_IS_ALIGNED(len, our_host_ps)) {
> -        error_report("%s: Misaligned page request, start: " RAM_ADDR_FMT
> -                     " len: %zd", __func__, start, len);
> -        mark_source_rp_bad(ms);
> +        error_setg(errp, "MIG_RP_MSG_REQ_PAGES: Misaligned page request, start:"
> +                   RAM_ADDR_FMT " len: %zd", start, len);
>          return;
>      }
>  
> -    if (ram_save_queue_pages(rbname, start, len)) {
> -        mark_source_rp_bad(ms);
> -    }
> +    ram_save_queue_pages(rbname, start, len, errp);

ram_save_queue_pages() returns an int.
I think this function should return an int.

Next is independent of this patch:

> -static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name)
> +static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name,
> +                                         Error **errp)
>  {
>      RAMBlock *block = qemu_ram_block_by_name(block_name);
>  
>      if (!block) {
> -        error_report("%s: invalid block name '%s'", __func__, block_name);
> +        error_setg(errp, "MIG_RP_MSG_RECV_BITMAP has invalid block name '%s'",
> +                   block_name);
>          return -EINVAL;

We sent -EINVAL.

>      }
>  
>      /* Fetch the received bitmap and refresh the dirty bitmap */
> -    return ram_dirty_bitmap_reload(s, block);
> +    return ram_dirty_bitmap_reload(s, block, errp);
>  }
>  
> -static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value)
> +static int migrate_handle_rp_resume_ack(MigrationState *s,
> +                                        uint32_t value, Error **errp)
>  {
>      trace_source_return_path_thread_resume_ack(value);
>  
>      if (value != MIGRATION_RESUME_ACK_VALUE) {
> -        error_report("%s: illegal resume_ack value %"PRIu32,
> -                     __func__, value);
> +        error_setg(errp, "illegal resume_ack value %"PRIu32, value);
>          return -1;

And here -1.

On both callers we just check if it is different from zero.  We never
use the return value as errno, so I think we should move to -1, if there
is an error, that is what errp is for.


> -/* Returns 0 if the RP was ok, otherwise there was an error on the RP */
> -static int await_return_path_close_on_source(MigrationState *ms)
> +static void await_return_path_close_on_source(MigrationState *ms)
>  {
> -    int ret;
> -
>      if (!ms->rp_state.rp_thread_created) {
> -        return 0;
> +        return;
>      }
>  
>      trace_migration_return_path_end_before();
> @@ -2060,18 +2050,10 @@ static int await_return_path_close_on_source(MigrationState *ms)
>          }
>      }
>  
> -    trace_await_return_path_close_on_source_joining();
>      qemu_thread_join(&ms->rp_state.rp_thread);
>      ms->rp_state.rp_thread_created = false;
> -    trace_await_return_path_close_on_source_close();
> -
> -    ret = ms->rp_state.error;
> -    ms->rp_state.error = false;
> -
>      migration_release_dst_files(ms);
> -
> -    trace_migration_return_path_end_after(ret);
> -    return ret;
> +    trace_migration_return_path_end_after();
>  }
>  
>  static inline void
> @@ -2367,7 +2349,10 @@ static void migration_completion(MigrationState *s)
>          goto fail;
>      }
>  
> -    if (await_return_path_close_on_source(s)) {
> +    await_return_path_close_on_source(s);
> +
> +    /* If return path has error, should have been set here */
> +    if (migrate_has_error(s)) {
>          goto fail;
>      }

In general, I think this is bad.  We are moving for

int foo(..)
{

}

....

if (foo()) {
     goto fail;
}

to:

void foo(..)
{

}

....

foo();

if (bar()) {
     goto fail;
}

I would preffer to move the other way around.  Move the error
synchrconously. My plan is that at some point in time
qemu_file_get_error() dissapears, i.e. we return the error when we
receive it and we handle it synchronously.

And yes, that is a something will take a lot of time, but I will hope we
move on that direction, not in trusting more setting internal errors,
use void functions and then check with yet another functions.


On top of your changes:

> -int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
> +int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
>  {
>      int ret = -EINVAL;
>      /* from_dst_file is always valid because we're within rp_thread */
> @@ -4163,8 +4165,8 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
>      trace_ram_dirty_bitmap_reload_begin(block->idstr);
>  
>      if (s->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {
> -        error_report("%s: incorrect state %s", __func__,
> -                     MigrationStatus_str(s->state));
> +        error_setg(errp, "Reload bitmap in incorrect state %s",
> +                   MigrationStatus_str(s->state));
>          return -EINVAL;

return -1

same for the rest of the cases. Callers only check for != 0, and if you
want details, you need to look at errp.

See the nice series for migration/rdma.c for why this is better (and
more consistent).

Rest of the patch is very nice.

Thanks, Juan.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu
  2023-10-04 22:02 ` [PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu Peter Xu
@ 2023-10-05  8:24   ` Juan Quintela
  0 siblings, 0 replies; 35+ messages in thread
From: Juan Quintela @ 2023-10-05  8:24 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas

Peter Xu <peterx@redhat.com> wrote:
> There's a bug on dest that if a double fault triggered on dest qemu (a
> network issue during postcopy-recover), we won't set PAUSED correctly
> because we assumed we always came from ACTIVE.
>
> Fix that by always overwriting the state to PAUSE.
>
> We could also check for these two states, but maybe it's an overkill.  We
> did the same on the src QEMU to unconditionally switch to PAUSE anyway.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

queued



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] migration: Refactor error handling in source return path
  2023-10-04 22:02 ` [PATCH v3 03/10] migration: Refactor error handling in source return path Peter Xu
  2023-10-05  6:11   ` Philippe Mathieu-Daudé
  2023-10-05  8:22   ` Juan Quintela
@ 2023-10-05 12:57   ` Fabiano Rosas
  2023-10-05 19:35     ` Peter Xu
  2 siblings, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-10-05 12:57 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: peterx, Juan Quintela

Peter Xu <peterx@redhat.com> writes:

> @@ -1882,48 +1870,46 @@ static void *source_return_path_thread(void *opaque)
>      uint32_t tmp32, sibling_error;
>      ram_addr_t start = 0; /* =0 to silence warning */
>      size_t  len = 0, expected_len;
> +    Error *err = NULL;
>      int res;
>  
>      trace_source_return_path_thread_entry();
>      rcu_register_thread();
>  
> -    while (!ms->rp_state.error && !qemu_file_get_error(rp) &&
> +    while (!migrate_has_error(ms) && !qemu_file_get_error(rp) &&
>             migration_is_setup_or_active(ms->state)) {
>          trace_source_return_path_thread_loop_top();
> +
>          header_type = qemu_get_be16(rp);
>          header_len = qemu_get_be16(rp);
>  
>          if (qemu_file_get_error(rp)) {
> -            mark_source_rp_bad(ms);
>              goto out;
>          }

This error will be lost because outside the loop we only check for err.

>  
>          if (header_type >= MIG_RP_MSG_MAX ||
>              header_type == MIG_RP_MSG_INVALID) {
> -            error_report("RP: Received invalid message 0x%04x length 0x%04x",
> -                         header_type, header_len);
> -            mark_source_rp_bad(ms);
> +            error_setg(&err, "Received invalid message 0x%04x length 0x%04x",
> +                       header_type, header_len);
>              goto out;
>          }
>  
>          if ((rp_cmd_args[header_type].len != -1 &&
>              header_len != rp_cmd_args[header_type].len) ||
>              header_len > sizeof(buf)) {
> -            error_report("RP: Received '%s' message (0x%04x) with"
> -                         "incorrect length %d expecting %zu",
> -                         rp_cmd_args[header_type].name, header_type, header_len,
> -                         (size_t)rp_cmd_args[header_type].len);
> -            mark_source_rp_bad(ms);
> +            error_setg(&err, "Received '%s' message (0x%04x) with"
> +                       "incorrect length %d expecting %zu",
> +                       rp_cmd_args[header_type].name, header_type, header_len,
> +                       (size_t)rp_cmd_args[header_type].len);
>              goto out;
>          }
>  
>          /* We know we've got a valid header by this point */
>          res = qemu_get_buffer(rp, buf, header_len);
>          if (res != header_len) {
> -            error_report("RP: Failed reading data for message 0x%04x"
> -                         " read %d expected %d",
> -                         header_type, res, header_len);
> -            mark_source_rp_bad(ms);
> +            error_setg(&err, "Failed reading data for message 0x%04x"
> +                       " read %d expected %d",
> +                       header_type, res, header_len);
>              goto out;
>          }
>  
> @@ -1933,8 +1919,7 @@ static void *source_return_path_thread(void *opaque)
>              sibling_error = ldl_be_p(buf);
>              trace_source_return_path_thread_shut(sibling_error);
>              if (sibling_error) {
> -                error_report("RP: Sibling indicated error %d", sibling_error);
> -                mark_source_rp_bad(ms);
> +                error_setg(&err, "Sibling indicated error %d", sibling_error);
>              }
>              /*
>               * We'll let the main thread deal with closing the RP
> @@ -1952,7 +1937,10 @@ static void *source_return_path_thread(void *opaque)
>          case MIG_RP_MSG_REQ_PAGES:
>              start = ldq_be_p(buf);
>              len = ldl_be_p(buf + 8);
> -            migrate_handle_rp_req_pages(ms, NULL, start, len);
> +            migrate_handle_rp_req_pages(ms, NULL, start, len, &err);
> +            if (err) {
> +                goto out;
> +            }
>              break;
>  
>          case MIG_RP_MSG_REQ_PAGES_ID:
> @@ -1967,32 +1955,32 @@ static void *source_return_path_thread(void *opaque)
>                  expected_len += tmp32;
>              }
>              if (header_len != expected_len) {
> -                error_report("RP: Req_Page_id with length %d expecting %zd",
> -                             header_len, expected_len);
> -                mark_source_rp_bad(ms);
> +                error_setg(&err, "Req_Page_id with length %d expecting %zd",
> +                           header_len, expected_len);
> +                goto out;
> +            }
> +            migrate_handle_rp_req_pages(ms, (char *)&buf[13], start, len,
> +                                        &err);
> +            if (err) {
>                  goto out;
>              }
> -            migrate_handle_rp_req_pages(ms, (char *)&buf[13], start, len);
>              break;
>  
>          case MIG_RP_MSG_RECV_BITMAP:
>              if (header_len < 1) {
> -                error_report("%s: missing block name", __func__);
> -                mark_source_rp_bad(ms);
> +                error_setg(&err, "MIG_RP_MSG_RECV_BITMAP missing block name");
>                  goto out;
>              }
>              /* Format: len (1B) + idstr (<255B). This ends the idstr. */
>              buf[buf[0] + 1] = '\0';
> -            if (migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1))) {
> -                mark_source_rp_bad(ms);
> +            if (migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1), &err)) {
>                  goto out;
>              }
>              break;
>  
>          case MIG_RP_MSG_RESUME_ACK:
>              tmp32 = ldl_be_p(buf);
> -            if (migrate_handle_rp_resume_ack(ms, tmp32)) {
> -                mark_source_rp_bad(ms);
> +            if (migrate_handle_rp_resume_ack(ms, tmp32, &err)) {
>                  goto out;
>              }
>              break;
> @@ -2008,9 +1996,14 @@ static void *source_return_path_thread(void *opaque)
>      }
>  
>  out:
> -    if (qemu_file_get_error(rp)) {
> +    if (err) {

Need to keep both checks here.

> +        /*
> +         * Collect any error in return-path thread and report it to the
> +         * migration state object.
> +         */
> +        migrate_set_error(ms, err);
> +        error_free(err);
>          trace_source_return_path_thread_bad_end();
> -        mark_source_rp_bad(ms);
>      }
>  
>      trace_source_return_path_thread_end();
> @@ -2036,13 +2029,10 @@ static int open_return_path_on_source(MigrationState *ms)
>      return 0;
>  }


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-04 22:02 ` [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER Peter Xu
@ 2023-10-05 13:24   ` Fabiano Rosas
  2023-10-05 13:37     ` Fabiano Rosas
  0 siblings, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-10-05 13:24 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: peterx, Juan Quintela

Peter Xu <peterx@redhat.com> writes:

> From: Fabiano Rosas <farosas@suse.de>
>
> To do so, create two paired sockets, but make them not providing real data.
> Feed those fake sockets to src/dst QEMUs for recovery to let them go into
> RECOVER stage without going out.  Test that we can always kick it out and
> recover again with the right ports.
>
> This patch is based on Fabiano's version here:
>
> https://lore.kernel.org/r/877cowmdu0.fsf@suse.de
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> [peterx: write commit message, remove case 1, fix bugs, and more]
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  tests/qtest/migration-test.c | 94 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 94 insertions(+)
>
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 46f1c275a2..fb7a3765e4 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -729,6 +729,7 @@ typedef struct {
>      /* Postcopy specific fields */
>      void *postcopy_data;
>      bool postcopy_preempt;
> +    bool postcopy_recovery_test_fail;
>  } MigrateCommon;
>  
>  static int test_migrate_start(QTestState **from, QTestState **to,
> @@ -1381,6 +1382,78 @@ static void test_postcopy_preempt_tls_psk(void)
>  }
>  #endif
>  
> +static void wait_for_postcopy_status(QTestState *one, const char *status)
> +{
> +    wait_for_migration_status(one, status,
> +                              (const char * []) { "failed", "active",
> +                                                  "completed", NULL });
> +}
> +
> +static void postcopy_recover_fail(QTestState *from, QTestState *to)
> +{
> +    int ret, pair1[2], pair2[2];
> +    char c;
> +
> +    /* Create two unrelated socketpairs */
> +    ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair1);
> +    g_assert_cmpint(ret, ==, 0);
> +
> +    ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair2);
> +    g_assert_cmpint(ret, ==, 0);
> +
> +    /*
> +     * Give the guests unpaired ends of the sockets, so they'll all blocked
> +     * at reading.  This mimics a wrong channel established.
> +     */
> +    qtest_qmp_fds_assert_success(from, &pair1[0], 1,
> +                                 "{ 'execute': 'getfd',"
> +                                 "  'arguments': { 'fdname': 'fd-mig' }}");
> +    qtest_qmp_fds_assert_success(to, &pair2[0], 1,
> +                                 "{ 'execute': 'getfd',"
> +                                 "  'arguments': { 'fdname': 'fd-mig' }}");
> +
> +    /*
> +     * Write the 1st byte as QEMU_VM_COMMAND (0x8) for the dest socket, to
> +     * emulate the 1st byte of a real recovery, but stops from there to
> +     * keep dest QEMU in RECOVER.  This is needed so that we can kick off
> +     * the recover process on dest QEMU (by triggering the G_IO_IN event).
> +     *
> +     * NOTE: this trick is not needed on src QEMUs, because src doesn't
> +     * rely on an pre-existing G_IO_IN event, so it will always trigger the
> +     * upcoming recovery anyway even if it can read nothing.
> +     */
> +#define QEMU_VM_COMMAND              0x08
> +    c = QEMU_VM_COMMAND;
> +    ret = send(pair2[1], &c, 1, 0);
> +    g_assert_cmpint(ret, ==, 1);
> +
> +    migrate_recover(to, "fd:fd-mig");
> +    migrate_qmp(from, "fd:fd-mig", "{'resume': true}");
> +
> +    /*
> +     * Make sure both QEMU instances will go into RECOVER stage, then test
> +     * kicking them out using migrate-pause.
> +     */
> +    wait_for_postcopy_status(from, "postcopy-recover");
> +    wait_for_postcopy_status(to, "postcopy-recover");

Is this wait out of place? I think we're trying to resume too fast after
migrate_recover():

# {                        
#     "error": {                                                                                                                                                                               
#         "class": "GenericError",                                                                                                                                                             
#         "desc": "Cannot resume if there is no paused migration"
#     }                                                                                                                                                                                        
# }  

> +
> +    /*
> +     * This would be issued by the admin upon noticing the hang, we should
> +     * make sure we're able to kick this out.
> +     */
> +    migrate_pause(from);
> +    wait_for_postcopy_status(from, "postcopy-paused");
> +
> +    /* Do the same test on dest */
> +    migrate_pause(to);
> +    wait_for_postcopy_status(to, "postcopy-paused");
> +
> +    close(pair1[0]);
> +    close(pair1[1]);
> +    close(pair2[0]);
> +    close(pair2[1]);
> +}
> +
>  static void test_postcopy_recovery_common(MigrateCommon *args)
>  {
>      QTestState *from, *to;
> @@ -1420,6 +1493,15 @@ static void test_postcopy_recovery_common(MigrateCommon *args)
>                                (const char * []) { "failed", "active",
>                                                    "completed", NULL });
>  
> +    if (args->postcopy_recovery_test_fail) {
> +        /*
> +         * Test when a wrong socket specified for recover, and then the
> +         * ability to kick it out, and continue with a correct socket.
> +         */
> +        postcopy_recover_fail(from, to);
> +        /* continue with a good recovery */
> +    }
> +
>      /*
>       * Create a new socket to emulate a new channel that is different
>       * from the broken migration channel; tell the destination to
> @@ -1459,6 +1541,15 @@ static void test_postcopy_recovery_compress(void)
>      test_postcopy_recovery_common(&args);
>  }
>  
> +static void test_postcopy_recovery_double_fail(void)
> +{
> +    MigrateCommon args = {
> +        .postcopy_recovery_test_fail = true,
> +    };
> +
> +    test_postcopy_recovery_common(&args);
> +}
> +
>  #ifdef CONFIG_GNUTLS
>  static void test_postcopy_recovery_tls_psk(void)
>  {
> @@ -2841,6 +2932,9 @@ int main(int argc, char **argv)
>              qtest_add_func("/migration/postcopy/recovery/compress/plain",
>                             test_postcopy_recovery_compress);
>          }
> +        qtest_add_func("/migration/postcopy/recovery/double-failures",
> +                       test_postcopy_recovery_double_fail);
> +
>      }
>  
>      qtest_add_func("/migration/bad_dest", test_baddest);


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 08/10] migration: Allow network to fail even during recovery
  2023-10-04 22:02 ` [PATCH v3 08/10] migration: Allow network to fail even during recovery Peter Xu
@ 2023-10-05 13:25   ` Fabiano Rosas
  0 siblings, 0 replies; 35+ messages in thread
From: Fabiano Rosas @ 2023-10-05 13:25 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: peterx, Juan Quintela, Xiaohui Li

Peter Xu <peterx@redhat.com> writes:

> Normally the postcopy recover phase should only exist for a super short
> period, that's the duration when QEMU is trying to recover from an
> interrupted postcopy migration, during which handshake will be carried out
> for continuing the procedure with state changes from PAUSED -> RECOVER ->
> POSTCOPY_ACTIVE again.
>
> Here RECOVER phase should be super small, that happens right after the
> admin specified a new but working network link for QEMU to reconnect to
> dest QEMU.
>
> However there can still be case where the channel is broken in this small
> RECOVER window.
>
> If it happens, with current code there's no way the src QEMU can got kicked
> out of RECOVER stage. No way either to retry the recover in another channel
> when established.
>
> This patch allows the RECOVER phase to fail itself too - we're mostly
> ready, just some small things missing, e.g. properly kick the main
> migration thread out when sleeping on rp_sem when we found that we're at
> RECOVER stage.  When this happens, it fails the RECOVER itself, and
> rollback to PAUSED stage.  Then the user can retry another round of
> recovery.
>
> To make it even stronger, teach QMP command migrate-pause to explicitly
> kick src/dst QEMU out when needed, so even if for some reason the migration
> thread didn't got kicked out already by a failing rethrn-path thread, the
> admin can also kick it out.
>
> This will be an super, super corner case, but still try to cover that.
>
> One can try to test this with two proxy channels for migration:
>
>   (a) socat unix-listen:/tmp/src.sock,reuseaddr,fork tcp:localhost:10000
>   (b) socat tcp-listen:10000,reuseaddr,fork unix:/tmp/dst.sock
>
> So the migration channel will be:
>
>                       (a)          (b)
>   src -> /tmp/src.sock -> tcp:10000 -> /tmp/dst.sock -> dst
>
> Then to make QEMU hang at RECOVER stage, one can do below:
>
>   (1) stop the postcopy using QMP command postcopy-pause
>   (2) kill the 2nd proxy (b)
>   (3) try to recover the postcopy using /tmp/src.sock on src
>   (4) src QEMU will go into RECOVER stage but won't be able to continue
>       from there, because the channel is actually broken at (b)
>
> Before this patch, step (4) will make src QEMU stuck in RECOVER stage,
> without a way to kick the QEMU out or continue the postcopy again.  After
> this patch, (4) will quickly fail qemu and bounce back to PAUSED stage.
>
> Admin can also kick QEMU from (4) into PAUSED when needed using
> migrate-pause when needed.
>
> After bouncing back to PAUSED stage, one can recover again.
>
> Reported-by: Xiaohui Li <xiaohli@redhat.com>
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2111332
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-05 13:24   ` Fabiano Rosas
@ 2023-10-05 13:37     ` Fabiano Rosas
  2023-10-05 20:55       ` Peter Xu
  0 siblings, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-10-05 13:37 UTC (permalink / raw)
  To: Peter Xu, qemu-devel; +Cc: peterx, Juan Quintela

Fabiano Rosas <farosas@suse.de> writes:

> Peter Xu <peterx@redhat.com> writes:
>
>> From: Fabiano Rosas <farosas@suse.de>
>>
>> To do so, create two paired sockets, but make them not providing real data.
>> Feed those fake sockets to src/dst QEMUs for recovery to let them go into
>> RECOVER stage without going out.  Test that we can always kick it out and
>> recover again with the right ports.
>>
>> This patch is based on Fabiano's version here:
>>
>> https://lore.kernel.org/r/877cowmdu0.fsf@suse.de
>>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> [peterx: write commit message, remove case 1, fix bugs, and more]
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>>  tests/qtest/migration-test.c | 94 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 94 insertions(+)
>>
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index 46f1c275a2..fb7a3765e4 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -729,6 +729,7 @@ typedef struct {
>>      /* Postcopy specific fields */
>>      void *postcopy_data;
>>      bool postcopy_preempt;
>> +    bool postcopy_recovery_test_fail;
>>  } MigrateCommon;
>>  
>>  static int test_migrate_start(QTestState **from, QTestState **to,
>> @@ -1381,6 +1382,78 @@ static void test_postcopy_preempt_tls_psk(void)
>>  }
>>  #endif
>>  
>> +static void wait_for_postcopy_status(QTestState *one, const char *status)
>> +{
>> +    wait_for_migration_status(one, status,
>> +                              (const char * []) { "failed", "active",
>> +                                                  "completed", NULL });
>> +}
>> +
>> +static void postcopy_recover_fail(QTestState *from, QTestState *to)
>> +{
>> +    int ret, pair1[2], pair2[2];
>> +    char c;
>> +
>> +    /* Create two unrelated socketpairs */
>> +    ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair1);
>> +    g_assert_cmpint(ret, ==, 0);
>> +
>> +    ret = qemu_socketpair(PF_LOCAL, SOCK_STREAM, 0, pair2);
>> +    g_assert_cmpint(ret, ==, 0);
>> +
>> +    /*
>> +     * Give the guests unpaired ends of the sockets, so they'll all blocked
>> +     * at reading.  This mimics a wrong channel established.
>> +     */
>> +    qtest_qmp_fds_assert_success(from, &pair1[0], 1,
>> +                                 "{ 'execute': 'getfd',"
>> +                                 "  'arguments': { 'fdname': 'fd-mig' }}");
>> +    qtest_qmp_fds_assert_success(to, &pair2[0], 1,
>> +                                 "{ 'execute': 'getfd',"
>> +                                 "  'arguments': { 'fdname': 'fd-mig' }}");
>> +
>> +    /*
>> +     * Write the 1st byte as QEMU_VM_COMMAND (0x8) for the dest socket, to
>> +     * emulate the 1st byte of a real recovery, but stops from there to
>> +     * keep dest QEMU in RECOVER.  This is needed so that we can kick off
>> +     * the recover process on dest QEMU (by triggering the G_IO_IN event).
>> +     *
>> +     * NOTE: this trick is not needed on src QEMUs, because src doesn't
>> +     * rely on an pre-existing G_IO_IN event, so it will always trigger the
>> +     * upcoming recovery anyway even if it can read nothing.
>> +     */
>> +#define QEMU_VM_COMMAND              0x08
>> +    c = QEMU_VM_COMMAND;
>> +    ret = send(pair2[1], &c, 1, 0);
>> +    g_assert_cmpint(ret, ==, 1);
>> +
>> +    migrate_recover(to, "fd:fd-mig");
>> +    migrate_qmp(from, "fd:fd-mig", "{'resume': true}");
>> +
>> +    /*
>> +     * Make sure both QEMU instances will go into RECOVER stage, then test
>> +     * kicking them out using migrate-pause.
>> +     */
>> +    wait_for_postcopy_status(from, "postcopy-recover");
>> +    wait_for_postcopy_status(to, "postcopy-recover");
>
> Is this wait out of place? I think we're trying to resume too fast after
> migrate_recover():
>
> # {
> #     "error": {
> #         "class": "GenericError",
> #         "desc": "Cannot resume if there is no paused migration"
> #     }
> # }
>

Ugh, sorry about the long lines:

{
    "error": {
        "class": "GenericError",
        "desc": "Cannot resume if there is no paused migration"
    }
}



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] migration: Refactor error handling in source return path
  2023-10-05  6:11   ` Philippe Mathieu-Daudé
@ 2023-10-05 16:05     ` Peter Xu
  2023-10-08 11:39       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-05 16:05 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: qemu-devel, Fabiano Rosas, Juan Quintela, Markus Armbruster

On Thu, Oct 05, 2023 at 08:11:33AM +0200, Philippe Mathieu-Daudé wrote:
> Hi Peter,
> 
> On 5/10/23 00:02, Peter Xu wrote:
> > rp_state.error was a boolean used to show error happened in return path
> > thread.  That's not only duplicating error reporting (migrate_set_error),
> > but also not good enough in that we only do error_report() and set it to
> > true, we never can keep a history of the exact error and show it in
> > query-migrate.
> > 
> > To make this better, a few things done:
> > 
> >    - Use error_setg() rather than error_report() across the whole lifecycle
> >      of return path thread, keeping the error in an Error*.
> > 
> >    - Use migrate_set_error() to apply that captured error to the global
> >      migration object when error occured in this thread.
> > 
> >    - With above, no need to have mark_source_rp_bad(), remove it, alongside
> >      with rp_state.error itself.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   migration/migration.h  |   1 -
> >   migration/ram.h        |   5 +-
> >   migration/migration.c  | 123 ++++++++++++++++++-----------------------
> >   migration/ram.c        |  41 +++++++-------
> >   migration/trace-events |   4 +-
> >   5 files changed, 79 insertions(+), 95 deletions(-)
> 
> 
> > -int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
> > +int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
> >   {
> >       int ret = -EINVAL;
> >       /* from_dst_file is always valid because we're within rp_thread */
> 
> 
> > @@ -4193,16 +4194,16 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
> >       ret = qemu_file_get_error(file);
> >       if (ret || size != local_size) {
> > -        error_report("%s: read bitmap failed for ramblock '%s': %d"
> > -                     " (size 0x%"PRIx64", got: 0x%"PRIx64")",
> > -                     __func__, block->idstr, ret, local_size, size);
> > +        error_setg(errp, "read bitmap failed for ramblock '%s': %d"
> > +                   " (size 0x%"PRIx64", got: 0x%"PRIx64")",
> > +                   block->idstr, ret, local_size, size);
> >           ret = -EIO;
> >           goto out;
> >       }
> >       if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) {
> > -        error_report("%s: ramblock '%s' end mark incorrect: 0x%"PRIx64,
> > -                     __func__, block->idstr, end_mark);
> > +        error_setg(errp, "ramblock '%s' end mark incorrect: 0x%"PRIx64,
> > +                   block->idstr, end_mark);
> >           ret = -EINVAL;
> >           goto out;
> >       }
> 
> This function returns -EIO/-EINVAL errors, propagated to its 2 callers
>  - migrate_handle_rp_recv_bitmap()
>  - migrate_handle_rp_resume_ack()

It was only called in migrate_handle_rp_recv_bitmap(), but I think I see
what you meant..

> which are only used in source_return_path_thread() where the return
> value is only checked as boolean.
> 
> Could we simplify them returning a boolean (which is the pattern with
> functions taking an Error** as last parameter)?

Yes, with errp passed in, the "int" retcode is slightly duplicated.  I can
add one more patch on top of this as further cleanup, as below.

Thanks,

===8<===
From b1052befd72beb129012afddf5647339fe4e257c Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Thu, 5 Oct 2023 12:03:44 -0400
Subject: [PATCH] migration: Change ram_dirty_bitmap_reload() retval to bool
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Now we have a Error** passed into the return path thread stack, which is
even clearer than an int retval.  Change ram_dirty_bitmap_reload() and the
callers to use a bool instead to replace errnos.

Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/ram.h       |  2 +-
 migration/migration.c | 18 +++++++++---------
 migration/ram.c       | 24 +++++++++++-------------
 3 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/migration/ram.h b/migration/ram.h
index 14ed666d58..af0290f8ab 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -72,7 +72,7 @@ void ramblock_recv_bitmap_set(RAMBlock *rb, void *host_addr);
 void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr, size_t nr);
 int64_t ramblock_recv_bitmap_send(QEMUFile *file,
                                   const char *block_name);
-int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb, Error **errp);
+bool ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb, Error **errp);
 bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
 void postcopy_preempt_shutdown_file(MigrationState *s);
 void *postcopy_preempt_thread(void *opaque);
diff --git a/migration/migration.c b/migration/migration.c
index 1a7f214fcf..e7375810be 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1837,29 +1837,29 @@ static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname,
     ram_save_queue_pages(rbname, start, len, errp);
 }
 
-static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name,
-                                         Error **errp)
+static bool migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name,
+                                          Error **errp)
 {
     RAMBlock *block = qemu_ram_block_by_name(block_name);
 
     if (!block) {
         error_setg(errp, "MIG_RP_MSG_RECV_BITMAP has invalid block name '%s'",
                    block_name);
-        return -EINVAL;
+        return false;
     }
 
     /* Fetch the received bitmap and refresh the dirty bitmap */
     return ram_dirty_bitmap_reload(s, block, errp);
 }
 
-static int migrate_handle_rp_resume_ack(MigrationState *s,
-                                        uint32_t value, Error **errp)
+static bool migrate_handle_rp_resume_ack(MigrationState *s,
+                                         uint32_t value, Error **errp)
 {
     trace_source_return_path_thread_resume_ack(value);
 
     if (value != MIGRATION_RESUME_ACK_VALUE) {
         error_setg(errp, "illegal resume_ack value %"PRIu32, value);
-        return -1;
+        return false;
     }
 
     /* Now both sides are active. */
@@ -1869,7 +1869,7 @@ static int migrate_handle_rp_resume_ack(MigrationState *s,
     /* Notify send thread that time to continue send pages */
     migration_rp_kick(s);
 
-    return 0;
+    return true;
 }
 
 /*
@@ -2021,14 +2021,14 @@ static void *source_return_path_thread(void *opaque)
             }
             /* Format: len (1B) + idstr (<255B). This ends the idstr. */
             buf[buf[0] + 1] = '\0';
-            if (migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1), &err)) {
+            if (!migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1), &err)) {
                 goto out;
             }
             break;
 
         case MIG_RP_MSG_RESUME_ACK:
             tmp32 = ldl_be_p(buf);
-            if (migrate_handle_rp_resume_ack(ms, tmp32, &err)) {
+            if (!migrate_handle_rp_resume_ack(ms, tmp32, &err)) {
                 goto out;
             }
             break;
diff --git a/migration/ram.c b/migration/ram.c
index 2565f53f5c..982fbbeee1 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4157,23 +4157,25 @@ static int ram_dirty_bitmap_sync_all(MigrationState *s, RAMState *rs)
  * Read the received bitmap, revert it as the initial dirty bitmap.
  * This is only used when the postcopy migration is paused but wants
  * to resume from a middle point.
+ *
+ * Returns true if succeeded, false for errors.
  */
-int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
+bool ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
 {
-    int ret = -EINVAL;
     /* from_dst_file is always valid because we're within rp_thread */
     QEMUFile *file = s->rp_state.from_dst_file;
     unsigned long *le_bitmap, nbits = block->used_length >> TARGET_PAGE_BITS;
     uint64_t local_size = DIV_ROUND_UP(nbits, 8);
     uint64_t size, end_mark;
     RAMState *rs = ram_state;
+    bool result = false;
 
     trace_ram_dirty_bitmap_reload_begin(block->idstr);
 
     if (s->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {
         error_setg(errp, "Reload bitmap in incorrect state %s",
                    MigrationStatus_str(s->state));
-        return -EINVAL;
+        return false;
     }
 
     /*
@@ -4191,26 +4193,22 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
     if (size != local_size) {
         error_setg(errp, "ramblock '%s' bitmap size mismatch (0x%"PRIx64
                    " != 0x%"PRIx64")", block->idstr, size, local_size);
-        ret = -EINVAL;
         goto out;
     }
 
     size = qemu_get_buffer(file, (uint8_t *)le_bitmap, local_size);
     end_mark = qemu_get_be64(file);
 
-    ret = qemu_file_get_error(file);
-    if (ret || size != local_size) {
-        error_setg(errp, "read bitmap failed for ramblock '%s': %d"
-                   " (size 0x%"PRIx64", got: 0x%"PRIx64")",
-                   block->idstr, ret, local_size, size);
-        ret = -EIO;
+    if (qemu_file_get_error(file) || size != local_size) {
+        error_setg(errp, "read bitmap failed for ramblock '%s': "
+                   "(size 0x%"PRIx64", got: 0x%"PRIx64")",
+                   block->idstr, local_size, size);
         goto out;
     }
 
     if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) {
         error_setg(errp, "ramblock '%s' end mark incorrect: 0x%"PRIx64,
                    block->idstr, end_mark);
-        ret = -EINVAL;
         goto out;
     }
 
@@ -4243,10 +4241,10 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
      */
     migration_rp_kick(s);
 
-    ret = 0;
+    result = true;
 out:
     g_free(le_bitmap);
-    return ret;
+    return result;
 }
 
 static int ram_resume_prepare(MigrationState *s, void *opaque)
-- 
2.41.0


-- 
Peter Xu



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] migration: Refactor error handling in source return path
  2023-10-05  8:22   ` Juan Quintela
@ 2023-10-05 19:35     ` Peter Xu
  0 siblings, 0 replies; 35+ messages in thread
From: Peter Xu @ 2023-10-05 19:35 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, Fabiano Rosas

On Thu, Oct 05, 2023 at 10:22:52AM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > rp_state.error was a boolean used to show error happened in return path
> > thread.  That's not only duplicating error reporting (migrate_set_error),
> > but also not good enough in that we only do error_report() and set it to
> > true, we never can keep a history of the exact error and show it in
> > query-migrate.
> >
> > To make this better, a few things done:
> >
> >   - Use error_setg() rather than error_report() across the whole lifecycle
> >     of return path thread, keeping the error in an Error*.
> 
> Good.
> 
> >   - Use migrate_set_error() to apply that captured error to the global
> >     migration object when error occured in this thread.
> 
> Good.
> 
> >   - With above, no need to have mark_source_rp_bad(), remove it, alongside
> >     with rp_state.error itself.
> 
> Good.
> 
> >  uint64_t ram_pagesize_summary(void);
> > -int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
> > +int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len,
> > +                         Error **errp);
> 
> 
> good.
> 
> > @@ -1793,37 +1782,36 @@ static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname,
> >       */
> >      if (!QEMU_IS_ALIGNED(start, our_host_ps) ||
> >          !QEMU_IS_ALIGNED(len, our_host_ps)) {
> > -        error_report("%s: Misaligned page request, start: " RAM_ADDR_FMT
> > -                     " len: %zd", __func__, start, len);
> > -        mark_source_rp_bad(ms);
> > +        error_setg(errp, "MIG_RP_MSG_REQ_PAGES: Misaligned page request, start:"
> > +                   RAM_ADDR_FMT " len: %zd", start, len);
> >          return;
> >      }
> >  
> > -    if (ram_save_queue_pages(rbname, start, len)) {
> > -        mark_source_rp_bad(ms);
> > -    }
> > +    ram_save_queue_pages(rbname, start, len, errp);
> 
> ram_save_queue_pages() returns an int.
> I think this function should return an int.

Phil suggested something similar for the other patch, instead of also let
this function return int, I'll add one more patch to let it return boolean
to show whether there's an error, keeping the real error in errp.

> 
> Next is independent of this patch:
> 
> > -static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name)
> > +static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name,
> > +                                         Error **errp)
> >  {
> >      RAMBlock *block = qemu_ram_block_by_name(block_name);
> >  
> >      if (!block) {
> > -        error_report("%s: invalid block name '%s'", __func__, block_name);
> > +        error_setg(errp, "MIG_RP_MSG_RECV_BITMAP has invalid block name '%s'",
> > +                   block_name);
> >          return -EINVAL;
> 
> We sent -EINVAL.
> 
> >      }
> >  
> >      /* Fetch the received bitmap and refresh the dirty bitmap */
> > -    return ram_dirty_bitmap_reload(s, block);
> > +    return ram_dirty_bitmap_reload(s, block, errp);
> >  }
> >  
> > -static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value)
> > +static int migrate_handle_rp_resume_ack(MigrationState *s,
> > +                                        uint32_t value, Error **errp)
> >  {
> >      trace_source_return_path_thread_resume_ack(value);
> >  
> >      if (value != MIGRATION_RESUME_ACK_VALUE) {
> > -        error_report("%s: illegal resume_ack value %"PRIu32,
> > -                     __func__, value);
> > +        error_setg(errp, "illegal resume_ack value %"PRIu32, value);
> >          return -1;
> 
> And here -1.
> 
> On both callers we just check if it is different from zero.  We never
> use the return value as errno, so I think we should move to -1, if there
> is an error, that is what errp is for.

Right.  I'll switch all rp-return thread paths to use boolean as return, as
long as there's errp.

> 
> 
> > -/* Returns 0 if the RP was ok, otherwise there was an error on the RP */
> > -static int await_return_path_close_on_source(MigrationState *ms)
> > +static void await_return_path_close_on_source(MigrationState *ms)
> >  {
> > -    int ret;
> > -
> >      if (!ms->rp_state.rp_thread_created) {
> > -        return 0;
> > +        return;
> >      }
> >  
> >      trace_migration_return_path_end_before();
> > @@ -2060,18 +2050,10 @@ static int await_return_path_close_on_source(MigrationState *ms)
> >          }
> >      }
> >  
> > -    trace_await_return_path_close_on_source_joining();
> >      qemu_thread_join(&ms->rp_state.rp_thread);
> >      ms->rp_state.rp_thread_created = false;
> > -    trace_await_return_path_close_on_source_close();
> > -
> > -    ret = ms->rp_state.error;
> > -    ms->rp_state.error = false;
> > -
> >      migration_release_dst_files(ms);
> > -
> > -    trace_migration_return_path_end_after(ret);
> > -    return ret;
> > +    trace_migration_return_path_end_after();
> >  }
> >  
> >  static inline void
> > @@ -2367,7 +2349,10 @@ static void migration_completion(MigrationState *s)
> >          goto fail;
> >      }
> >  
> > -    if (await_return_path_close_on_source(s)) {
> > +    await_return_path_close_on_source(s);
> > +
> > +    /* If return path has error, should have been set here */
> > +    if (migrate_has_error(s)) {
> >          goto fail;
> >      }
> 
> In general, I think this is bad.  We are moving for
> 
> int foo(..)
> {
> 
> }
> 
> ....
> 
> if (foo()) {
>      goto fail;
> }
> 
> to:
> 
> void foo(..)
> {
> 
> }
> 
> ....
> 
> foo();
> 
> if (bar()) {
>      goto fail;
> }
> 
> I would preffer to move the other way around.  Move the error
> synchrconously. My plan is that at some point in time
> qemu_file_get_error() dissapears, i.e. we return the error when we
> receive it and we handle it synchronously.
> 
> And yes, that is a something will take a lot of time, but I will hope we
> move on that direction, not in trusting more setting internal errors,
> use void functions and then check with yet another functions.

IIUC "synchronous" here means we can have the Error* returned from
pthread_join(), but I worry that might be too late, that the real return
path Error* doesn't get its chance to set into MigrationState.error because
there can already be some error set.

I can at least move that check into await_return_path_close_on_source()
again, so it keeps returning something.  Does that sound like okay for now?

> 
> 
> On top of your changes:
> 
> > -int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
> > +int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
> >  {
> >      int ret = -EINVAL;
> >      /* from_dst_file is always valid because we're within rp_thread */
> > @@ -4163,8 +4165,8 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
> >      trace_ram_dirty_bitmap_reload_begin(block->idstr);
> >  
> >      if (s->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {
> > -        error_report("%s: incorrect state %s", __func__,
> > -                     MigrationStatus_str(s->state));
> > +        error_setg(errp, "Reload bitmap in incorrect state %s",
> > +                   MigrationStatus_str(s->state));
> >          return -EINVAL;
> 
> return -1
> 
> same for the rest of the cases. Callers only check for != 0, and if you
> want details, you need to look at errp.

I'll let them return boolean here too to be consistent.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] migration: Refactor error handling in source return path
  2023-10-05 12:57   ` Fabiano Rosas
@ 2023-10-05 19:35     ` Peter Xu
  0 siblings, 0 replies; 35+ messages in thread
From: Peter Xu @ 2023-10-05 19:35 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Juan Quintela

On Thu, Oct 05, 2023 at 09:57:58AM -0300, Fabiano Rosas wrote:
> > @@ -2008,9 +1996,14 @@ static void *source_return_path_thread(void *opaque)
> >      }
> >  
> >  out:
> > -    if (qemu_file_get_error(rp)) {
> > +    if (err) {
> 
> Need to keep both checks here.

The next patch did that.  Let me squash that into this..

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 07/10] migration: Add migration_rp_wait|kick()
  2023-10-05  7:49   ` Juan Quintela
@ 2023-10-05 20:47     ` Peter Xu
  0 siblings, 0 replies; 35+ messages in thread
From: Peter Xu @ 2023-10-05 20:47 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, Fabiano Rosas

On Thu, Oct 05, 2023 at 09:49:25AM +0200, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > It's just a simple wrapper for rp_sem on either wait() or kick(), make it
> > even clearer on how it is used.  Prepared to be used even for other things.
> >
> > Reviewed-by: Fabiano Rosas <farosas@suse.de>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> 
> I agree with the idea, but I think that the problem is the name of the
> semaphore.
> 
> > +void migration_rp_wait(MigrationState *s)
> > +{
> > +    qemu_sem_wait(&s->rp_state.rp_sem);
> 
> I am not sure if it would be better to have the wrappers or just rename
> 
> If we rename the remaphore to migration_thread, this becomes:
> 
>     qemu_sem_wait(&s->rp_state.return_path_ready);
> 
>     qemu_sem_post(&s->rp_state.return_path_ready);
> 
> Or something similar?

I'd prefer keeping a pair of helpers, but I'm open to other suggestions,
e.g. I can rename the sem at the same time, or have a better name just for
the helpers.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-05 13:37     ` Fabiano Rosas
@ 2023-10-05 20:55       ` Peter Xu
  2023-10-05 21:10         ` Fabiano Rosas
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-05 20:55 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Juan Quintela

On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote:
> >> +    /*
> >> +     * Make sure both QEMU instances will go into RECOVER stage, then test
> >> +     * kicking them out using migrate-pause.
> >> +     */
> >> +    wait_for_postcopy_status(from, "postcopy-recover");
> >> +    wait_for_postcopy_status(to, "postcopy-recover");
> >
> > Is this wait out of place? I think we're trying to resume too fast after
> > migrate_recover():
> >
> > # {
> > #     "error": {
> > #         "class": "GenericError",
> > #         "desc": "Cannot resume if there is no paused migration"
> > #     }
> > # }
> >
> 
> Ugh, sorry about the long lines:
> 
> {
>     "error": {
>         "class": "GenericError",
>         "desc": "Cannot resume if there is no paused migration"
>     }
> }

Sorry I didn't get you here.  Could you elaborate your question?

Here we wait on both sides and make sure they'll all be in RECOVER stage,
and we should know that they won't proceed further because the pipes
contain mostly nothing so they'll just block at the pipes.  What did I
miss?

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-05 20:55       ` Peter Xu
@ 2023-10-05 21:10         ` Fabiano Rosas
  2023-10-05 21:44           ` Peter Xu
  0 siblings, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-10-05 21:10 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

Peter Xu <peterx@redhat.com> writes:

> On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote:
>> >> +    /*
>> >> +     * Make sure both QEMU instances will go into RECOVER stage, then test
>> >> +     * kicking them out using migrate-pause.
>> >> +     */
>> >> +    wait_for_postcopy_status(from, "postcopy-recover");
>> >> +    wait_for_postcopy_status(to, "postcopy-recover");
>> >
>> > Is this wait out of place? I think we're trying to resume too fast after
>> > migrate_recover():
>> >
>> > # {
>> > #     "error": {
>> > #         "class": "GenericError",
>> > #         "desc": "Cannot resume if there is no paused migration"
>> > #     }
>> > # }
>> >
>> 
>> Ugh, sorry about the long lines:
>> 
>> {
>>     "error": {
>>         "class": "GenericError",
>>         "desc": "Cannot resume if there is no paused migration"
>>     }
>> }
>
> Sorry I didn't get you here.  Could you elaborate your question?
>

The test is sometimes failing with the above message.

But indeed my question doesn't make sense. I forgot migrate_recover
happens on the destination. Nevermind.

The bug is still present nonetheless. We're going into migrate_prepare
in some state other than POSTCOPY_PAUSED.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-05 21:10         ` Fabiano Rosas
@ 2023-10-05 21:44           ` Peter Xu
  2023-10-05 22:01             ` Fabiano Rosas
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-10-05 21:44 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Juan Quintela

On Thu, Oct 05, 2023 at 06:10:20PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote:
> >> >> +    /*
> >> >> +     * Make sure both QEMU instances will go into RECOVER stage, then test
> >> >> +     * kicking them out using migrate-pause.
> >> >> +     */
> >> >> +    wait_for_postcopy_status(from, "postcopy-recover");
> >> >> +    wait_for_postcopy_status(to, "postcopy-recover");
> >> >
> >> > Is this wait out of place? I think we're trying to resume too fast after
> >> > migrate_recover():
> >> >
> >> > # {
> >> > #     "error": {
> >> > #         "class": "GenericError",
> >> > #         "desc": "Cannot resume if there is no paused migration"
> >> > #     }
> >> > # }
> >> >
> >> 
> >> Ugh, sorry about the long lines:
> >> 
> >> {
> >>     "error": {
> >>         "class": "GenericError",
> >>         "desc": "Cannot resume if there is no paused migration"
> >>     }
> >> }
> >
> > Sorry I didn't get you here.  Could you elaborate your question?
> >
> 
> The test is sometimes failing with the above message.
> 
> But indeed my question doesn't make sense. I forgot migrate_recover
> happens on the destination. Nevermind.
> 
> The bug is still present nonetheless. We're going into migrate_prepare
> in some state other than POSTCOPY_PAUSED.

Oh I see.  Interestingly I cannot reproduce on my host, just like last
time..

What is your setup for running the test?  Anything special?  Here's my
cmdline:

$ cat reproduce.sh 
index=$1
loop=0

while :; do
        echo "Starting loop=$loop..."
        QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test -p /x86_64/migration/postcopy/recovery/double-failures
        if [[ $? != 0 ]]; then
                echo "index $index REPRODUCED (loop=$loop) !"
                break
        fi
        loop=$(( loop + 1 ))
done

Survives 200+ loops and kept going.

However I think I saw what's wrong here, could you help try below fixup?

Thanks,

===8<===
From 52bd2cd5ddf472e0bb99789dba3660a626382630 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Thu, 5 Oct 2023 17:38:42 -0400
Subject: [PATCH] fixup! tests/migration-test: Add a test for postcopy hangs
 during RECOVER

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tests/qtest/migration-test.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index fb7a3765e4..1bdae0a579 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1489,9 +1489,8 @@ static void test_postcopy_recovery_common(MigrateCommon *args)
      * migrate-recover command can only succeed if destination machine
      * is in the paused state
      */
-    wait_for_migration_status(to, "postcopy-paused",
-                              (const char * []) { "failed", "active",
-                                                  "completed", NULL });
+    wait_for_postcopy_status(to, "postcopy-paused");
+    wait_for_postcopy_status(from, "postcopy-paused");
 
     if (args->postcopy_recovery_test_fail) {
         /*
@@ -1514,9 +1513,6 @@ static void test_postcopy_recovery_common(MigrateCommon *args)
      * Try to rebuild the migration channel using the resume flag and
      * the newly created channel
      */
-    wait_for_migration_status(from, "postcopy-paused",
-                              (const char * []) { "failed", "active",
-                                                  "completed", NULL });
     migrate_qmp(from, uri, "{'resume': true}");
 
     /* Restore the postcopy bandwidth to unlimited */
-- 
2.41.0

-- 
Peter Xu



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-05 21:44           ` Peter Xu
@ 2023-10-05 22:01             ` Fabiano Rosas
  2023-10-09 16:50               ` Fabiano Rosas
  0 siblings, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-10-05 22:01 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

Peter Xu <peterx@redhat.com> writes:

> On Thu, Oct 05, 2023 at 06:10:20PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote:
>> >> >> +    /*
>> >> >> +     * Make sure both QEMU instances will go into RECOVER stage, then test
>> >> >> +     * kicking them out using migrate-pause.
>> >> >> +     */
>> >> >> +    wait_for_postcopy_status(from, "postcopy-recover");
>> >> >> +    wait_for_postcopy_status(to, "postcopy-recover");
>> >> >
>> >> > Is this wait out of place? I think we're trying to resume too fast after
>> >> > migrate_recover():
>> >> >
>> >> > # {
>> >> > #     "error": {
>> >> > #         "class": "GenericError",
>> >> > #         "desc": "Cannot resume if there is no paused migration"
>> >> > #     }
>> >> > # }
>> >> >
>> >> 
>> >> Ugh, sorry about the long lines:
>> >> 
>> >> {
>> >>     "error": {
>> >>         "class": "GenericError",
>> >>         "desc": "Cannot resume if there is no paused migration"
>> >>     }
>> >> }
>> >
>> > Sorry I didn't get you here.  Could you elaborate your question?
>> >
>> 
>> The test is sometimes failing with the above message.
>> 
>> But indeed my question doesn't make sense. I forgot migrate_recover
>> happens on the destination. Nevermind.
>> 
>> The bug is still present nonetheless. We're going into migrate_prepare
>> in some state other than POSTCOPY_PAUSED.
>
> Oh I see.  Interestingly I cannot reproduce on my host, just like last
> time..
>
> What is your setup for running the test?  Anything special?  Here's my
> cmdline:

The crudest oneliner:

for i in $(seq 1 9999); do echo "$i ============="; \
QTEST_QEMU_BINARY=./qemu-system-x86_64 \
./tests/qtest/migration-test -r /x86_64/migration/postcopy/recovery || break ; done

I suspect my system has something specific to it that affects the timing
of the tests. But I have no idea what it could be.

$ lscpu       
Architecture:            x86_64      
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               GenuineIntel
  Model name:            11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
    CPU family:          6
    Model:               141
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           1
    Stepping:            1
    CPU max MHz:         4800.0000
    CPU min MHz:         800.0000
    BogoMIPS:            4992.00

>
> $ cat reproduce.sh 
> index=$1
> loop=0
>
> while :; do
>         echo "Starting loop=$loop..."
>         QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test -p /x86_64/migration/postcopy/recovery/double-failures
>         if [[ $? != 0 ]]; then
>                 echo "index $index REPRODUCED (loop=$loop) !"
>                 break
>         fi
>         loop=$(( loop + 1 ))
> done
>
> Survives 200+ loops and kept going.
>
> However I think I saw what's wrong here, could you help try below fixup?
>

Sure. I won't get to it until tomorrow though.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 03/10] migration: Refactor error handling in source return path
  2023-10-05 16:05     ` Peter Xu
@ 2023-10-08 11:39       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-10-08 11:39 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Juan Quintela, Markus Armbruster

On 5/10/23 18:05, Peter Xu wrote:
> On Thu, Oct 05, 2023 at 08:11:33AM +0200, Philippe Mathieu-Daudé wrote:
>> Hi Peter,
>>
>> On 5/10/23 00:02, Peter Xu wrote:
>>> rp_state.error was a boolean used to show error happened in return path
>>> thread.  That's not only duplicating error reporting (migrate_set_error),
>>> but also not good enough in that we only do error_report() and set it to
>>> true, we never can keep a history of the exact error and show it in
>>> query-migrate.
>>>
>>> To make this better, a few things done:
>>>
>>>     - Use error_setg() rather than error_report() across the whole lifecycle
>>>       of return path thread, keeping the error in an Error*.
>>>
>>>     - Use migrate_set_error() to apply that captured error to the global
>>>       migration object when error occured in this thread.
>>>
>>>     - With above, no need to have mark_source_rp_bad(), remove it, alongside
>>>       with rp_state.error itself.
>>>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>    migration/migration.h  |   1 -
>>>    migration/ram.h        |   5 +-
>>>    migration/migration.c  | 123 ++++++++++++++++++-----------------------
>>>    migration/ram.c        |  41 +++++++-------
>>>    migration/trace-events |   4 +-
>>>    5 files changed, 79 insertions(+), 95 deletions(-)
>>
>>
>>> -int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
>>> +int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
>>>    {
>>>        int ret = -EINVAL;
>>>        /* from_dst_file is always valid because we're within rp_thread */
>>
>>
>>> @@ -4193,16 +4194,16 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block)
>>>        ret = qemu_file_get_error(file);
>>>        if (ret || size != local_size) {
>>> -        error_report("%s: read bitmap failed for ramblock '%s': %d"
>>> -                     " (size 0x%"PRIx64", got: 0x%"PRIx64")",
>>> -                     __func__, block->idstr, ret, local_size, size);
>>> +        error_setg(errp, "read bitmap failed for ramblock '%s': %d"
>>> +                   " (size 0x%"PRIx64", got: 0x%"PRIx64")",
>>> +                   block->idstr, ret, local_size, size);
>>>            ret = -EIO;
>>>            goto out;
>>>        }
>>>        if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) {
>>> -        error_report("%s: ramblock '%s' end mark incorrect: 0x%"PRIx64,
>>> -                     __func__, block->idstr, end_mark);
>>> +        error_setg(errp, "ramblock '%s' end mark incorrect: 0x%"PRIx64,
>>> +                   block->idstr, end_mark);
>>>            ret = -EINVAL;
>>>            goto out;
>>>        }
>>
>> This function returns -EIO/-EINVAL errors, propagated to its 2 callers
>>   - migrate_handle_rp_recv_bitmap()
>>   - migrate_handle_rp_resume_ack()
> 
> It was only called in migrate_handle_rp_recv_bitmap(), but I think I see
> what you meant..
> 
>> which are only used in source_return_path_thread() where the return
>> value is only checked as boolean.
>>
>> Could we simplify them returning a boolean (which is the pattern with
>> functions taking an Error** as last parameter)?
> 
> Yes, with errp passed in, the "int" retcode is slightly duplicated.  I can
> add one more patch on top of this as further cleanup, as below.
> 
> Thanks,
> 
> ===8<===
>  From b1052befd72beb129012afddf5647339fe4e257c Mon Sep 17 00:00:00 2001
> From: Peter Xu <peterx@redhat.com>
> Date: Thu, 5 Oct 2023 12:03:44 -0400
> Subject: [PATCH] migration: Change ram_dirty_bitmap_reload() retval to bool
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Now we have a Error** passed into the return path thread stack, which is
> even clearer than an int retval.  Change ram_dirty_bitmap_reload() and the
> callers to use a bool instead to replace errnos.
> 
> Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   migration/ram.h       |  2 +-
>   migration/migration.c | 18 +++++++++---------
>   migration/ram.c       | 24 +++++++++++-------------
>   3 files changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/migration/ram.h b/migration/ram.h
> index 14ed666d58..af0290f8ab 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -72,7 +72,7 @@ void ramblock_recv_bitmap_set(RAMBlock *rb, void *host_addr);
>   void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr, size_t nr);
>   int64_t ramblock_recv_bitmap_send(QEMUFile *file,
>                                     const char *block_name);
> -int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb, Error **errp);
> +bool ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb, Error **errp);
>   bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start);
>   void postcopy_preempt_shutdown_file(MigrationState *s);
>   void *postcopy_preempt_thread(void *opaque);
> diff --git a/migration/migration.c b/migration/migration.c
> index 1a7f214fcf..e7375810be 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1837,29 +1837,29 @@ static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname,
>       ram_save_queue_pages(rbname, start, len, errp);
>   }
>   
> -static int migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name,
> -                                         Error **errp)
> +static bool migrate_handle_rp_recv_bitmap(MigrationState *s, char *block_name,
> +                                          Error **errp)
>   {
>       RAMBlock *block = qemu_ram_block_by_name(block_name);
>   
>       if (!block) {
>           error_setg(errp, "MIG_RP_MSG_RECV_BITMAP has invalid block name '%s'",
>                      block_name);
> -        return -EINVAL;
> +        return false;
>       }
>   
>       /* Fetch the received bitmap and refresh the dirty bitmap */
>       return ram_dirty_bitmap_reload(s, block, errp);
>   }
>   
> -static int migrate_handle_rp_resume_ack(MigrationState *s,
> -                                        uint32_t value, Error **errp)
> +static bool migrate_handle_rp_resume_ack(MigrationState *s,
> +                                         uint32_t value, Error **errp)
>   {
>       trace_source_return_path_thread_resume_ack(value);
>   
>       if (value != MIGRATION_RESUME_ACK_VALUE) {
>           error_setg(errp, "illegal resume_ack value %"PRIu32, value);
> -        return -1;
> +        return false;
>       }
>   
>       /* Now both sides are active. */
> @@ -1869,7 +1869,7 @@ static int migrate_handle_rp_resume_ack(MigrationState *s,
>       /* Notify send thread that time to continue send pages */
>       migration_rp_kick(s);
>   
> -    return 0;
> +    return true;
>   }
>   
>   /*
> @@ -2021,14 +2021,14 @@ static void *source_return_path_thread(void *opaque)
>               }
>               /* Format: len (1B) + idstr (<255B). This ends the idstr. */
>               buf[buf[0] + 1] = '\0';
> -            if (migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1), &err)) {
> +            if (!migrate_handle_rp_recv_bitmap(ms, (char *)(buf + 1), &err)) {
>                   goto out;
>               }
>               break;
>   
>           case MIG_RP_MSG_RESUME_ACK:
>               tmp32 = ldl_be_p(buf);
> -            if (migrate_handle_rp_resume_ack(ms, tmp32, &err)) {
> +            if (!migrate_handle_rp_resume_ack(ms, tmp32, &err)) {
>                   goto out;
>               }
>               break;
> diff --git a/migration/ram.c b/migration/ram.c
> index 2565f53f5c..982fbbeee1 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -4157,23 +4157,25 @@ static int ram_dirty_bitmap_sync_all(MigrationState *s, RAMState *rs)
>    * Read the received bitmap, revert it as the initial dirty bitmap.
>    * This is only used when the postcopy migration is paused but wants
>    * to resume from a middle point.
> + *
> + * Returns true if succeeded, false for errors.
>    */
> -int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
> +bool ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
>   {
> -    int ret = -EINVAL;
>       /* from_dst_file is always valid because we're within rp_thread */
>       QEMUFile *file = s->rp_state.from_dst_file;
>       unsigned long *le_bitmap, nbits = block->used_length >> TARGET_PAGE_BITS;
>       uint64_t local_size = DIV_ROUND_UP(nbits, 8);
>       uint64_t size, end_mark;
>       RAMState *rs = ram_state;
> +    bool result = false;
>   
>       trace_ram_dirty_bitmap_reload_begin(block->idstr);
>   
>       if (s->state != MIGRATION_STATUS_POSTCOPY_RECOVER) {
>           error_setg(errp, "Reload bitmap in incorrect state %s",
>                      MigrationStatus_str(s->state));
> -        return -EINVAL;
> +        return false;
>       }
>   
>       /*
> @@ -4191,26 +4193,22 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
>       if (size != local_size) {
>           error_setg(errp, "ramblock '%s' bitmap size mismatch (0x%"PRIx64
>                      " != 0x%"PRIx64")", block->idstr, size, local_size);
> -        ret = -EINVAL;
>           goto out;
>       }
>   
>       size = qemu_get_buffer(file, (uint8_t *)le_bitmap, local_size);
>       end_mark = qemu_get_be64(file);
>   
> -    ret = qemu_file_get_error(file);
> -    if (ret || size != local_size) {
> -        error_setg(errp, "read bitmap failed for ramblock '%s': %d"
> -                   " (size 0x%"PRIx64", got: 0x%"PRIx64")",
> -                   block->idstr, ret, local_size, size);
> -        ret = -EIO;
> +    if (qemu_file_get_error(file) || size != local_size) {
> +        error_setg(errp, "read bitmap failed for ramblock '%s': "
> +                   "(size 0x%"PRIx64", got: 0x%"PRIx64")",
> +                   block->idstr, local_size, size);
>           goto out;
>       }
>   
>       if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) {
>           error_setg(errp, "ramblock '%s' end mark incorrect: 0x%"PRIx64,
>                      block->idstr, end_mark);
> -        ret = -EINVAL;
>           goto out;
>       }
>   
> @@ -4243,10 +4241,10 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block, Error **errp)
>        */
>       migration_rp_kick(s);
>   
> -    ret = 0;
> +    result = true;
>   out:
>       g_free(le_bitmap);
> -    return ret;
> +    return result;
>   }
>   
>   static int ram_resume_prepare(MigrationState *s, void *opaque)

Yes, exactly what I meant. For the embedded patch:
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

One step further is to use g_autofree for le_bitmap to remove this
annoying 'out' label. I'll send the patch.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-05 22:01             ` Fabiano Rosas
@ 2023-10-09 16:50               ` Fabiano Rosas
  2023-10-10 16:00                 ` Peter Xu
  0 siblings, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-10-09 16:50 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

Fabiano Rosas <farosas@suse.de> writes:

> Peter Xu <peterx@redhat.com> writes:
>
>> On Thu, Oct 05, 2023 at 06:10:20PM -0300, Fabiano Rosas wrote:
>>> Peter Xu <peterx@redhat.com> writes:
>>> 
>>> > On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote:
>>> >> >> +    /*
>>> >> >> +     * Make sure both QEMU instances will go into RECOVER stage, then test
>>> >> >> +     * kicking them out using migrate-pause.
>>> >> >> +     */
>>> >> >> +    wait_for_postcopy_status(from, "postcopy-recover");
>>> >> >> +    wait_for_postcopy_status(to, "postcopy-recover");
>>> >> >
>>> >> > Is this wait out of place? I think we're trying to resume too fast after
>>> >> > migrate_recover():
>>> >> >
>>> >> > # {
>>> >> > #     "error": {
>>> >> > #         "class": "GenericError",
>>> >> > #         "desc": "Cannot resume if there is no paused migration"
>>> >> > #     }
>>> >> > # }
>>> >> >
>>> >> 
>>> >> Ugh, sorry about the long lines:
>>> >> 
>>> >> {
>>> >>     "error": {
>>> >>         "class": "GenericError",
>>> >>         "desc": "Cannot resume if there is no paused migration"
>>> >>     }
>>> >> }
>>> >
>>> > Sorry I didn't get you here.  Could you elaborate your question?
>>> >
>>> 
>>> The test is sometimes failing with the above message.
>>> 
>>> But indeed my question doesn't make sense. I forgot migrate_recover
>>> happens on the destination. Nevermind.
>>> 
>>> The bug is still present nonetheless. We're going into migrate_prepare
>>> in some state other than POSTCOPY_PAUSED.
>>
>> Oh I see.  Interestingly I cannot reproduce on my host, just like last
>> time..
>>
>> What is your setup for running the test?  Anything special?  Here's my
>> cmdline:
>
> The crudest oneliner:
>
> for i in $(seq 1 9999); do echo "$i ============="; \
> QTEST_QEMU_BINARY=./qemu-system-x86_64 \
> ./tests/qtest/migration-test -r /x86_64/migration/postcopy/recovery || break ; done
>
> I suspect my system has something specific to it that affects the timing
> of the tests. But I have no idea what it could be.
>
> $ lscpu       
> Architecture:            x86_64      
>   CPU op-mode(s):        32-bit, 64-bit
>   Address sizes:         39 bits physical, 48 bits virtual
>   Byte Order:            Little Endian
> CPU(s):                  16
>   On-line CPU(s) list:   0-15
> Vendor ID:               GenuineIntel
>   Model name:            11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
>     CPU family:          6
>     Model:               141
>     Thread(s) per core:  2
>     Core(s) per socket:  8
>     Socket(s):           1
>     Stepping:            1
>     CPU max MHz:         4800.0000
>     CPU min MHz:         800.0000
>     BogoMIPS:            4992.00
>
>>
>> $ cat reproduce.sh 
>> index=$1
>> loop=0
>>
>> while :; do
>>         echo "Starting loop=$loop..."
>>         QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test -p /x86_64/migration/postcopy/recovery/double-failures
>>         if [[ $? != 0 ]]; then
>>                 echo "index $index REPRODUCED (loop=$loop) !"
>>                 break
>>         fi
>>         loop=$(( loop + 1 ))
>> done
>>
>> Survives 200+ loops and kept going.
>>
>> However I think I saw what's wrong here, could you help try below fixup?
>>
>
> Sure. I won't get to it until tomorrow though.

It seems to have fixed the issue. 3500 iterations and still going.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER
  2023-10-09 16:50               ` Fabiano Rosas
@ 2023-10-10 16:00                 ` Peter Xu
  0 siblings, 0 replies; 35+ messages in thread
From: Peter Xu @ 2023-10-10 16:00 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Juan Quintela

On Mon, Oct 09, 2023 at 01:50:08PM -0300, Fabiano Rosas wrote:
> It seems to have fixed the issue. 3500 iterations and still going.

I'll go with that then, but feel free to report whenever that's hit again.

Thanks a lot.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2023-10-10 16:04 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-04 22:02 [PATCH v3 00/10] migration: Better error handling in rp thread, allow failures in recover Peter Xu
2023-10-04 22:02 ` [PATCH v3 01/10] migration: Display error in query-migrate irrelevant of status Peter Xu
2023-10-05  7:28   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 02/10] migration: Introduce migrate_has_error() Peter Xu
2023-10-05  7:30   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 03/10] migration: Refactor error handling in source return path Peter Xu
2023-10-05  6:11   ` Philippe Mathieu-Daudé
2023-10-05 16:05     ` Peter Xu
2023-10-08 11:39       ` Philippe Mathieu-Daudé
2023-10-05  8:22   ` Juan Quintela
2023-10-05 19:35     ` Peter Xu
2023-10-05 12:57   ` Fabiano Rosas
2023-10-05 19:35     ` Peter Xu
2023-10-04 22:02 ` [PATCH v3 04/10] migration: Deliver return path file error to migrate state too Peter Xu
2023-10-05  7:32   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 05/10] qemufile: Always return a verbose error Peter Xu
2023-10-05  7:42   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 06/10] migration: Remember num of ramblocks to sync during recovery Peter Xu
2023-10-05  7:43   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 07/10] migration: Add migration_rp_wait|kick() Peter Xu
2023-10-05  7:49   ` Juan Quintela
2023-10-05 20:47     ` Peter Xu
2023-10-04 22:02 ` [PATCH v3 08/10] migration: Allow network to fail even during recovery Peter Xu
2023-10-05 13:25   ` Fabiano Rosas
2023-10-04 22:02 ` [PATCH v3 09/10] migration: Allow RECOVER->PAUSED convertion for dest qemu Peter Xu
2023-10-05  8:24   ` Juan Quintela
2023-10-04 22:02 ` [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER Peter Xu
2023-10-05 13:24   ` Fabiano Rosas
2023-10-05 13:37     ` Fabiano Rosas
2023-10-05 20:55       ` Peter Xu
2023-10-05 21:10         ` Fabiano Rosas
2023-10-05 21:44           ` Peter Xu
2023-10-05 22:01             ` Fabiano Rosas
2023-10-09 16:50               ` Fabiano Rosas
2023-10-10 16:00                 ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.