qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Restore vmstate on cancelled/failed migration
@ 2023-05-17 12:37 Vladimir Sementsov-Ogievskiy
  2023-05-17 12:37 ` [PATCH 1/5] runstate: add runstate_get() Vladimir Sementsov-Ogievskiy
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-05-17 12:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, leobras, peterx, quintela, vsementsov, yc-core

Hi all.

The problem I want to solve is that guest-panicked state may be lost
when migration is failed (or cancelled) after source stop.

Still, I try to go further and restore all possible paused states in the
same way. The key patch is the last one and others are refactoring and
preparation.

Vladimir Sementsov-Ogievskiy (5):
  runstate: add runstate_get()
  migration: never fail in global_state_store()
  runstate: drop unused runstate_store()
  migration: switch from .vm_was_running to .vm_old_state
  migration: restore vmstate on migration failure

 include/migration/global_state.h |  2 +-
 include/sysemu/runstate.h        |  2 +-
 migration/global_state.c         | 23 +++++++------
 migration/migration.c            | 56 +++++++++++++++-----------------
 migration/migration.h            |  9 +++--
 migration/savevm.c               |  6 +---
 softmmu/runstate.c               | 25 +++++++-------
 7 files changed, 59 insertions(+), 64 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] runstate: add runstate_get()
  2023-05-17 12:37 [PATCH 0/5] Restore vmstate on cancelled/failed migration Vladimir Sementsov-Ogievskiy
@ 2023-05-17 12:37 ` Vladimir Sementsov-Ogievskiy
  2023-05-18 11:13   ` Juan Quintela
  2023-05-17 12:37 ` [PATCH 2/5] migration: never fail in global_state_store() Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-05-17 12:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, leobras, peterx, quintela, vsementsov, yc-core

It's necessary to restore the state after failed/cancelled migration in
further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 include/sysemu/runstate.h | 1 +
 softmmu/runstate.c        | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index f3ed52548e..85f5d9a419 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -6,6 +6,7 @@
 
 bool runstate_check(RunState state);
 void runstate_set(RunState new_state);
+RunState runstate_get(void);
 bool runstate_is_running(void);
 bool runstate_needs_reset(void);
 bool runstate_store(char *str, size_t size);
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index 2f2396c819..1e6f0bcecc 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -221,6 +221,11 @@ void runstate_set(RunState new_state)
     current_run_state = new_state;
 }
 
+RunState runstate_get(void)
+{
+    return current_run_state;
+}
+
 bool runstate_is_running(void)
 {
     return runstate_check(RUN_STATE_RUNNING);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] migration: never fail in global_state_store()
  2023-05-17 12:37 [PATCH 0/5] Restore vmstate on cancelled/failed migration Vladimir Sementsov-Ogievskiy
  2023-05-17 12:37 ` [PATCH 1/5] runstate: add runstate_get() Vladimir Sementsov-Ogievskiy
@ 2023-05-17 12:37 ` Vladimir Sementsov-Ogievskiy
  2023-05-18 11:18   ` Juan Quintela
  2023-05-26  8:28   ` Juan Quintela
  2023-05-17 12:37 ` [PATCH 3/5] runstate: drop unused runstate_store() Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-05-17 12:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, leobras, peterx, quintela, vsementsov, yc-core

Actually global_state_store() can never fail. Let's get rid of extra
error paths.

To make things clear, use new runstate_get() and use same approach for
global_state_store() and global_state_store_running().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 include/migration/global_state.h |  2 +-
 migration/global_state.c         | 23 ++++++++---------
 migration/migration.c            | 43 +++++++++++++++-----------------
 migration/savevm.c               |  6 +----
 4 files changed, 33 insertions(+), 41 deletions(-)

diff --git a/include/migration/global_state.h b/include/migration/global_state.h
index 945eb35d5b..d7c2cd3216 100644
--- a/include/migration/global_state.h
+++ b/include/migration/global_state.h
@@ -16,7 +16,7 @@
 #include "qapi/qapi-types-run-state.h"
 
 void register_global_state(void);
-int global_state_store(void);
+void global_state_store(void);
 void global_state_store_running(void);
 bool global_state_received(void);
 RunState global_state_get_runstate(void);
diff --git a/migration/global_state.c b/migration/global_state.c
index a33947ca32..4e2a9d8ec0 100644
--- a/migration/global_state.c
+++ b/migration/global_state.c
@@ -29,23 +29,22 @@ typedef struct {
 
 static GlobalState global_state;
 
-int global_state_store(void)
+static void global_state_do_store(RunState state)
 {
-    if (!runstate_store((char *)global_state.runstate,
-                        sizeof(global_state.runstate))) {
-        error_report("runstate name too big: %s", global_state.runstate);
-        trace_migrate_state_too_big();
-        return -EINVAL;
-    }
-    return 0;
+    const char *state_str = RunState_str(state);
+    assert(strlen(state_str) < sizeof(global_state.runstate));
+    strpadcpy((char *)global_state.runstate, sizeof(global_state.runstate),
+              state_str, '\0');
+}
+
+void global_state_store(void)
+{
+    global_state_do_store(runstate_get());
 }
 
 void global_state_store_running(void)
 {
-    const char *state = RunState_str(RUN_STATE_RUNNING);
-    assert(strlen(state) < sizeof(global_state.runstate));
-    strpadcpy((char *)global_state.runstate, sizeof(global_state.runstate),
-              state, '\0');
+    global_state_do_store(RUN_STATE_RUNNING);
 }
 
 bool global_state_received(void)
diff --git a/migration/migration.c b/migration/migration.c
index 439e8651df..b42d028191 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2313,27 +2313,26 @@ static void migration_completion(MigrationState *s)
         s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
         qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
         s->vm_was_running = runstate_is_running();
-        ret = global_state_store();
-
-        if (!ret) {
-            ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
-            trace_migration_completion_vm_stop(ret);
-            if (ret >= 0) {
-                ret = migration_maybe_pause(s, &current_active_state,
-                                            MIGRATION_STATUS_DEVICE);
-            }
-            if (ret >= 0) {
-                /*
-                 * Inactivate disks except in COLO, and track that we
-                 * have done so in order to remember to reactivate
-                 * them if migration fails or is cancelled.
-                 */
-                s->block_inactive = !migrate_colo();
-                qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
-                ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
-                                                         s->block_inactive);
-            }
+        global_state_store();
+
+        ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
+        trace_migration_completion_vm_stop(ret);
+        if (ret >= 0) {
+            ret = migration_maybe_pause(s, &current_active_state,
+                                        MIGRATION_STATUS_DEVICE);
+        }
+        if (ret >= 0) {
+            /*
+             * Inactivate disks except in COLO, and track that we
+             * have done so in order to remember to reactivate
+             * them if migration fails or is cancelled.
+             */
+            s->block_inactive = !migrate_colo();
+            qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
+            ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
+                                                     s->block_inactive);
         }
+
         qemu_mutex_unlock_iothread();
 
         if (ret < 0) {
@@ -3120,9 +3119,7 @@ static void *bg_migration_thread(void *opaque)
     qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
     s->vm_was_running = runstate_is_running();
 
-    if (global_state_store()) {
-        goto fail;
-    }
+    global_state_store();
     /* Forcibly stop VM before saving state of vCPUs and devices */
     if (vm_stop_force_state(RUN_STATE_PAUSED)) {
         goto fail;
diff --git a/migration/savevm.c b/migration/savevm.c
index 032044b1d5..778030d087 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2919,11 +2919,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
 
     saved_vm_running = runstate_is_running();
 
-    ret = global_state_store();
-    if (ret) {
-        error_setg(errp, "Error saving global state");
-        return false;
-    }
+    global_state_store();
     vm_stop(RUN_STATE_SAVE_VM);
 
     bdrv_drain_all_begin();
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] runstate: drop unused runstate_store()
  2023-05-17 12:37 [PATCH 0/5] Restore vmstate on cancelled/failed migration Vladimir Sementsov-Ogievskiy
  2023-05-17 12:37 ` [PATCH 1/5] runstate: add runstate_get() Vladimir Sementsov-Ogievskiy
  2023-05-17 12:37 ` [PATCH 2/5] migration: never fail in global_state_store() Vladimir Sementsov-Ogievskiy
@ 2023-05-17 12:37 ` Vladimir Sementsov-Ogievskiy
  2023-05-18 11:19   ` Juan Quintela
  2023-05-17 12:37 ` [PATCH 4/5] migration: switch from .vm_was_running to .vm_old_state Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-05-17 12:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, leobras, peterx, quintela, vsementsov, yc-core

The function is unused since previous commit. Drop it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 include/sysemu/runstate.h |  1 -
 softmmu/runstate.c        | 12 ------------
 2 files changed, 13 deletions(-)

diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 85f5d9a419..7beb29c2e2 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -9,7 +9,6 @@ void runstate_set(RunState new_state);
 RunState runstate_get(void);
 bool runstate_is_running(void);
 bool runstate_needs_reset(void);
-bool runstate_store(char *str, size_t size);
 
 typedef void VMChangeStateHandler(void *opaque, bool running, RunState state);
 
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index 1e6f0bcecc..0370230a5e 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -175,18 +175,6 @@ bool runstate_check(RunState state)
     return current_run_state == state;
 }
 
-bool runstate_store(char *str, size_t size)
-{
-    const char *state = RunState_str(current_run_state);
-    size_t len = strlen(state) + 1;
-
-    if (len > size) {
-        return false;
-    }
-    memcpy(str, state, len);
-    return true;
-}
-
 static void runstate_init(void)
 {
     const RunStateTransition *p;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] migration: switch from .vm_was_running to .vm_old_state
  2023-05-17 12:37 [PATCH 0/5] Restore vmstate on cancelled/failed migration Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2023-05-17 12:37 ` [PATCH 3/5] runstate: drop unused runstate_store() Vladimir Sementsov-Ogievskiy
@ 2023-05-17 12:37 ` Vladimir Sementsov-Ogievskiy
  2023-05-18 11:20   ` Juan Quintela
  2023-05-17 12:37 ` [PATCH 5/5] migration: restore vmstate on migration failure Vladimir Sementsov-Ogievskiy
  2023-05-18 11:23 ` [PATCH 0/5] Restore vmstate on cancelled/failed migration Juan Quintela
  5 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-05-17 12:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, leobras, peterx, quintela, vsementsov, yc-core

No logic change here, only refactoring. That's a preparation for next
commit where we finally restore the stopped vm state on migration
failure or cancellation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 migration/migration.c | 11 ++++++-----
 migration/migration.h |  9 ++++++---
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index b42d028191..4ccb9f9f3b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1422,7 +1422,7 @@ void migrate_init(MigrationState *s)
 
     s->start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     s->total_time = 0;
-    s->vm_was_running = false;
+    s->vm_old_state = -1;
     s->iteration_initial_bytes = 0;
     s->threshold_size = 0;
 }
@@ -2312,7 +2312,8 @@ static void migration_completion(MigrationState *s)
         qemu_mutex_lock_iothread();
         s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
         qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
-        s->vm_was_running = runstate_is_running();
+
+        s->vm_old_state = runstate_get();
         global_state_store();
 
         ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
@@ -2792,12 +2793,12 @@ static void migration_iteration_finish(MigrationState *s)
     case MIGRATION_STATUS_COLO:
         assert(migrate_colo());
         migrate_start_colo_process(s);
-        s->vm_was_running = true;
+        s->vm_old_state = RUN_STATE_RUNNING;
         /* Fallthrough */
     case MIGRATION_STATUS_FAILED:
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_CANCELLING:
-        if (s->vm_was_running) {
+        if (s->vm_old_state == RUN_STATE_RUNNING) {
             if (!runstate_check(RUN_STATE_SHUTDOWN)) {
                 vm_start();
             }
@@ -3117,7 +3118,7 @@ static void *bg_migration_thread(void *opaque)
      * transition in vm_stop_force_state() we need to wakeup it up.
      */
     qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
-    s->vm_was_running = runstate_is_running();
+    s->vm_old_state = runstate_get();
 
     global_state_store();
     /* Forcibly stop VM before saving state of vCPUs and devices */
diff --git a/migration/migration.h b/migration/migration.h
index 7721c7658b..a786778c9d 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -25,6 +25,7 @@
 #include "net/announce.h"
 #include "qom/object.h"
 #include "postcopy-ram.h"
+#include "sysemu/runstate.h"
 
 struct PostcopyBlocktimeContext;
 
@@ -310,12 +311,14 @@ struct MigrationState {
     int64_t expected_downtime;
     bool capabilities[MIGRATION_CAPABILITY__MAX];
     int64_t setup_time;
+
     /*
-     * Whether guest was running when we enter the completion stage.
+     * State before stopping the vm by vm_stop_force_state().
      * If migration is interrupted by any reason, we need to continue
-     * running the guest on source.
+     * running the guest on source if it was running or restore its stopped
+     * state.
      */
-    bool vm_was_running;
+    RunState vm_old_state;
 
     /* Flag set once the migration has been asked to enter postcopy */
     bool start_postcopy;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] migration: restore vmstate on migration failure
  2023-05-17 12:37 [PATCH 0/5] Restore vmstate on cancelled/failed migration Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2023-05-17 12:37 ` [PATCH 4/5] migration: switch from .vm_was_running to .vm_old_state Vladimir Sementsov-Ogievskiy
@ 2023-05-17 12:37 ` Vladimir Sementsov-Ogievskiy
  2023-05-18 11:22   ` Juan Quintela
  2023-05-18 11:23 ` [PATCH 0/5] Restore vmstate on cancelled/failed migration Juan Quintela
  5 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-05-17 12:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: pbonzini, leobras, peterx, quintela, vsementsov, yc-core

1. Otherwise failed migration just drops guest-panicked state, which is
   not good for management software.

2. We do keep different paused states like guest-panicked during
   migration with help of global_state state.

3. We do restore running state on source when migration is cancelled or
   failed.

4. "postmigrate" state is documented as "guest is paused following a
   successful 'migrate'", so originally it's only for successful path
   and we never documented current behavior.

Let's restore paused states like guest-panicked in case of cancel or
fail too. Allow same transitions like for inmigrate state.

This commit changes the behavior that was introduced by commit
42da5550d6 "migration: set state to post-migrate on failure" and
provides a bit different fix on related
  https://bugzilla.redhat.com/show_bug.cgi?id=1355683

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 migration/migration.c | 2 +-
 softmmu/runstate.c    | 8 +++++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 4ccb9f9f3b..951897554f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2804,7 +2804,7 @@ static void migration_iteration_finish(MigrationState *s)
             }
         } else {
             if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
-                runstate_set(RUN_STATE_POSTMIGRATE);
+                runstate_set(s->vm_old_state);
             }
         }
         break;
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index 0370230a5e..1957caf73f 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -121,7 +121,13 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PAUSED },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH },
-    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_INTERNAL_ERROR },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_IO_ERROR },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_SHUTDOWN },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_SUSPENDED },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_WATCHDOG },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_GUEST_PANICKED },
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
     { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/5] runstate: add runstate_get()
  2023-05-17 12:37 ` [PATCH 1/5] runstate: add runstate_get() Vladimir Sementsov-Ogievskiy
@ 2023-05-18 11:13   ` Juan Quintela
  0 siblings, 0 replies; 16+ messages in thread
From: Juan Quintela @ 2023-05-18 11:13 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> It's necessary to restore the state after failed/cancelled migration in
> further commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Reviewed-by: Juan Quintela <quintela@redhat.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] migration: never fail in global_state_store()
  2023-05-17 12:37 ` [PATCH 2/5] migration: never fail in global_state_store() Vladimir Sementsov-Ogievskiy
@ 2023-05-18 11:18   ` Juan Quintela
  2023-05-18 14:43     ` Vladimir Sementsov-Ogievskiy
  2023-05-26  8:28   ` Juan Quintela
  1 sibling, 1 reply; 16+ messages in thread
From: Juan Quintela @ 2023-05-18 11:18 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> Actually global_state_store() can never fail. Let's get rid of extra
> error paths.
>
> To make things clear, use new runstate_get() and use same approach for
> global_state_store() and global_state_store_running().
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

I don't know.

On one hand, you have removed a lot of code that "can't" happen.

On the other:

> +static void global_state_do_store(RunState state)
>  {
> -    if (!runstate_store((char *)global_state.runstate,
> -                        sizeof(global_state.runstate))) {
> -        error_report("runstate name too big: %s", global_state.runstate);
> -        trace_migrate_state_too_big();
> -        return -EINVAL;
> -    }
> -    return 0;
> +    const char *state_str = RunState_str(state);
> +    assert(strlen(state_str) < sizeof(global_state.runstate));

First: g_assert() please.

Second: We try really hard not to fail during migration and get the
whole qemu back.  One assert is bad.  Specially in a place like this
one, where we know that nothing is broken, simpli that we can't migrate.

Later, Juan.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/5] runstate: drop unused runstate_store()
  2023-05-17 12:37 ` [PATCH 3/5] runstate: drop unused runstate_store() Vladimir Sementsov-Ogievskiy
@ 2023-05-18 11:19   ` Juan Quintela
  0 siblings, 0 replies; 16+ messages in thread
From: Juan Quintela @ 2023-05-18 11:19 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> The function is unused since previous commit. Drop it.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Reviewed-by: Juan Quintela <quintela@redhat.com>

Depending on how the discussions on previous one go.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] migration: switch from .vm_was_running to .vm_old_state
  2023-05-17 12:37 ` [PATCH 4/5] migration: switch from .vm_was_running to .vm_old_state Vladimir Sementsov-Ogievskiy
@ 2023-05-18 11:20   ` Juan Quintela
  0 siblings, 0 replies; 16+ messages in thread
From: Juan Quintela @ 2023-05-18 11:20 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> No logic change here, only refactoring. That's a preparation for next
> commit where we finally restore the stopped vm state on migration
> failure or cancellation.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Reviewed-by: Juan Quintela <quintela@redhat.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/5] migration: restore vmstate on migration failure
  2023-05-17 12:37 ` [PATCH 5/5] migration: restore vmstate on migration failure Vladimir Sementsov-Ogievskiy
@ 2023-05-18 11:22   ` Juan Quintela
  0 siblings, 0 replies; 16+ messages in thread
From: Juan Quintela @ 2023-05-18 11:22 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> 1. Otherwise failed migration just drops guest-panicked state, which is
>    not good for management software.
>
> 2. We do keep different paused states like guest-panicked during
>    migration with help of global_state state.
>
> 3. We do restore running state on source when migration is cancelled or
>    failed.
>
> 4. "postmigrate" state is documented as "guest is paused following a
>    successful 'migrate'", so originally it's only for successful path
>    and we never documented current behavior.
>
> Let's restore paused states like guest-panicked in case of cancel or
> fail too. Allow same transitions like for inmigrate state.
>
> This commit changes the behavior that was introduced by commit
> 42da5550d6 "migration: set state to post-migrate on failure" and
> provides a bit different fix on related
>   https://bugzilla.redhat.com/show_bug.cgi?id=1355683
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Reviewed-by: Juan Quintela <quintela@redhat.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/5] Restore vmstate on cancelled/failed migration
  2023-05-17 12:37 [PATCH 0/5] Restore vmstate on cancelled/failed migration Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2023-05-17 12:37 ` [PATCH 5/5] migration: restore vmstate on migration failure Vladimir Sementsov-Ogievskiy
@ 2023-05-18 11:23 ` Juan Quintela
  2023-05-18 14:49   ` Vladimir Sementsov-Ogievskiy
  5 siblings, 1 reply; 16+ messages in thread
From: Juan Quintela @ 2023-05-18 11:23 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> Hi all.
>
> The problem I want to solve is that guest-panicked state may be lost
> when migration is failed (or cancelled) after source stop.
>
> Still, I try to go further and restore all possible paused states in the
> same way. The key patch is the last one and others are refactoring and
> preparation.

Hi

I like and agree with the spirit of the series in general.  But I think
that we need to drop the "never fail in global_state_store()".  We
shouldn't kill a guest because we found a bug on migration.

Later, Juan.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] migration: never fail in global_state_store()
  2023-05-18 11:18   ` Juan Quintela
@ 2023-05-18 14:43     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-05-18 14:43 UTC (permalink / raw)
  To: quintela; +Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

On 18.05.23 14:18, Juan Quintela wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
>> Actually global_state_store() can never fail. Let's get rid of extra
>> error paths.
>>
>> To make things clear, use new runstate_get() and use same approach for
>> global_state_store() and global_state_store_running().
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> 
> I don't know.
> 
> On one hand, you have removed a lot of code that "can't" happen.
> 
> On the other:
> 
>> +static void global_state_do_store(RunState state)
>>   {
>> -    if (!runstate_store((char *)global_state.runstate,
>> -                        sizeof(global_state.runstate))) {
>> -        error_report("runstate name too big: %s", global_state.runstate);
>> -        trace_migrate_state_too_big();
>> -        return -EINVAL;
>> -    }
>> -    return 0;
>> +    const char *state_str = RunState_str(state);
>> +    assert(strlen(state_str) < sizeof(global_state.runstate));
> 
> First: g_assert() please.
> 
> Second: We try really hard not to fail during migration and get the
> whole qemu back.  One assert is bad.  Specially in a place like this
> one, where we know that nothing is broken, simpli that we can't migrate.
> 

On the other hand, having runstate longer than 100 characters means memory corruption, so aborting QEMU is best we can do

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/5] Restore vmstate on cancelled/failed migration
  2023-05-18 11:23 ` [PATCH 0/5] Restore vmstate on cancelled/failed migration Juan Quintela
@ 2023-05-18 14:49   ` Vladimir Sementsov-Ogievskiy
  2023-05-26  7:59     ` Juan Quintela
  0 siblings, 1 reply; 16+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2023-05-18 14:49 UTC (permalink / raw)
  To: quintela; +Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

On 18.05.23 14:23, Juan Quintela wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
>> Hi all.
>>
>> The problem I want to solve is that guest-panicked state may be lost
>> when migration is failed (or cancelled) after source stop.
>>
>> Still, I try to go further and restore all possible paused states in the
>> same way. The key patch is the last one and others are refactoring and
>> preparation.
> 
> Hi
> 
> I like and agree with the spirit of the series in general.  But I think
> that we need to drop the "never fail in global_state_store()".  We
> shouldn't kill a guest because we found a bug on migration.
> 

Why migration is better in this sense than non-migration? We have a lot of places where we just assert things instead of creating unreachable error messages. I think assert/abort is always better in such cases. Really, if we fail in this assertion it means that memory is corrupted, and stopping the execution is the best thing to do.

(Should we consider the case that in future we add 100 character length vmstate? I hope we should not)

-- 
Best regards,
Vladimir



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/5] Restore vmstate on cancelled/failed migration
  2023-05-18 14:49   ` Vladimir Sementsov-Ogievskiy
@ 2023-05-26  7:59     ` Juan Quintela
  0 siblings, 0 replies; 16+ messages in thread
From: Juan Quintela @ 2023-05-26  7:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> On 18.05.23 14:23, Juan Quintela wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
>>> Hi all.
>>>
>>> The problem I want to solve is that guest-panicked state may be lost
>>> when migration is failed (or cancelled) after source stop.
>>>
>>> Still, I try to go further and restore all possible paused states in the
>>> same way. The key patch is the last one and others are refactoring and
>>> preparation.
>> Hi
>> I like and agree with the spirit of the series in general.  But I
>> think
>> that we need to drop the "never fail in global_state_store()".  We
>> shouldn't kill a guest because we found a bug on migration.
>> 
>
> Why migration is better in this sense than non-migration? We have a
> lot of places where we just assert things instead of creating
> unreachable error messages. I think assert/abort is always better in
> such cases. Really, if we fail in this assertion it means that memory
> is corrupted, and stopping the execution is the best thing to do.
>
> (Should we consider the case that in future we add 100 character length vmstate? I hope we should not)

Ok, I give up and integrate the series as they are O:-)

I agree that this is a case that shouldn't happen, so assert() is not as
out of question.

What I am trying to get migration is to really detect errors and be able
to recover from them.  My long term crusade is getting rid of
qemu_file_get_error() and just check the return value for functions that
do IO.  Yes, it is a big long term because we need to change the whole
interface to something saner.

Later, Juan.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/5] migration: never fail in global_state_store()
  2023-05-17 12:37 ` [PATCH 2/5] migration: never fail in global_state_store() Vladimir Sementsov-Ogievskiy
  2023-05-18 11:18   ` Juan Quintela
@ 2023-05-26  8:28   ` Juan Quintela
  1 sibling, 0 replies; 16+ messages in thread
From: Juan Quintela @ 2023-05-26  8:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, pbonzini, leobras, peterx, yc-core

Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
> Actually global_state_store() can never fail. Let's get rid of extra
> error paths.
>
> To make things clear, use new runstate_get() and use same approach for
> global_state_store() and global_state_store_running().
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Reviewed-by: Juan Quintela <quintela@redhat.com>



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-05-26  8:28 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-17 12:37 [PATCH 0/5] Restore vmstate on cancelled/failed migration Vladimir Sementsov-Ogievskiy
2023-05-17 12:37 ` [PATCH 1/5] runstate: add runstate_get() Vladimir Sementsov-Ogievskiy
2023-05-18 11:13   ` Juan Quintela
2023-05-17 12:37 ` [PATCH 2/5] migration: never fail in global_state_store() Vladimir Sementsov-Ogievskiy
2023-05-18 11:18   ` Juan Quintela
2023-05-18 14:43     ` Vladimir Sementsov-Ogievskiy
2023-05-26  8:28   ` Juan Quintela
2023-05-17 12:37 ` [PATCH 3/5] runstate: drop unused runstate_store() Vladimir Sementsov-Ogievskiy
2023-05-18 11:19   ` Juan Quintela
2023-05-17 12:37 ` [PATCH 4/5] migration: switch from .vm_was_running to .vm_old_state Vladimir Sementsov-Ogievskiy
2023-05-18 11:20   ` Juan Quintela
2023-05-17 12:37 ` [PATCH 5/5] migration: restore vmstate on migration failure Vladimir Sementsov-Ogievskiy
2023-05-18 11:22   ` Juan Quintela
2023-05-18 11:23 ` [PATCH 0/5] Restore vmstate on cancelled/failed migration Juan Quintela
2023-05-18 14:49   ` Vladimir Sementsov-Ogievskiy
2023-05-26  7:59     ` Juan Quintela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).