All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 00/12] fix migration of suspended runstate
@ 2023-11-13 18:33 Steve Sistare
  2023-11-13 18:33 ` [PATCH V5 01/12] cpus: refactor vm_stop Steve Sistare
                   ` (11 more replies)
  0 siblings, 12 replies; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Migration of a guest in the suspended runstate is broken.  The incoming
migration code automatically tries to wake the guest, which is wrong;
the guest should end migration in the same runstate it started.  Further,
for a restored snapshot, the automatic wakeup fails.  The runstate is
RUNNING, but the guest is not.  See the commit messages for the details.

Changes in V2:
  * simplify "start on wakeup request"
  * fix postcopy, snapshot, and background migration
  * refactor fixes for each type of migration
  * explicitly handled suspended events and runstate in tests
  * add test for postcopy and background migration

Changes in V3:
  * rebase to tip
  * fix hang in new function migrate_wait_for_dirty_mem

Changes in V4:
  * rebase to tip
  * add patch for vm_prepare_start (thanks Peter)
  * add patch to preserve cpu ticks

Changes in V5:
  * rebase to tip
  * added patches to completely stop vm in suspended state:
      cpus: refactor vm_stop
      cpus: stop vm in suspended state
  * added patch to partially resume vm in suspended state:
      cpus: start vm in suspended state
  * modified "preserve suspended ..." patches to use the above.
  * deleted patch "preserve cpu ticks if suspended".  stop ticks in
    vm_stop_force_state instead.
  * deleted patch "add runstate function".  defined new helper function
    migrate_new_runstate in "preserve suspended runstate"
  * Added some RB's, but removed other RB's because the patches changed.

Steve Sistare (12):
  cpus: refactor vm_stop
  cpus: stop vm in suspended state
  cpus: pass runstate to vm_prepare_start
  cpus: start vm in suspended state
  migration: preserve suspended runstate
  migration: preserve suspended for snapshot
  migration: preserve suspended for bg_migration
  tests/qtest: migration events
  tests/qtest: option to suspend during migration
  tests/qtest: precopy migration with suspend
  tests/qtest: postcopy migration with suspend
  tests/qtest: background migration with suspend

 backends/tpm/tpm_emulator.c          |   2 +-
 gdbstub/system.c                     |   2 +-
 hw/usb/hcd-ehci.c                    |   2 +-
 hw/usb/redirect.c                    |   2 +-
 hw/xen/xen-hvm-common.c              |   2 +-
 include/migration/snapshot.h         |   7 ++
 include/sysemu/runstate.h            |  12 ++-
 migration/migration-hmp-cmds.c       |  12 ++-
 migration/migration.c                |  40 ++++---
 migration/migration.h                |   1 +
 migration/savevm.c                   |  41 +++----
 system/cpus.c                        |  71 ++++++------
 system/runstate.c                    |  13 +++
 system/vl.c                          |   2 +
 tests/migration/i386/Makefile        |   5 +-
 tests/migration/i386/a-b-bootblock.S |  50 ++++++++-
 tests/migration/i386/a-b-bootblock.h |  26 +++--
 tests/qtest/migration-helpers.c      |  27 ++---
 tests/qtest/migration-helpers.h      |  11 +-
 tests/qtest/migration-test.c         | 202 ++++++++++++++++++++++++++---------
 20 files changed, 362 insertions(+), 168 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH V5 01/12] cpus: refactor vm_stop
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-20 13:22   ` Fabiano Rosas
  2023-11-13 18:33 ` [PATCH V5 02/12] cpus: stop vm in suspended state Steve Sistare
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Refactor the vm_stop functions into one common subroutine do_vm_stop called
with options.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 system/cpus.c | 44 +++++++++++++++++---------------------------
 1 file changed, 17 insertions(+), 27 deletions(-)

diff --git a/system/cpus.c b/system/cpus.c
index 0848e0d..f72c4be 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -252,10 +252,21 @@ void cpu_interrupt(CPUState *cpu, int mask)
     }
 }
 
-static int do_vm_stop(RunState state, bool send_stop)
+static int do_vm_stop(RunState state, bool send_stop, bool force)
 {
     int ret = 0;
 
+    if (qemu_in_vcpu_thread()) {
+        qemu_system_vmstop_request_prepare();
+        qemu_system_vmstop_request(state);
+        /*
+         * FIXME: should not return to device code in case
+         * vm_stop() has been requested.
+         */
+        cpu_stop_current();
+        return 0;
+    }
+
     if (runstate_is_running()) {
         runstate_set(state);
         cpu_disable_ticks();
@@ -264,6 +275,8 @@ static int do_vm_stop(RunState state, bool send_stop)
         if (send_stop) {
             qapi_event_send_stop();
         }
+    } else if (force) {
+        runstate_set(state);
     }
 
     bdrv_drain_all();
@@ -278,7 +291,7 @@ static int do_vm_stop(RunState state, bool send_stop)
  */
 int vm_shutdown(void)
 {
-    return do_vm_stop(RUN_STATE_SHUTDOWN, false);
+    return do_vm_stop(RUN_STATE_SHUTDOWN, false, false);
 }
 
 bool cpu_can_run(CPUState *cpu)
@@ -656,18 +669,7 @@ void cpu_stop_current(void)
 
 int vm_stop(RunState state)
 {
-    if (qemu_in_vcpu_thread()) {
-        qemu_system_vmstop_request_prepare();
-        qemu_system_vmstop_request(state);
-        /*
-         * FIXME: should not return to device code in case
-         * vm_stop() has been requested.
-         */
-        cpu_stop_current();
-        return 0;
-    }
-
-    return do_vm_stop(state, true);
+    return do_vm_stop(state, true, false);
 }
 
 /**
@@ -723,19 +725,7 @@ void vm_start(void)
    current state is forgotten forever */
 int vm_stop_force_state(RunState state)
 {
-    if (runstate_is_running()) {
-        return vm_stop(state);
-    } else {
-        int ret;
-        runstate_set(state);
-
-        bdrv_drain_all();
-        /* Make sure to return an error if the flush in a previous vm_stop()
-         * failed. */
-        ret = bdrv_flush_all();
-        trace_vm_stop_flush_all(ret);
-        return ret;
-    }
+    return do_vm_stop(state, true, true);
 }
 
 void qmp_memsave(int64_t addr, int64_t size, const char *filename,
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
  2023-11-13 18:33 ` [PATCH V5 01/12] cpus: refactor vm_stop Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-20 14:15   ` Fabiano Rosas
  2023-11-20 19:59   ` Peter Xu
  2023-11-13 18:33 ` [PATCH V5 03/12] cpus: pass runstate to vm_prepare_start Steve Sistare
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

A vm in the suspended state is not completely stopped.  The VCPUs have been
paused, but the cpu clock still runs, and runstate notifiers for the
transition to stopped have not been called.  Modify vm_stop_force_state to
completely stop the vm if the current state is suspended, to be called for
live migration and snapshots.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 system/cpus.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/system/cpus.c b/system/cpus.c
index f72c4be..c772708 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -255,6 +255,8 @@ void cpu_interrupt(CPUState *cpu, int mask)
 static int do_vm_stop(RunState state, bool send_stop, bool force)
 {
     int ret = 0;
+    bool running = runstate_is_running();
+    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
 
     if (qemu_in_vcpu_thread()) {
         qemu_system_vmstop_request_prepare();
@@ -267,10 +269,12 @@ static int do_vm_stop(RunState state, bool send_stop, bool force)
         return 0;
     }
 
-    if (runstate_is_running()) {
+    if (running || (suspended && force)) {
         runstate_set(state);
         cpu_disable_ticks();
-        pause_all_vcpus();
+        if (running) {
+            pause_all_vcpus();
+        }
         vm_state_notify(0, state);
         if (send_stop) {
             qapi_event_send_stop();
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 03/12] cpus: pass runstate to vm_prepare_start
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
  2023-11-13 18:33 ` [PATCH V5 01/12] cpus: refactor vm_stop Steve Sistare
  2023-11-13 18:33 ` [PATCH V5 02/12] cpus: stop vm in suspended state Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-13 18:33 ` [PATCH V5 04/12] cpus: start vm in suspended state Steve Sistare
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

When a vm in the suspended state is migrated, we must call vm_prepare_start
on the destination, so a later system_wakeup properly resumes the guest,
when main_loop_should_exit calls resume_all_vcpus.  However, the runstate
should remain suspended until system_wakeup is called, so allow the caller
to pass the new state to vm_prepare_start, rather than assume the new state
is RUN_STATE_RUNNING.  Modify vm state change handlers that check
RUN_STATE_RUNNING to instead use the running parameter.

No functional change.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
 backends/tpm/tpm_emulator.c | 2 +-
 gdbstub/system.c            | 2 +-
 hw/usb/hcd-ehci.c           | 2 +-
 hw/usb/redirect.c           | 2 +-
 hw/xen/xen-hvm-common.c     | 2 +-
 include/sysemu/runstate.h   | 4 +++-
 system/cpus.c               | 8 ++++----
 7 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/backends/tpm/tpm_emulator.c b/backends/tpm/tpm_emulator.c
index f7f1b4a..254fce7 100644
--- a/backends/tpm/tpm_emulator.c
+++ b/backends/tpm/tpm_emulator.c
@@ -904,7 +904,7 @@ static void tpm_emulator_vm_state_change(void *opaque, bool running,
 
     trace_tpm_emulator_vm_state_change(running, state);
 
-    if (!running || state != RUN_STATE_RUNNING || !tpm_emu->relock_storage) {
+    if (!running || !tpm_emu->relock_storage) {
         return;
     }
 
diff --git a/gdbstub/system.c b/gdbstub/system.c
index 783ac14..7ab9f82 100644
--- a/gdbstub/system.c
+++ b/gdbstub/system.c
@@ -570,7 +570,7 @@ int gdb_continue_partial(char *newstates)
             }
         }
 
-        if (vm_prepare_start(step_requested)) {
+        if (vm_prepare_start(step_requested, RUN_STATE_RUNNING)) {
             return 0;
         }
 
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index 19b4534..10c82ce 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -2451,7 +2451,7 @@ static void usb_ehci_vm_state_change(void *opaque, bool running, RunState state)
      * USB-devices which have async handled packages have a packet in the
      * ep queue to match the completion with.
      */
-    if (state == RUN_STATE_RUNNING) {
+    if (running) {
         ehci_advance_async_state(ehci);
     }
 
diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index c9893df..3785bb0 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -1403,7 +1403,7 @@ static void usbredir_vm_state_change(void *priv, bool running, RunState state)
 {
     USBRedirDevice *dev = priv;
 
-    if (state == RUN_STATE_RUNNING && dev->parser != NULL) {
+    if (running && dev->parser != NULL) {
         usbredirparser_do_write(dev->parser); /* Flush any pending writes */
     }
 }
diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
index 565dc39..47e6cb1 100644
--- a/hw/xen/xen-hvm-common.c
+++ b/hw/xen/xen-hvm-common.c
@@ -623,7 +623,7 @@ void xen_hvm_change_state_handler(void *opaque, bool running,
 
     xen_set_ioreq_server_state(xen_domid,
                                state->ioservid,
-                               (rstate == RUN_STATE_RUNNING));
+                               running);
 }
 
 void xen_exit_notifier(Notifier *n, void *data)
diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index c8c2bd8..9e78c7f 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -46,8 +46,10 @@ void vm_start(void);
  * vm_prepare_start: Prepare for starting/resuming the VM
  *
  * @step_pending: whether any of the CPUs is about to be single-stepped by gdb
+ * @state: the vm state to setup
  */
-int vm_prepare_start(bool step_pending);
+int vm_prepare_start(bool step_pending, RunState state);
+
 int vm_stop(RunState state);
 int vm_stop_force_state(RunState state);
 int vm_shutdown(void);
diff --git a/system/cpus.c b/system/cpus.c
index c772708..4d05d83 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -681,7 +681,7 @@ int vm_stop(RunState state)
  * Returns -1 if the vCPUs are not to be restarted (e.g. if they are already
  * running or in case of an error condition), 0 otherwise.
  */
-int vm_prepare_start(bool step_pending)
+int vm_prepare_start(bool step_pending, RunState state)
 {
     RunState requested;
 
@@ -713,14 +713,14 @@ int vm_prepare_start(bool step_pending)
     qapi_event_send_resume();
 
     cpu_enable_ticks();
-    runstate_set(RUN_STATE_RUNNING);
-    vm_state_notify(1, RUN_STATE_RUNNING);
+    runstate_set(state);
+    vm_state_notify(1, state);
     return 0;
 }
 
 void vm_start(void)
 {
-    if (!vm_prepare_start(false)) {
+    if (!vm_prepare_start(false, RUN_STATE_RUNNING)) {
         resume_all_vcpus();
     }
 }
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 04/12] cpus: start vm in suspended state
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (2 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 03/12] cpus: pass runstate to vm_prepare_start Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-20 17:20   ` Fabiano Rosas
  2023-11-13 18:33 ` [PATCH V5 05/12] migration: preserve suspended runstate Steve Sistare
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Provide the vm_resume helper which resumes the vm in the requested state,
correctly restoring the suspended state.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/sysemu/runstate.h |  8 ++++++++
 system/cpus.c             | 11 +++++++++++
 2 files changed, 19 insertions(+)

diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 9e78c7f..3fa28a4 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -50,6 +50,14 @@ void vm_start(void);
  */
 int vm_prepare_start(bool step_pending, RunState state);
 
+/**
+ * vm_resume: If @state is a running state, start the vm and set the state,
+ * else just set the state.
+ *
+ * @state: the state to restore
+ */
+void vm_resume(RunState state);
+
 int vm_stop(RunState state);
 int vm_stop_force_state(RunState state);
 int vm_shutdown(void);
diff --git a/system/cpus.c b/system/cpus.c
index 4d05d83..1d867bb 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -725,6 +725,17 @@ void vm_start(void)
     }
 }
 
+void vm_resume(RunState state)
+{
+    if (state == RUN_STATE_RUNNING) {
+        vm_start();
+    } else if (state == RUN_STATE_SUSPENDED) {
+        vm_prepare_start(false, state);
+    } else {
+        runstate_set(state);
+    }
+}
+
 /* does a state transition even if the VM is already stopped,
    current state is forgotten forever */
 int vm_stop_force_state(RunState state)
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 05/12] migration: preserve suspended runstate
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (3 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 04/12] cpus: start vm in suspended state Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-20 17:30   ` Fabiano Rosas
  2023-11-13 18:33 ` [PATCH V5 06/12] migration: preserve suspended for snapshot Steve Sistare
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

A guest that is migrated in the suspended state automaticaly wakes and
continues execution.  This is wrong; the guest should end migration in
the same state it started.  The root cause is that the outgoing migration
code automatically wakes the guest, then saves the RUNNING runstate in
global_state_store(), hence the incoming migration code thinks the guest is
running and continues the guest if autostart is true.

Simply deleting the call to qemu_system_wakeup_request() on the outgoing
side, to migrate the vm in state suspended, does not solve the problem.
The old vm_stop_force_state does little if the vm is suspended, so the cpu
clock remains running, and runstate notifiers for the stop transition are
not called (and were not called on transition to suspended). Stale cpu
timers_state is saved to the migration stream, causing time errors in the
guest when it wakes from suspend.  State that would have been modified by
runstate notifiers is wrong.

The new version of vm_stop_force_state solves the outgoing problems, by
completely stopping a vm in the suspended state.

On the incoming side for precopy, compute the desired new state from global
state received, and call runstate_restore, which will partially
resume the vm if the state is suspended.  A future system_wakeup monitor
request will cause the vm to resume running.

On the incoming side for postcopy, apply the the same restore logic found
in precopy.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 migration/migration.c | 33 ++++++++++++++++++---------------
 migration/migration.h |  1 +
 migration/savevm.c    |  8 +-------
 3 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 28a34c9..29ed901 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -592,6 +592,7 @@ static void process_incoming_migration_bh(void *opaque)
 {
     Error *local_err = NULL;
     MigrationIncomingState *mis = opaque;
+    RunState state = migrate_new_runstate();
 
     trace_vmstate_downtime_checkpoint("dst-precopy-bh-enter");
 
@@ -602,8 +603,7 @@ static void process_incoming_migration_bh(void *opaque)
      * unless we really are starting the VM.
      */
     if (!migrate_late_block_activate() ||
-         (autostart && (!global_state_received() ||
-            global_state_get_runstate() == RUN_STATE_RUNNING))) {
+        state == RUN_STATE_RUNNING) {
         /* Make sure all file formats throw away their mutable metadata.
          * If we get an error here, just don't restart the VM yet. */
         bdrv_activate_all(&local_err);
@@ -626,19 +626,13 @@ static void process_incoming_migration_bh(void *opaque)
 
     dirty_bitmap_mig_before_vm_start();
 
-    if (!global_state_received() ||
-        global_state_get_runstate() == RUN_STATE_RUNNING) {
-        if (autostart) {
-            vm_start();
-        } else {
-            runstate_set(RUN_STATE_PAUSED);
-        }
-    } else if (migration_incoming_colo_enabled()) {
+    if (migration_incoming_colo_enabled()) {
         migration_incoming_disable_colo();
         vm_start();
     } else {
-        runstate_set(global_state_get_runstate());
+        vm_resume(state);
     }
+
     trace_vmstate_downtime_checkpoint("dst-precopy-bh-vm-started");
     /*
      * This must happen after any state changes since as soon as an external
@@ -1277,6 +1271,16 @@ void migrate_set_state(int *state, int old_state, int new_state)
     }
 }
 
+RunState migrate_new_runstate(void)
+{
+    RunState state = global_state_get_runstate();
+
+    if (!global_state_received() || state == RUN_STATE_RUNNING) {
+        state = autostart ? RUN_STATE_RUNNING : RUN_STATE_PAUSED;
+    }
+    return state;
+}
+
 static void migrate_fd_cleanup(MigrationState *s)
 {
     qemu_bh_delete(s->cleanup_bh);
@@ -2415,7 +2419,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
 
     migration_downtime_start(ms);
 
-    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
     global_state_store();
     ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
     if (ret < 0) {
@@ -2614,7 +2617,6 @@ static int migration_completion_precopy(MigrationState *s,
 
     qemu_mutex_lock_iothread();
     migration_downtime_start(s);
-    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
 
     s->vm_old_state = runstate_get();
     global_state_store();
@@ -3135,9 +3137,10 @@ static void migration_iteration_finish(MigrationState *s)
     case MIGRATION_STATUS_FAILED:
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_CANCELLING:
-        if (s->vm_old_state == RUN_STATE_RUNNING) {
+        if (s->vm_old_state == RUN_STATE_RUNNING ||
+            s->vm_old_state == RUN_STATE_SUSPENDED) {
             if (!runstate_check(RUN_STATE_SHUTDOWN)) {
-                vm_start();
+                vm_resume(s->vm_old_state);
             }
         } else {
             if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
diff --git a/migration/migration.h b/migration/migration.h
index cf2c9c8..5b7aa89 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -473,6 +473,7 @@ struct MigrationState {
 };
 
 void migrate_set_state(int *state, int old_state, int new_state);
+RunState migrate_new_runstate(void);
 
 void migration_fd_process_incoming(QEMUFile *f, Error **errp);
 void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
diff --git a/migration/savevm.c b/migration/savevm.c
index eec5503..78ac2bd 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2163,13 +2163,7 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
 
     dirty_bitmap_mig_before_vm_start();
 
-    if (autostart) {
-        /* Hold onto your hats, starting the CPU */
-        vm_start();
-    } else {
-        /* leave it paused and let management decide when to start the CPU */
-        runstate_set(RUN_STATE_PAUSED);
-    }
+    vm_resume(migrate_new_runstate());
 
     qemu_bh_delete(mis->bh);
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 06/12] migration: preserve suspended for snapshot
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (4 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 05/12] migration: preserve suspended runstate Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-20 18:13   ` Fabiano Rosas
  2023-11-13 18:33 ` [PATCH V5 07/12] migration: preserve suspended for bg_migration Steve Sistare
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Restoring a snapshot can break a suspended guest.  Snapshots suffer from
the same suspended-state issues that affect live migration, plus they must
handle an additional problematic scenario, which is that a running vm must
remain running if it loads a suspended snapshot.

To save, call vm_stop_force_state to completely stop a vm in the suspended
state, and restore the suspended state using runstate_restore.  This
produces a correct vmstate file and leaves the vm in the state it had prior
to the save.

To load, if the snapshot is not suspended, then vm_stop_force_state +
runstate_restore correctly handles all states, and leaves the vm in the
state it had prior to the load.  However, if the snapshot is suspended,
restoration is trickier.  First restore the state to suspended so the
current state matches the saved state.  Then, if the pre-load state is
running, wakeup to resume running.

Prior to these changes, the vm_stop to RUN_STATE_SAVE_VM and
RUN_STATE_RESTORE_VM did not change runstate if the current state was
paused, suspended, or prelaunch, but now vm_stop_force_state forces these
transitions, so allow them.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/migration/snapshot.h   |  7 +++++++
 migration/migration-hmp-cmds.c | 12 ++++++++----
 migration/savevm.c             | 33 +++++++++++++++++++++------------
 system/runstate.c              | 10 ++++++++++
 system/vl.c                    |  2 ++
 5 files changed, 48 insertions(+), 16 deletions(-)

diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
index e72083b..9e4dcaa 100644
--- a/include/migration/snapshot.h
+++ b/include/migration/snapshot.h
@@ -16,6 +16,7 @@
 #define QEMU_MIGRATION_SNAPSHOT_H
 
 #include "qapi/qapi-builtin-types.h"
+#include "qapi/qapi-types-run-state.h"
 
 /**
  * save_snapshot: Save an internal snapshot.
@@ -61,4 +62,10 @@ bool delete_snapshot(const char *name,
                     bool has_devices, strList *devices,
                     Error **errp);
 
+/**
+ * load_snapshot_resume: Restore runstate after loading snapshot.
+ * @state: state to restore
+ */
+void load_snapshot_resume(RunState state);
+
 #endif
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 86ae832..c31cdc7 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -399,15 +399,19 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
 
 void hmp_loadvm(Monitor *mon, const QDict *qdict)
 {
-    int saved_vm_running  = runstate_is_running();
+    RunState saved_state = runstate_get();
+
     const char *name = qdict_get_str(qdict, "name");
     Error *err = NULL;
 
-    vm_stop(RUN_STATE_RESTORE_VM);
+    vm_stop_force_state(RUN_STATE_RESTORE_VM);
 
-    if (load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) {
-        vm_start();
+    if (load_snapshot(name, NULL, false, NULL, &err)) {
+        load_snapshot_resume(saved_state);
+    } else {
+        vm_resume(saved_state);
     }
+
     hmp_handle_error(mon, err);
 }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index 78ac2bd..b4b49bb 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3040,7 +3040,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
     QEMUSnapshotInfo sn1, *sn = &sn1;
     int ret = -1, ret2;
     QEMUFile *f;
-    int saved_vm_running;
+    RunState saved_state = runstate_get();
     uint64_t vm_state_size;
     g_autoptr(GDateTime) now = g_date_time_new_now_local();
     AioContext *aio_context;
@@ -3088,10 +3088,8 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
     }
     aio_context = bdrv_get_aio_context(bs);
 
-    saved_vm_running = runstate_is_running();
-
     global_state_store();
-    vm_stop(RUN_STATE_SAVE_VM);
+    vm_stop_force_state(RUN_STATE_SAVE_VM);
 
     bdrv_drain_all_begin();
 
@@ -3157,9 +3155,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
 
     bdrv_drain_all_end();
 
-    if (saved_vm_running) {
-        vm_start();
-    }
+    vm_resume(saved_state);
     return ret == 0;
 }
 
@@ -3333,6 +3329,20 @@ err_drain:
     return false;
 }
 
+void load_snapshot_resume(RunState state)
+{
+    if (global_state_received() &&
+        global_state_get_runstate() == RUN_STATE_SUSPENDED) {
+
+        vm_resume(RUN_STATE_SUSPENDED);
+        if (state == RUN_STATE_RUNNING) {
+            qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
+        }
+    } else {
+        vm_resume(state);
+    }
+}
+
 bool delete_snapshot(const char *name, bool has_devices,
                      strList *devices, Error **errp)
 {
@@ -3397,16 +3407,15 @@ static void snapshot_load_job_bh(void *opaque)
 {
     Job *job = opaque;
     SnapshotJob *s = container_of(job, SnapshotJob, common);
-    int orig_vm_running;
+    RunState orig_state = runstate_get();
 
     job_progress_set_remaining(&s->common, 1);
 
-    orig_vm_running = runstate_is_running();
-    vm_stop(RUN_STATE_RESTORE_VM);
+    vm_stop_force_state(RUN_STATE_RESTORE_VM);
 
     s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
-    if (s->ret && orig_vm_running) {
-        vm_start();
+    if (s->ret) {
+        load_snapshot_resume(orig_state);
     }
 
     job_progress_update(&s->common, 1);
diff --git a/system/runstate.c b/system/runstate.c
index ea9d6c2..f1d4bc7 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -77,6 +77,8 @@ typedef struct {
 
 static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
+    { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
+    { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
 
     { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
     { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },
@@ -108,6 +110,8 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE },
     { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
     { RUN_STATE_PAUSED, RUN_STATE_COLO},
+    { RUN_STATE_PAUSED, RUN_STATE_SAVE_VM},
+    { RUN_STATE_PAUSED, RUN_STATE_RESTORE_VM},
 
     { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -131,6 +135,8 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
     { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
+    { RUN_STATE_RESTORE_VM, RUN_STATE_PAUSED },
+    { RUN_STATE_RESTORE_VM, RUN_STATE_SUSPENDED },
 
     { RUN_STATE_COLO, RUN_STATE_RUNNING },
     { RUN_STATE_COLO, RUN_STATE_PRELAUNCH },
@@ -149,6 +155,8 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_COLO},
 
     { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
+    { RUN_STATE_SAVE_VM, RUN_STATE_PAUSED },
+    { RUN_STATE_SAVE_VM, RUN_STATE_SUSPENDED },
 
     { RUN_STATE_SHUTDOWN, RUN_STATE_PAUSED },
     { RUN_STATE_SHUTDOWN, RUN_STATE_FINISH_MIGRATE },
@@ -161,6 +169,8 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
     { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
     { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
+    { RUN_STATE_SUSPENDED, RUN_STATE_SAVE_VM },
+    { RUN_STATE_SUSPENDED, RUN_STATE_RESTORE_VM },
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
diff --git a/system/vl.c b/system/vl.c
index bd7fad7..082a45a 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -2702,7 +2702,9 @@ void qmp_x_exit_preconfig(Error **errp)
     qemu_machine_creation_done();
 
     if (loadvm) {
+        RunState state = autostart ? RUN_STATE_RUNNING : runstate_get();
         load_snapshot(loadvm, NULL, false, NULL, &error_fatal);
+        load_snapshot_resume(state);
     }
     if (replay_mode != REPLAY_MODE_NONE) {
         replay_vmstate_init();
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 07/12] migration: preserve suspended for bg_migration
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (5 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 06/12] migration: preserve suspended for snapshot Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-20 18:18   ` Fabiano Rosas
  2023-11-13 18:33 ` [PATCH V5 08/12] tests/qtest: migration events Steve Sistare
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Do not wake a suspended guest during bg_migration, and restore the prior
state at finish rather than unconditionally running.  Allow the additional
state transitions that occur because bg migration forces RUN_STATE_PAUSED to
save the precopy device state.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 migration/migration.c | 7 +------
 system/runstate.c     | 3 +++
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 29ed901..49ec836 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3394,7 +3394,7 @@ static void bg_migration_vm_start_bh(void *opaque)
     qemu_bh_delete(s->vm_start_bh);
     s->vm_start_bh = NULL;
 
-    vm_start();
+    vm_resume(s->vm_old_state);
     migration_downtime_end(s);
 }
 
@@ -3466,11 +3466,6 @@ static void *bg_migration_thread(void *opaque)
 
     qemu_mutex_lock_iothread();
 
-    /*
-     * If VM is currently in suspended state, then, to make a valid runstate
-     * transition in vm_stop_force_state() we need to wakeup it up.
-     */
-    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
     s->vm_old_state = runstate_get();
 
     global_state_store();
diff --git a/system/runstate.c b/system/runstate.c
index f1d4bc7..68cc485 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -112,6 +112,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_PAUSED, RUN_STATE_COLO},
     { RUN_STATE_PAUSED, RUN_STATE_SAVE_VM},
     { RUN_STATE_PAUSED, RUN_STATE_RESTORE_VM},
+    { RUN_STATE_PAUSED, RUN_STATE_SUSPENDED},
 
     { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -171,6 +172,8 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
     { RUN_STATE_SUSPENDED, RUN_STATE_SAVE_VM },
     { RUN_STATE_SUSPENDED, RUN_STATE_RESTORE_VM },
+    { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
+    { RUN_STATE_SUSPENDED, RUN_STATE_SHUTDOWN },
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 08/12] tests/qtest: migration events
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (6 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 07/12] migration: preserve suspended for bg_migration Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-13 18:33 ` [PATCH V5 09/12] tests/qtest: option to suspend during migration Steve Sistare
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Define a state object to capture events seen by migration tests, to allow
more events to be captured in a subsequent patch, and simplify event
checking in wait_for_migration_pass.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/migration-helpers.c | 24 ++++-------
 tests/qtest/migration-helpers.h |  9 ++--
 tests/qtest/migration-test.c    | 91 +++++++++++++++++++----------------------
 3 files changed, 56 insertions(+), 68 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 24fb7b3..fd3b94e 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -24,26 +24,16 @@
  */
 #define MIGRATION_STATUS_WAIT_TIMEOUT 120
 
-bool migrate_watch_for_stop(QTestState *who, const char *name,
-                            QDict *event, void *opaque)
-{
-    bool *seen = opaque;
-
-    if (g_str_equal(name, "STOP")) {
-        *seen = true;
-        return true;
-    }
-
-    return false;
-}
-
-bool migrate_watch_for_resume(QTestState *who, const char *name,
+bool migrate_watch_for_events(QTestState *who, const char *name,
                               QDict *event, void *opaque)
 {
-    bool *seen = opaque;
+    QTestMigrationState *state = opaque;
 
-    if (g_str_equal(name, "RESUME")) {
-        *seen = true;
+    if (g_str_equal(name, "STOP")) {
+        state->stop_seen = true;
+        return true;
+    } else if (g_str_equal(name, "RESUME")) {
+        state->resume_seen = true;
         return true;
     }
 
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index e31dc85..3d32699 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -15,9 +15,12 @@
 
 #include "libqtest.h"
 
-bool migrate_watch_for_stop(QTestState *who, const char *name,
-                            QDict *event, void *opaque);
-bool migrate_watch_for_resume(QTestState *who, const char *name,
+typedef struct QTestMigrationState {
+    bool stop_seen;
+    bool resume_seen;
+} QTestMigrationState;
+
+bool migrate_watch_for_events(QTestState *who, const char *name,
                               QDict *event, void *opaque);
 
 G_GNUC_PRINTF(3, 4)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 5752412..59fbbef 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -43,8 +43,8 @@
 unsigned start_address;
 unsigned end_address;
 static bool uffd_feature_thread_id;
-static bool got_src_stop;
-static bool got_dst_resume;
+static QTestMigrationState src_state;
+static QTestMigrationState dst_state;
 
 /*
  * An initial 3 MB offset is used as that corresponds
@@ -230,6 +230,20 @@ static void wait_for_serial(const char *side)
     } while (true);
 }
 
+static void wait_for_stop(QTestState *who, QTestMigrationState *state)
+{
+    if (!state->stop_seen) {
+        qtest_qmp_eventwait(who, "STOP");
+    }
+}
+
+static void wait_for_resume(QTestState *who, QTestMigrationState *state)
+{
+    if (!state->resume_seen) {
+        qtest_qmp_eventwait(who, "RESUME");
+    }
+}
+
 /*
  * It's tricky to use qemu's migration event capability with qtest,
  * events suddenly appearing confuse the qmp()/hmp() responses.
@@ -277,21 +291,19 @@ static void read_blocktime(QTestState *who)
     qobject_unref(rsp_return);
 }
 
+/*
+ * Wait for two changes in the migration pass count, but bail if we stop.
+ */
 static void wait_for_migration_pass(QTestState *who)
 {
-    uint64_t initial_pass = get_migration_pass(who);
-    uint64_t pass;
+    uint64_t pass, prev_pass = 0, changes = 0;
 
-    /* Wait for the 1st sync */
-    while (!got_src_stop && !initial_pass) {
-        usleep(1000);
-        initial_pass = get_migration_pass(who);
-    }
-
-    do {
+    while (changes < 2 && !src_state.stop_seen) {
         usleep(1000);
         pass = get_migration_pass(who);
-    } while (pass == initial_pass && !got_src_stop);
+        changes += (pass != prev_pass);
+        prev_pass = pass;
+    }
 }
 
 static void check_guests_ram(QTestState *who)
@@ -617,10 +629,7 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
 {
     qtest_qmp_assert_success(from, "{ 'execute': 'migrate-start-postcopy' }");
 
-    if (!got_src_stop) {
-        qtest_qmp_eventwait(from, "STOP");
-    }
-
+    wait_for_stop(from, &src_state);
     qtest_qmp_eventwait(to, "RESUME");
 }
 
@@ -756,8 +765,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         }
     }
 
-    got_src_stop = false;
-    got_dst_resume = false;
+    dst_state = (QTestMigrationState) { };
+    src_state = (QTestMigrationState) { };
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         memory_size = "150M";
 
@@ -848,8 +857,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
     if (!args->only_target) {
         *from = qtest_init_with_env(QEMU_ENV_SRC, cmd_source);
         qtest_qmp_set_event_callback(*from,
-                                     migrate_watch_for_stop,
-                                     &got_src_stop);
+                                     migrate_watch_for_events,
+                                     &src_state);
     }
 
     cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
@@ -869,8 +878,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                  ignore_stderr);
     *to = qtest_init_with_env(QEMU_ENV_DST, cmd_target);
     qtest_qmp_set_event_callback(*to,
-                                 migrate_watch_for_resume,
-                                 &got_dst_resume);
+                                 migrate_watch_for_events,
+                                 &dst_state);
 
     /*
      * Remove shmem file immediately to avoid memory leak in test failed case.
@@ -1717,9 +1726,7 @@ static void test_precopy_common(MigrateCommon *args)
          */
         if (args->result == MIG_TEST_SUCCEED) {
             qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
-            if (!got_src_stop) {
-                qtest_qmp_eventwait(from, "STOP");
-            }
+            wait_for_stop(from, &src_state);
             migrate_ensure_converge(from);
         }
     }
@@ -1765,9 +1772,8 @@ static void test_precopy_common(MigrateCommon *args)
              */
             wait_for_migration_complete(from);
 
-            if (!got_src_stop) {
-                qtest_qmp_eventwait(from, "STOP");
-            }
+            wait_for_stop(from, &src_state);
+
         } else {
             wait_for_migration_complete(from);
             /*
@@ -1780,9 +1786,7 @@ static void test_precopy_common(MigrateCommon *args)
             qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
         }
 
-        if (!got_dst_resume) {
-            qtest_qmp_eventwait(to, "RESUME");
-        }
+        wait_for_resume(to, &dst_state);
 
         wait_for_serial("dest_serial");
     }
@@ -1821,9 +1825,7 @@ static void test_file_common(MigrateCommon *args, bool stop_src)
 
     if (stop_src) {
         qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
-        if (!got_src_stop) {
-            qtest_qmp_eventwait(from, "STOP");
-        }
+        wait_for_stop(from, &src_state);
     }
 
     if (args->result == MIG_TEST_QMP_ERROR) {
@@ -1844,10 +1846,7 @@ static void test_file_common(MigrateCommon *args, bool stop_src)
     if (stop_src) {
         qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
     }
-
-    if (!got_dst_resume) {
-        qtest_qmp_eventwait(to, "RESUME");
-    }
+    wait_for_resume(to, &dst_state);
 
     wait_for_serial("dest_serial");
 
@@ -1966,9 +1965,7 @@ static void test_ignore_shared(void)
 
     migrate_wait_for_dirty_mem(from, to);
 
-    if (!got_src_stop) {
-        qtest_qmp_eventwait(from, "STOP");
-    }
+    wait_for_stop(from, &src_state);
 
     qtest_qmp_eventwait(to, "RESUME");
 
@@ -2503,7 +2500,7 @@ static void test_migrate_auto_converge(void)
             break;
         }
         usleep(20);
-        g_assert_false(got_src_stop);
+        g_assert_false(src_state.stop_seen);
     } while (true);
     /* The first percentage of throttling should be at least init_pct */
     g_assert_cmpint(percentage, >=, init_pct);
@@ -2842,9 +2839,7 @@ static void test_multifd_tcp_cancel(void)
 
     migrate_ensure_converge(from);
 
-    if (!got_src_stop) {
-        qtest_qmp_eventwait(from, "STOP");
-    }
+    wait_for_stop(from, &src_state);
     qtest_qmp_eventwait(to2, "RESUME");
 
     wait_for_serial("dest_serial");
@@ -3177,7 +3172,7 @@ static void test_migrate_dirty_limit(void)
         throttle_us_per_full =
         read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
         usleep(100);
-        g_assert_false(got_src_stop);
+        g_assert_false(src_state.stop_seen);
     }
 
     /* Now cancel migrate and wait for dirty limit throttle switch off */
@@ -3189,7 +3184,7 @@ static void test_migrate_dirty_limit(void)
         throttle_us_per_full =
         read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
         usleep(100);
-        g_assert_false(got_src_stop);
+        g_assert_false(src_state.stop_seen);
     } while (throttle_us_per_full != 0 && --max_try_count);
 
     /* Assert dirty limit is not in service */
@@ -3218,7 +3213,7 @@ static void test_migrate_dirty_limit(void)
         throttle_us_per_full =
         read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
         usleep(100);
-        g_assert_false(got_src_stop);
+        g_assert_false(src_state.stop_seen);
     }
 
     /*
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 09/12] tests/qtest: option to suspend during migration
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (7 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 08/12] tests/qtest: migration events Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-13 18:33 ` [PATCH V5 10/12] tests/qtest: precopy migration with suspend Steve Sistare
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add an option to suspend the src in a-b-bootblock.S, which puts the guest
in S3 state after one round of writing to memory.  The option is enabled by
poking a 1 into the suspend_me word in the boot block prior to starting the
src vm.  Generate symbol offsets in a-b-bootblock.h so that the suspend_me
offset is known.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
---
 tests/migration/i386/Makefile        |  5 ++--
 tests/migration/i386/a-b-bootblock.S | 50 +++++++++++++++++++++++++++++++++---
 tests/migration/i386/a-b-bootblock.h | 26 +++++++++++++------
 tests/qtest/migration-test.c         |  8 ++++--
 4 files changed, 74 insertions(+), 15 deletions(-)

diff --git a/tests/migration/i386/Makefile b/tests/migration/i386/Makefile
index 5c03241..37a72ae 100644
--- a/tests/migration/i386/Makefile
+++ b/tests/migration/i386/Makefile
@@ -4,9 +4,10 @@
 .PHONY: all clean
 all: a-b-bootblock.h
 
-a-b-bootblock.h: x86.bootsect
+a-b-bootblock.h: x86.bootsect x86.o
 	echo "$$__note" > header.tmp
 	xxd -i $< | sed -e 's/.*int.*//' >> header.tmp
+	nm x86.o | awk '{print "#define SYM_"$$3" 0x"$$1}' >> header.tmp
 	mv header.tmp $@
 
 x86.bootsect: x86.boot
@@ -16,7 +17,7 @@ x86.boot: x86.o
 	$(CROSS_PREFIX)objcopy -O binary $< $@
 
 x86.o: a-b-bootblock.S
-	$(CROSS_PREFIX)gcc -m32 -march=i486 -c $< -o $@
+	$(CROSS_PREFIX)gcc -I.. -m32 -march=i486 -c $< -o $@
 
 clean:
 	@rm -rf *.boot *.o *.bootsect
diff --git a/tests/migration/i386/a-b-bootblock.S b/tests/migration/i386/a-b-bootblock.S
index 6bb9999..6f39eb6 100644
--- a/tests/migration/i386/a-b-bootblock.S
+++ b/tests/migration/i386/a-b-bootblock.S
@@ -9,6 +9,23 @@
 #
 # Author: dgilbert@redhat.com
 
+#include "migration-test.h"
+
+#define ACPI_ENABLE         0xf1
+#define ACPI_PORT_SMI_CMD   0xb2
+#define ACPI_PM_BASE        0x600
+#define PM1A_CNT_OFFSET     4
+
+#define ACPI_SCI_ENABLE     0x0001
+#define ACPI_SLEEP_TYPE     0x0400
+#define ACPI_SLEEP_ENABLE   0x2000
+#define SLEEP (ACPI_SCI_ENABLE + ACPI_SLEEP_TYPE + ACPI_SLEEP_ENABLE)
+
+#define LOW_ADDR            X86_TEST_MEM_START
+#define HIGH_ADDR           X86_TEST_MEM_END
+
+/* Save the suspended status at an address that is not written in the loop. */
+#define suspended           (X86_TEST_MEM_START + 4)
 
 .code16
 .org 0x7c00
@@ -35,8 +52,8 @@ start:             # at 0x7c00 ?
         mov %eax,%ds
 
 # Start from 1MB
-.set TEST_MEM_START, (1024*1024)
-.set TEST_MEM_END, (100*1024*1024)
+.set TEST_MEM_START, X86_TEST_MEM_START
+.set TEST_MEM_END, X86_TEST_MEM_END
 
         mov $65,%ax
         mov $0x3f8,%dx
@@ -69,7 +86,30 @@ innerloop:
         mov $0x3f8,%dx
         outb %al,%dx
 
-        jmp mainloop
+        # should this test suspend?
+        mov (suspend_me),%eax
+        cmp $0,%eax
+        je mainloop
+
+        # are we waking after suspend?  do not suspend again.
+        mov $suspended,%eax
+        mov (%eax),%eax
+        cmp $1,%eax
+        je mainloop
+
+        # enable acpi
+        mov $ACPI_ENABLE,%al
+        outb %al,$ACPI_PORT_SMI_CMD
+
+        # suspend to ram
+        mov $suspended,%eax
+        movl $1,(%eax)
+        mov $SLEEP,%ax
+        mov $(ACPI_PM_BASE + PM1A_CNT_OFFSET),%dx
+        outw %ax,%dx
+        # not reached.  The wakeup causes reset and restart at 0x7c00, and we
+        # do not save and restore registers as a real kernel would do.
+
 
         # GDT magic from old (GPLv2)  Grub startup.S
         .p2align        2       /* force 4-byte alignment */
@@ -95,6 +135,10 @@ gdtdesc:
         .word   0x27                    /* limit */
         .long   gdt                     /* addr */
 
+        /* test launcher can poke a 1 here to exercise suspend */
+suspend_me:
+        .int  0
+
 /* I'm a bootable disk */
 .org 0x7dfe
         .byte 0x55
diff --git a/tests/migration/i386/a-b-bootblock.h b/tests/migration/i386/a-b-bootblock.h
index 5b52391..c83f871 100644
--- a/tests/migration/i386/a-b-bootblock.h
+++ b/tests/migration/i386/a-b-bootblock.h
@@ -4,7 +4,7 @@
  * the header and the assembler differences in your patch submission.
  */
 unsigned char x86_bootsect[] = {
-  0xfa, 0x0f, 0x01, 0x16, 0x8c, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
+  0xfa, 0x0f, 0x01, 0x16, 0xb8, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
   0x0f, 0x22, 0xc0, 0x66, 0xea, 0x20, 0x7c, 0x00, 0x00, 0x08, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xe4, 0x92, 0x0c, 0x02,
   0xe6, 0x92, 0xb8, 0x10, 0x00, 0x00, 0x00, 0x8e, 0xd8, 0x66, 0xb8, 0x41,
@@ -13,13 +13,13 @@ unsigned char x86_bootsect[] = {
   0x40, 0x06, 0x7c, 0xf1, 0xb8, 0x00, 0x00, 0x10, 0x00, 0xfe, 0x00, 0x05,
   0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00, 0x40, 0x06, 0x7c, 0xf2, 0xfe,
   0xc3, 0x80, 0xe3, 0x3f, 0x75, 0xe6, 0x66, 0xb8, 0x42, 0x00, 0x66, 0xba,
-  0xf8, 0x03, 0xee, 0xeb, 0xdb, 0x8d, 0x76, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00,
-  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00, 0x27, 0x00, 0x74, 0x7c,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0xf8, 0x03, 0xee, 0xa1, 0xbe, 0x7c, 0x00, 0x00, 0x83, 0xf8, 0x00, 0x74,
+  0xd3, 0xb8, 0x04, 0x00, 0x10, 0x00, 0x8b, 0x00, 0x83, 0xf8, 0x01, 0x74,
+  0xc7, 0xb0, 0xf1, 0xe6, 0xb2, 0xb8, 0x04, 0x00, 0x10, 0x00, 0xc7, 0x00,
+  0x01, 0x00, 0x00, 0x00, 0x66, 0xb8, 0x01, 0x24, 0x66, 0xba, 0x04, 0x06,
+  0x66, 0xef, 0x66, 0x90, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00, 0xff, 0xff, 0x00, 0x00,
+  0x00, 0x92, 0xcf, 0x00, 0x27, 0x00, 0xa0, 0x7c, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
@@ -49,3 +49,13 @@ unsigned char x86_bootsect[] = {
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x55, 0xaa
 };
 
+#define SYM_do_zero 0x00007c3d
+#define SYM_gdt 0x00007ca0
+#define SYM_gdtdesc 0x00007cb8
+#define SYM_innerloop 0x00007c51
+#define SYM_mainloop 0x00007c4c
+#define SYM_pre_zero 0x00007c38
+#define SYM_start 0x00007c00
+#define SYM_suspend_me 0x00007cbe
+#define SYM_TEST_MEM_END 0x06400000
+#define SYM_TEST_MEM_START 0x00100000
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 59fbbef..bef1430 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -133,7 +133,7 @@ static char *bootpath;
 #include "tests/migration/aarch64/a-b-kernel.h"
 #include "tests/migration/s390x/a-b-bios.h"
 
-static void bootfile_create(char *dir)
+static void bootfile_create(char *dir, bool suspend_me)
 {
     const char *arch = qtest_get_arch();
     unsigned char *content;
@@ -143,6 +143,7 @@ static void bootfile_create(char *dir)
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         /* the assembled x86 boot sector should be exactly one sector large */
         g_assert(sizeof(x86_bootsect) == 512);
+        x86_bootsect[SYM_suspend_me - SYM_start] = suspend_me;
         content = x86_bootsect;
         len = sizeof(x86_bootsect);
     } else if (g_str_equal(arch, "s390x")) {
@@ -646,6 +647,8 @@ typedef struct {
     bool use_dirty_ring;
     const char *opts_source;
     const char *opts_target;
+    /* suspend the src before migrating to dest. */
+    bool suspend_me;
 } MigrateStart;
 
 /*
@@ -767,6 +770,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
 
     dst_state = (QTestMigrationState) { };
     src_state = (QTestMigrationState) { };
+    bootfile_create(tmpfs, args->suspend_me);
+
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         memory_size = "150M";
 
@@ -3329,7 +3334,6 @@ int main(int argc, char **argv)
                        g_get_tmp_dir(), err->message);
     }
     g_assert(tmpfs);
-    bootfile_create(tmpfs);
 
     module_call_init(MODULE_INIT_QOM);
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 10/12] tests/qtest: precopy migration with suspend
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (8 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 09/12] tests/qtest: option to suspend during migration Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-13 18:33 ` [PATCH V5 11/12] tests/qtest: postcopy " Steve Sistare
  2023-11-13 18:34 ` [PATCH V5 12/12] tests/qtest: background " Steve Sistare
  11 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add a test case to verify that the suspended state is handled correctly
during live migration precopy.  The test suspends the src, migrates, then
wakes the dest.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 tests/qtest/migration-helpers.c |  3 ++
 tests/qtest/migration-helpers.h |  2 ++
 tests/qtest/migration-test.c    | 69 +++++++++++++++++++++++++++++++++++++----
 3 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index fd3b94e..37e8e81 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -32,6 +32,9 @@ bool migrate_watch_for_events(QTestState *who, const char *name,
     if (g_str_equal(name, "STOP")) {
         state->stop_seen = true;
         return true;
+    } else if (g_str_equal(name, "SUSPEND")) {
+        state->suspend_seen = true;
+        return true;
     } else if (g_str_equal(name, "RESUME")) {
         state->resume_seen = true;
         return true;
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index 3d32699..b478549 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -18,6 +18,8 @@
 typedef struct QTestMigrationState {
     bool stop_seen;
     bool resume_seen;
+    bool suspend_seen;
+    bool suspend_me;
 } QTestMigrationState;
 
 bool migrate_watch_for_events(QTestState *who, const char *name,
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index bef1430..42706c9 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -178,7 +178,7 @@ static void bootfile_delete(void)
 /*
  * Wait for some output in the serial output file,
  * we get an 'A' followed by an endless string of 'B's
- * but on the destination we won't have the A.
+ * but on the destination we won't have the A (unless we enabled suspend/resume)
  */
 static void wait_for_serial(const char *side)
 {
@@ -245,6 +245,13 @@ static void wait_for_resume(QTestState *who, QTestMigrationState *state)
     }
 }
 
+static void wait_for_suspend(QTestState *who, QTestMigrationState *state)
+{
+    if (!state->suspend_seen) {
+        qtest_qmp_eventwait(who, "SUSPEND");
+    }
+}
+
 /*
  * It's tricky to use qemu's migration event capability with qtest,
  * events suddenly appearing confuse the qmp()/hmp() responses.
@@ -299,7 +306,7 @@ static void wait_for_migration_pass(QTestState *who)
 {
     uint64_t pass, prev_pass = 0, changes = 0;
 
-    while (changes < 2 && !src_state.stop_seen) {
+    while (changes < 2 && !src_state.stop_seen && !src_state.suspend_seen) {
         usleep(1000);
         pass = get_migration_pass(who);
         changes += (pass != prev_pass);
@@ -595,7 +602,8 @@ static void migrate_wait_for_dirty_mem(QTestState *from,
     watch_byte = qtest_readb(from, watch_address);
     do {
         usleep(1000 * 10);
-    } while (qtest_readb(from, watch_address) == watch_byte);
+    } while (qtest_readb(from, watch_address) == watch_byte &&
+             !src_state.suspend_seen);
 }
 
 
@@ -771,6 +779,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
     dst_state = (QTestMigrationState) { };
     src_state = (QTestMigrationState) { };
     bootfile_create(tmpfs, args->suspend_me);
+    src_state.suspend_me = args->suspend_me;
 
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         memory_size = "150M";
@@ -1730,8 +1739,12 @@ static void test_precopy_common(MigrateCommon *args)
          * change anything.
          */
         if (args->result == MIG_TEST_SUCCEED) {
-            qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
-            wait_for_stop(from, &src_state);
+            if (src_state.suspend_me) {
+                wait_for_suspend(from, &src_state);
+            } else {
+                qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
+                wait_for_stop(from, &src_state);
+            }
             migrate_ensure_converge(from);
         }
     }
@@ -1777,6 +1790,9 @@ static void test_precopy_common(MigrateCommon *args)
              */
             wait_for_migration_complete(from);
 
+            if (src_state.suspend_me) {
+                wait_for_suspend(from, &src_state);
+            }
             wait_for_stop(from, &src_state);
 
         } else {
@@ -1793,6 +1809,11 @@ static void test_precopy_common(MigrateCommon *args)
 
         wait_for_resume(to, &dst_state);
 
+        if (args->start.suspend_me) {
+            /* wakeup succeeds only if guest is suspended */
+            qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
+        }
+
         wait_for_serial("dest_serial");
     }
 
@@ -1879,6 +1900,34 @@ static void test_precopy_unix_plain(void)
     test_precopy_common(&args);
 }
 
+static void test_precopy_unix_suspend_live(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .listen_uri = uri,
+        .connect_uri = uri,
+        /*
+         * despite being live, the test is fast because the src
+         * suspends immediately.
+         */
+        .live = true,
+        .start.suspend_me = true,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_unix_suspend_notlive(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .listen_uri = uri,
+        .connect_uri = uri,
+        .start.suspend_me = true,
+    };
+
+    test_precopy_common(&args);
+}
 
 static void test_precopy_unix_dirty_ring(void)
 {
@@ -3277,7 +3326,7 @@ static bool kvm_dirty_ring_supported(void)
 int main(int argc, char **argv)
 {
     bool has_kvm, has_tcg;
-    bool has_uffd;
+    bool has_uffd, is_x86;
     const char *arch;
     g_autoptr(GError) err = NULL;
     const char *qemu_src = getenv(QEMU_ENV_SRC);
@@ -3307,6 +3356,7 @@ int main(int argc, char **argv)
 
     has_uffd = ufd_version_check();
     arch = qtest_get_arch();
+    is_x86 = !strcmp(arch, "i386") || !strcmp(arch, "x86_64");
 
     /*
      * On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG
@@ -3337,6 +3387,13 @@ int main(int argc, char **argv)
 
     module_call_init(MODULE_INIT_QOM);
 
+    if (is_x86) {
+        qtest_add_func("/migration/precopy/unix/suspend/live",
+                       test_precopy_unix_suspend_live);
+        qtest_add_func("/migration/precopy/unix/suspend/notlive",
+                       test_precopy_unix_suspend_notlive);
+    }
+
     if (has_uffd) {
         qtest_add_func("/migration/postcopy/plain", test_postcopy);
         qtest_add_func("/migration/postcopy/recovery/plain",
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 11/12] tests/qtest: postcopy migration with suspend
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (9 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 10/12] tests/qtest: precopy migration with suspend Steve Sistare
@ 2023-11-13 18:33 ` Steve Sistare
  2023-11-13 18:34 ` [PATCH V5 12/12] tests/qtest: background " Steve Sistare
  11 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add a test case to verify that the suspended state is handled correctly by
live migration postcopy.  The test suspends the src, migrates, then wakes
the dest.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 tests/qtest/migration-test.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 42706c9..0471f92 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -638,8 +638,12 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
 {
     qtest_qmp_assert_success(from, "{ 'execute': 'migrate-start-postcopy' }");
 
-    wait_for_stop(from, &src_state);
-    qtest_qmp_eventwait(to, "RESUME");
+    if (src_state.suspend_me) {
+        wait_for_suspend(from, &src_state);
+    } else {
+        wait_for_stop(from, &src_state);
+        qtest_qmp_eventwait(to, "RESUME");
+    }
 }
 
 typedef struct {
@@ -1359,6 +1363,11 @@ static void migrate_postcopy_complete(QTestState *from, QTestState *to,
 {
     wait_for_migration_complete(from);
 
+    if (args->start.suspend_me) {
+        /* wakeup succeeds only if guest is suspended */
+        qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
+    }
+
     /* Make sure we get at least one "B" on destination */
     wait_for_serial("dest_serial");
 
@@ -1392,6 +1401,15 @@ static void test_postcopy(void)
     test_postcopy_common(&args);
 }
 
+static void test_postcopy_suspend(void)
+{
+    MigrateCommon args = {
+        .start.suspend_me = true,
+    };
+
+    test_postcopy_common(&args);
+}
+
 static void test_postcopy_compress(void)
 {
     MigrateCommon args = {
@@ -3411,7 +3429,10 @@ int main(int argc, char **argv)
         qtest_add_func("/migration/postcopy/recovery/double-failures",
                        test_postcopy_recovery_double_fail);
 #endif /* _WIN32 */
-
+        if (is_x86) {
+            qtest_add_func("/migration/postcopy/suspend",
+                           test_postcopy_suspend);
+        }
     }
 
     qtest_add_func("/migration/bad_dest", test_baddest);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH V5 12/12] tests/qtest: background migration with suspend
  2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
                   ` (10 preceding siblings ...)
  2023-11-13 18:33 ` [PATCH V5 11/12] tests/qtest: postcopy " Steve Sistare
@ 2023-11-13 18:34 ` Steve Sistare
  11 siblings, 0 replies; 35+ messages in thread
From: Steve Sistare @ 2023-11-13 18:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add a test case to verify that the suspended state is handled correctly by
a background migration.  The test suspends the src, migrates, then wakes
the dest.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 tests/qtest/migration-test.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 0471f92..88a14c5 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1947,6 +1947,26 @@ static void test_precopy_unix_suspend_notlive(void)
     test_precopy_common(&args);
 }
 
+static void *test_bg_suspend_start(QTestState *from, QTestState *to)
+{
+    migrate_set_capability(from, "background-snapshot", true);
+    return NULL;
+}
+
+static void test_bg_suspend(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .listen_uri = uri,
+        .connect_uri = uri,
+        .live = true,       /* runs fast, the src suspends immediately. */
+        .start.suspend_me = true,
+        .start_hook = test_bg_suspend_start
+    };
+
+    test_precopy_common(&args);
+}
+
 static void test_precopy_unix_dirty_ring(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
@@ -3432,6 +3452,7 @@ int main(int argc, char **argv)
         if (is_x86) {
             qtest_add_func("/migration/postcopy/suspend",
                            test_postcopy_suspend);
+            qtest_add_func("/migration/bg/suspend", test_bg_suspend);
         }
     }
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 01/12] cpus: refactor vm_stop
  2023-11-13 18:33 ` [PATCH V5 01/12] cpus: refactor vm_stop Steve Sistare
@ 2023-11-20 13:22   ` Fabiano Rosas
  2023-11-20 19:09     ` Steven Sistare
  0 siblings, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-11-20 13:22 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> Refactor the vm_stop functions into one common subroutine do_vm_stop called
> with options.  No functional change.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  system/cpus.c | 44 +++++++++++++++++---------------------------
>  1 file changed, 17 insertions(+), 27 deletions(-)
>
> diff --git a/system/cpus.c b/system/cpus.c
> index 0848e0d..f72c4be 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -252,10 +252,21 @@ void cpu_interrupt(CPUState *cpu, int mask)
>      }
>  }
>  
> -static int do_vm_stop(RunState state, bool send_stop)
> +static int do_vm_stop(RunState state, bool send_stop, bool force)
>  {
>      int ret = 0;
>  
> +    if (qemu_in_vcpu_thread()) {
> +        qemu_system_vmstop_request_prepare();
> +        qemu_system_vmstop_request(state);
> +        /*
> +         * FIXME: should not return to device code in case
> +         * vm_stop() has been requested.
> +         */
> +        cpu_stop_current();
> +        return 0;
> +    }

At vm_stop_force_state(), this block used to be under
runstate_is_running(), but now it runs unconditionally.

At vm_shutdown(), this block was not executed at all, but now it is.

We might need some words to explain why this patch doesn't affect
functionality.

> +
>      if (runstate_is_running()) {
>          runstate_set(state);
>          cpu_disable_ticks();
> @@ -264,6 +275,8 @@ static int do_vm_stop(RunState state, bool send_stop)
>          if (send_stop) {
>              qapi_event_send_stop();
>          }
> +    } else if (force) {
> +        runstate_set(state);
>      }
>  
>      bdrv_drain_all();
> @@ -278,7 +291,7 @@ static int do_vm_stop(RunState state, bool send_stop)
>   */
>  int vm_shutdown(void)
>  {
> -    return do_vm_stop(RUN_STATE_SHUTDOWN, false);
> +    return do_vm_stop(RUN_STATE_SHUTDOWN, false, false);
>  }
>  
>  bool cpu_can_run(CPUState *cpu)
> @@ -656,18 +669,7 @@ void cpu_stop_current(void)
>  
>  int vm_stop(RunState state)
>  {
> -    if (qemu_in_vcpu_thread()) {
> -        qemu_system_vmstop_request_prepare();
> -        qemu_system_vmstop_request(state);
> -        /*
> -         * FIXME: should not return to device code in case
> -         * vm_stop() has been requested.
> -         */
> -        cpu_stop_current();
> -        return 0;
> -    }
> -
> -    return do_vm_stop(state, true);
> +    return do_vm_stop(state, true, false);
>  }
>  
>  /**
> @@ -723,19 +725,7 @@ void vm_start(void)
>     current state is forgotten forever */
>  int vm_stop_force_state(RunState state)
>  {
> -    if (runstate_is_running()) {
> -        return vm_stop(state);
> -    } else {
> -        int ret;
> -        runstate_set(state);
> -
> -        bdrv_drain_all();
> -        /* Make sure to return an error if the flush in a previous vm_stop()
> -         * failed. */
> -        ret = bdrv_flush_all();
> -        trace_vm_stop_flush_all(ret);
> -        return ret;
> -    }
> +    return do_vm_stop(state, true, true);
>  }
>  
>  void qmp_memsave(int64_t addr, int64_t size, const char *filename,


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-13 18:33 ` [PATCH V5 02/12] cpus: stop vm in suspended state Steve Sistare
@ 2023-11-20 14:15   ` Fabiano Rosas
  2023-11-20 19:10     ` Steven Sistare
  2023-11-20 19:59   ` Peter Xu
  1 sibling, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-11-20 14:15 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> A vm in the suspended state is not completely stopped.  

Is this a statement of a fact about VMs in the suspended state in
general or is this describing what this patch is trying to fix?

> The VCPUs have been paused, but the cpu clock still runs, and runstate
> notifiers for the transition to stopped have not been called.

...it reads like the latter, but then why aren't we fixing this at the
moment we put the VM in the suspend state?

> Modify vm_stop_force_state to completely stop the vm if the current
> state is suspended, to be called for live migration and snapshots.

Hm, this changes the meaning of the "force" from:

"force a state even if already stopped"

into:

"force a complete stop if already suspended, otherwise just set the
state"

I don't know what to make of this, shouldn't all vm_stops cause a
complete stop?

We need to at least resolve the overloading of the 'force' term.

>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  system/cpus.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/system/cpus.c b/system/cpus.c
> index f72c4be..c772708 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -255,6 +255,8 @@ void cpu_interrupt(CPUState *cpu, int mask)
>  static int do_vm_stop(RunState state, bool send_stop, bool force)
>  {
>      int ret = 0;
> +    bool running = runstate_is_running();
> +    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
>  
>      if (qemu_in_vcpu_thread()) {
>          qemu_system_vmstop_request_prepare();
> @@ -267,10 +269,12 @@ static int do_vm_stop(RunState state, bool send_stop, bool force)
>          return 0;
>      }
>  
> -    if (runstate_is_running()) {
> +    if (running || (suspended && force)) {
>          runstate_set(state);
>          cpu_disable_ticks();
> -        pause_all_vcpus();
> +        if (running) {
> +            pause_all_vcpus();
> +        }
>          vm_state_notify(0, state);
>          if (send_stop) {
>              qapi_event_send_stop();


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 04/12] cpus: start vm in suspended state
  2023-11-13 18:33 ` [PATCH V5 04/12] cpus: start vm in suspended state Steve Sistare
@ 2023-11-20 17:20   ` Fabiano Rosas
  0 siblings, 0 replies; 35+ messages in thread
From: Fabiano Rosas @ 2023-11-20 17:20 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> Provide the vm_resume helper which resumes the vm in the requested state,
> correctly restoring the suspended state.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  include/sysemu/runstate.h |  8 ++++++++
>  system/cpus.c             | 11 +++++++++++
>  2 files changed, 19 insertions(+)
>
> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
> index 9e78c7f..3fa28a4 100644
> --- a/include/sysemu/runstate.h
> +++ b/include/sysemu/runstate.h
> @@ -50,6 +50,14 @@ void vm_start(void);
>   */
>  int vm_prepare_start(bool step_pending, RunState state);
>  
> +/**
> + * vm_resume: If @state is a running state, start the vm and set the state,
> + * else just set the state.
> + *
> + * @state: the state to restore
> + */
> +void vm_resume(RunState state);
> +
>  int vm_stop(RunState state);
>  int vm_stop_force_state(RunState state);
>  int vm_shutdown(void);
> diff --git a/system/cpus.c b/system/cpus.c
> index 4d05d83..1d867bb 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -725,6 +725,17 @@ void vm_start(void)
>      }
>  }
>  
> +void vm_resume(RunState state)
> +{
> +    if (state == RUN_STATE_RUNNING) {
> +        vm_start();
> +    } else if (state == RUN_STATE_SUSPENDED) {
> +        vm_prepare_start(false, state);
> +    } else {
> +        runstate_set(state);
> +    }
> +}
> +
>  /* does a state transition even if the VM is already stopped,
>     current state is forgotten forever */
>  int vm_stop_force_state(RunState state)

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 05/12] migration: preserve suspended runstate
  2023-11-13 18:33 ` [PATCH V5 05/12] migration: preserve suspended runstate Steve Sistare
@ 2023-11-20 17:30   ` Fabiano Rosas
  0 siblings, 0 replies; 35+ messages in thread
From: Fabiano Rosas @ 2023-11-20 17:30 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> A guest that is migrated in the suspended state automaticaly wakes and
> continues execution.  This is wrong; the guest should end migration in
> the same state it started.  The root cause is that the outgoing migration
> code automatically wakes the guest, then saves the RUNNING runstate in
> global_state_store(), hence the incoming migration code thinks the guest is
> running and continues the guest if autostart is true.
>
> Simply deleting the call to qemu_system_wakeup_request() on the outgoing
> side, to migrate the vm in state suspended, does not solve the problem.
> The old vm_stop_force_state does little if the vm is suspended, so the cpu
> clock remains running, and runstate notifiers for the stop transition are
> not called (and were not called on transition to suspended). Stale cpu
> timers_state is saved to the migration stream, causing time errors in the
> guest when it wakes from suspend.  State that would have been modified by
> runstate notifiers is wrong.
>
> The new version of vm_stop_force_state solves the outgoing problems, by
> completely stopping a vm in the suspended state.
>
> On the incoming side for precopy, compute the desired new state from global
> state received, and call runstate_restore, which will partially
> resume the vm if the state is suspended.  A future system_wakeup monitor
> request will cause the vm to resume running.
>
> On the incoming side for postcopy, apply the the same restore logic found
> in precopy.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 06/12] migration: preserve suspended for snapshot
  2023-11-13 18:33 ` [PATCH V5 06/12] migration: preserve suspended for snapshot Steve Sistare
@ 2023-11-20 18:13   ` Fabiano Rosas
  2023-11-20 19:10     ` Steven Sistare
  0 siblings, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-11-20 18:13 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> Restoring a snapshot can break a suspended guest.  Snapshots suffer from
> the same suspended-state issues that affect live migration, plus they must
> handle an additional problematic scenario, which is that a running vm must
> remain running if it loads a suspended snapshot.
>
> To save, call vm_stop_force_state to completely stop a vm in the suspended
> state, and restore the suspended state using runstate_restore.  This
> produces a correct vmstate file and leaves the vm in the state it had prior
> to the save.
>
> To load, if the snapshot is not suspended, then vm_stop_force_state +
> runstate_restore correctly handles all states, and leaves the vm in the
> state it had prior to the load.  However, if the snapshot is suspended,
> restoration is trickier.  First restore the state to suspended so the
> current state matches the saved state.  Then, if the pre-load state is
> running, wakeup to resume running.
>
> Prior to these changes, the vm_stop to RUN_STATE_SAVE_VM and
> RUN_STATE_RESTORE_VM did not change runstate if the current state was
> paused, suspended, or prelaunch, but now vm_stop_force_state forces these
> transitions, so allow them.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  include/migration/snapshot.h   |  7 +++++++
>  migration/migration-hmp-cmds.c | 12 ++++++++----
>  migration/savevm.c             | 33 +++++++++++++++++++++------------
>  system/runstate.c              | 10 ++++++++++
>  system/vl.c                    |  2 ++
>  5 files changed, 48 insertions(+), 16 deletions(-)
>
> diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
> index e72083b..9e4dcaa 100644
> --- a/include/migration/snapshot.h
> +++ b/include/migration/snapshot.h
> @@ -16,6 +16,7 @@
>  #define QEMU_MIGRATION_SNAPSHOT_H
>  
>  #include "qapi/qapi-builtin-types.h"
> +#include "qapi/qapi-types-run-state.h"
>  
>  /**
>   * save_snapshot: Save an internal snapshot.
> @@ -61,4 +62,10 @@ bool delete_snapshot(const char *name,
>                      bool has_devices, strList *devices,
>                      Error **errp);
>  
> +/**
> + * load_snapshot_resume: Restore runstate after loading snapshot.
> + * @state: state to restore
> + */
> +void load_snapshot_resume(RunState state);
> +
>  #endif
> diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> index 86ae832..c31cdc7 100644
> --- a/migration/migration-hmp-cmds.c
> +++ b/migration/migration-hmp-cmds.c
> @@ -399,15 +399,19 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>  
>  void hmp_loadvm(Monitor *mon, const QDict *qdict)
>  {
> -    int saved_vm_running  = runstate_is_running();
> +    RunState saved_state = runstate_get();
> +
>      const char *name = qdict_get_str(qdict, "name");
>      Error *err = NULL;
>  
> -    vm_stop(RUN_STATE_RESTORE_VM);
> +    vm_stop_force_state(RUN_STATE_RESTORE_VM);
>  
> -    if (load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) {
> -        vm_start();
> +    if (load_snapshot(name, NULL, false, NULL, &err)) {
> +        load_snapshot_resume(saved_state);
> +    } else {
> +        vm_resume(saved_state);

Here we're starting the VM if load_snapshot() fails. Is that
intentional?

>      }
> +
>      hmp_handle_error(mon, err);
>  }
>  
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 78ac2bd..b4b49bb 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -3040,7 +3040,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>      QEMUSnapshotInfo sn1, *sn = &sn1;
>      int ret = -1, ret2;
>      QEMUFile *f;
> -    int saved_vm_running;
> +    RunState saved_state = runstate_get();
>      uint64_t vm_state_size;
>      g_autoptr(GDateTime) now = g_date_time_new_now_local();
>      AioContext *aio_context;
> @@ -3088,10 +3088,8 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>      }
>      aio_context = bdrv_get_aio_context(bs);
>  
> -    saved_vm_running = runstate_is_running();
> -
>      global_state_store();
> -    vm_stop(RUN_STATE_SAVE_VM);
> +    vm_stop_force_state(RUN_STATE_SAVE_VM);
>  
>      bdrv_drain_all_begin();
>  
> @@ -3157,9 +3155,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>  
>      bdrv_drain_all_end();
>  
> -    if (saved_vm_running) {
> -        vm_start();
> -    }
> +    vm_resume(saved_state);
>      return ret == 0;
>  }
>  
> @@ -3333,6 +3329,20 @@ err_drain:
>      return false;
>  }
>  
> +void load_snapshot_resume(RunState state)
> +{
> +    if (global_state_received() &&
> +        global_state_get_runstate() == RUN_STATE_SUSPENDED) {
> +
> +        vm_resume(RUN_STATE_SUSPENDED);
> +        if (state == RUN_STATE_RUNNING) {
> +            qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
> +        }
> +    } else {
> +        vm_resume(state);
> +    }
> +}
> +
>  bool delete_snapshot(const char *name, bool has_devices,
>                       strList *devices, Error **errp)
>  {
> @@ -3397,16 +3407,15 @@ static void snapshot_load_job_bh(void *opaque)
>  {
>      Job *job = opaque;
>      SnapshotJob *s = container_of(job, SnapshotJob, common);
> -    int orig_vm_running;
> +    RunState orig_state = runstate_get();
>  
>      job_progress_set_remaining(&s->common, 1);
>  
> -    orig_vm_running = runstate_is_running();
> -    vm_stop(RUN_STATE_RESTORE_VM);
> +    vm_stop_force_state(RUN_STATE_RESTORE_VM);
>  
>      s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
> -    if (s->ret && orig_vm_running) {
> -        vm_start();
> +    if (s->ret) {
> +        load_snapshot_resume(orig_state);

Same here, we used to not start the VM if load_snapshot() failed.

>      }
>  
>      job_progress_update(&s->common, 1);
> diff --git a/system/runstate.c b/system/runstate.c
> index ea9d6c2..f1d4bc7 100644
> --- a/system/runstate.c
> +++ b/system/runstate.c
> @@ -77,6 +77,8 @@ typedef struct {
>  
>  static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
> +    { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
> +    { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
>  
>      { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
>      { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },
> @@ -108,6 +110,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE },
>      { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
>      { RUN_STATE_PAUSED, RUN_STATE_COLO},
> +    { RUN_STATE_PAUSED, RUN_STATE_SAVE_VM},
> +    { RUN_STATE_PAUSED, RUN_STATE_RESTORE_VM},
>  
>      { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
>      { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
> @@ -131,6 +135,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>  
>      { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
>      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
> +    { RUN_STATE_RESTORE_VM, RUN_STATE_PAUSED },
> +    { RUN_STATE_RESTORE_VM, RUN_STATE_SUSPENDED },
>  
>      { RUN_STATE_COLO, RUN_STATE_RUNNING },
>      { RUN_STATE_COLO, RUN_STATE_PRELAUNCH },
> @@ -149,6 +155,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_RUNNING, RUN_STATE_COLO},
>  
>      { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
> +    { RUN_STATE_SAVE_VM, RUN_STATE_PAUSED },
> +    { RUN_STATE_SAVE_VM, RUN_STATE_SUSPENDED },
>  
>      { RUN_STATE_SHUTDOWN, RUN_STATE_PAUSED },
>      { RUN_STATE_SHUTDOWN, RUN_STATE_FINISH_MIGRATE },
> @@ -161,6 +169,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
>      { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>      { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
> +    { RUN_STATE_SUSPENDED, RUN_STATE_SAVE_VM },
> +    { RUN_STATE_SUSPENDED, RUN_STATE_RESTORE_VM },
>  
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
> diff --git a/system/vl.c b/system/vl.c
> index bd7fad7..082a45a 100644
> --- a/system/vl.c
> +++ b/system/vl.c
> @@ -2702,7 +2702,9 @@ void qmp_x_exit_preconfig(Error **errp)
>      qemu_machine_creation_done();
>  
>      if (loadvm) {
> +        RunState state = autostart ? RUN_STATE_RUNNING : runstate_get();
>          load_snapshot(loadvm, NULL, false, NULL, &error_fatal);
> +        load_snapshot_resume(state);

Here it's using error_fatal, so it won't start the VM.

>      }
>      if (replay_mode != REPLAY_MODE_NONE) {
>          replay_vmstate_init();


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 07/12] migration: preserve suspended for bg_migration
  2023-11-13 18:33 ` [PATCH V5 07/12] migration: preserve suspended for bg_migration Steve Sistare
@ 2023-11-20 18:18   ` Fabiano Rosas
  0 siblings, 0 replies; 35+ messages in thread
From: Fabiano Rosas @ 2023-11-20 18:18 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> Do not wake a suspended guest during bg_migration, and restore the prior
> state at finish rather than unconditionally running.  Allow the additional
> state transitions that occur because bg migration forces RUN_STATE_PAUSED to
> save the precopy device state.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 01/12] cpus: refactor vm_stop
  2023-11-20 13:22   ` Fabiano Rosas
@ 2023-11-20 19:09     ` Steven Sistare
  2023-11-20 19:46       ` Peter Xu
  2023-11-20 20:01       ` Fabiano Rosas
  0 siblings, 2 replies; 35+ messages in thread
From: Steven Sistare @ 2023-11-20 19:09 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

On 11/20/2023 8:22 AM, Fabiano Rosas wrote:
> Steve Sistare <steven.sistare@oracle.com> writes:
>> Refactor the vm_stop functions into one common subroutine do_vm_stop called
>> with options.  No functional change.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  system/cpus.c | 44 +++++++++++++++++---------------------------
>>  1 file changed, 17 insertions(+), 27 deletions(-)
>>
>> diff --git a/system/cpus.c b/system/cpus.c
>> index 0848e0d..f72c4be 100644
>> --- a/system/cpus.c
>> +++ b/system/cpus.c
>> @@ -252,10 +252,21 @@ void cpu_interrupt(CPUState *cpu, int mask)
>>      }
>>  }
>>  
>> -static int do_vm_stop(RunState state, bool send_stop)
>> +static int do_vm_stop(RunState state, bool send_stop, bool force)
>>  {
>>      int ret = 0;
>>  
>> +    if (qemu_in_vcpu_thread()) {
>> +        qemu_system_vmstop_request_prepare();
>> +        qemu_system_vmstop_request(state);
>> +        /*
>> +         * FIXME: should not return to device code in case
>> +         * vm_stop() has been requested.
>> +         */
>> +        cpu_stop_current();
>> +        return 0;
>> +    }
> 
> At vm_stop_force_state(), this block used to be under
> runstate_is_running(), but now it runs unconditionally.

vm_stop_force_state callers should never be called in a vcpu thread, so this block
is never executed for them.  I could assert that.

> At vm_shutdown(), this block was not executed at all, but now it is.

vm_shutdown should never be called from a vcpu thread.
I could assert that.

- Steve

> We might need some words to explain why this patch doesn't affect
> functionality.
>> +
>>      if (runstate_is_running()) {
>>          runstate_set(state);
>>          cpu_disable_ticks();
>> @@ -264,6 +275,8 @@ static int do_vm_stop(RunState state, bool send_stop)
>>          if (send_stop) {
>>              qapi_event_send_stop();
>>          }
>> +    } else if (force) {
>> +        runstate_set(state);
>>      }
>>  
>>      bdrv_drain_all();
>> @@ -278,7 +291,7 @@ static int do_vm_stop(RunState state, bool send_stop)
>>   */
>>  int vm_shutdown(void)
>>  {
>> -    return do_vm_stop(RUN_STATE_SHUTDOWN, false);
>> +    return do_vm_stop(RUN_STATE_SHUTDOWN, false, false);
>>  }
>>  
>>  bool cpu_can_run(CPUState *cpu)
>> @@ -656,18 +669,7 @@ void cpu_stop_current(void)
>>  
>>  int vm_stop(RunState state)
>>  {
>> -    if (qemu_in_vcpu_thread()) {
>> -        qemu_system_vmstop_request_prepare();
>> -        qemu_system_vmstop_request(state);
>> -        /*
>> -         * FIXME: should not return to device code in case
>> -         * vm_stop() has been requested.
>> -         */
>> -        cpu_stop_current();
>> -        return 0;
>> -    }
>> -
>> -    return do_vm_stop(state, true);
>> +    return do_vm_stop(state, true, false);
>>  }
>>  
>>  /**
>> @@ -723,19 +725,7 @@ void vm_start(void)
>>     current state is forgotten forever */
>>  int vm_stop_force_state(RunState state)
>>  {
>> -    if (runstate_is_running()) {
>> -        return vm_stop(state);
>> -    } else {
>> -        int ret;
>> -        runstate_set(state);
>> -
>> -        bdrv_drain_all();
>> -        /* Make sure to return an error if the flush in a previous vm_stop()
>> -         * failed. */
>> -        ret = bdrv_flush_all();
>> -        trace_vm_stop_flush_all(ret);
>> -        return ret;
>> -    }
>> +    return do_vm_stop(state, true, true);
>>  }
>>  
>>  void qmp_memsave(int64_t addr, int64_t size, const char *filename,


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-20 14:15   ` Fabiano Rosas
@ 2023-11-20 19:10     ` Steven Sistare
  0 siblings, 0 replies; 35+ messages in thread
From: Steven Sistare @ 2023-11-20 19:10 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

On 11/20/2023 9:15 AM, Fabiano Rosas wrote:
> Steve Sistare <steven.sistare@oracle.com> writes:
> 
>> A vm in the suspended state is not completely stopped.  
> 
> Is this a statement of a fact about VMs in the suspended state in
> general or is this describing what this patch is trying to fix?

The former.

>> The VCPUs have been paused, but the cpu clock still runs, and runstate
>> notifiers for the transition to stopped have not been called.
> 
> ...it reads like the latter, but then why aren't we fixing this at the
> moment we put the VM in the suspend state?

cpu_get_ticks() must continue to tick while the guest is suspended, so that
QEMU_CLOCK_VIRTUAL continues to tick, so that timeouts based on that clock
will fire.  One example is timed wake from suspend,  acpi_pm_tmr_timer.

>> Modify vm_stop_force_state to completely stop the vm if the current
>> state is suspended, to be called for live migration and snapshots.
> 
> Hm, this changes the meaning of the "force" from:
> 
> "force a state even if already stopped"
> 
> into:
> 
> "force a complete stop if already suspended, otherwise just set the
> state"

vm_stop_force_state has the same behavior as before for all states
except suspended.  If suspended, it also:
  - stops cpu ticks
  - calls runstate stopped handlers
  - sets a new runstate

> I don't know what to make of this, shouldn't all vm_stops cause a
> complete stop?

We cannot stop cpu_get_ticks.  We could maybe call the runstate stop handlers,
but that requires a careful examination of every handler, and there is no obvious 
correctness or cleanliness reason to stop them immediately on vm_stop(), since cpu 
ticks still needs special handling later.

> We need to at least resolve the overloading of the 'force' term.

How about a more complete function header comment:

/*
 * If the machine is running or suspended, completely stop it.
 * Force the new runstate to @state.
 * The current state is forgotten forever.
 */

- Steve

>> Suggested-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  system/cpus.c | 8 ++++++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/system/cpus.c b/system/cpus.c
>> index f72c4be..c772708 100644
>> --- a/system/cpus.c
>> +++ b/system/cpus.c
>> @@ -255,6 +255,8 @@ void cpu_interrupt(CPUState *cpu, int mask)
>>  static int do_vm_stop(RunState state, bool send_stop, bool force)
>>  {
>>      int ret = 0;
>> +    bool running = runstate_is_running();
>> +    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
>>  
>>      if (qemu_in_vcpu_thread()) {
>>          qemu_system_vmstop_request_prepare();
>> @@ -267,10 +269,12 @@ static int do_vm_stop(RunState state, bool send_stop, bool force)
>>          return 0;
>>      }
>>  
>> -    if (runstate_is_running()) {
>> +    if (running || (suspended && force)) {
>>          runstate_set(state);
>>          cpu_disable_ticks();
>> -        pause_all_vcpus();
>> +        if (running) {
>> +            pause_all_vcpus();
>> +        }
>>          vm_state_notify(0, state);
>>          if (send_stop) {
>>              qapi_event_send_stop();


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 06/12] migration: preserve suspended for snapshot
  2023-11-20 18:13   ` Fabiano Rosas
@ 2023-11-20 19:10     ` Steven Sistare
  0 siblings, 0 replies; 35+ messages in thread
From: Steven Sistare @ 2023-11-20 19:10 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

On 11/20/2023 1:13 PM, Fabiano Rosas wrote:
> Steve Sistare <steven.sistare@oracle.com> writes:
> 
>> Restoring a snapshot can break a suspended guest.  Snapshots suffer from
>> the same suspended-state issues that affect live migration, plus they must
>> handle an additional problematic scenario, which is that a running vm must
>> remain running if it loads a suspended snapshot.
>>
>> To save, call vm_stop_force_state to completely stop a vm in the suspended
>> state, and restore the suspended state using runstate_restore.  This
>> produces a correct vmstate file and leaves the vm in the state it had prior
>> to the save.
>>
>> To load, if the snapshot is not suspended, then vm_stop_force_state +
>> runstate_restore correctly handles all states, and leaves the vm in the
>> state it had prior to the load.  However, if the snapshot is suspended,
>> restoration is trickier.  First restore the state to suspended so the
>> current state matches the saved state.  Then, if the pre-load state is
>> running, wakeup to resume running.
>>
>> Prior to these changes, the vm_stop to RUN_STATE_SAVE_VM and
>> RUN_STATE_RESTORE_VM did not change runstate if the current state was
>> paused, suspended, or prelaunch, but now vm_stop_force_state forces these
>> transitions, so allow them.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  include/migration/snapshot.h   |  7 +++++++
>>  migration/migration-hmp-cmds.c | 12 ++++++++----
>>  migration/savevm.c             | 33 +++++++++++++++++++++------------
>>  system/runstate.c              | 10 ++++++++++
>>  system/vl.c                    |  2 ++
>>  5 files changed, 48 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
>> index e72083b..9e4dcaa 100644
>> --- a/include/migration/snapshot.h
>> +++ b/include/migration/snapshot.h
>> @@ -16,6 +16,7 @@
>>  #define QEMU_MIGRATION_SNAPSHOT_H
>>  
>>  #include "qapi/qapi-builtin-types.h"
>> +#include "qapi/qapi-types-run-state.h"
>>  
>>  /**
>>   * save_snapshot: Save an internal snapshot.
>> @@ -61,4 +62,10 @@ bool delete_snapshot(const char *name,
>>                      bool has_devices, strList *devices,
>>                      Error **errp);
>>  
>> +/**
>> + * load_snapshot_resume: Restore runstate after loading snapshot.
>> + * @state: state to restore
>> + */
>> +void load_snapshot_resume(RunState state);
>> +
>>  #endif
>> diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
>> index 86ae832..c31cdc7 100644
>> --- a/migration/migration-hmp-cmds.c
>> +++ b/migration/migration-hmp-cmds.c
>> @@ -399,15 +399,19 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>>  
>>  void hmp_loadvm(Monitor *mon, const QDict *qdict)
>>  {
>> -    int saved_vm_running  = runstate_is_running();
>> +    RunState saved_state = runstate_get();
>> +
>>      const char *name = qdict_get_str(qdict, "name");
>>      Error *err = NULL;
>>  
>> -    vm_stop(RUN_STATE_RESTORE_VM);
>> +    vm_stop_force_state(RUN_STATE_RESTORE_VM);
>>  
>> -    if (load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) {
>> -        vm_start();
>> +    if (load_snapshot(name, NULL, false, NULL, &err)) {
>> +        load_snapshot_resume(saved_state);
>> +    } else {
>> +        vm_resume(saved_state);
> 
> Here we're starting the VM if load_snapshot() fails. Is that
> intentional?

My bad, good catch, I will delete the else clause.

>>      }
>> +
>>      hmp_handle_error(mon, err);
>>  }
>>  
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 78ac2bd..b4b49bb 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -3040,7 +3040,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>>      QEMUSnapshotInfo sn1, *sn = &sn1;
>>      int ret = -1, ret2;
>>      QEMUFile *f;
>> -    int saved_vm_running;
>> +    RunState saved_state = runstate_get();
>>      uint64_t vm_state_size;
>>      g_autoptr(GDateTime) now = g_date_time_new_now_local();
>>      AioContext *aio_context;
>> @@ -3088,10 +3088,8 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>>      }
>>      aio_context = bdrv_get_aio_context(bs);
>>  
>> -    saved_vm_running = runstate_is_running();
>> -
>>      global_state_store();
>> -    vm_stop(RUN_STATE_SAVE_VM);
>> +    vm_stop_force_state(RUN_STATE_SAVE_VM);
>>  
>>      bdrv_drain_all_begin();
>>  
>> @@ -3157,9 +3155,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>>  
>>      bdrv_drain_all_end();
>>  
>> -    if (saved_vm_running) {
>> -        vm_start();
>> -    }
>> +    vm_resume(saved_state);
>>      return ret == 0;
>>  }
>>  
>> @@ -3333,6 +3329,20 @@ err_drain:
>>      return false;
>>  }
>>  
>> +void load_snapshot_resume(RunState state)
>> +{
>> +    if (global_state_received() &&
>> +        global_state_get_runstate() == RUN_STATE_SUSPENDED) {
>> +
>> +        vm_resume(RUN_STATE_SUSPENDED);
>> +        if (state == RUN_STATE_RUNNING) {
>> +            qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>> +        }
>> +    } else {
>> +        vm_resume(state);
>> +    }
>> +}
>> +
>>  bool delete_snapshot(const char *name, bool has_devices,
>>                       strList *devices, Error **errp)
>>  {
>> @@ -3397,16 +3407,15 @@ static void snapshot_load_job_bh(void *opaque)
>>  {
>>      Job *job = opaque;
>>      SnapshotJob *s = container_of(job, SnapshotJob, common);
>> -    int orig_vm_running;
>> +    RunState orig_state = runstate_get();
>>  
>>      job_progress_set_remaining(&s->common, 1);
>>  
>> -    orig_vm_running = runstate_is_running();
>> -    vm_stop(RUN_STATE_RESTORE_VM);
>> +    vm_stop_force_state(RUN_STATE_RESTORE_VM);
>>  
>>      s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
>> -    if (s->ret && orig_vm_running) {
>> -        vm_start();
>> +    if (s->ret) {
>> +        load_snapshot_resume(orig_state);
> 
> Same here, we used to not start the VM if load_snapshot() failed.

Here the behavior is the same as the old code.  ret=1 means success.
That inverted return code has misled us both :)

>>      }
>>  
>>      job_progress_update(&s->common, 1);
>> diff --git a/system/runstate.c b/system/runstate.c
>> index ea9d6c2..f1d4bc7 100644
>> --- a/system/runstate.c
>> +++ b/system/runstate.c
>> @@ -77,6 +77,8 @@ typedef struct {
>>  
>>  static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
>> +    { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
>> +    { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
>>  
>>      { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
>>      { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },
>> @@ -108,6 +110,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE },
>>      { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
>>      { RUN_STATE_PAUSED, RUN_STATE_COLO},
>> +    { RUN_STATE_PAUSED, RUN_STATE_SAVE_VM},
>> +    { RUN_STATE_PAUSED, RUN_STATE_RESTORE_VM},
>>  
>>      { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
>>      { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
>> @@ -131,6 +135,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>>  
>>      { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
>>      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
>> +    { RUN_STATE_RESTORE_VM, RUN_STATE_PAUSED },
>> +    { RUN_STATE_RESTORE_VM, RUN_STATE_SUSPENDED },
>>  
>>      { RUN_STATE_COLO, RUN_STATE_RUNNING },
>>      { RUN_STATE_COLO, RUN_STATE_PRELAUNCH },
>> @@ -149,6 +155,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_RUNNING, RUN_STATE_COLO},
>>  
>>      { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
>> +    { RUN_STATE_SAVE_VM, RUN_STATE_PAUSED },
>> +    { RUN_STATE_SAVE_VM, RUN_STATE_SUSPENDED },
>>  
>>      { RUN_STATE_SHUTDOWN, RUN_STATE_PAUSED },
>>      { RUN_STATE_SHUTDOWN, RUN_STATE_FINISH_MIGRATE },
>> @@ -161,6 +169,8 @@ static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
>>      { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>>      { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
>> +    { RUN_STATE_SUSPENDED, RUN_STATE_SAVE_VM },
>> +    { RUN_STATE_SUSPENDED, RUN_STATE_RESTORE_VM },
>>  
>>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>> diff --git a/system/vl.c b/system/vl.c
>> index bd7fad7..082a45a 100644
>> --- a/system/vl.c
>> +++ b/system/vl.c
>> @@ -2702,7 +2702,9 @@ void qmp_x_exit_preconfig(Error **errp)
>>      qemu_machine_creation_done();
>>  
>>      if (loadvm) {
>> +        RunState state = autostart ? RUN_STATE_RUNNING : runstate_get();
>>          load_snapshot(loadvm, NULL, false, NULL, &error_fatal);
>> +        load_snapshot_resume(state);
> 
> Here it's using error_fatal, so it won't start the VM.

Yes, same as the old code.

- Steve




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 01/12] cpus: refactor vm_stop
  2023-11-20 19:09     ` Steven Sistare
@ 2023-11-20 19:46       ` Peter Xu
  2023-11-20 19:49         ` Steven Sistare
  2023-11-20 20:01       ` Fabiano Rosas
  1 sibling, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-11-20 19:46 UTC (permalink / raw)
  To: Steven Sistare
  Cc: Fabiano Rosas, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

On Mon, Nov 20, 2023 at 02:09:31PM -0500, Steven Sistare wrote:
> On 11/20/2023 8:22 AM, Fabiano Rosas wrote:
> > Steve Sistare <steven.sistare@oracle.com> writes:
> >> Refactor the vm_stop functions into one common subroutine do_vm_stop called
> >> with options.  No functional change.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> ---
> >>  system/cpus.c | 44 +++++++++++++++++---------------------------
> >>  1 file changed, 17 insertions(+), 27 deletions(-)
> >>
> >> diff --git a/system/cpus.c b/system/cpus.c
> >> index 0848e0d..f72c4be 100644
> >> --- a/system/cpus.c
> >> +++ b/system/cpus.c
> >> @@ -252,10 +252,21 @@ void cpu_interrupt(CPUState *cpu, int mask)
> >>      }
> >>  }
> >>  
> >> -static int do_vm_stop(RunState state, bool send_stop)
> >> +static int do_vm_stop(RunState state, bool send_stop, bool force)
> >>  {
> >>      int ret = 0;
> >>  
> >> +    if (qemu_in_vcpu_thread()) {
> >> +        qemu_system_vmstop_request_prepare();
> >> +        qemu_system_vmstop_request(state);
> >> +        /*
> >> +         * FIXME: should not return to device code in case
> >> +         * vm_stop() has been requested.
> >> +         */
> >> +        cpu_stop_current();
> >> +        return 0;
> >> +    }
> > 
> > At vm_stop_force_state(), this block used to be under
> > runstate_is_running(), but now it runs unconditionally.
> 
> vm_stop_force_state callers should never be called in a vcpu thread, so this block
> is never executed for them.  I could assert that.
> 
> > At vm_shutdown(), this block was not executed at all, but now it is.
> 
> vm_shutdown should never be called from a vcpu thread.
> I could assert that.

Would you split the patch into two?  Moving qemu_in_vcpu_thread() is one,
the rest can be put into another, IMHO.  That may also help to make the
review easier.  OTOH the code changes look all correct here.  Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 01/12] cpus: refactor vm_stop
  2023-11-20 19:46       ` Peter Xu
@ 2023-11-20 19:49         ` Steven Sistare
  0 siblings, 0 replies; 35+ messages in thread
From: Steven Sistare @ 2023-11-20 19:49 UTC (permalink / raw)
  To: Peter Xu
  Cc: Fabiano Rosas, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

On 11/20/2023 2:46 PM, Peter Xu wrote:
> On Mon, Nov 20, 2023 at 02:09:31PM -0500, Steven Sistare wrote:
>> On 11/20/2023 8:22 AM, Fabiano Rosas wrote:
>>> Steve Sistare <steven.sistare@oracle.com> writes:
>>>> Refactor the vm_stop functions into one common subroutine do_vm_stop called
>>>> with options.  No functional change.
>>>>
>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>> ---
>>>>  system/cpus.c | 44 +++++++++++++++++---------------------------
>>>>  1 file changed, 17 insertions(+), 27 deletions(-)
>>>>
>>>> diff --git a/system/cpus.c b/system/cpus.c
>>>> index 0848e0d..f72c4be 100644
>>>> --- a/system/cpus.c
>>>> +++ b/system/cpus.c
>>>> @@ -252,10 +252,21 @@ void cpu_interrupt(CPUState *cpu, int mask)
>>>>      }
>>>>  }
>>>>  
>>>> -static int do_vm_stop(RunState state, bool send_stop)
>>>> +static int do_vm_stop(RunState state, bool send_stop, bool force)
>>>>  {
>>>>      int ret = 0;
>>>>  
>>>> +    if (qemu_in_vcpu_thread()) {
>>>> +        qemu_system_vmstop_request_prepare();
>>>> +        qemu_system_vmstop_request(state);
>>>> +        /*
>>>> +         * FIXME: should not return to device code in case
>>>> +         * vm_stop() has been requested.
>>>> +         */
>>>> +        cpu_stop_current();
>>>> +        return 0;
>>>> +    }
>>>
>>> At vm_stop_force_state(), this block used to be under
>>> runstate_is_running(), but now it runs unconditionally.
>>
>> vm_stop_force_state callers should never be called in a vcpu thread, so this block
>> is never executed for them.  I could assert that.
>>
>>> At vm_shutdown(), this block was not executed at all, but now it is.
>>
>> vm_shutdown should never be called from a vcpu thread.
>> I could assert that.
> 
> Would you split the patch into two?  Moving qemu_in_vcpu_thread() is one,
> the rest can be put into another, IMHO.  That may also help to make the
> review easier.  OTOH the code changes look all correct here.  Thanks,

Will do, thanks - steve


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-13 18:33 ` [PATCH V5 02/12] cpus: stop vm in suspended state Steve Sistare
  2023-11-20 14:15   ` Fabiano Rosas
@ 2023-11-20 19:59   ` Peter Xu
  2023-11-20 20:47     ` Fabiano Rosas
  2023-11-20 20:55     ` Steven Sistare
  1 sibling, 2 replies; 35+ messages in thread
From: Peter Xu @ 2023-11-20 19:59 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Mon, Nov 13, 2023 at 10:33:50AM -0800, Steve Sistare wrote:
> A vm in the suspended state is not completely stopped.  The VCPUs have been
> paused, but the cpu clock still runs, and runstate notifiers for the
> transition to stopped have not been called.  Modify vm_stop_force_state to
> completely stop the vm if the current state is suspended, to be called for
> live migration and snapshots.
> 
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  system/cpus.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/system/cpus.c b/system/cpus.c
> index f72c4be..c772708 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -255,6 +255,8 @@ void cpu_interrupt(CPUState *cpu, int mask)
>  static int do_vm_stop(RunState state, bool send_stop, bool force)
>  {
>      int ret = 0;
> +    bool running = runstate_is_running();
> +    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
>  
>      if (qemu_in_vcpu_thread()) {
>          qemu_system_vmstop_request_prepare();
> @@ -267,10 +269,12 @@ static int do_vm_stop(RunState state, bool send_stop, bool force)
>          return 0;
>      }
>  
> -    if (runstate_is_running()) {
> +    if (running || (suspended && force)) {
>          runstate_set(state);
>          cpu_disable_ticks();

Not directly relevant, but this is weird that I just notice.

If we disable ticks before stopping vCPUs, IIUC it means vcpus can see
stall ticks.  I checked the vm_start() and indeed that one did it in the
other way round: we'll stop vCPUs before stopping the ticks.

> -        pause_all_vcpus();
> +        if (running) {
> +            pause_all_vcpus();
> +        }
>          vm_state_notify(0, state);
>          if (send_stop) {
>              qapi_event_send_stop();

IIUC the "force" is not actually needed.  It's only used when SUSPENDED,
right?

In general, considering all above, I'm wondering something like this would
be much cleaner (and dropping force)?

===8<===
static int do_vm_stop(RunState state, bool send_stop)
 {
+    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
+    bool running = runstate_is_running();
     int ret = 0;
 
-    if (runstate_is_running()) {
+    /*
+     * RUNNING:   VM and vCPUs are all running
+     * SUSPENDED: VM is running, VCPUs are stopped
+     * Others:    VM and vCPUs are all stopped
+     */
+
+    /* Whether do we need to stop vCPUs? */
+    if (running) {
+        pause_all_vcpus();
+    }
+
+    /* Whether do we need to stop the VM in general? */
+    if (running || suspended) {
         runstate_set(state);
         cpu_disable_ticks();
-        pause_all_vcpus();
         vm_state_notify(0, state);
         if (send_stop) {
             qapi_event_send_stop();

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 01/12] cpus: refactor vm_stop
  2023-11-20 19:09     ` Steven Sistare
  2023-11-20 19:46       ` Peter Xu
@ 2023-11-20 20:01       ` Fabiano Rosas
  1 sibling, 0 replies; 35+ messages in thread
From: Fabiano Rosas @ 2023-11-20 20:01 UTC (permalink / raw)
  To: Steven Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

Steven Sistare <steven.sistare@oracle.com> writes:

> On 11/20/2023 8:22 AM, Fabiano Rosas wrote:
>> Steve Sistare <steven.sistare@oracle.com> writes:
>>> Refactor the vm_stop functions into one common subroutine do_vm_stop called
>>> with options.  No functional change.
>>>
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>> ---
>>>  system/cpus.c | 44 +++++++++++++++++---------------------------
>>>  1 file changed, 17 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/system/cpus.c b/system/cpus.c
>>> index 0848e0d..f72c4be 100644
>>> --- a/system/cpus.c
>>> +++ b/system/cpus.c
>>> @@ -252,10 +252,21 @@ void cpu_interrupt(CPUState *cpu, int mask)
>>>      }
>>>  }
>>>  
>>> -static int do_vm_stop(RunState state, bool send_stop)
>>> +static int do_vm_stop(RunState state, bool send_stop, bool force)
>>>  {
>>>      int ret = 0;
>>>  
>>> +    if (qemu_in_vcpu_thread()) {
>>> +        qemu_system_vmstop_request_prepare();
>>> +        qemu_system_vmstop_request(state);
>>> +        /*
>>> +         * FIXME: should not return to device code in case
>>> +         * vm_stop() has been requested.
>>> +         */
>>> +        cpu_stop_current();
>>> +        return 0;
>>> +    }
>> 
>> At vm_stop_force_state(), this block used to be under
>> runstate_is_running(), but now it runs unconditionally.
>
> vm_stop_force_state callers should never be called in a vcpu thread, so this block
> is never executed for them.  I could assert that.
>
>> At vm_shutdown(), this block was not executed at all, but now it is.
>
> vm_shutdown should never be called from a vcpu thread.
> I could assert that.

Yes, this is an assumption that will get lost to time unless we document
it or have code to enforce.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-20 19:59   ` Peter Xu
@ 2023-11-20 20:47     ` Fabiano Rosas
  2023-11-20 21:26       ` Steven Sistare
  2023-11-20 20:55     ` Steven Sistare
  1 sibling, 1 reply; 35+ messages in thread
From: Fabiano Rosas @ 2023-11-20 20:47 UTC (permalink / raw)
  To: Peter Xu, Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

Peter Xu <peterx@redhat.com> writes:

> On Mon, Nov 13, 2023 at 10:33:50AM -0800, Steve Sistare wrote:
>> A vm in the suspended state is not completely stopped.  The VCPUs have been
>> paused, but the cpu clock still runs, and runstate notifiers for the
>> transition to stopped have not been called.  Modify vm_stop_force_state to
>> completely stop the vm if the current state is suspended, to be called for
>> live migration and snapshots.
>> 
>> Suggested-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  system/cpus.c | 8 ++++++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>> 
>> diff --git a/system/cpus.c b/system/cpus.c
>> index f72c4be..c772708 100644
>> --- a/system/cpus.c
>> +++ b/system/cpus.c
>> @@ -255,6 +255,8 @@ void cpu_interrupt(CPUState *cpu, int mask)
>>  static int do_vm_stop(RunState state, bool send_stop, bool force)
>>  {
>>      int ret = 0;
>> +    bool running = runstate_is_running();
>> +    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
>>  
>>      if (qemu_in_vcpu_thread()) {
>>          qemu_system_vmstop_request_prepare();
>> @@ -267,10 +269,12 @@ static int do_vm_stop(RunState state, bool send_stop, bool force)
>>          return 0;
>>      }
>>  
>> -    if (runstate_is_running()) {
>> +    if (running || (suspended && force)) {
>>          runstate_set(state);
>>          cpu_disable_ticks();
>
> Not directly relevant, but this is weird that I just notice.
>
> If we disable ticks before stopping vCPUs, IIUC it means vcpus can see
> stall ticks.  I checked the vm_start() and indeed that one did it in the
> other way round: we'll stop vCPUs before stopping the ticks.
>
>> -        pause_all_vcpus();
>> +        if (running) {
>> +            pause_all_vcpus();
>> +        }
>>          vm_state_notify(0, state);
>>          if (send_stop) {
>>              qapi_event_send_stop();
>
> IIUC the "force" is not actually needed.  It's only used when SUSPENDED,
> right?

That's the overloading I'm complaining about. We're using "force" to say
both: "include suspended" and: "set the state". This is basically taking
knowledge from the callsite being the migration code and encoding it in
that flag.

I'd prefer something like:

static int do_vm_stop(RunState state, bool send_stop, bool set_state,
                      bool include_suspended);



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-20 19:59   ` Peter Xu
  2023-11-20 20:47     ` Fabiano Rosas
@ 2023-11-20 20:55     ` Steven Sistare
  2023-11-20 21:44       ` Peter Xu
  1 sibling, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2023-11-20 20:55 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On 11/20/2023 2:59 PM, Peter Xu wrote:
> On Mon, Nov 13, 2023 at 10:33:50AM -0800, Steve Sistare wrote:
>> A vm in the suspended state is not completely stopped.  The VCPUs have been
>> paused, but the cpu clock still runs, and runstate notifiers for the
>> transition to stopped have not been called.  Modify vm_stop_force_state to
>> completely stop the vm if the current state is suspended, to be called for
>> live migration and snapshots.
>>
>> Suggested-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  system/cpus.c | 8 ++++++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/system/cpus.c b/system/cpus.c
>> index f72c4be..c772708 100644
>> --- a/system/cpus.c
>> +++ b/system/cpus.c
>> @@ -255,6 +255,8 @@ void cpu_interrupt(CPUState *cpu, int mask)
>>  static int do_vm_stop(RunState state, bool send_stop, bool force)
>>  {
>>      int ret = 0;
>> +    bool running = runstate_is_running();
>> +    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
>>  
>>      if (qemu_in_vcpu_thread()) {
>>          qemu_system_vmstop_request_prepare();
>> @@ -267,10 +269,12 @@ static int do_vm_stop(RunState state, bool send_stop, bool force)
>>          return 0;
>>      }
>>  
>> -    if (runstate_is_running()) {
>> +    if (running || (suspended && force)) {
>>          runstate_set(state);
>>          cpu_disable_ticks();
> 
> Not directly relevant, but this is weird that I just notice.
> 
> If we disable ticks before stopping vCPUs, IIUC it means vcpus can see
> stall ticks.  I checked the vm_start() and indeed that one did it in the
> other way round: we'll stop vCPUs before stopping the ticks.
> 
>> -        pause_all_vcpus();
>> +        if (running) {
>> +            pause_all_vcpus();
>> +        }
>>          vm_state_notify(0, state);
>>          if (send_stop) {
>>              qapi_event_send_stop();
> 
> IIUC the "force" is not actually needed.  It's only used when SUSPENDED,
> right?
> 
> In general, considering all above, I'm wondering something like this would
> be much cleaner (and dropping force)?

If we drop force, then all calls to vm_stop will completely stop the suspended
state, eg an hmp "stop" command. This causes two problems.  First, that is a change
in user-visible behavior for something that currently works, vs the migration code
where we are fixing brokenness.  Second, it does not quite work, because the state 
becomes RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont" 
will try to set the running state.  I could fix that by introducing a new state
RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change in existing
behavior.  (I even implemented that while developing, then I realized it was not 
needed to fix the migration bugs.)

- Steve

> ===8<===
> static int do_vm_stop(RunState state, bool send_stop)
>  {
> +    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
> +    bool running = runstate_is_running();
>      int ret = 0;
>  
> -    if (runstate_is_running()) {
> +    /*
> +     * RUNNING:   VM and vCPUs are all running
> +     * SUSPENDED: VM is running, VCPUs are stopped
> +     * Others:    VM and vCPUs are all stopped
> +     */
> +
> +    /* Whether do we need to stop vCPUs? */
> +    if (running) {
> +        pause_all_vcpus();
> +    }
> +
> +    /* Whether do we need to stop the VM in general? */
> +    if (running || suspended) {
>          runstate_set(state);
>          cpu_disable_ticks();
> -        pause_all_vcpus();
>          vm_state_notify(0, state);
>          if (send_stop) {
>              qapi_event_send_stop();
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-20 20:47     ` Fabiano Rosas
@ 2023-11-20 21:26       ` Steven Sistare
  0 siblings, 0 replies; 35+ messages in thread
From: Steven Sistare @ 2023-11-20 21:26 UTC (permalink / raw)
  To: Fabiano Rosas, Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

On 11/20/2023 3:47 PM, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
>> On Mon, Nov 13, 2023 at 10:33:50AM -0800, Steve Sistare wrote:
>>> A vm in the suspended state is not completely stopped.  The VCPUs have been
>>> paused, but the cpu clock still runs, and runstate notifiers for the
>>> transition to stopped have not been called.  Modify vm_stop_force_state to
>>> completely stop the vm if the current state is suspended, to be called for
>>> live migration and snapshots.
>>>
>>> Suggested-by: Peter Xu <peterx@redhat.com>
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>> ---
>>>  system/cpus.c | 8 ++++++--
>>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/system/cpus.c b/system/cpus.c
>>> index f72c4be..c772708 100644
>>> --- a/system/cpus.c
>>> +++ b/system/cpus.c
>>> @@ -255,6 +255,8 @@ void cpu_interrupt(CPUState *cpu, int mask)
>>>  static int do_vm_stop(RunState state, bool send_stop, bool force)
>>>  {
>>>      int ret = 0;
>>> +    bool running = runstate_is_running();
>>> +    bool suspended = runstate_check(RUN_STATE_SUSPENDED);
>>>  
>>>      if (qemu_in_vcpu_thread()) {
>>>          qemu_system_vmstop_request_prepare();
>>> @@ -267,10 +269,12 @@ static int do_vm_stop(RunState state, bool send_stop, bool force)
>>>          return 0;
>>>      }
>>>  
>>> -    if (runstate_is_running()) {
>>> +    if (running || (suspended && force)) {
>>>          runstate_set(state);
>>>          cpu_disable_ticks();
>>
>> Not directly relevant, but this is weird that I just notice.
>>
>> If we disable ticks before stopping vCPUs, IIUC it means vcpus can see
>> stall ticks.  I checked the vm_start() and indeed that one did it in the
>> other way round: we'll stop vCPUs before stopping the ticks.
>>
>>> -        pause_all_vcpus();
>>> +        if (running) {
>>> +            pause_all_vcpus();
>>> +        }
>>>          vm_state_notify(0, state);
>>>          if (send_stop) {
>>>              qapi_event_send_stop();
>>
>> IIUC the "force" is not actually needed.  It's only used when SUSPENDED,
>> right?

When not suspended, the force flag causes a stopped state to be forced even
if current is a different stopped state.

> That's the overloading I'm complaining about. We're using "force" to say
> both: "include suspended" and: "set the state". This is basically taking
> knowledge from the callsite being the migration code and encoding it in
> that flag.
> 
> I'd prefer something like:
> 
> static int do_vm_stop(RunState state, bool send_stop, bool set_state,
>                       bool include_suspended);

This function has always been tailored for use by migration code and no other
callers.  Migration would always pass set_state=true and include_suspended=true.
We have no use case for other combinations and no test for them.

To my mind, "force" naturally implies both behaviors.  We force the machine into the
specified stop state, completely stopping suspended execution.

Perhaps renaming vm_stop_force_state would erase the old association of "force" with 
only forcing runstate, such as vm_stop_all().

- Steve


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-20 20:55     ` Steven Sistare
@ 2023-11-20 21:44       ` Peter Xu
  2023-11-21 21:21         ` Steven Sistare
  2023-11-22  9:38         ` Daniel P. Berrangé
  0 siblings, 2 replies; 35+ messages in thread
From: Peter Xu @ 2023-11-20 21:44 UTC (permalink / raw)
  To: Steven Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote:
> If we drop force, then all calls to vm_stop will completely stop the
> suspended state, eg an hmp "stop" command. This causes two problems.
> First, that is a change in user-visible behavior for something that
> currently works,

IMHO it depends on what should be the correct behavior.  IOW, when VM is in
SUSPENDED state and then the user sends "stop" QMP command, what should we
expect?

My understanding is we should expect to fully stop the VM, including the
ticks, for example.  Keeping the ticks running even after QMP "stop"
doesn't sound right, isn't it?

> vs the migration code where we are fixing brokenness.

This is not a migration-only bug if above holds, IMO.

> Second, it does not quite work, because the state becomes
> RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont"
> will try to set the running state.  I could fix that by introducing a new
> state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change
> in existing behavior.  (I even implemented that while developing, then I
> realized it was not needed to fix the migration bugs.)

Good point.

Now with above comments, what's your thoughts on whether we should change
the user behavior?  My answer is still a yes.

Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible
behavior, while something like QMP "stop" is not guest visible.  Maybe we
should remember it separately?

It means qemu_system_suspend() could remember that in a separate field (as
part of guest state), then when wakeup we should conditionally go back
with/without vcpus running depending on the new "suspended" state.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-20 21:44       ` Peter Xu
@ 2023-11-21 21:21         ` Steven Sistare
  2023-11-21 22:47           ` Peter Xu
  2023-11-22  9:38         ` Daniel P. Berrangé
  1 sibling, 1 reply; 35+ messages in thread
From: Steven Sistare @ 2023-11-21 21:21 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On 11/20/2023 4:44 PM, Peter Xu wrote:
> On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote:
>> If we drop force, then all calls to vm_stop will completely stop the
>> suspended state, eg an hmp "stop" command. This causes two problems.
>> First, that is a change in user-visible behavior for something that
>> currently works,
> 
> IMHO it depends on what should be the correct behavior.  IOW, when VM is in
> SUSPENDED state and then the user sends "stop" QMP command, what should we
> expect?
> 
> My understanding is we should expect to fully stop the VM, including the
> ticks, for example.  Keeping the ticks running even after QMP "stop"
> doesn't sound right, isn't it?

I agree, but others may not, and this decision would require approval from 
maintainers in other areas, including upper layers such as libvirt that are
aware of the suspended state.  It is a user-visible change, and may require 
a new libvirt release.

>> vs the migration code where we are fixing brokenness.
> 
> This is not a migration-only bug if above holds, IMO.
> 
>> Second, it does not quite work, because the state becomes
>> RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont"
>> will try to set the running state.  I could fix that by introducing a new
>> state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change
>> in existing behavior.  (I even implemented that while developing, then I
>> realized it was not needed to fix the migration bugs.)
> 
> Good point.
> 
> Now with above comments, what's your thoughts on whether we should change
> the user behavior?  My answer is still a yes.
> 
> Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible
> behavior, while something like QMP "stop" is not guest visible.  Maybe we
> should remember it separately?
>
> It means qemu_system_suspend() could remember that in a separate field (as
> part of guest state), then when wakeup we should conditionally go back
> with/without vcpus running depending on the new "suspended" state.

Regardless of how we remember it, the status command must still expose the 
suspended state to the user.  The user must be able to see that a guest is 
suspended, and decide when to issue a wakeup command.

If we change the stop command to completely stop a suspended vm, then we must
allow the user to query whether a vm is suspended-running or suspended-stopped,
because the command they must issue to resume is different: wakeup for the
former, and cont for the latter.

This change is visible to libvirt.  Adding it will delay this entire patch
series, and is not necessary for fixing the migration bug.  There is no
downside to drawing the line here for this series, and possibly changing stop
semantics in the future.

- Steve


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-21 21:21         ` Steven Sistare
@ 2023-11-21 22:47           ` Peter Xu
  0 siblings, 0 replies; 35+ messages in thread
From: Peter Xu @ 2023-11-21 22:47 UTC (permalink / raw)
  To: Steven Sistare, Daniel P. Berrangé,
	Paolo Bonzini, Richard Henderson
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Tue, Nov 21, 2023 at 04:21:18PM -0500, Steven Sistare wrote:
> On 11/20/2023 4:44 PM, Peter Xu wrote:
> > On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote:
> >> If we drop force, then all calls to vm_stop will completely stop the
> >> suspended state, eg an hmp "stop" command. This causes two problems.
> >> First, that is a change in user-visible behavior for something that
> >> currently works,
> > 
> > IMHO it depends on what should be the correct behavior.  IOW, when VM is in
> > SUSPENDED state and then the user sends "stop" QMP command, what should we
> > expect?
> > 
> > My understanding is we should expect to fully stop the VM, including the
> > ticks, for example.  Keeping the ticks running even after QMP "stop"
> > doesn't sound right, isn't it?
> 
> I agree, but others may not, and this decision would require approval from 
> maintainers in other areas, including upper layers such as libvirt that are
> aware of the suspended state.  It is a user-visible change, and may require 
> a new libvirt release.

$ ./scripts/get_maintainer.pl -f system/cpus.c
Richard Henderson <richard.henderson@linaro.org> (maintainer:Overall TCG CPUs)
Paolo Bonzini <pbonzini@redhat.com> (reviewer:Overall TCG CPUs)
qemu-devel@nongnu.org (open list:All patches CC here)

I'm also copying Richard, while Dan/Paolo is already in the loop, so we
should have the "quorum" already.  Let's see whether we can already get
some comments from the maintainers..

> 
> >> vs the migration code where we are fixing brokenness.
> > 
> > This is not a migration-only bug if above holds, IMO.
> > 
> >> Second, it does not quite work, because the state becomes
> >> RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont"
> >> will try to set the running state.  I could fix that by introducing a new
> >> state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change
> >> in existing behavior.  (I even implemented that while developing, then I
> >> realized it was not needed to fix the migration bugs.)
> > 
> > Good point.
> > 
> > Now with above comments, what's your thoughts on whether we should change
> > the user behavior?  My answer is still a yes.
> > 
> > Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible
> > behavior, while something like QMP "stop" is not guest visible.  Maybe we
> > should remember it separately?
> >
> > It means qemu_system_suspend() could remember that in a separate field (as
> > part of guest state), then when wakeup we should conditionally go back
> > with/without vcpus running depending on the new "suspended" state.
>
> Regardless of how we remember it, the status command must still expose
> the suspended state to the user.  The user must be able to see that a
> guest is suspended, and decide when to issue a wakeup command.

Hmm, right, we may want to keep having the SUSPENDED state in RunState,
even another separate "vm_suspended" boolean might still be required.

> 
> If we change the stop command to completely stop a suspended vm, then we must
> allow the user to query whether a vm is suspended-running or suspended-stopped,
> because the command they must issue to resume is different: wakeup for the
> former, and cont for the latter.

If it's stopped, the user must need a "cont" anyway.  And then if after
"cont" the user still sees it's suspended, then would "system_wakeup" work
here if necessary, after that "cont"?

Let's consider the current QEMU with below sequence of operations:

  1) vm running
  2) guest triggers ACPI suspend -> vm suspended
  3) admin triggers "stop" cmd -> vm suspended (ignored..)
  4) admin triggers "cont" cmd -> vm suspended (ignored.. too)

AFAICT both 2) and 3) are unwanted behavior, and after noticing 3) I feel
stronger that this is not a migration issue alone.

It also means after step 1)-3) if we got a wakeup elsewhere, the VM can
actually be running!  That's definitely unexpected after admin sends "stop"
already.  Isn't that another real bug?

I'm slightly confused on why you said above that libvirt will need a new
release. Could you elaborate?  Especially on what scenario we need to
maintain compatibility that still makes sense.

> 
> This change is visible to libvirt.  Adding it will delay this entire patch
> series, and is not necessary for fixing the migration bug.  There is no
> downside to drawing the line here for this series, and possibly changing stop
> semantics in the future.

This series will need to wait for rc releases anyway until 8.2 all out:

  https://wiki.qemu.org/Planning/8.2

I think we still have time to even catch the earliest train right after 8.2
released, if we can reach a consensus soon in whatever form.

Having a partial solution merged for migration is probably doable, but that
will make the code even more complicated and harder to maintain.  So before
doing so, I'd at least like to understand better on what use case you were
describing that will start to fall apart.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-20 21:44       ` Peter Xu
  2023-11-21 21:21         ` Steven Sistare
@ 2023-11-22  9:38         ` Daniel P. Berrangé
  2023-11-22 16:21           ` Peter Xu
  1 sibling, 1 reply; 35+ messages in thread
From: Daniel P. Berrangé @ 2023-11-22  9:38 UTC (permalink / raw)
  To: Peter Xu
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Fabiano Rosas, Leonardo Bras

On Mon, Nov 20, 2023 at 04:44:50PM -0500, Peter Xu wrote:
> On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote:
> > If we drop force, then all calls to vm_stop will completely stop the
> > suspended state, eg an hmp "stop" command. This causes two problems.
> > First, that is a change in user-visible behavior for something that
> > currently works,
> 
> IMHO it depends on what should be the correct behavior.  IOW, when VM is in
> SUSPENDED state and then the user sends "stop" QMP command, what should we
> expect?

I would say that from a mgmtm app POV "stop" is initiating a state
transition, from RUN_STATE_RUNNING to RUN_STATE_PAUSED and "cont"
is doing the reverse from PAUSED to RUNNING.

It is a little more complicated than that as there are some other
states like INMIGRATE that are conceptually equiv to RUNNING,
and states where the transition simply doesn't make sense.


So my question is if we're in "SUSPENDED" and someone issues "stop",
what state do we go into, and perhaps more importantly what state
do we go to in a subsequent "cont".

If you say SUSPENDED ---(stop)---> PAUSED ---(cont)---> SUSPENDED
then we create a problem, because the decision for the transition
out of PAUSED needs memory of the previous state.

> My understanding is we should expect to fully stop the VM, including the
> ticks, for example.  Keeping the ticks running even after QMP "stop"
> doesn't sound right, isn't it?

The "stop" QMP command is documented as

    "Stop all guest VCPU execution"

the devil is in the detail though, and we've not documented any detail.

Whether or not timers keep running across stop/cont I think can be
argued to be an impl detail, as long as the headline goal "vcpus
don't execute" is satisfied.

> > vs the migration code where we are fixing brokenness.
> 
> This is not a migration-only bug if above holds, IMO.
> 
> > Second, it does not quite work, because the state becomes
> > RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont"
> > will try to set the running state.  I could fix that by introducing a new
> > state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change
> > in existing behavior.  (I even implemented that while developing, then I
> > realized it was not needed to fix the migration bugs.)
> 
> Good point.

We have added new guest states periodically. It is a user visible
change, but you could argue that it is implementing a new feature
ie the ability to "stop" a "suspended" guest, and so is justified.

S3 is so little used in virt, so I'm not surprised we're finding
long standing edge cases that have never been thought about before.

> Now with above comments, what's your thoughts on whether we should change
> the user behavior?  My answer is still a yes.
> 
> Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible
> behavior, while something like QMP "stop" is not guest visible.  Maybe we
> should remember it separately?

Yes, every time I look at this area I come away thinking that
the RunState enum is a mess, overloading too many different
concepts onto the same single field.

Specifically "SUSPENDED" vs "RUNNING" is a reflection of guest
state (ie whether or not the VM is in S3), but pretty much all
the others are a reflection of QEMU host state. I kind of feel
that SUSPENDED (S3) probably shouldn't have been a RunState at
all. I'd probably put guest-panicked into a separate thing too.

But we're stuck with what we have.

> It means qemu_system_suspend() could remember that in a separate field (as
> part of guest state), then when wakeup we should conditionally go back
> with/without vcpus running depending on the new "suspended" state.



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-22  9:38         ` Daniel P. Berrangé
@ 2023-11-22 16:21           ` Peter Xu
  2023-11-28 13:26             ` Steven Sistare
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Xu @ 2023-11-22 16:21 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Fabiano Rosas, Leonardo Bras

On Wed, Nov 22, 2023 at 09:38:06AM +0000, Daniel P. Berrangé wrote:
> On Mon, Nov 20, 2023 at 04:44:50PM -0500, Peter Xu wrote:
> > On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote:
> > > If we drop force, then all calls to vm_stop will completely stop the
> > > suspended state, eg an hmp "stop" command. This causes two problems.
> > > First, that is a change in user-visible behavior for something that
> > > currently works,
> > 
> > IMHO it depends on what should be the correct behavior.  IOW, when VM is in
> > SUSPENDED state and then the user sends "stop" QMP command, what should we
> > expect?
> 
> I would say that from a mgmtm app POV "stop" is initiating a state
> transition, from RUN_STATE_RUNNING to RUN_STATE_PAUSED and "cont"
> is doing the reverse from PAUSED to RUNNING.
> 
> It is a little more complicated than that as there are some other
> states like INMIGRATE that are conceptually equiv to RUNNING,
> and states where the transition simply doesn't make sense.

In the qemu impl, INMIGRATE is, imho, more equiv of PAUSED - do_vm_stop()
mostly ignores every state except RUNNING (putting bdrv operations aside).
IOW, anything besides "running" is treated as "not running".

But then Paolo fixed that in 1e9981465f ("qmp: handle stop/cont in
INMIGRATE state"), wiring that to autostart.

Now we seem to find that "suspended" should actually fall within (where
"vm" is running, but "vcpu" is not), and it seems we should treat "vm" and
"vcpu" differently.

> 
> So my question is if we're in "SUSPENDED" and someone issues "stop",
> what state do we go into, and perhaps more importantly what state
> do we go to in a subsequent "cont".

I think we must stop the "vm", not only the "vcpu".  I discussed this bit
in the other thread more or less: it's because qemu_system_wakeup_request()
can be called in many places, e.g. acpi_pm_tmr_timer().

It means even after the VM is "stop"ped by the admin QMP (where qmp_stop()
ignores SUSPENDED, keep the "vm" running), it can silently got waken up
without admin even noticing it.  I'm not sure what Libvirt will behave if
it suddenly receives a QAPI_EVENT_WAKEUP randomly after a "stop".

> 
> If you say SUSPENDED ---(stop)---> PAUSED ---(cont)---> SUSPENDED
> then we create a problem, because the decision for the transition
> out of PAUSED needs memory of the previous state.

Right, that's why I think we at least need one more boolean to remember the
suspended state, then when we switch from any STOP states into any RUN
states, we know where to go.  Here STOP states I meant anything except
RUNNING and SUSPENDED, while RUN -> RUNNING or SUSPENDED.

> 
> > My understanding is we should expect to fully stop the VM, including the
> > ticks, for example.  Keeping the ticks running even after QMP "stop"
> > doesn't sound right, isn't it?
> 
> The "stop" QMP command is documented as
> 
>     "Stop all guest VCPU execution"
> 
> the devil is in the detail though, and we've not documented any detail.
> 
> Whether or not timers keep running across stop/cont I think can be
> argued to be an impl detail, as long as the headline goal "vcpus
> don't execute" is satisfied.

"stop" was since qemu v0.14, so I guess we can't blame the missing of
details or any form of inaccuracy..  Obviously we do more than "stop vCPU
executions" in the current implementation.

But after we reach a consensus on how we should fix the current suspended
problem, we may want to update the documentation to start containing more
information.

> 
> > > vs the migration code where we are fixing brokenness.
> > 
> > This is not a migration-only bug if above holds, IMO.
> > 
> > > Second, it does not quite work, because the state becomes
> > > RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont"
> > > will try to set the running state.  I could fix that by introducing a new
> > > state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change
> > > in existing behavior.  (I even implemented that while developing, then I
> > > realized it was not needed to fix the migration bugs.)
> > 
> > Good point.
> 
> We have added new guest states periodically. It is a user visible
> change, but you could argue that it is implementing a new feature
> ie the ability to "stop" a "suspended" guest, and so is justified.
> 
> S3 is so little used in virt, so I'm not surprised we're finding
> long standing edge cases that have never been thought about before.
> 
> > Now with above comments, what's your thoughts on whether we should change
> > the user behavior?  My answer is still a yes.
> > 
> > Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible
> > behavior, while something like QMP "stop" is not guest visible.  Maybe we
> > should remember it separately?
> 
> Yes, every time I look at this area I come away thinking that
> the RunState enum is a mess, overloading too many different
> concepts onto the same single field.
> 
> Specifically "SUSPENDED" vs "RUNNING" is a reflection of guest
> state (ie whether or not the VM is in S3), but pretty much all
> the others are a reflection of QEMU host state. I kind of feel
> that SUSPENDED (S3) probably shouldn't have been a RunState at
> all. I'd probably put guest-panicked into a separate thing too.
> 
> But we're stuck with what we have.

IMO compatibility is only necessary if at least the existing code is
running well.  But now I see it a major flaw in suspended state and I can't
see how it can go right if with current code base..  My concern is instead
that after suspended will be used more (e.g., assuming CPR will rely on it)
we can have more chance to confuse/oops a mgmt app like Libvirt, like I
described above.

In summary, I think a current solution to me is only to fix at least
suspended state for good, by:

  - adding vm_suspended boolean to remember machine RUNNING / SUSPENDED
    state.  When "cont" we need to switch to either RUNNING / SUSPENDED
    depending on the boolean.

  - keeping SUSPENDED state as RunState (for compatibility, otherwise we'll
    need another interface to fetch that boolean anyway), even though not
    query-able during any !RUNNING && !SUSPENDED state.. hopefully not a
    big deal

  - enrich ducumentation of qmp_stop/qmp_cont to describe what they really do

  - (with suspended working all right...) fix migration of SUSPENDED state

I don't expect a lot of code changes is needed, maybe even less than the
current series (because we don't need special knob to differenciate
migration or non-migration callers of do_vm_stop()). IMHO this series is
already doing some of that but just decided to ignore outside-migration
states for suspeneded.

We may want to add some test cases though to verify the suspended state
transitions (maybe easier to put into migration-test with the new ACPI
guest code), but optional.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH V5 02/12] cpus: stop vm in suspended state
  2023-11-22 16:21           ` Peter Xu
@ 2023-11-28 13:26             ` Steven Sistare
  0 siblings, 0 replies; 35+ messages in thread
From: Steven Sistare @ 2023-11-28 13:26 UTC (permalink / raw)
  To: Peter Xu, Daniel P. Berrangé
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Fabiano Rosas, Leonardo Bras

On 11/22/2023 11:21 AM, Peter Xu wrote:
> On Wed, Nov 22, 2023 at 09:38:06AM +0000, Daniel P. Berrangé wrote:
>> On Mon, Nov 20, 2023 at 04:44:50PM -0500, Peter Xu wrote:
>>> On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote:
>>>> If we drop force, then all calls to vm_stop will completely stop the
>>>> suspended state, eg an hmp "stop" command. This causes two problems.
>>>> First, that is a change in user-visible behavior for something that
>>>> currently works,
>>>
>>> IMHO it depends on what should be the correct behavior.  IOW, when VM is in
>>> SUSPENDED state and then the user sends "stop" QMP command, what should we
>>> expect?
>>
>> I would say that from a mgmtm app POV "stop" is initiating a state
>> transition, from RUN_STATE_RUNNING to RUN_STATE_PAUSED and "cont"
>> is doing the reverse from PAUSED to RUNNING.
>>
>> It is a little more complicated than that as there are some other
>> states like INMIGRATE that are conceptually equiv to RUNNING,
>> and states where the transition simply doesn't make sense.
> 
> In the qemu impl, INMIGRATE is, imho, more equiv of PAUSED - do_vm_stop()
> mostly ignores every state except RUNNING (putting bdrv operations aside).
> IOW, anything besides "running" is treated as "not running".
> 
> But then Paolo fixed that in 1e9981465f ("qmp: handle stop/cont in
> INMIGRATE state"), wiring that to autostart.
> 
> Now we seem to find that "suspended" should actually fall within (where
> "vm" is running, but "vcpu" is not), and it seems we should treat "vm" and
> "vcpu" differently.
> 
>>
>> So my question is if we're in "SUSPENDED" and someone issues "stop",
>> what state do we go into, and perhaps more importantly what state
>> do we go to in a subsequent "cont".
> 
> I think we must stop the "vm", not only the "vcpu".  I discussed this bit
> in the other thread more or less: it's because qemu_system_wakeup_request()
> can be called in many places, e.g. acpi_pm_tmr_timer().
> 
> It means even after the VM is "stop"ped by the admin QMP (where qmp_stop()
> ignores SUSPENDED, keep the "vm" running), it can silently got waken up
> without admin even noticing it.  I'm not sure what Libvirt will behave if
> it suddenly receives a QAPI_EVENT_WAKEUP randomly after a "stop".
> 
>>
>> If you say SUSPENDED ---(stop)---> PAUSED ---(cont)---> SUSPENDED
>> then we create a problem, because the decision for the transition
>> out of PAUSED needs memory of the previous state.
> 
> Right, that's why I think we at least need one more boolean to remember the
> suspended state, then when we switch from any STOP states into any RUN
> states, we know where to go.  Here STOP states I meant anything except
> RUNNING and SUSPENDED, while RUN -> RUNNING or SUSPENDED.
> 
>>
>>> My understanding is we should expect to fully stop the VM, including the
>>> ticks, for example.  Keeping the ticks running even after QMP "stop"
>>> doesn't sound right, isn't it?
>>
>> The "stop" QMP command is documented as
>>
>>     "Stop all guest VCPU execution"
>>
>> the devil is in the detail though, and we've not documented any detail.
>>
>> Whether or not timers keep running across stop/cont I think can be
>> argued to be an impl detail, as long as the headline goal "vcpus
>> don't execute" is satisfied.
> 
> "stop" was since qemu v0.14, so I guess we can't blame the missing of
> details or any form of inaccuracy..  Obviously we do more than "stop vCPU
> executions" in the current implementation.
> 
> But after we reach a consensus on how we should fix the current suspended
> problem, we may want to update the documentation to start containing more
> information.
> 
>>
>>>> vs the migration code where we are fixing brokenness.
>>>
>>> This is not a migration-only bug if above holds, IMO.
>>>
>>>> Second, it does not quite work, because the state becomes
>>>> RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont"
>>>> will try to set the running state.  I could fix that by introducing a new
>>>> state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change
>>>> in existing behavior.  (I even implemented that while developing, then I
>>>> realized it was not needed to fix the migration bugs.)
>>>
>>> Good point.
>>
>> We have added new guest states periodically. It is a user visible
>> change, but you could argue that it is implementing a new feature
>> ie the ability to "stop" a "suspended" guest, and so is justified.
>>
>> S3 is so little used in virt, so I'm not surprised we're finding
>> long standing edge cases that have never been thought about before.
>>
>>> Now with above comments, what's your thoughts on whether we should change
>>> the user behavior?  My answer is still a yes.
>>>
>>> Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible
>>> behavior, while something like QMP "stop" is not guest visible.  Maybe we
>>> should remember it separately?
>>
>> Yes, every time I look at this area I come away thinking that
>> the RunState enum is a mess, overloading too many different
>> concepts onto the same single field.
>>
>> Specifically "SUSPENDED" vs "RUNNING" is a reflection of guest
>> state (ie whether or not the VM is in S3), but pretty much all
>> the others are a reflection of QEMU host state. I kind of feel
>> that SUSPENDED (S3) probably shouldn't have been a RunState at
>> all. I'd probably put guest-panicked into a separate thing too.
>>
>> But we're stuck with what we have.
> 
> IMO compatibility is only necessary if at least the existing code is
> running well.  But now I see it a major flaw in suspended state and I can't
> see how it can go right if with current code base..  My concern is instead
> that after suspended will be used more (e.g., assuming CPR will rely on it)
> we can have more chance to confuse/oops a mgmt app like Libvirt, like I
> described above.
> 
> In summary, I think a current solution to me is only to fix at least
> suspended state for good, by:
> 
>   - adding vm_suspended boolean to remember machine RUNNING / SUSPENDED
>     state.  When "cont" we need to switch to either RUNNING / SUSPENDED
>     depending on the boolean.
> 
>   - keeping SUSPENDED state as RunState (for compatibility, otherwise we'll
>     need another interface to fetch that boolean anyway), even though not
>     query-able during any !RUNNING && !SUSPENDED state.. hopefully not a
>     big deal
> 
>   - enrich ducumentation of qmp_stop/qmp_cont to describe what they really do
> 
>   - (with suspended working all right...) fix migration of SUSPENDED state
> 
> I don't expect a lot of code changes is needed, maybe even less than the
> current series (because we don't need special knob to differenciate
> migration or non-migration callers of do_vm_stop()). IMHO this series is
> already doing some of that but just decided to ignore outside-migration
> states for suspeneded.
> 
> We may want to add some test cases though to verify the suspended state
> transitions (maybe easier to put into migration-test with the new ACPI
> guest code), but optional.

FYI, here is a brief update before today's meeting.  I have implemented this and
I am testing libvirt and its various save + restore commands, when the guest is
suspended running (RUN_STATE_SUSPENDED), and suspended stopped (RUN_STATE_PAUSED
with vm_was_suspended = true). There are a few failures, and I am still investigating 
to see whether they can be fixed in qemu, or need a fix in libvirt.

I will send more details later.

- Steve



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2023-11-28 13:27 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
2023-11-13 18:33 ` [PATCH V5 01/12] cpus: refactor vm_stop Steve Sistare
2023-11-20 13:22   ` Fabiano Rosas
2023-11-20 19:09     ` Steven Sistare
2023-11-20 19:46       ` Peter Xu
2023-11-20 19:49         ` Steven Sistare
2023-11-20 20:01       ` Fabiano Rosas
2023-11-13 18:33 ` [PATCH V5 02/12] cpus: stop vm in suspended state Steve Sistare
2023-11-20 14:15   ` Fabiano Rosas
2023-11-20 19:10     ` Steven Sistare
2023-11-20 19:59   ` Peter Xu
2023-11-20 20:47     ` Fabiano Rosas
2023-11-20 21:26       ` Steven Sistare
2023-11-20 20:55     ` Steven Sistare
2023-11-20 21:44       ` Peter Xu
2023-11-21 21:21         ` Steven Sistare
2023-11-21 22:47           ` Peter Xu
2023-11-22  9:38         ` Daniel P. Berrangé
2023-11-22 16:21           ` Peter Xu
2023-11-28 13:26             ` Steven Sistare
2023-11-13 18:33 ` [PATCH V5 03/12] cpus: pass runstate to vm_prepare_start Steve Sistare
2023-11-13 18:33 ` [PATCH V5 04/12] cpus: start vm in suspended state Steve Sistare
2023-11-20 17:20   ` Fabiano Rosas
2023-11-13 18:33 ` [PATCH V5 05/12] migration: preserve suspended runstate Steve Sistare
2023-11-20 17:30   ` Fabiano Rosas
2023-11-13 18:33 ` [PATCH V5 06/12] migration: preserve suspended for snapshot Steve Sistare
2023-11-20 18:13   ` Fabiano Rosas
2023-11-20 19:10     ` Steven Sistare
2023-11-13 18:33 ` [PATCH V5 07/12] migration: preserve suspended for bg_migration Steve Sistare
2023-11-20 18:18   ` Fabiano Rosas
2023-11-13 18:33 ` [PATCH V5 08/12] tests/qtest: migration events Steve Sistare
2023-11-13 18:33 ` [PATCH V5 09/12] tests/qtest: option to suspend during migration Steve Sistare
2023-11-13 18:33 ` [PATCH V5 10/12] tests/qtest: precopy migration with suspend Steve Sistare
2023-11-13 18:33 ` [PATCH V5 11/12] tests/qtest: postcopy " Steve Sistare
2023-11-13 18:34 ` [PATCH V5 12/12] tests/qtest: background " Steve Sistare

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.