All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V6 00/14] fix migration of suspended runstate
@ 2023-11-30 21:37 Steve Sistare
  2023-11-30 21:37 ` [PATCH V6 01/14] cpus: pass runstate to vm_prepare_start Steve Sistare
                   ` (14 more replies)
  0 siblings, 15 replies; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Migration of a guest in the suspended runstate is broken.  The incoming
migration code automatically tries to wake the guest, which is wrong;
the guest should end migration in the same runstate it started.  Further,
after saving a snapshot in the suspended state and loading it, the vm_start
fails.  The runstate is RUNNING, but the guest is not.

See the commit messages for the details.

Changes in V2:
  * simplify "start on wakeup request"
  * fix postcopy, snapshot, and background migration
  * refactor fixes for each type of migration
  * explicitly handled suspended events and runstate in tests
  * add test for postcopy and background migration

Changes in V3:
  * rebase to tip
  * fix hang in new function migrate_wait_for_dirty_mem

Changes in V4:
  * rebase to tip
  * add patch for vm_prepare_start (thanks Peter)
  * add patch to preserve cpu ticks

Changes in V5:
  * rebase to tip
  * added patches to completely stop vm in suspended state:
      cpus: refactor vm_stop
      cpus: stop vm in suspended state
  * added patch to partially resume vm in suspended state:
      cpus: start vm in suspended state
  * modified "preserve suspended ..." patches to use the above.
  * deleted patch "preserve cpu ticks if suspended".  stop ticks in
    vm_stop_force_state instead.
  * deleted patch "add runstate function".  defined new helper function
    migrate_new_runstate in "preserve suspended runstate"
  * Added some RB's, but removed other RB's because the patches changed.

Changes in V6:
  * all vm_stop calls completely stop the suspended state
  * refactored and updated the "cpus" patches
  * simplified the "preserve suspended" patches
  * added patch "bootfile per vm"

Steve Sistare (14):
  cpus: pass runstate to vm_prepare_start
  cpus: vm_was_suspended
  cpus: stop vm in suspended runstate
  cpus: vm_resume
  migration: propagate suspended runstate
  migration: preserve suspended runstate
  migration: preserve suspended for snapshot
  migration: preserve suspended for bg_migration
  tests/qtest: migration events
  tests/qtest: option to suspend during migration
  tests/qtest: precopy migration with suspend
  tests/qtest: postcopy migration with suspend
  tests/qtest: bootfile per vm
  tests/qtest: background migration with suspend

 backends/tpm/tpm_emulator.c          |   2 +-
 gdbstub/system.c                     |   2 +-
 hw/usb/hcd-ehci.c                    |   2 +-
 hw/usb/redirect.c                    |   2 +-
 hw/xen/xen-hvm-common.c              |   2 +-
 include/migration/snapshot.h         |   7 +
 include/sysemu/runstate.h            |  19 ++-
 migration/global_state.c             |  10 ++
 migration/migration-hmp-cmds.c       |   8 +-
 migration/migration.c                |  15 +--
 migration/savevm.c                   |  23 ++--
 qapi/misc.json                       |  10 +-
 system/cpus.c                        |  49 +++++--
 system/runstate.c                    |   9 ++
 system/vl.c                          |   2 +
 tests/migration/i386/Makefile        |   5 +-
 tests/migration/i386/a-b-bootblock.S |  50 +++++++-
 tests/migration/i386/a-b-bootblock.h |  26 ++--
 tests/qtest/migration-helpers.c      |  27 ++--
 tests/qtest/migration-helpers.h      |  11 +-
 tests/qtest/migration-test.c         | 240 +++++++++++++++++++++++++----------
 21 files changed, 382 insertions(+), 139 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH V6 01/14] cpus: pass runstate to vm_prepare_start
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-11-30 21:37 ` [PATCH V6 02/14] cpus: vm_was_suspended Steve Sistare
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

When a vm in the suspended state is migrated, we must call vm_prepare_start
on the destination, so a later system_wakeup properly resumes the guest,
when main_loop_should_exit calls resume_all_vcpus.  However, the runstate
should remain suspended until system_wakeup is called, so allow the caller
to pass the new state to vm_prepare_start, rather than assume the new state
is RUN_STATE_RUNNING.  Modify vm state change handlers that check
RUN_STATE_RUNNING to instead use the running parameter.

No functional change.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
 backends/tpm/tpm_emulator.c | 2 +-
 gdbstub/system.c            | 2 +-
 hw/usb/hcd-ehci.c           | 2 +-
 hw/usb/redirect.c           | 2 +-
 hw/xen/xen-hvm-common.c     | 2 +-
 include/sysemu/runstate.h   | 4 +++-
 system/cpus.c               | 8 ++++----
 7 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/backends/tpm/tpm_emulator.c b/backends/tpm/tpm_emulator.c
index f7f1b4a..254fce7 100644
--- a/backends/tpm/tpm_emulator.c
+++ b/backends/tpm/tpm_emulator.c
@@ -904,7 +904,7 @@ static void tpm_emulator_vm_state_change(void *opaque, bool running,
 
     trace_tpm_emulator_vm_state_change(running, state);
 
-    if (!running || state != RUN_STATE_RUNNING || !tpm_emu->relock_storage) {
+    if (!running || !tpm_emu->relock_storage) {
         return;
     }
 
diff --git a/gdbstub/system.c b/gdbstub/system.c
index 783ac14..7ab9f82 100644
--- a/gdbstub/system.c
+++ b/gdbstub/system.c
@@ -570,7 +570,7 @@ int gdb_continue_partial(char *newstates)
             }
         }
 
-        if (vm_prepare_start(step_requested)) {
+        if (vm_prepare_start(step_requested, RUN_STATE_RUNNING)) {
             return 0;
         }
 
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index 19b4534..10c82ce 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -2451,7 +2451,7 @@ static void usb_ehci_vm_state_change(void *opaque, bool running, RunState state)
      * USB-devices which have async handled packages have a packet in the
      * ep queue to match the completion with.
      */
-    if (state == RUN_STATE_RUNNING) {
+    if (running) {
         ehci_advance_async_state(ehci);
     }
 
diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index c9893df..3785bb0 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -1403,7 +1403,7 @@ static void usbredir_vm_state_change(void *priv, bool running, RunState state)
 {
     USBRedirDevice *dev = priv;
 
-    if (state == RUN_STATE_RUNNING && dev->parser != NULL) {
+    if (running && dev->parser != NULL) {
         usbredirparser_do_write(dev->parser); /* Flush any pending writes */
     }
 }
diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
index 565dc39..47e6cb1 100644
--- a/hw/xen/xen-hvm-common.c
+++ b/hw/xen/xen-hvm-common.c
@@ -623,7 +623,7 @@ void xen_hvm_change_state_handler(void *opaque, bool running,
 
     xen_set_ioreq_server_state(xen_domid,
                                state->ioservid,
-                               (rstate == RUN_STATE_RUNNING));
+                               running);
 }
 
 void xen_exit_notifier(Notifier *n, void *data)
diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index c8c2bd8..9e78c7f 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -46,8 +46,10 @@ void vm_start(void);
  * vm_prepare_start: Prepare for starting/resuming the VM
  *
  * @step_pending: whether any of the CPUs is about to be single-stepped by gdb
+ * @state: the vm state to setup
  */
-int vm_prepare_start(bool step_pending);
+int vm_prepare_start(bool step_pending, RunState state);
+
 int vm_stop(RunState state);
 int vm_stop_force_state(RunState state);
 int vm_shutdown(void);
diff --git a/system/cpus.c b/system/cpus.c
index a444a74..0c60d7a 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -682,7 +682,7 @@ int vm_stop(RunState state)
  * Returns -1 if the vCPUs are not to be restarted (e.g. if they are already
  * running or in case of an error condition), 0 otherwise.
  */
-int vm_prepare_start(bool step_pending)
+int vm_prepare_start(bool step_pending, RunState state)
 {
     RunState requested;
 
@@ -714,14 +714,14 @@ int vm_prepare_start(bool step_pending)
     qapi_event_send_resume();
 
     cpu_enable_ticks();
-    runstate_set(RUN_STATE_RUNNING);
-    vm_state_notify(1, RUN_STATE_RUNNING);
+    runstate_set(state);
+    vm_state_notify(1, state);
     return 0;
 }
 
 void vm_start(void)
 {
-    if (!vm_prepare_start(false)) {
+    if (!vm_prepare_start(false, RUN_STATE_RUNNING)) {
         resume_all_vcpus();
     }
 }
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 02/14] cpus: vm_was_suspended
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
  2023-11-30 21:37 ` [PATCH V6 01/14] cpus: pass runstate to vm_prepare_start Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-11-30 22:03   ` Peter Xu
  2023-11-30 21:37 ` [PATCH V6 03/14] cpus: stop vm in suspended runstate Steve Sistare
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add a state variable to remember if a vm previously transitioned into a
suspended state.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/sysemu/runstate.h |  2 ++
 system/cpus.c             | 15 +++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 9e78c7f..f6a337b 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -53,6 +53,8 @@ int vm_prepare_start(bool step_pending, RunState state);
 int vm_stop(RunState state);
 int vm_stop_force_state(RunState state);
 int vm_shutdown(void);
+void vm_set_suspended(bool suspended);
+bool vm_get_suspended(void);
 
 typedef enum WakeupReason {
     /* Always keep QEMU_WAKEUP_REASON_NONE = 0 */
diff --git a/system/cpus.c b/system/cpus.c
index 0c60d7a..ef7a0d3 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -259,6 +259,21 @@ void cpu_interrupt(CPUState *cpu, int mask)
     }
 }
 
+/*
+ * True if the vm was previously suspended, and has not been woken or reset.
+ */
+static int vm_was_suspended;
+
+void vm_set_suspended(bool suspended)
+{
+    vm_was_suspended = suspended;
+}
+
+bool vm_get_suspended(void)
+{
+    return vm_was_suspended;
+}
+
 static int do_vm_stop(RunState state, bool send_stop)
 {
     int ret = 0;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
  2023-11-30 21:37 ` [PATCH V6 01/14] cpus: pass runstate to vm_prepare_start Steve Sistare
  2023-11-30 21:37 ` [PATCH V6 02/14] cpus: vm_was_suspended Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-11-30 22:10   ` Peter Xu
  2023-12-22 12:20   ` Markus Armbruster
  2023-11-30 21:37 ` [PATCH V6 04/14] cpus: vm_resume Steve Sistare
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Currently, a vm in the suspended state is not completely stopped.  The VCPUs
have been paused, but the cpu clock still runs, and runstate notifiers for
the transition to stopped have not been called.  This causes problems for
live migration.  Stale cpu timers_state is saved to the migration stream,
causing time errors in the guest when it wakes from suspend, and state that
would have been modified by runstate notifiers is wrong.

Modify vm_stop to completely stop the vm if the current state is suspended,
transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
Modify vm_start to restore the suspended state.

This affects all callers of vm_stop and vm_start, notably, the qapi stop and
cont commands.  For example:

    (qemu) info status
    VM status: paused (suspended)

    (qemu) stop
    (qemu) info status
    VM status: paused

    (qemu) cont
    (qemu) info status
    VM status: paused (suspended)

    (qemu) system_wakeup
    (qemu) info status
    VM status: running

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/sysemu/runstate.h |  5 +++++
 qapi/misc.json            | 10 ++++++++--
 system/cpus.c             | 19 ++++++++++++++-----
 system/runstate.c         |  3 +++
 4 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index f6a337b..1d6828f 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
     return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
 }
 
+static inline bool runstate_is_started(RunState state)
+{
+    return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
+}
+
 void vm_start(void);
 
 /**
diff --git a/qapi/misc.json b/qapi/misc.json
index cda2eff..efb8d44 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -134,7 +134,7 @@
 ##
 # @stop:
 #
-# Stop all guest VCPU execution.
+# Stop all guest VCPU and VM execution.
 #
 # Since: 0.14
 #
@@ -143,6 +143,9 @@
 #     the guest remains paused once migration finishes, as if the -S
 #     option was passed on the command line.
 #
+#     In the "suspended" state, it will completely stop the VM and
+#     cause a transition to the "paused" state. (Since 9.0)
+#
 # Example:
 #
 # -> { "execute": "stop" }
@@ -153,7 +156,7 @@
 ##
 # @cont:
 #
-# Resume guest VCPU execution.
+# Resume guest VCPU and VM execution.
 #
 # Since: 0.14
 #
@@ -165,6 +168,9 @@
 #     guest starts once migration finishes, removing the effect of the
 #     -S command line option if it was passed.
 #
+#     If the VM was previously suspended, and not been reset or woken,
+#     this command will transition back to the "suspended" state. (Since 9.0)
+#
 # Example:
 #
 # -> { "execute": "cont" }
diff --git a/system/cpus.c b/system/cpus.c
index ef7a0d3..cbc6d6d 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -277,11 +277,15 @@ bool vm_get_suspended(void)
 static int do_vm_stop(RunState state, bool send_stop)
 {
     int ret = 0;
+    RunState oldstate = runstate_get();
 
-    if (runstate_is_running()) {
+    if (runstate_is_started(oldstate)) {
+        vm_was_suspended = (oldstate == RUN_STATE_SUSPENDED);
         runstate_set(state);
         cpu_disable_ticks();
-        pause_all_vcpus();
+        if (oldstate == RUN_STATE_RUNNING) {
+            pause_all_vcpus();
+        }
         vm_state_notify(0, state);
         if (send_stop) {
             qapi_event_send_stop();
@@ -736,8 +740,13 @@ int vm_prepare_start(bool step_pending, RunState state)
 
 void vm_start(void)
 {
-    if (!vm_prepare_start(false, RUN_STATE_RUNNING)) {
-        resume_all_vcpus();
+    RunState state = vm_was_suspended ? RUN_STATE_SUSPENDED : RUN_STATE_RUNNING;
+
+    if (!vm_prepare_start(false, state)) {
+        if (state == RUN_STATE_RUNNING) {
+            resume_all_vcpus();
+        }
+        vm_was_suspended = false;
     }
 }
 
@@ -745,7 +754,7 @@ void vm_start(void)
    current state is forgotten forever */
 int vm_stop_force_state(RunState state)
 {
-    if (runstate_is_running()) {
+    if (runstate_is_started(runstate_get())) {
         return vm_stop(state);
     } else {
         int ret;
diff --git a/system/runstate.c b/system/runstate.c
index ea9d6c2..e2fa204 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -108,6 +108,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE },
     { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
     { RUN_STATE_PAUSED, RUN_STATE_COLO},
+    { RUN_STATE_PAUSED, RUN_STATE_SUSPENDED},
 
     { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -161,6 +162,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
     { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
     { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
+    { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED},
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
@@ -502,6 +504,7 @@ void qemu_system_reset(ShutdownCause reason)
         qapi_event_send_reset(shutdown_caused_by_guest(reason), reason);
     }
     cpu_synchronize_all_post_reset();
+    vm_set_suspended(false);
 }
 
 /*
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 04/14] cpus: vm_resume
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (2 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 03/14] cpus: stop vm in suspended runstate Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-12-05 21:36   ` Peter Xu
  2023-11-30 21:37 ` [PATCH V6 05/14] migration: propagate suspended runstate Steve Sistare
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Define the vm_resume helper, for use in subsequent patches.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/sysemu/runstate.h | 8 ++++++++
 system/cpus.c             | 9 +++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 1d6828f..a900cec 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -55,6 +55,14 @@ void vm_start(void);
  */
 int vm_prepare_start(bool step_pending, RunState state);
 
+/**
+ * vm_resume: If @state is a startable state, start the vm and set the state,
+ * else just set the state.
+ *
+ * @state: the state to restore
+ */
+void vm_resume(RunState state);
+
 int vm_stop(RunState state);
 int vm_stop_force_state(RunState state);
 int vm_shutdown(void);
diff --git a/system/cpus.c b/system/cpus.c
index cbc6d6d..63cf356 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -750,6 +750,15 @@ void vm_start(void)
     }
 }
 
+void vm_resume(RunState state)
+{
+    if (runstate_is_started(state)) {
+        vm_start();
+    } else {
+        runstate_set(state);
+    }
+}
+
 /* does a state transition even if the VM is already stopped,
    current state is forgotten forever */
 int vm_stop_force_state(RunState state)
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 05/14] migration: propagate suspended runstate
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (3 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 04/14] cpus: vm_resume Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-11-30 23:06   ` Peter Xu
  2023-11-30 21:37 ` [PATCH V6 06/14] migration: preserve " Steve Sistare
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

If the outgoing machine was previously suspended, propagate that to the
incoming side via global_state, so a subsequent vm_start restores the
suspended state.  To maintain backward and forward compatibility, define
the new field in a zero'd hole in the GlobalState struct.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 migration/global_state.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/migration/global_state.c b/migration/global_state.c
index 4e2a9d8..de2532c 100644
--- a/migration/global_state.c
+++ b/migration/global_state.c
@@ -25,6 +25,7 @@ typedef struct {
     uint8_t runstate[100];
     RunState state;
     bool received;
+    bool vm_was_suspended;
 } GlobalState;
 
 static GlobalState global_state;
@@ -35,6 +36,7 @@ static void global_state_do_store(RunState state)
     assert(strlen(state_str) < sizeof(global_state.runstate));
     strpadcpy((char *)global_state.runstate, sizeof(global_state.runstate),
               state_str, '\0');
+    global_state.vm_was_suspended = vm_get_suspended();
 }
 
 void global_state_store(void)
@@ -68,6 +70,12 @@ static bool global_state_needed(void *opaque)
         return true;
     }
 
+    /* If the suspended state must be remembered, it is needed */
+
+    if (vm_get_suspended()) {
+        return true;
+    }
+
     /* If state is running or paused, it is not needed */
 
     if (strcmp(runstate, "running") == 0 ||
@@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int version_id)
         return -EINVAL;
     }
     s->state = r;
+    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
 
     return 0;
 }
@@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
     .fields = (VMStateField[]) {
         VMSTATE_UINT32(size, GlobalState),
         VMSTATE_BUFFER(runstate, GlobalState),
+        VMSTATE_BOOL(vm_was_suspended, GlobalState),
         VMSTATE_END_OF_LIST()
     },
 };
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 06/14] migration: preserve suspended runstate
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (4 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 05/14] migration: propagate suspended runstate Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-12-05 21:34   ` Peter Xu
  2023-11-30 21:37 ` [PATCH V6 07/14] migration: preserve suspended for snapshot Steve Sistare
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

A guest that is migrated in the suspended state automaticaly wakes and
continues execution.  This is wrong; the guest should end migration in
the same state it started.  The root cause is that the outgoing migration
code automatically wakes the guest, then saves the RUNNING runstate in
global_state_store(), hence the incoming migration code thinks the guest is
running and continues the guest if autostart is true.

On the outgoing side, delete the call to qemu_system_wakeup_request().
Now that vm_stop completely stops a vm in the suspended state (from the
preceding patches), the existing call to vm_stop_force_state is sufficient
to correctly migrate all vmstate.

On the incoming side, call vm_start if the pre-migration state was running
or suspended.  For the latter, vm_start correctly restores the suspended
state, and a future system_wakeup monitor request will cause the vm to
resume running.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 migration/migration.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 28a34c9..d1d94c4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -603,7 +603,7 @@ static void process_incoming_migration_bh(void *opaque)
      */
     if (!migrate_late_block_activate() ||
          (autostart && (!global_state_received() ||
-            global_state_get_runstate() == RUN_STATE_RUNNING))) {
+            runstate_is_started(global_state_get_runstate())))) {
         /* Make sure all file formats throw away their mutable metadata.
          * If we get an error here, just don't restart the VM yet. */
         bdrv_activate_all(&local_err);
@@ -627,7 +627,7 @@ static void process_incoming_migration_bh(void *opaque)
     dirty_bitmap_mig_before_vm_start();
 
     if (!global_state_received() ||
-        global_state_get_runstate() == RUN_STATE_RUNNING) {
+        runstate_is_started(global_state_get_runstate())) {
         if (autostart) {
             vm_start();
         } else {
@@ -2415,7 +2415,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
 
     migration_downtime_start(ms);
 
-    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
     global_state_store();
     ret = migration_stop_vm(RUN_STATE_FINISH_MIGRATE);
     if (ret < 0) {
@@ -2614,7 +2613,6 @@ static int migration_completion_precopy(MigrationState *s,
 
     qemu_mutex_lock_iothread();
     migration_downtime_start(s);
-    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
 
     s->vm_old_state = runstate_get();
     global_state_store();
@@ -3135,7 +3133,7 @@ static void migration_iteration_finish(MigrationState *s)
     case MIGRATION_STATUS_FAILED:
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_CANCELLING:
-        if (s->vm_old_state == RUN_STATE_RUNNING) {
+        if (runstate_is_started(s->vm_old_state)) {
             if (!runstate_check(RUN_STATE_SHUTDOWN)) {
                 vm_start();
             }
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 07/14] migration: preserve suspended for snapshot
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (5 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 06/14] migration: preserve " Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-12-05 21:35   ` Peter Xu
  2023-11-30 21:37 ` [PATCH V6 08/14] migration: preserve suspended for bg_migration Steve Sistare
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Restoring a snapshot can break a suspended guest.  Snapshots suffer from
the same suspended-state issues that affect live migration, plus they must
handle an additional problematic scenario, which is that a running vm must
remain running if it loads a suspended snapshot.  Currently, after loading
such a snapshot, the vm_start fails.  The runstate is RUNNING, but the guest
is not.

To save, the vm_stop call now completely stops the suspended state, courtesy
of a recent patch.  Finish with vm_resume to leave the vm in the state it had
prior to the save, correctly restoring the suspended state.

To load, if the snapshot is not suspended, then vm_stop + vm_resume
correctly handles all states, and leaves the vm in the state it had prior
to the load.  However, if the snapshot is suspended, restoration is
trickier.  First, call vm_resume to restore the state to suspended so the
current state matches the saved state.  Then, if the pre-load state is
running, call wakeup to resume running.

Prior to these changes, the vm_stop to RUN_STATE_SAVE_VM and
RUN_STATE_RESTORE_VM did not change runstate if the current state was
paused, suspended, or prelaunch, but now it does, so allow these
transitions.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 include/migration/snapshot.h   |  7 +++++++
 migration/migration-hmp-cmds.c |  8 +++++---
 migration/savevm.c             | 23 +++++++++++++----------
 system/runstate.c              |  5 +++++
 system/vl.c                    |  2 ++
 5 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
index e72083b..9e4dcaa 100644
--- a/include/migration/snapshot.h
+++ b/include/migration/snapshot.h
@@ -16,6 +16,7 @@
 #define QEMU_MIGRATION_SNAPSHOT_H
 
 #include "qapi/qapi-builtin-types.h"
+#include "qapi/qapi-types-run-state.h"
 
 /**
  * save_snapshot: Save an internal snapshot.
@@ -61,4 +62,10 @@ bool delete_snapshot(const char *name,
                     bool has_devices, strList *devices,
                     Error **errp);
 
+/**
+ * load_snapshot_resume: Restore runstate after loading snapshot.
+ * @state: state to restore
+ */
+void load_snapshot_resume(RunState state);
+
 #endif
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 86ae832..c8d70bc 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -399,15 +399,17 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
 
 void hmp_loadvm(Monitor *mon, const QDict *qdict)
 {
-    int saved_vm_running  = runstate_is_running();
+    RunState saved_state = runstate_get();
+
     const char *name = qdict_get_str(qdict, "name");
     Error *err = NULL;
 
     vm_stop(RUN_STATE_RESTORE_VM);
 
-    if (load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) {
-        vm_start();
+    if (load_snapshot(name, NULL, false, NULL, &err)) {
+        load_snapshot_resume(saved_state);
     }
+
     hmp_handle_error(mon, err);
 }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index eec5503..78697c0 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3046,7 +3046,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
     QEMUSnapshotInfo sn1, *sn = &sn1;
     int ret = -1, ret2;
     QEMUFile *f;
-    int saved_vm_running;
+    RunState saved_state = runstate_get();
     uint64_t vm_state_size;
     g_autoptr(GDateTime) now = g_date_time_new_now_local();
     AioContext *aio_context;
@@ -3094,8 +3094,6 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
     }
     aio_context = bdrv_get_aio_context(bs);
 
-    saved_vm_running = runstate_is_running();
-
     global_state_store();
     vm_stop(RUN_STATE_SAVE_VM);
 
@@ -3163,9 +3161,7 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
 
     bdrv_drain_all_end();
 
-    if (saved_vm_running) {
-        vm_start();
-    }
+    vm_resume(saved_state);
     return ret == 0;
 }
 
@@ -3339,6 +3335,14 @@ err_drain:
     return false;
 }
 
+void load_snapshot_resume(RunState state)
+{
+    vm_resume(state);
+    if (state == RUN_STATE_RUNNING && runstate_get() == RUN_STATE_SUSPENDED) {
+        qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
+    }
+}
+
 bool delete_snapshot(const char *name, bool has_devices,
                      strList *devices, Error **errp)
 {
@@ -3403,16 +3407,15 @@ static void snapshot_load_job_bh(void *opaque)
 {
     Job *job = opaque;
     SnapshotJob *s = container_of(job, SnapshotJob, common);
-    int orig_vm_running;
+    RunState orig_state = runstate_get();
 
     job_progress_set_remaining(&s->common, 1);
 
-    orig_vm_running = runstate_is_running();
     vm_stop(RUN_STATE_RESTORE_VM);
 
     s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
-    if (s->ret && orig_vm_running) {
-        vm_start();
+    if (s->ret) {
+        load_snapshot_resume(orig_state);
     }
 
     job_progress_update(&s->common, 1);
diff --git a/system/runstate.c b/system/runstate.c
index e2fa204..ca9eb54 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -77,6 +77,7 @@ typedef struct {
 
 static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
+    { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
 
     { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
     { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },
@@ -132,6 +133,7 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
     { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
+    { RUN_STATE_RESTORE_VM, RUN_STATE_SUSPENDED },
 
     { RUN_STATE_COLO, RUN_STATE_RUNNING },
     { RUN_STATE_COLO, RUN_STATE_PRELAUNCH },
@@ -150,6 +152,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_COLO},
 
     { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
+    { RUN_STATE_SAVE_VM, RUN_STATE_SUSPENDED },
 
     { RUN_STATE_SHUTDOWN, RUN_STATE_PAUSED },
     { RUN_STATE_SHUTDOWN, RUN_STATE_FINISH_MIGRATE },
@@ -163,6 +166,8 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
     { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
     { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED},
+    { RUN_STATE_SUSPENDED, RUN_STATE_SAVE_VM },
+    { RUN_STATE_SUSPENDED, RUN_STATE_RESTORE_VM },
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
diff --git a/system/vl.c b/system/vl.c
index 2bcd9ef..2fbbbba 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -2706,7 +2706,9 @@ void qmp_x_exit_preconfig(Error **errp)
     qemu_machine_creation_done();
 
     if (loadvm) {
+        RunState state = autostart ? RUN_STATE_RUNNING : runstate_get();
         load_snapshot(loadvm, NULL, false, NULL, &error_fatal);
+        load_snapshot_resume(state);
     }
     if (replay_mode != REPLAY_MODE_NONE) {
         replay_vmstate_init();
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 08/14] migration: preserve suspended for bg_migration
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (6 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 07/14] migration: preserve suspended for snapshot Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-12-05 21:35   ` Peter Xu
  2023-11-30 21:37 ` [PATCH V6 09/14] tests/qtest: migration events Steve Sistare
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Do not wake a suspended guest during bg_migration, and restore the prior
state at finish rather than unconditionally running.  Allow the additional
state transitions that occur.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 7 +------
 system/runstate.c     | 1 +
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index d1d94c4..63c616f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3389,7 +3389,7 @@ static void bg_migration_vm_start_bh(void *opaque)
     qemu_bh_delete(s->vm_start_bh);
     s->vm_start_bh = NULL;
 
-    vm_start();
+    vm_resume(s->vm_old_state);
     migration_downtime_end(s);
 }
 
@@ -3461,11 +3461,6 @@ static void *bg_migration_thread(void *opaque)
 
     qemu_mutex_lock_iothread();
 
-    /*
-     * If VM is currently in suspended state, then, to make a valid runstate
-     * transition in vm_stop_force_state() we need to wakeup it up.
-     */
-    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
     s->vm_old_state = runstate_get();
 
     global_state_store();
diff --git a/system/runstate.c b/system/runstate.c
index ca9eb54..621a023 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -168,6 +168,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED},
     { RUN_STATE_SUSPENDED, RUN_STATE_SAVE_VM },
     { RUN_STATE_SUSPENDED, RUN_STATE_RESTORE_VM },
+    { RUN_STATE_SUSPENDED, RUN_STATE_SHUTDOWN },
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 09/14] tests/qtest: migration events
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (7 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 08/14] migration: preserve suspended for bg_migration Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-11-30 21:37 ` [PATCH V6 10/14] tests/qtest: option to suspend during migration Steve Sistare
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Define a state object to capture events seen by migration tests, to allow
more events to be captured in a subsequent patch, and simplify event
checking in wait_for_migration_pass.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/migration-helpers.c | 24 ++++-------
 tests/qtest/migration-helpers.h |  9 ++--
 tests/qtest/migration-test.c    | 91 +++++++++++++++++++----------------------
 3 files changed, 56 insertions(+), 68 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 24fb7b3..fd3b94e 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -24,26 +24,16 @@
  */
 #define MIGRATION_STATUS_WAIT_TIMEOUT 120
 
-bool migrate_watch_for_stop(QTestState *who, const char *name,
-                            QDict *event, void *opaque)
-{
-    bool *seen = opaque;
-
-    if (g_str_equal(name, "STOP")) {
-        *seen = true;
-        return true;
-    }
-
-    return false;
-}
-
-bool migrate_watch_for_resume(QTestState *who, const char *name,
+bool migrate_watch_for_events(QTestState *who, const char *name,
                               QDict *event, void *opaque)
 {
-    bool *seen = opaque;
+    QTestMigrationState *state = opaque;
 
-    if (g_str_equal(name, "RESUME")) {
-        *seen = true;
+    if (g_str_equal(name, "STOP")) {
+        state->stop_seen = true;
+        return true;
+    } else if (g_str_equal(name, "RESUME")) {
+        state->resume_seen = true;
         return true;
     }
 
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index e31dc85..3d32699 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -15,9 +15,12 @@
 
 #include "libqtest.h"
 
-bool migrate_watch_for_stop(QTestState *who, const char *name,
-                            QDict *event, void *opaque);
-bool migrate_watch_for_resume(QTestState *who, const char *name,
+typedef struct QTestMigrationState {
+    bool stop_seen;
+    bool resume_seen;
+} QTestMigrationState;
+
+bool migrate_watch_for_events(QTestState *who, const char *name,
                               QDict *event, void *opaque);
 
 G_GNUC_PRINTF(3, 4)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 0fbaa6a..05c0740 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -43,8 +43,8 @@
 unsigned start_address;
 unsigned end_address;
 static bool uffd_feature_thread_id;
-static bool got_src_stop;
-static bool got_dst_resume;
+static QTestMigrationState src_state;
+static QTestMigrationState dst_state;
 
 /*
  * An initial 3 MB offset is used as that corresponds
@@ -230,6 +230,20 @@ static void wait_for_serial(const char *side)
     } while (true);
 }
 
+static void wait_for_stop(QTestState *who, QTestMigrationState *state)
+{
+    if (!state->stop_seen) {
+        qtest_qmp_eventwait(who, "STOP");
+    }
+}
+
+static void wait_for_resume(QTestState *who, QTestMigrationState *state)
+{
+    if (!state->resume_seen) {
+        qtest_qmp_eventwait(who, "RESUME");
+    }
+}
+
 /*
  * It's tricky to use qemu's migration event capability with qtest,
  * events suddenly appearing confuse the qmp()/hmp() responses.
@@ -277,21 +291,19 @@ static void read_blocktime(QTestState *who)
     qobject_unref(rsp_return);
 }
 
+/*
+ * Wait for two changes in the migration pass count, but bail if we stop.
+ */
 static void wait_for_migration_pass(QTestState *who)
 {
-    uint64_t initial_pass = get_migration_pass(who);
-    uint64_t pass;
+    uint64_t pass, prev_pass = 0, changes = 0;
 
-    /* Wait for the 1st sync */
-    while (!got_src_stop && !initial_pass) {
-        usleep(1000);
-        initial_pass = get_migration_pass(who);
-    }
-
-    do {
+    while (changes < 2 && !src_state.stop_seen) {
         usleep(1000);
         pass = get_migration_pass(who);
-    } while (pass == initial_pass && !got_src_stop);
+        changes += (pass != prev_pass);
+        prev_pass = pass;
+    }
 }
 
 static void check_guests_ram(QTestState *who)
@@ -617,10 +629,7 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
 {
     qtest_qmp_assert_success(from, "{ 'execute': 'migrate-start-postcopy' }");
 
-    if (!got_src_stop) {
-        qtest_qmp_eventwait(from, "STOP");
-    }
-
+    wait_for_stop(from, &src_state);
     qtest_qmp_eventwait(to, "RESUME");
 }
 
@@ -756,8 +765,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         }
     }
 
-    got_src_stop = false;
-    got_dst_resume = false;
+    dst_state = (QTestMigrationState) { };
+    src_state = (QTestMigrationState) { };
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         memory_size = "150M";
 
@@ -848,8 +857,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
     if (!args->only_target) {
         *from = qtest_init_with_env(QEMU_ENV_SRC, cmd_source);
         qtest_qmp_set_event_callback(*from,
-                                     migrate_watch_for_stop,
-                                     &got_src_stop);
+                                     migrate_watch_for_events,
+                                     &src_state);
     }
 
     cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
@@ -869,8 +878,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
                                  ignore_stderr);
     *to = qtest_init_with_env(QEMU_ENV_DST, cmd_target);
     qtest_qmp_set_event_callback(*to,
-                                 migrate_watch_for_resume,
-                                 &got_dst_resume);
+                                 migrate_watch_for_events,
+                                 &dst_state);
 
     /*
      * Remove shmem file immediately to avoid memory leak in test failed case.
@@ -1717,9 +1726,7 @@ static void test_precopy_common(MigrateCommon *args)
          */
         if (args->result == MIG_TEST_SUCCEED) {
             qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
-            if (!got_src_stop) {
-                qtest_qmp_eventwait(from, "STOP");
-            }
+            wait_for_stop(from, &src_state);
             migrate_ensure_converge(from);
         }
     }
@@ -1765,9 +1772,8 @@ static void test_precopy_common(MigrateCommon *args)
              */
             wait_for_migration_complete(from);
 
-            if (!got_src_stop) {
-                qtest_qmp_eventwait(from, "STOP");
-            }
+            wait_for_stop(from, &src_state);
+
         } else {
             wait_for_migration_complete(from);
             /*
@@ -1780,9 +1786,7 @@ static void test_precopy_common(MigrateCommon *args)
             qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
         }
 
-        if (!got_dst_resume) {
-            qtest_qmp_eventwait(to, "RESUME");
-        }
+        wait_for_resume(to, &dst_state);
 
         wait_for_serial("dest_serial");
     }
@@ -1821,9 +1825,7 @@ static void test_file_common(MigrateCommon *args, bool stop_src)
 
     if (stop_src) {
         qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
-        if (!got_src_stop) {
-            qtest_qmp_eventwait(from, "STOP");
-        }
+        wait_for_stop(from, &src_state);
     }
 
     if (args->result == MIG_TEST_QMP_ERROR) {
@@ -1844,10 +1846,7 @@ static void test_file_common(MigrateCommon *args, bool stop_src)
     if (stop_src) {
         qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
     }
-
-    if (!got_dst_resume) {
-        qtest_qmp_eventwait(to, "RESUME");
-    }
+    wait_for_resume(to, &dst_state);
 
     wait_for_serial("dest_serial");
 
@@ -1966,9 +1965,7 @@ static void test_ignore_shared(void)
 
     migrate_wait_for_dirty_mem(from, to);
 
-    if (!got_src_stop) {
-        qtest_qmp_eventwait(from, "STOP");
-    }
+    wait_for_stop(from, &src_state);
 
     qtest_qmp_eventwait(to, "RESUME");
 
@@ -2503,7 +2500,7 @@ static void test_migrate_auto_converge(void)
             break;
         }
         usleep(20);
-        g_assert_false(got_src_stop);
+        g_assert_false(src_state.stop_seen);
     } while (true);
     /* The first percentage of throttling should be at least init_pct */
     g_assert_cmpint(percentage, >=, init_pct);
@@ -2842,9 +2839,7 @@ static void test_multifd_tcp_cancel(void)
 
     migrate_ensure_converge(from);
 
-    if (!got_src_stop) {
-        qtest_qmp_eventwait(from, "STOP");
-    }
+    wait_for_stop(from, &src_state);
     qtest_qmp_eventwait(to2, "RESUME");
 
     wait_for_serial("dest_serial");
@@ -3177,7 +3172,7 @@ static void test_migrate_dirty_limit(void)
         throttle_us_per_full =
         read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
         usleep(100);
-        g_assert_false(got_src_stop);
+        g_assert_false(src_state.stop_seen);
     }
 
     /* Now cancel migrate and wait for dirty limit throttle switch off */
@@ -3189,7 +3184,7 @@ static void test_migrate_dirty_limit(void)
         throttle_us_per_full =
         read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
         usleep(100);
-        g_assert_false(got_src_stop);
+        g_assert_false(src_state.stop_seen);
     } while (throttle_us_per_full != 0 && --max_try_count);
 
     /* Assert dirty limit is not in service */
@@ -3218,7 +3213,7 @@ static void test_migrate_dirty_limit(void)
         throttle_us_per_full =
         read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
         usleep(100);
-        g_assert_false(got_src_stop);
+        g_assert_false(src_state.stop_seen);
     }
 
     /*
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 10/14] tests/qtest: option to suspend during migration
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (8 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 09/14] tests/qtest: migration events Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-12-04 21:14   ` Fabiano Rosas
  2023-11-30 21:37 ` [PATCH V6 11/14] tests/qtest: precopy migration with suspend Steve Sistare
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add an option to suspend the src in a-b-bootblock.S, which puts the guest
in S3 state after one round of writing to memory.  The option is enabled by
poking a 1 into the suspend_me word in the boot block prior to starting the
src vm.  Generate symbol offsets in a-b-bootblock.h so that the suspend_me
offset is known.  Generate the bootblock for each test, because suspend_me
may differ for each.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
---
 tests/migration/i386/Makefile        |  5 ++--
 tests/migration/i386/a-b-bootblock.S | 50 +++++++++++++++++++++++++++++++++---
 tests/migration/i386/a-b-bootblock.h | 26 +++++++++++++------
 tests/qtest/migration-test.c         | 12 ++++++---
 4 files changed, 77 insertions(+), 16 deletions(-)

diff --git a/tests/migration/i386/Makefile b/tests/migration/i386/Makefile
index 5c03241..37a72ae 100644
--- a/tests/migration/i386/Makefile
+++ b/tests/migration/i386/Makefile
@@ -4,9 +4,10 @@
 .PHONY: all clean
 all: a-b-bootblock.h
 
-a-b-bootblock.h: x86.bootsect
+a-b-bootblock.h: x86.bootsect x86.o
 	echo "$$__note" > header.tmp
 	xxd -i $< | sed -e 's/.*int.*//' >> header.tmp
+	nm x86.o | awk '{print "#define SYM_"$$3" 0x"$$1}' >> header.tmp
 	mv header.tmp $@
 
 x86.bootsect: x86.boot
@@ -16,7 +17,7 @@ x86.boot: x86.o
 	$(CROSS_PREFIX)objcopy -O binary $< $@
 
 x86.o: a-b-bootblock.S
-	$(CROSS_PREFIX)gcc -m32 -march=i486 -c $< -o $@
+	$(CROSS_PREFIX)gcc -I.. -m32 -march=i486 -c $< -o $@
 
 clean:
 	@rm -rf *.boot *.o *.bootsect
diff --git a/tests/migration/i386/a-b-bootblock.S b/tests/migration/i386/a-b-bootblock.S
index 6bb9999..6f39eb6 100644
--- a/tests/migration/i386/a-b-bootblock.S
+++ b/tests/migration/i386/a-b-bootblock.S
@@ -9,6 +9,23 @@
 #
 # Author: dgilbert@redhat.com
 
+#include "migration-test.h"
+
+#define ACPI_ENABLE         0xf1
+#define ACPI_PORT_SMI_CMD   0xb2
+#define ACPI_PM_BASE        0x600
+#define PM1A_CNT_OFFSET     4
+
+#define ACPI_SCI_ENABLE     0x0001
+#define ACPI_SLEEP_TYPE     0x0400
+#define ACPI_SLEEP_ENABLE   0x2000
+#define SLEEP (ACPI_SCI_ENABLE + ACPI_SLEEP_TYPE + ACPI_SLEEP_ENABLE)
+
+#define LOW_ADDR            X86_TEST_MEM_START
+#define HIGH_ADDR           X86_TEST_MEM_END
+
+/* Save the suspended status at an address that is not written in the loop. */
+#define suspended           (X86_TEST_MEM_START + 4)
 
 .code16
 .org 0x7c00
@@ -35,8 +52,8 @@ start:             # at 0x7c00 ?
         mov %eax,%ds
 
 # Start from 1MB
-.set TEST_MEM_START, (1024*1024)
-.set TEST_MEM_END, (100*1024*1024)
+.set TEST_MEM_START, X86_TEST_MEM_START
+.set TEST_MEM_END, X86_TEST_MEM_END
 
         mov $65,%ax
         mov $0x3f8,%dx
@@ -69,7 +86,30 @@ innerloop:
         mov $0x3f8,%dx
         outb %al,%dx
 
-        jmp mainloop
+        # should this test suspend?
+        mov (suspend_me),%eax
+        cmp $0,%eax
+        je mainloop
+
+        # are we waking after suspend?  do not suspend again.
+        mov $suspended,%eax
+        mov (%eax),%eax
+        cmp $1,%eax
+        je mainloop
+
+        # enable acpi
+        mov $ACPI_ENABLE,%al
+        outb %al,$ACPI_PORT_SMI_CMD
+
+        # suspend to ram
+        mov $suspended,%eax
+        movl $1,(%eax)
+        mov $SLEEP,%ax
+        mov $(ACPI_PM_BASE + PM1A_CNT_OFFSET),%dx
+        outw %ax,%dx
+        # not reached.  The wakeup causes reset and restart at 0x7c00, and we
+        # do not save and restore registers as a real kernel would do.
+
 
         # GDT magic from old (GPLv2)  Grub startup.S
         .p2align        2       /* force 4-byte alignment */
@@ -95,6 +135,10 @@ gdtdesc:
         .word   0x27                    /* limit */
         .long   gdt                     /* addr */
 
+        /* test launcher can poke a 1 here to exercise suspend */
+suspend_me:
+        .int  0
+
 /* I'm a bootable disk */
 .org 0x7dfe
         .byte 0x55
diff --git a/tests/migration/i386/a-b-bootblock.h b/tests/migration/i386/a-b-bootblock.h
index 5b52391..c83f871 100644
--- a/tests/migration/i386/a-b-bootblock.h
+++ b/tests/migration/i386/a-b-bootblock.h
@@ -4,7 +4,7 @@
  * the header and the assembler differences in your patch submission.
  */
 unsigned char x86_bootsect[] = {
-  0xfa, 0x0f, 0x01, 0x16, 0x8c, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
+  0xfa, 0x0f, 0x01, 0x16, 0xb8, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
   0x0f, 0x22, 0xc0, 0x66, 0xea, 0x20, 0x7c, 0x00, 0x00, 0x08, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xe4, 0x92, 0x0c, 0x02,
   0xe6, 0x92, 0xb8, 0x10, 0x00, 0x00, 0x00, 0x8e, 0xd8, 0x66, 0xb8, 0x41,
@@ -13,13 +13,13 @@ unsigned char x86_bootsect[] = {
   0x40, 0x06, 0x7c, 0xf1, 0xb8, 0x00, 0x00, 0x10, 0x00, 0xfe, 0x00, 0x05,
   0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00, 0x40, 0x06, 0x7c, 0xf2, 0xfe,
   0xc3, 0x80, 0xe3, 0x3f, 0x75, 0xe6, 0x66, 0xb8, 0x42, 0x00, 0x66, 0xba,
-  0xf8, 0x03, 0xee, 0xeb, 0xdb, 0x8d, 0x76, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00,
-  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00, 0x27, 0x00, 0x74, 0x7c,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0xf8, 0x03, 0xee, 0xa1, 0xbe, 0x7c, 0x00, 0x00, 0x83, 0xf8, 0x00, 0x74,
+  0xd3, 0xb8, 0x04, 0x00, 0x10, 0x00, 0x8b, 0x00, 0x83, 0xf8, 0x01, 0x74,
+  0xc7, 0xb0, 0xf1, 0xe6, 0xb2, 0xb8, 0x04, 0x00, 0x10, 0x00, 0xc7, 0x00,
+  0x01, 0x00, 0x00, 0x00, 0x66, 0xb8, 0x01, 0x24, 0x66, 0xba, 0x04, 0x06,
+  0x66, 0xef, 0x66, 0x90, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00, 0xff, 0xff, 0x00, 0x00,
+  0x00, 0x92, 0xcf, 0x00, 0x27, 0x00, 0xa0, 0x7c, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
@@ -49,3 +49,13 @@ unsigned char x86_bootsect[] = {
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x55, 0xaa
 };
 
+#define SYM_do_zero 0x00007c3d
+#define SYM_gdt 0x00007ca0
+#define SYM_gdtdesc 0x00007cb8
+#define SYM_innerloop 0x00007c51
+#define SYM_mainloop 0x00007c4c
+#define SYM_pre_zero 0x00007c38
+#define SYM_start 0x00007c00
+#define SYM_suspend_me 0x00007cbe
+#define SYM_TEST_MEM_END 0x06400000
+#define SYM_TEST_MEM_START 0x00100000
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 05c0740..e10d5a4 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -133,7 +133,7 @@ static char *bootpath;
 #include "tests/migration/aarch64/a-b-kernel.h"
 #include "tests/migration/s390x/a-b-bios.h"
 
-static void bootfile_create(char *dir)
+static void bootfile_create(char *dir, bool suspend_me)
 {
     const char *arch = qtest_get_arch();
     unsigned char *content;
@@ -143,6 +143,7 @@ static void bootfile_create(char *dir)
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         /* the assembled x86 boot sector should be exactly one sector large */
         g_assert(sizeof(x86_bootsect) == 512);
+        x86_bootsect[SYM_suspend_me - SYM_start] = suspend_me;
         content = x86_bootsect;
         len = sizeof(x86_bootsect);
     } else if (g_str_equal(arch, "s390x")) {
@@ -646,6 +647,8 @@ typedef struct {
     bool use_dirty_ring;
     const char *opts_source;
     const char *opts_target;
+    /* suspend the src before migrating to dest. */
+    bool suspend_me;
 } MigrateStart;
 
 /*
@@ -767,6 +770,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
 
     dst_state = (QTestMigrationState) { };
     src_state = (QTestMigrationState) { };
+    bootfile_create(tmpfs, args->suspend_me);
+
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         memory_size = "150M";
 
@@ -2980,7 +2985,9 @@ static int64_t get_limit_rate(QTestState *who)
 static QTestState *dirtylimit_start_vm(void)
 {
     QTestState *vm = NULL;
-    g_autofree gchar *
+    g_autofree gchar *cmd = NULL;
+
+    bootfile_create(tmpfs, false);
     cmd = g_strdup_printf("-accel kvm,dirty-ring-size=4096 "
                           "-name dirtylimit-test,debug-threads=on "
                           "-m 150M -smp 1 "
@@ -3329,7 +3336,6 @@ int main(int argc, char **argv)
                        g_get_tmp_dir(), err->message);
     }
     g_assert(tmpfs);
-    bootfile_create(tmpfs);
 
     module_call_init(MODULE_INIT_QOM);
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 11/14] tests/qtest: precopy migration with suspend
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (9 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 10/14] tests/qtest: option to suspend during migration Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-12-04 20:49   ` Peter Xu
  2023-11-30 21:37 ` [PATCH V6 12/14] tests/qtest: postcopy " Steve Sistare
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add a test case to verify that the suspended state is handled correctly
during live migration precopy.  The test suspends the src, migrates, then
wakes the dest.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 tests/qtest/migration-helpers.c |  3 ++
 tests/qtest/migration-helpers.h |  2 ++
 tests/qtest/migration-test.c    | 64 ++++++++++++++++++++++++++++++++++++++---
 3 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index fd3b94e..37e8e81 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -32,6 +32,9 @@ bool migrate_watch_for_events(QTestState *who, const char *name,
     if (g_str_equal(name, "STOP")) {
         state->stop_seen = true;
         return true;
+    } else if (g_str_equal(name, "SUSPEND")) {
+        state->suspend_seen = true;
+        return true;
     } else if (g_str_equal(name, "RESUME")) {
         state->resume_seen = true;
         return true;
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index 3d32699..b478549 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -18,6 +18,8 @@
 typedef struct QTestMigrationState {
     bool stop_seen;
     bool resume_seen;
+    bool suspend_seen;
+    bool suspend_me;
 } QTestMigrationState;
 
 bool migrate_watch_for_events(QTestState *who, const char *name,
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index e10d5a4..200f023 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -178,7 +178,7 @@ static void bootfile_delete(void)
 /*
  * Wait for some output in the serial output file,
  * we get an 'A' followed by an endless string of 'B's
- * but on the destination we won't have the A.
+ * but on the destination we won't have the A (unless we enabled suspend/resume)
  */
 static void wait_for_serial(const char *side)
 {
@@ -245,6 +245,13 @@ static void wait_for_resume(QTestState *who, QTestMigrationState *state)
     }
 }
 
+static void wait_for_suspend(QTestState *who, QTestMigrationState *state)
+{
+    if (!state->suspend_seen) {
+        qtest_qmp_eventwait(who, "SUSPEND");
+    }
+}
+
 /*
  * It's tricky to use qemu's migration event capability with qtest,
  * events suddenly appearing confuse the qmp()/hmp() responses.
@@ -299,7 +306,7 @@ static void wait_for_migration_pass(QTestState *who)
 {
     uint64_t pass, prev_pass = 0, changes = 0;
 
-    while (changes < 2 && !src_state.stop_seen) {
+    while (changes < 2 && !src_state.stop_seen && !src_state.suspend_seen) {
         usleep(1000);
         pass = get_migration_pass(who);
         changes += (pass != prev_pass);
@@ -595,7 +602,8 @@ static void migrate_wait_for_dirty_mem(QTestState *from,
     watch_byte = qtest_readb(from, watch_address);
     do {
         usleep(1000 * 10);
-    } while (qtest_readb(from, watch_address) == watch_byte);
+    } while (qtest_readb(from, watch_address) == watch_byte &&
+             !src_state.suspend_seen);
 }
 
 
@@ -771,6 +779,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
     dst_state = (QTestMigrationState) { };
     src_state = (QTestMigrationState) { };
     bootfile_create(tmpfs, args->suspend_me);
+    src_state.suspend_me = args->suspend_me;
 
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         memory_size = "150M";
@@ -1730,6 +1739,9 @@ static void test_precopy_common(MigrateCommon *args)
          * change anything.
          */
         if (args->result == MIG_TEST_SUCCEED) {
+            if (src_state.suspend_me) {
+                wait_for_suspend(from, &src_state);
+            }
             qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
             wait_for_stop(from, &src_state);
             migrate_ensure_converge(from);
@@ -1777,6 +1789,9 @@ static void test_precopy_common(MigrateCommon *args)
              */
             wait_for_migration_complete(from);
 
+            if (src_state.suspend_me) {
+                wait_for_suspend(from, &src_state);
+            }
             wait_for_stop(from, &src_state);
 
         } else {
@@ -1793,6 +1808,11 @@ static void test_precopy_common(MigrateCommon *args)
 
         wait_for_resume(to, &dst_state);
 
+        if (args->start.suspend_me) {
+            /* wakeup succeeds only if guest is suspended */
+            qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
+        }
+
         wait_for_serial("dest_serial");
     }
 
@@ -1879,6 +1899,34 @@ static void test_precopy_unix_plain(void)
     test_precopy_common(&args);
 }
 
+static void test_precopy_unix_suspend_live(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .listen_uri = uri,
+        .connect_uri = uri,
+        /*
+         * despite being live, the test is fast because the src
+         * suspends immediately.
+         */
+        .live = true,
+        .start.suspend_me = true,
+    };
+
+    test_precopy_common(&args);
+}
+
+static void test_precopy_unix_suspend_notlive(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .listen_uri = uri,
+        .connect_uri = uri,
+        .start.suspend_me = true,
+    };
+
+    test_precopy_common(&args);
+}
 
 static void test_precopy_unix_dirty_ring(void)
 {
@@ -3279,7 +3327,7 @@ static bool kvm_dirty_ring_supported(void)
 int main(int argc, char **argv)
 {
     bool has_kvm, has_tcg;
-    bool has_uffd;
+    bool has_uffd, is_x86;
     const char *arch;
     g_autoptr(GError) err = NULL;
     const char *qemu_src = getenv(QEMU_ENV_SRC);
@@ -3309,6 +3357,7 @@ int main(int argc, char **argv)
 
     has_uffd = ufd_version_check();
     arch = qtest_get_arch();
+    is_x86 = !strcmp(arch, "i386") || !strcmp(arch, "x86_64");
 
     /*
      * On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG
@@ -3339,6 +3388,13 @@ int main(int argc, char **argv)
 
     module_call_init(MODULE_INIT_QOM);
 
+    if (is_x86) {
+        qtest_add_func("/migration/precopy/unix/suspend/live",
+                       test_precopy_unix_suspend_live);
+        qtest_add_func("/migration/precopy/unix/suspend/notlive",
+                       test_precopy_unix_suspend_notlive);
+    }
+
     if (has_uffd) {
         qtest_add_func("/migration/postcopy/plain", test_postcopy);
         qtest_add_func("/migration/postcopy/recovery/plain",
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 12/14] tests/qtest: postcopy migration with suspend
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (10 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 11/14] tests/qtest: precopy migration with suspend Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-11-30 21:37 ` [PATCH V6 13/14] tests/qtest: bootfile per vm Steve Sistare
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add a test case to verify that the suspended state is handled correctly by
live migration postcopy.  The test suspends the src, migrates, then wakes
the dest.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 tests/qtest/migration-test.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 200f023..af661f8 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -638,6 +638,9 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
 {
     qtest_qmp_assert_success(from, "{ 'execute': 'migrate-start-postcopy' }");
 
+    if (src_state.suspend_me) {
+        wait_for_suspend(from, &src_state);
+    }
     wait_for_stop(from, &src_state);
     qtest_qmp_eventwait(to, "RESUME");
 }
@@ -1359,6 +1362,11 @@ static void migrate_postcopy_complete(QTestState *from, QTestState *to,
 {
     wait_for_migration_complete(from);
 
+    if (args->start.suspend_me) {
+        /* wakeup succeeds only if guest is suspended */
+        qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
+    }
+
     /* Make sure we get at least one "B" on destination */
     wait_for_serial("dest_serial");
 
@@ -1392,6 +1400,15 @@ static void test_postcopy(void)
     test_postcopy_common(&args);
 }
 
+static void test_postcopy_suspend(void)
+{
+    MigrateCommon args = {
+        .start.suspend_me = true,
+    };
+
+    test_postcopy_common(&args);
+}
+
 static void test_postcopy_compress(void)
 {
     MigrateCommon args = {
@@ -3412,7 +3429,10 @@ int main(int argc, char **argv)
         qtest_add_func("/migration/postcopy/recovery/double-failures",
                        test_postcopy_recovery_double_fail);
 #endif /* _WIN32 */
-
+        if (is_x86) {
+            qtest_add_func("/migration/postcopy/suspend",
+                           test_postcopy_suspend);
+        }
     }
 
     qtest_add_func("/migration/bad_dest", test_baddest);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 13/14] tests/qtest: bootfile per vm
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (11 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 12/14] tests/qtest: postcopy " Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-12-04 21:13   ` Fabiano Rosas
  2023-11-30 21:37 ` [PATCH V6 14/14] tests/qtest: background migration with suspend Steve Sistare
  2023-12-05 18:52 ` [PATCH V6 00/14] fix migration of suspended runstate Steven Sistare
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Create a separate bootfile for the outgoing and incoming vm, so the block
layer can lock the file during the background migration test.  Otherwise,
the test fails with:
  "Failed to get "write" lock.  Is another process using the image
   [/tmp/migration-test-WAKPD2/bootsect]?"

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 tests/qtest/migration-test.c | 44 +++++++++++++++++++++++++++-----------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index af661f8..e16710f 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -124,7 +124,8 @@ static bool ufd_version_check(void)
 #endif
 
 static char *tmpfs;
-static char *bootpath;
+static char *src_bootpath;
+static char *dst_bootpath;
 
 /* The boot file modifies memory area in [start_address, end_address)
  * repeatedly. It outputs a 'B' at a fixed rate while it's still running.
@@ -133,13 +134,13 @@ static char *bootpath;
 #include "tests/migration/aarch64/a-b-kernel.h"
 #include "tests/migration/s390x/a-b-bios.h"
 
-static void bootfile_create(char *dir, bool suspend_me)
+static char *bootfile_create(char *dir, const char *prefix, bool suspend_me)
 {
     const char *arch = qtest_get_arch();
     unsigned char *content;
     size_t len;
+    char *bootpath = g_strdup_printf("%s/%s-bootsect", dir, prefix);
 
-    bootpath = g_strdup_printf("%s/bootsect", dir);
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         /* the assembled x86 boot sector should be exactly one sector large */
         g_assert(sizeof(x86_bootsect) == 512);
@@ -153,7 +154,7 @@ static void bootfile_create(char *dir, bool suspend_me)
         /*
          * sane architectures can be programmed at the boot prompt
          */
-        return;
+        return NULL;
     } else if (strcmp(arch, "aarch64") == 0) {
         content = aarch64_kernel;
         len = sizeof(aarch64_kernel);
@@ -166,13 +167,15 @@ static void bootfile_create(char *dir, bool suspend_me)
 
     g_assert_cmpint(fwrite(content, len, 1, bootfile), ==, 1);
     fclose(bootfile);
+    return bootpath;
 }
 
-static void bootfile_delete(void)
+static void bootfile_delete(char *bootpath)
 {
-    unlink(bootpath);
-    g_free(bootpath);
-    bootpath = NULL;
+    if (bootpath) {
+        unlink(bootpath);
+        g_free(bootpath);
+    }
 }
 
 /*
@@ -766,6 +769,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
     const gchar *ignore_stderr;
     g_autofree char *shmem_opts = NULL;
     g_autofree char *shmem_path = NULL;
+    const char *arch_boot_fmt = NULL;
     const char *kvm_opts = NULL;
     const char *arch = qtest_get_arch();
     const char *memory_size;
@@ -781,7 +785,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
 
     dst_state = (QTestMigrationState) { };
     src_state = (QTestMigrationState) { };
-    bootfile_create(tmpfs, args->suspend_me);
+    src_bootpath = bootfile_create(tmpfs, "src", args->suspend_me);
+    dst_bootpath = bootfile_create(tmpfs, "dst", args->suspend_me);
     src_state.suspend_me = args->suspend_me;
 
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
@@ -792,15 +797,14 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         } else {
             machine_alias = "q35";
         }
-        arch_opts = g_strdup_printf(
-            "-drive if=none,id=d0,file=%s,format=raw "
-            "-device ide-hd,drive=d0,secs=1,cyls=1,heads=1", bootpath);
+        arch_boot_fmt = "-drive if=none,id=d0,file=%s,format=raw "
+                        "-device ide-hd,drive=d0,secs=1,cyls=1,heads=1";
         start_address = X86_TEST_MEM_START;
         end_address = X86_TEST_MEM_END;
     } else if (g_str_equal(arch, "s390x")) {
         memory_size = "128M";
         machine_alias = "s390-ccw-virtio";
-        arch_opts = g_strdup_printf("-bios %s", bootpath);
+        arch_boot_fmt = "-bios %s";
         start_address = S390_TEST_MEM_START;
         end_address = S390_TEST_MEM_END;
     } else if (strcmp(arch, "ppc64") == 0) {
@@ -818,13 +822,18 @@ static int test_migrate_start(QTestState **from, QTestState **to,
         memory_size = "150M";
         machine_alias = "virt";
         machine_opts = "gic-version=max";
-        arch_opts = g_strdup_printf("-cpu max -kernel %s", bootpath);
+        arch_boot_fmt = "-cpu max -kernel %s";
         start_address = ARM_TEST_MEM_START;
         end_address = ARM_TEST_MEM_END;
     } else {
         g_assert_not_reached();
     }
 
+    if (arch_boot_fmt) {
+        arch_source = g_strdup_printf(arch_boot_fmt, src_bootpath);
+        arch_target = g_strdup_printf(arch_boot_fmt, dst_bootpath);
+    }
+
     if (!getenv("QTEST_LOG") && args->hide_stderr) {
 #ifndef _WIN32
         ignore_stderr = "2>/dev/null";
@@ -3052,13 +3061,13 @@ static QTestState *dirtylimit_start_vm(void)
     QTestState *vm = NULL;
     g_autofree gchar *cmd = NULL;
 
-    bootfile_create(tmpfs, false);
+    src_bootpath = bootfile_create(tmpfs, "src", false);
     cmd = g_strdup_printf("-accel kvm,dirty-ring-size=4096 "
                           "-name dirtylimit-test,debug-threads=on "
                           "-m 150M -smp 1 "
                           "-serial file:%s/vm_serial "
                           "-drive file=%s,format=raw ",
-                          tmpfs, bootpath);
+                          tmpfs, src_bootpath);
 
     vm = qtest_init(cmd);
     return vm;
@@ -3589,7 +3598,8 @@ int main(int argc, char **argv)
 
     g_assert_cmpint(ret, ==, 0);
 
-    bootfile_delete();
+    bootfile_delete(src_bootpath);
+    bootfile_delete(dst_bootpath);
     ret = rmdir(tmpfs);
     if (ret != 0) {
         g_test_message("unable to rmdir: path (%s): %s",
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH V6 14/14] tests/qtest: background migration with suspend
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (12 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 13/14] tests/qtest: bootfile per vm Steve Sistare
@ 2023-11-30 21:37 ` Steve Sistare
  2023-12-04 21:14   ` Fabiano Rosas
  2023-12-05 18:52 ` [PATCH V6 00/14] fix migration of suspended runstate Steven Sistare
  14 siblings, 1 reply; 57+ messages in thread
From: Steve Sistare @ 2023-11-30 21:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Steve Sistare

Add a test case to verify that the suspended state is handled correctly by
a background migration.  The test suspends the src, migrates, then wakes
the dest.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 tests/qtest/migration-test.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index e16710f..30d4b32 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1954,6 +1954,26 @@ static void test_precopy_unix_suspend_notlive(void)
     test_precopy_common(&args);
 }
 
+static void *test_bg_suspend_start(QTestState *from, QTestState *to)
+{
+    migrate_set_capability(from, "background-snapshot", true);
+    return NULL;
+}
+
+static void test_bg_suspend(void)
+{
+    g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    MigrateCommon args = {
+        .listen_uri = uri,
+        .connect_uri = uri,
+        .live = true,       /* runs fast, the src suspends immediately. */
+        .start.suspend_me = true,
+        .start_hook = test_bg_suspend_start
+    };
+
+    test_precopy_common(&args);
+}
+
 static void test_precopy_unix_dirty_ring(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
@@ -3441,6 +3461,7 @@ int main(int argc, char **argv)
         if (is_x86) {
             qtest_add_func("/migration/postcopy/suspend",
                            test_postcopy_suspend);
+            qtest_add_func("/migration/bg/suspend", test_bg_suspend);
         }
     }
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 02/14] cpus: vm_was_suspended
  2023-11-30 21:37 ` [PATCH V6 02/14] cpus: vm_was_suspended Steve Sistare
@ 2023-11-30 22:03   ` Peter Xu
  0 siblings, 0 replies; 57+ messages in thread
From: Peter Xu @ 2023-11-30 22:03 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Thu, Nov 30, 2023 at 01:37:15PM -0800, Steve Sistare wrote:
> Add a state variable to remember if a vm previously transitioned into a
> suspended state.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

I'd even consider squashing this small patch into the next, the reasoning
to have it resides there, but not a huge deal:

Reviewed-by: Peter Xu <peterx@redhat.com>

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-11-30 21:37 ` [PATCH V6 03/14] cpus: stop vm in suspended runstate Steve Sistare
@ 2023-11-30 22:10   ` Peter Xu
  2023-12-01 17:11     ` Steven Sistare
  2023-12-22 12:20   ` Markus Armbruster
  1 sibling, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-11-30 22:10 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Markus Armbruster, Eric Blake

On Thu, Nov 30, 2023 at 01:37:16PM -0800, Steve Sistare wrote:
> Currently, a vm in the suspended state is not completely stopped.  The VCPUs
> have been paused, but the cpu clock still runs, and runstate notifiers for
> the transition to stopped have not been called.  This causes problems for
> live migration.  Stale cpu timers_state is saved to the migration stream,
> causing time errors in the guest when it wakes from suspend, and state that
> would have been modified by runstate notifiers is wrong.
> 
> Modify vm_stop to completely stop the vm if the current state is suspended,
> transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
> Modify vm_start to restore the suspended state.
> 
> This affects all callers of vm_stop and vm_start, notably, the qapi stop and
> cont commands.  For example:
> 
>     (qemu) info status
>     VM status: paused (suspended)
> 
>     (qemu) stop
>     (qemu) info status
>     VM status: paused
> 
>     (qemu) cont
>     (qemu) info status
>     VM status: paused (suspended)
> 
>     (qemu) system_wakeup
>     (qemu) info status
>     VM status: running

So system_wakeup for a stopped (but used to be suspended) VM will fail
directly, not touching vm_was_suspended.  It's not mentioned here, but that
behavior makes sense to me.

> 
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

Since you touched qapi/, please copy maintainers too.  I've copied Markus
and Eric in this reply.

I also have some nitpicks which may not affect the R-b, please see below.

> ---
>  include/sysemu/runstate.h |  5 +++++
>  qapi/misc.json            | 10 ++++++++--
>  system/cpus.c             | 19 ++++++++++++++-----
>  system/runstate.c         |  3 +++
>  4 files changed, 30 insertions(+), 7 deletions(-)
> 
> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
> index f6a337b..1d6828f 100644
> --- a/include/sysemu/runstate.h
> +++ b/include/sysemu/runstate.h
> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>      return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>  }
>  
> +static inline bool runstate_is_started(RunState state)

Would runstate_has_vm_running() sound better?  It is a bit awkward when
saying something like "start a runstate".

> +{
> +    return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
> +}
> +
>  void vm_start(void);
>  
>  /**
> diff --git a/qapi/misc.json b/qapi/misc.json
> index cda2eff..efb8d44 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -134,7 +134,7 @@
>  ##
>  # @stop:
>  #
> -# Stop all guest VCPU execution.
> +# Stop all guest VCPU and VM execution.
>  #
>  # Since: 0.14
>  #
> @@ -143,6 +143,9 @@
>  #     the guest remains paused once migration finishes, as if the -S
>  #     option was passed on the command line.
>  #
> +#     In the "suspended" state, it will completely stop the VM and
> +#     cause a transition to the "paused" state. (Since 9.0)
> +#
>  # Example:
>  #
>  # -> { "execute": "stop" }
> @@ -153,7 +156,7 @@
>  ##
>  # @cont:
>  #
> -# Resume guest VCPU execution.
> +# Resume guest VCPU and VM execution.
>  #
>  # Since: 0.14
>  #
> @@ -165,6 +168,9 @@
>  #     guest starts once migration finishes, removing the effect of the
>  #     -S command line option if it was passed.
>  #
> +#     If the VM was previously suspended, and not been reset or woken,
> +#     this command will transition back to the "suspended" state. (Since 9.0)
> +#
>  # Example:
>  #
>  # -> { "execute": "cont" }
> diff --git a/system/cpus.c b/system/cpus.c
> index ef7a0d3..cbc6d6d 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -277,11 +277,15 @@ bool vm_get_suspended(void)
>  static int do_vm_stop(RunState state, bool send_stop)
>  {
>      int ret = 0;
> +    RunState oldstate = runstate_get();
>  
> -    if (runstate_is_running()) {
> +    if (runstate_is_started(oldstate)) {
> +        vm_was_suspended = (oldstate == RUN_STATE_SUSPENDED);
>          runstate_set(state);
>          cpu_disable_ticks();
> -        pause_all_vcpus();
> +        if (oldstate == RUN_STATE_RUNNING) {
> +            pause_all_vcpus();
> +        }
>          vm_state_notify(0, state);
>          if (send_stop) {
>              qapi_event_send_stop();
> @@ -736,8 +740,13 @@ int vm_prepare_start(bool step_pending, RunState state)
>  
>  void vm_start(void)
>  {
> -    if (!vm_prepare_start(false, RUN_STATE_RUNNING)) {
> -        resume_all_vcpus();
> +    RunState state = vm_was_suspended ? RUN_STATE_SUSPENDED : RUN_STATE_RUNNING;
> +
> +    if (!vm_prepare_start(false, state)) {
> +        if (state == RUN_STATE_RUNNING) {
> +            resume_all_vcpus();
> +        }
> +        vm_was_suspended = false;
>      }
>  }
>  
> @@ -745,7 +754,7 @@ void vm_start(void)
>     current state is forgotten forever */
>  int vm_stop_force_state(RunState state)
>  {
> -    if (runstate_is_running()) {
> +    if (runstate_is_started(runstate_get())) {
>          return vm_stop(state);
>      } else {
>          int ret;
> diff --git a/system/runstate.c b/system/runstate.c
> index ea9d6c2..e2fa204 100644
> --- a/system/runstate.c
> +++ b/system/runstate.c
> @@ -108,6 +108,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE },
>      { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
>      { RUN_STATE_PAUSED, RUN_STATE_COLO},
> +    { RUN_STATE_PAUSED, RUN_STATE_SUSPENDED},
>  
>      { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
>      { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
> @@ -161,6 +162,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
>      { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>      { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
> +    { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED},
>  
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
> @@ -502,6 +504,7 @@ void qemu_system_reset(ShutdownCause reason)
>          qapi_event_send_reset(shutdown_caused_by_guest(reason), reason);
>      }
>      cpu_synchronize_all_post_reset();
> +    vm_set_suspended(false);
>  }
>  
>  /*
> -- 
> 1.8.3.1
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-11-30 21:37 ` [PATCH V6 05/14] migration: propagate suspended runstate Steve Sistare
@ 2023-11-30 23:06   ` Peter Xu
  2023-12-01 16:23     ` Steven Sistare
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-11-30 23:06 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Thu, Nov 30, 2023 at 01:37:18PM -0800, Steve Sistare wrote:
> If the outgoing machine was previously suspended, propagate that to the
> incoming side via global_state, so a subsequent vm_start restores the
> suspended state.  To maintain backward and forward compatibility, define
> the new field in a zero'd hole in the GlobalState struct.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  migration/global_state.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/migration/global_state.c b/migration/global_state.c
> index 4e2a9d8..de2532c 100644
> --- a/migration/global_state.c
> +++ b/migration/global_state.c
> @@ -25,6 +25,7 @@ typedef struct {
>      uint8_t runstate[100];
>      RunState state;
>      bool received;
> +    bool vm_was_suspended;
>  } GlobalState;
>  
>  static GlobalState global_state;
> @@ -35,6 +36,7 @@ static void global_state_do_store(RunState state)
>      assert(strlen(state_str) < sizeof(global_state.runstate));
>      strpadcpy((char *)global_state.runstate, sizeof(global_state.runstate),
>                state_str, '\0');
> +    global_state.vm_was_suspended = vm_get_suspended();
>  }
>  
>  void global_state_store(void)
> @@ -68,6 +70,12 @@ static bool global_state_needed(void *opaque)
>          return true;
>      }
>  
> +    /* If the suspended state must be remembered, it is needed */
> +
> +    if (vm_get_suspended()) {
> +        return true;
> +    }
> +
>      /* If state is running or paused, it is not needed */
>  
>      if (strcmp(runstate, "running") == 0 ||
> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int version_id)
>          return -EINVAL;
>      }
>      s->state = r;
> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);

IIUC current vm_was_suspended (based on my read of your patch) was not the
same as a boolean representing "whether VM is suspended", but only a
temporary field to remember that for a VM stop request.  To be explicit, I
didn't see this flag set in qemu_system_suspend() in your previous patch.

If so, we can already do:

  vm_set_suspended(s->vm_was_suspended);

Irrelevant of RUN_STATE_SUSPENDED?

>  
>      return 0;
>  }
> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
>      .fields = (VMStateField[]) {
>          VMSTATE_UINT32(size, GlobalState),
>          VMSTATE_BUFFER(runstate, GlobalState),
> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
>          VMSTATE_END_OF_LIST()
>      },
>  };

I think this will break migration between old/new, unfortunately.  And
since the global state exist mostly for every VM, all VM setup should be
affected, and over all archs.

We used to have the version_id field right above for adding fields, but I
_think_ that will still break backward migration fron new->old binary, so
not wanted.  Juan can keep me honest.

The best thing is still machine compat properties, afaict, to fix.  It's
slightly involved, but let me attach a sample diff for you (at the end,
possibly working with your current patch kind-of squashed, but not ever
tested), hopefully make it slightly easier.

I'm wondering how bad it is to just ignore it, it's not as bad as if we
don't fix stop-during-suspend, in this case the worst case of forgetting
this field over migration is: if VM stopped (and used to be suspended) then
after migration it'll keep being stopped, however after "cont" it'll forget
the suspended state.  Not that bad!  IIUC SPR should always migrate with
suspended (rather than any fully stopped state), right?  Then shouldn't be
affected.  If risk is low, maybe we can leave this one for later?

Thanks,

===8<===

diff --git a/migration/migration.h b/migration/migration.h
index cf2c9c88e0..c3fd1f8347 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -470,6 +470,8 @@ struct MigrationState {
     bool switchover_acked;
     /* Is this a rdma migration */
     bool rdma_migration;
+    /* Whether remember global vm_was_suspended field? */
+    bool store_vm_was_suspended;
 };
 
 void migrate_set_state(int *state, int old_state, int new_state);
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 0c17398141..365e01c1c9 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -37,6 +37,7 @@ GlobalProperty hw_compat_8_1[] = {
     { "ramfb", "x-migrate", "off" },
     { "vfio-pci-nohotplug", "x-ramfb-migrate", "off" },
     { "igb", "x-pcie-flr-init", "off" },
+    { "migration", "store-vm-was-suspended", false },
 };
 const size_t hw_compat_8_1_len = G_N_ELEMENTS(hw_compat_8_1);
 
diff --git a/migration/global_state.c b/migration/global_state.c
index 4e2a9d8ec0..ffa7bf82ca 100644
--- a/migration/global_state.c
+++ b/migration/global_state.c
@@ -25,6 +25,7 @@ typedef struct {
     uint8_t runstate[100];
     RunState state;
     bool received;
+    bool vm_was_suspended;
 } GlobalState;
 
 static GlobalState global_state;
@@ -124,6 +125,25 @@ static int global_state_pre_save(void *opaque)
     return 0;
 }
 
+static bool global_state_has_vm_was_suspended(void *opaque)
+{
+    return migrate_get_current()->store_vm_was_suspended;
+}
+
+static const VMStateDescription vmstate_vm_was_suspended = {
+    .name = "globalstate/vm_was_suspended",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    /* TODO: Fill these in */
+    .pre_save = NULL,
+    .post_load = NULL,
+    .needed = global_state_has_vm_was_suspended,
+    .fields = (VMStateField[]) {
+        VMSTATE_BOOL(vm_was_suspended, GlobalState),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static const VMStateDescription vmstate_globalstate = {
     .name = "globalstate",
     .version_id = 1,
@@ -136,6 +156,10 @@ static const VMStateDescription vmstate_globalstate = {
         VMSTATE_BUFFER(runstate, GlobalState),
         VMSTATE_END_OF_LIST()
     },
+    .subsections = (const VMStateDescription*[]) {
+        &vmstate_vm_was_suspended,
+        NULL
+    },
 };
 
 void register_global_state(void)
diff --git a/migration/options.c b/migration/options.c
index 8d8ec73ad9..5f9998d3e8 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -88,6 +88,8 @@
 Property migration_properties[] = {
     DEFINE_PROP_BOOL("store-global-state", MigrationState,
                      store_global_state, true),
+    DEFINE_PROP_BOOL("store-vm-was-suspended", MigrationState,
+                     store_vm_was_suspended, true),
     DEFINE_PROP_BOOL("send-configuration", MigrationState,
                      send_configuration, true),
     DEFINE_PROP_BOOL("send-section-footer", MigrationState,

-- 
Peter Xu



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-11-30 23:06   ` Peter Xu
@ 2023-12-01 16:23     ` Steven Sistare
  2023-12-04 17:24       ` Peter Xu
  0 siblings, 1 reply; 57+ messages in thread
From: Steven Sistare @ 2023-12-01 16:23 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On 11/30/2023 6:06 PM, Peter Xu wrote:
> On Thu, Nov 30, 2023 at 01:37:18PM -0800, Steve Sistare wrote:
>> If the outgoing machine was previously suspended, propagate that to the
>> incoming side via global_state, so a subsequent vm_start restores the
>> suspended state.  To maintain backward and forward compatibility, define
>> the new field in a zero'd hole in the GlobalState struct.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  migration/global_state.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/migration/global_state.c b/migration/global_state.c
>> index 4e2a9d8..de2532c 100644
>> --- a/migration/global_state.c
>> +++ b/migration/global_state.c
>> @@ -25,6 +25,7 @@ typedef struct {
>>      uint8_t runstate[100];
>>      RunState state;
>>      bool received;
>> +    bool vm_was_suspended;
>>  } GlobalState;
>>  
>>  static GlobalState global_state;
>> @@ -35,6 +36,7 @@ static void global_state_do_store(RunState state)
>>      assert(strlen(state_str) < sizeof(global_state.runstate));
>>      strpadcpy((char *)global_state.runstate, sizeof(global_state.runstate),
>>                state_str, '\0');
>> +    global_state.vm_was_suspended = vm_get_suspended();
>>  }
>>  
>>  void global_state_store(void)
>> @@ -68,6 +70,12 @@ static bool global_state_needed(void *opaque)
>>          return true;
>>      }
>>  
>> +    /* If the suspended state must be remembered, it is needed */
>> +
>> +    if (vm_get_suspended()) {
>> +        return true;
>> +    }
>> +
>>      /* If state is running or paused, it is not needed */
>>  
>>      if (strcmp(runstate, "running") == 0 ||
>> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int version_id)
>>          return -EINVAL;
>>      }
>>      s->state = r;
>> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
> 
> IIUC current vm_was_suspended (based on my read of your patch) was not the
> same as a boolean representing "whether VM is suspended", but only a
> temporary field to remember that for a VM stop request.  To be explicit, I
> didn't see this flag set in qemu_system_suspend() in your previous patch.
> 
> If so, we can already do:
> 
>   vm_set_suspended(s->vm_was_suspended);
> 
> Irrelevant of RUN_STATE_SUSPENDED?

We need both terms of the expression.

If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = false.
We call global_state_store prior to vm_stop_force_state, so the incoming
side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
calling vm_start, we need to restore the suspended state.  Thus in 
global_state_post_load, we must set vm_was_suspended = true.

If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
then vm_was_suspended = true.  Migration from that state sets
vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and 
ends with runstate_set(RUN_STATE_PAUSED).

I will add a comment here in the code.
 
>>      return 0;
>>  }
>> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
>>      .fields = (VMStateField[]) {
>>          VMSTATE_UINT32(size, GlobalState),
>>          VMSTATE_BUFFER(runstate, GlobalState),
>> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
>>          VMSTATE_END_OF_LIST()
>>      },
>>  };
> 
> I think this will break migration between old/new, unfortunately.  And
> since the global state exist mostly for every VM, all VM setup should be
> affected, and over all archs.

Thanks, I keep forgetting that my binary tricks are no good here.  However,
I have one other trick up my sleeve, which is to store vm_was_running in
global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
compatible, since that byte is always 0 in older qemu.  It can be implemented
with a few lines of code change confined to global_state.c, versus many lines 
spread across files to do it the conventional way using a compat property and
a subsection.  Sound OK?  

> We used to have the version_id field right above for adding fields, but I
> _think_ that will still break backward migration fron new->old binary, so
> not wanted.  Juan can keep me honest.
> 
> The best thing is still machine compat properties, afaict, to fix.  It's
> slightly involved, but let me attach a sample diff for you (at the end,
> possibly working with your current patch kind-of squashed, but not ever
> tested), hopefully make it slightly easier.
> 
> I'm wondering how bad it is to just ignore it, it's not as bad as if we
> don't fix stop-during-suspend, in this case the worst case of forgetting
> this field over migration is: if VM stopped (and used to be suspended) then
> after migration it'll keep being stopped, however after "cont" it'll forget
> the suspended state.  

Cont changes state to running, but the VM is broken because wake was never called.

> Not that bad!  IIUC SPR should always migrate with
> suspended (rather than any fully stopped state), right?  Then shouldn't be
> affected.  If risk is low, maybe we can leave this one for later?
> 
> Thanks,

Thanks for the patch.  I am going to save this as a template for adding compatible
subsections in future work.

- Steve

> ===8<===
> 
> diff --git a/migration/migration.h b/migration/migration.h
> index cf2c9c88e0..c3fd1f8347 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -470,6 +470,8 @@ struct MigrationState {
>      bool switchover_acked;
>      /* Is this a rdma migration */
>      bool rdma_migration;
> +    /* Whether remember global vm_was_suspended field? */
> +    bool store_vm_was_suspended;
>  };
>  
>  void migrate_set_state(int *state, int old_state, int new_state);
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 0c17398141..365e01c1c9 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -37,6 +37,7 @@ GlobalProperty hw_compat_8_1[] = {
>      { "ramfb", "x-migrate", "off" },
>      { "vfio-pci-nohotplug", "x-ramfb-migrate", "off" },
>      { "igb", "x-pcie-flr-init", "off" },
> +    { "migration", "store-vm-was-suspended", false },
>  };
>  const size_t hw_compat_8_1_len = G_N_ELEMENTS(hw_compat_8_1);
>  
> diff --git a/migration/global_state.c b/migration/global_state.c
> index 4e2a9d8ec0..ffa7bf82ca 100644
> --- a/migration/global_state.c
> +++ b/migration/global_state.c
> @@ -25,6 +25,7 @@ typedef struct {
>      uint8_t runstate[100];
>      RunState state;
>      bool received;
> +    bool vm_was_suspended;
>  } GlobalState;
>  
>  static GlobalState global_state;
> @@ -124,6 +125,25 @@ static int global_state_pre_save(void *opaque)
>      return 0;
>  }
>  
> +static bool global_state_has_vm_was_suspended(void *opaque)
> +{
> +    return migrate_get_current()->store_vm_was_suspended;
> +}
> +
> +static const VMStateDescription vmstate_vm_was_suspended = {
> +    .name = "globalstate/vm_was_suspended",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    /* TODO: Fill these in */
> +    .pre_save = NULL,
> +    .post_load = NULL,
> +    .needed = global_state_has_vm_was_suspended,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static const VMStateDescription vmstate_globalstate = {
>      .name = "globalstate",
>      .version_id = 1,
> @@ -136,6 +156,10 @@ static const VMStateDescription vmstate_globalstate = {
>          VMSTATE_BUFFER(runstate, GlobalState),
>          VMSTATE_END_OF_LIST()
>      },
> +    .subsections = (const VMStateDescription*[]) {
> +        &vmstate_vm_was_suspended,
> +        NULL
> +    },
>  };
>  
>  void register_global_state(void)
> diff --git a/migration/options.c b/migration/options.c
> index 8d8ec73ad9..5f9998d3e8 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -88,6 +88,8 @@
>  Property migration_properties[] = {
>      DEFINE_PROP_BOOL("store-global-state", MigrationState,
>                       store_global_state, true),
> +    DEFINE_PROP_BOOL("store-vm-was-suspended", MigrationState,
> +                     store_vm_was_suspended, true),
>      DEFINE_PROP_BOOL("send-configuration", MigrationState,
>                       send_configuration, true),
>      DEFINE_PROP_BOOL("send-section-footer", MigrationState,
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-11-30 22:10   ` Peter Xu
@ 2023-12-01 17:11     ` Steven Sistare
  2023-12-04 16:35       ` Peter Xu
  0 siblings, 1 reply; 57+ messages in thread
From: Steven Sistare @ 2023-12-01 17:11 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Markus Armbruster, Eric Blake

On 11/30/2023 5:10 PM, Peter Xu wrote:
> On Thu, Nov 30, 2023 at 01:37:16PM -0800, Steve Sistare wrote:
>> Currently, a vm in the suspended state is not completely stopped.  The VCPUs
>> have been paused, but the cpu clock still runs, and runstate notifiers for
>> the transition to stopped have not been called.  This causes problems for
>> live migration.  Stale cpu timers_state is saved to the migration stream,
>> causing time errors in the guest when it wakes from suspend, and state that
>> would have been modified by runstate notifiers is wrong.
>>
>> Modify vm_stop to completely stop the vm if the current state is suspended,
>> transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
>> Modify vm_start to restore the suspended state.
>>
>> This affects all callers of vm_stop and vm_start, notably, the qapi stop and
>> cont commands.  For example:
>>
>>     (qemu) info status
>>     VM status: paused (suspended)
>>
>>     (qemu) stop
>>     (qemu) info status
>>     VM status: paused
>>
>>     (qemu) cont
>>     (qemu) info status
>>     VM status: paused (suspended)
>>
>>     (qemu) system_wakeup
>>     (qemu) info status
>>     VM status: running
> 
> So system_wakeup for a stopped (but used to be suspended) VM will fail
> directly, not touching vm_was_suspended.  It's not mentioned here, but that
> behavior makes sense to me.

Right.  I'll add that example above.

>> Suggested-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> 
> Since you touched qapi/, please copy maintainers too.  I've copied Markus
> and Eric in this reply.

My bad, thanks for the cc.

> I also have some nitpicks which may not affect the R-b, please see below.
> 
>> ---
>>  include/sysemu/runstate.h |  5 +++++
>>  qapi/misc.json            | 10 ++++++++--
>>  system/cpus.c             | 19 ++++++++++++++-----
>>  system/runstate.c         |  3 +++
>>  4 files changed, 30 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
>> index f6a337b..1d6828f 100644
>> --- a/include/sysemu/runstate.h
>> +++ b/include/sysemu/runstate.h
>> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>>      return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>>  }
>>  
>> +static inline bool runstate_is_started(RunState state)
> 
> Would runstate_has_vm_running() sound better?  It is a bit awkward when
> saying something like "start a runstate".

I have been searching for the perfect name for this accessor.
IMO using "running" in this accessor is confusing because it applies to both
the running and suspended state.  So, I invented a new aggregate state called
started.  vm_start transitions the machine to a started state.

How about runstate_was_started?  It works well at both start and stop call sites:

    void vm_resume(RunState state)
        if (runstate_was_started(state)) {
            vm_start();

    int vm_stop_force_state(RunState state)
        if (runstate_was_started(runstate_get())) {
            return vm_stop(state);

- Steve

>> +{
>> +    return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
>> +}
>> +


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-12-01 17:11     ` Steven Sistare
@ 2023-12-04 16:35       ` Peter Xu
  2023-12-04 16:41         ` Steven Sistare
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-12-04 16:35 UTC (permalink / raw)
  To: Steven Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Markus Armbruster, Eric Blake

On Fri, Dec 01, 2023 at 12:11:32PM -0500, Steven Sistare wrote:
> >> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
> >> index f6a337b..1d6828f 100644
> >> --- a/include/sysemu/runstate.h
> >> +++ b/include/sysemu/runstate.h
> >> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
> >>      return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
> >>  }
> >>  
> >> +static inline bool runstate_is_started(RunState state)
> > 
> > Would runstate_has_vm_running() sound better?  It is a bit awkward when
> > saying something like "start a runstate".
> 
> I have been searching for the perfect name for this accessor.
> IMO using "running" in this accessor is confusing because it applies to both
> the running and suspended state.  So, I invented a new aggregate state called
> started.  vm_start transitions the machine to a started state.
> 
> How about runstate_was_started?  It works well at both start and stop call sites:
> 
>     void vm_resume(RunState state)
>         if (runstate_was_started(state)) {

This one looks fine, but...

>             vm_start();
> 
>     int vm_stop_force_state(RunState state)
>         if (runstate_was_started(runstate_get())) {

.. this one makes the past tense not looking good.

>             return vm_stop(state);

How about runstate_is_alive()?  So far the best I can come up with. :)

Even if you prefer "started", I'd vote for not using past tense, hence
runstate_is_started().

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-12-04 16:35       ` Peter Xu
@ 2023-12-04 16:41         ` Steven Sistare
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Sistare @ 2023-12-04 16:41 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Markus Armbruster, Eric Blake

On 12/4/2023 11:35 AM, Peter Xu wrote:
> On Fri, Dec 01, 2023 at 12:11:32PM -0500, Steven Sistare wrote:
>>>> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
>>>> index f6a337b..1d6828f 100644
>>>> --- a/include/sysemu/runstate.h
>>>> +++ b/include/sysemu/runstate.h
>>>> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>>>>      return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>>>>  }
>>>>  
>>>> +static inline bool runstate_is_started(RunState state)
>>>
>>> Would runstate_has_vm_running() sound better?  It is a bit awkward when
>>> saying something like "start a runstate".
>>
>> I have been searching for the perfect name for this accessor.
>> IMO using "running" in this accessor is confusing because it applies to both
>> the running and suspended state.  So, I invented a new aggregate state called
>> started.  vm_start transitions the machine to a started state.
>>
>> How about runstate_was_started?  It works well at both start and stop call sites:
>>
>>     void vm_resume(RunState state)
>>         if (runstate_was_started(state)) {
> 
> This one looks fine, but...
> 
>>             vm_start();
>>
>>     int vm_stop_force_state(RunState state)
>>         if (runstate_was_started(runstate_get())) {
> 
> .. this one makes the past tense not looking good.
> 
>>             return vm_stop(state);
> 
> How about runstate_is_alive()?  So far the best I can come up with. :)
> 
> Even if you prefer "started", I'd vote for not using past tense, hence
> runstate_is_started().

runstate_is_live also occurred to me.  I'll use that.

- Steve


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-01 16:23     ` Steven Sistare
@ 2023-12-04 17:24       ` Peter Xu
  2023-12-04 19:31         ` Fabiano Rosas
  2023-12-04 22:23         ` Steven Sistare
  0 siblings, 2 replies; 57+ messages in thread
From: Peter Xu @ 2023-12-04 17:24 UTC (permalink / raw)
  To: Steven Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote:
> >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int version_id)
> >>          return -EINVAL;
> >>      }
> >>      s->state = r;
> >> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
> > 
> > IIUC current vm_was_suspended (based on my read of your patch) was not the
> > same as a boolean representing "whether VM is suspended", but only a
> > temporary field to remember that for a VM stop request.  To be explicit, I
> > didn't see this flag set in qemu_system_suspend() in your previous patch.
> > 
> > If so, we can already do:
> > 
> >   vm_set_suspended(s->vm_was_suspended);
> > 
> > Irrelevant of RUN_STATE_SUSPENDED?
> 
> We need both terms of the expression.
> 
> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = false.
> We call global_state_store prior to vm_stop_force_state, so the incoming
> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.

Right.

> However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
> calling vm_start, we need to restore the suspended state.  Thus in 
> global_state_post_load, we must set vm_was_suspended = true.

With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED
already?  Then I think it should call vm_start(SUSPENDED) if to start.

Maybe you're talking about the special case where autostart==false?  We
used to have this (existing process_incoming_migration_bh()):

    if (!global_state_received() ||
        global_state_get_runstate() == RUN_STATE_RUNNING) {
        if (autostart) {
            vm_start();
        } else {
            runstate_set(RUN_STATE_PAUSED);
        }
    }

If so maybe I get you, because in the "else" path we do seem to lose the
SUSPENDED state again, but in that case IMHO we should logically set
vm_was_suspended only when we "lose" it - we didn't lose it during
migration, but only until we decided to switch to PAUSED (due to
autostart==false). IOW, change above to something like:

    state = global_state_get_runstate();
    if (!global_state_received() || runstate_is_alive(state)) {
        if (autostart) {
            vm_start(state);
        } else {
            if (runstate_is_suspended(state)) {
                /* Remember suspended state before setting system to STOPed */
                vm_was_suspended = true;
            }
            runstate_set(RUN_STATE_PAUSED);
        }
    }

It may or may not have a functional difference even if current patch,
though.  However maybe clearer to follow vm_was_suspended's strict
definition.

> 
> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
> then vm_was_suspended = true.  Migration from that state sets
> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and 
> ends with runstate_set(RUN_STATE_PAUSED).
> 
> I will add a comment here in the code.
>  
> >>      return 0;
> >>  }
> >> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
> >>      .fields = (VMStateField[]) {
> >>          VMSTATE_UINT32(size, GlobalState),
> >>          VMSTATE_BUFFER(runstate, GlobalState),
> >> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
> >>          VMSTATE_END_OF_LIST()
> >>      },
> >>  };
> > 
> > I think this will break migration between old/new, unfortunately.  And
> > since the global state exist mostly for every VM, all VM setup should be
> > affected, and over all archs.
> 
> Thanks, I keep forgetting that my binary tricks are no good here.  However,
> I have one other trick up my sleeve, which is to store vm_was_running in
> global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
> compatible, since that byte is always 0 in older qemu.  It can be implemented
> with a few lines of code change confined to global_state.c, versus many lines 
> spread across files to do it the conventional way using a compat property and
> a subsection.  Sound OK?  

Tricky!  But sounds okay to me.  I think you're inventing some of your own
way of being compatible, not relying on machine type as a benefit.  If go
this route please document clearly on the layout and also what it looked
like in old binaries.

I think maybe it'll be good to keep using strings, so in the new binaries
we allow >1 strings, then we define properly on those strings (index 0:
runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to
express, while 0x00 means old binary, etc.).

I hope this trick will need less code than the subsection solution,
otherwise I'd still consider going with that, which is the "common
solution".

Let's also see whether Juan/Fabiano/others has any opinions.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-04 17:24       ` Peter Xu
@ 2023-12-04 19:31         ` Fabiano Rosas
  2023-12-04 20:02           ` Peter Xu
  2023-12-04 22:23         ` Steven Sistare
  1 sibling, 1 reply; 57+ messages in thread
From: Fabiano Rosas @ 2023-12-04 19:31 UTC (permalink / raw)
  To: Peter Xu, Steven Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

Peter Xu <peterx@redhat.com> writes:

> On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote:
>> >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int version_id)
>> >>          return -EINVAL;
>> >>      }
>> >>      s->state = r;
>> >> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
>> > 
>> > IIUC current vm_was_suspended (based on my read of your patch) was not the
>> > same as a boolean representing "whether VM is suspended", but only a
>> > temporary field to remember that for a VM stop request.  To be explicit, I
>> > didn't see this flag set in qemu_system_suspend() in your previous patch.
>> > 
>> > If so, we can already do:
>> > 
>> >   vm_set_suspended(s->vm_was_suspended);
>> > 
>> > Irrelevant of RUN_STATE_SUSPENDED?
>> 
>> We need both terms of the expression.
>> 
>> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = false.
>> We call global_state_store prior to vm_stop_force_state, so the incoming
>> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
>
> Right.
>
>> However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
>> calling vm_start, we need to restore the suspended state.  Thus in 
>> global_state_post_load, we must set vm_was_suspended = true.
>
> With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED
> already?  Then I think it should call vm_start(SUSPENDED) if to start.
>
> Maybe you're talking about the special case where autostart==false?  We
> used to have this (existing process_incoming_migration_bh()):
>
>     if (!global_state_received() ||
>         global_state_get_runstate() == RUN_STATE_RUNNING) {
>         if (autostart) {
>             vm_start();
>         } else {
>             runstate_set(RUN_STATE_PAUSED);
>         }
>     }
>
> If so maybe I get you, because in the "else" path we do seem to lose the
> SUSPENDED state again, but in that case IMHO we should logically set
> vm_was_suspended only when we "lose" it - we didn't lose it during
> migration, but only until we decided to switch to PAUSED (due to
> autostart==false). IOW, change above to something like:
>
>     state = global_state_get_runstate();
>     if (!global_state_received() || runstate_is_alive(state)) {
>         if (autostart) {
>             vm_start(state);
>         } else {
>             if (runstate_is_suspended(state)) {
>                 /* Remember suspended state before setting system to STOPed */
>                 vm_was_suspended = true;
>             }
>             runstate_set(RUN_STATE_PAUSED);
>         }
>     }
>
> It may or may not have a functional difference even if current patch,
> though.  However maybe clearer to follow vm_was_suspended's strict
> definition.
>
>> 
>> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
>> then vm_was_suspended = true.  Migration from that state sets
>> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and 
>> ends with runstate_set(RUN_STATE_PAUSED).
>> 
>> I will add a comment here in the code.
>>  
>> >>      return 0;
>> >>  }
>> >> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
>> >>      .fields = (VMStateField[]) {
>> >>          VMSTATE_UINT32(size, GlobalState),
>> >>          VMSTATE_BUFFER(runstate, GlobalState),
>> >> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
>> >>          VMSTATE_END_OF_LIST()
>> >>      },
>> >>  };
>> > 
>> > I think this will break migration between old/new, unfortunately.  And
>> > since the global state exist mostly for every VM, all VM setup should be
>> > affected, and over all archs.
>> 
>> Thanks, I keep forgetting that my binary tricks are no good here.  However,
>> I have one other trick up my sleeve, which is to store vm_was_running in
>> global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
>> compatible, since that byte is always 0 in older qemu.  It can be implemented
>> with a few lines of code change confined to global_state.c, versus many lines 
>> spread across files to do it the conventional way using a compat property and
>> a subsection.  Sound OK?  
>
> Tricky!  But sounds okay to me.  I think you're inventing some of your own
> way of being compatible, not relying on machine type as a benefit.  If go
> this route please document clearly on the layout and also what it looked
> like in old binaries.
>
> I think maybe it'll be good to keep using strings, so in the new binaries
> we allow >1 strings, then we define properly on those strings (index 0:
> runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to
> express, while 0x00 means old binary, etc.).
>
> I hope this trick will need less code than the subsection solution,
> otherwise I'd still consider going with that, which is the "common
> solution".
>
> Let's also see whether Juan/Fabiano/others has any opinions.

Can't we pack the structure and just go ahead and slash 'runstate' in
half? That would claim some unused bytes for future backward
compatibility issues.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-04 19:31         ` Fabiano Rosas
@ 2023-12-04 20:02           ` Peter Xu
  2023-12-04 21:09             ` Fabiano Rosas
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-12-04 20:02 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

On Mon, Dec 04, 2023 at 04:31:56PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote:
> >> >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int version_id)
> >> >>          return -EINVAL;
> >> >>      }
> >> >>      s->state = r;
> >> >> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
> >> > 
> >> > IIUC current vm_was_suspended (based on my read of your patch) was not the
> >> > same as a boolean representing "whether VM is suspended", but only a
> >> > temporary field to remember that for a VM stop request.  To be explicit, I
> >> > didn't see this flag set in qemu_system_suspend() in your previous patch.
> >> > 
> >> > If so, we can already do:
> >> > 
> >> >   vm_set_suspended(s->vm_was_suspended);
> >> > 
> >> > Irrelevant of RUN_STATE_SUSPENDED?
> >> 
> >> We need both terms of the expression.
> >> 
> >> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = false.
> >> We call global_state_store prior to vm_stop_force_state, so the incoming
> >> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
> >
> > Right.
> >
> >> However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
> >> calling vm_start, we need to restore the suspended state.  Thus in 
> >> global_state_post_load, we must set vm_was_suspended = true.
> >
> > With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED
> > already?  Then I think it should call vm_start(SUSPENDED) if to start.
> >
> > Maybe you're talking about the special case where autostart==false?  We
> > used to have this (existing process_incoming_migration_bh()):
> >
> >     if (!global_state_received() ||
> >         global_state_get_runstate() == RUN_STATE_RUNNING) {
> >         if (autostart) {
> >             vm_start();
> >         } else {
> >             runstate_set(RUN_STATE_PAUSED);
> >         }
> >     }
> >
> > If so maybe I get you, because in the "else" path we do seem to lose the
> > SUSPENDED state again, but in that case IMHO we should logically set
> > vm_was_suspended only when we "lose" it - we didn't lose it during
> > migration, but only until we decided to switch to PAUSED (due to
> > autostart==false). IOW, change above to something like:
> >
> >     state = global_state_get_runstate();
> >     if (!global_state_received() || runstate_is_alive(state)) {
> >         if (autostart) {
> >             vm_start(state);
> >         } else {
> >             if (runstate_is_suspended(state)) {
> >                 /* Remember suspended state before setting system to STOPed */
> >                 vm_was_suspended = true;
> >             }
> >             runstate_set(RUN_STATE_PAUSED);
> >         }
> >     }
> >
> > It may or may not have a functional difference even if current patch,
> > though.  However maybe clearer to follow vm_was_suspended's strict
> > definition.
> >
> >> 
> >> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
> >> then vm_was_suspended = true.  Migration from that state sets
> >> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and 
> >> ends with runstate_set(RUN_STATE_PAUSED).
> >> 
> >> I will add a comment here in the code.
> >>  
> >> >>      return 0;
> >> >>  }
> >> >> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
> >> >>      .fields = (VMStateField[]) {
> >> >>          VMSTATE_UINT32(size, GlobalState),
> >> >>          VMSTATE_BUFFER(runstate, GlobalState),
> >> >> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
> >> >>          VMSTATE_END_OF_LIST()
> >> >>      },
> >> >>  };
> >> > 
> >> > I think this will break migration between old/new, unfortunately.  And
> >> > since the global state exist mostly for every VM, all VM setup should be
> >> > affected, and over all archs.
> >> 
> >> Thanks, I keep forgetting that my binary tricks are no good here.  However,
> >> I have one other trick up my sleeve, which is to store vm_was_running in
> >> global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
> >> compatible, since that byte is always 0 in older qemu.  It can be implemented
> >> with a few lines of code change confined to global_state.c, versus many lines 
> >> spread across files to do it the conventional way using a compat property and
> >> a subsection.  Sound OK?  
> >
> > Tricky!  But sounds okay to me.  I think you're inventing some of your own
> > way of being compatible, not relying on machine type as a benefit.  If go
> > this route please document clearly on the layout and also what it looked
> > like in old binaries.
> >
> > I think maybe it'll be good to keep using strings, so in the new binaries
> > we allow >1 strings, then we define properly on those strings (index 0:
> > runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to
> > express, while 0x00 means old binary, etc.).
> >
> > I hope this trick will need less code than the subsection solution,
> > otherwise I'd still consider going with that, which is the "common
> > solution".
> >
> > Let's also see whether Juan/Fabiano/others has any opinions.
> 
> Can't we pack the structure and just go ahead and slash 'runstate' in
> half? That would claim some unused bytes for future backward
> compatibility issues.

What I meant is something like:

  runstate[100] = {"str1", 0x00, "str2", 0x00, ...}

Where str1 is runstate, and str2 can be either "0"/"1" to reflect suspended
value.  We define all the strings separated by 0x00, then IIUC we save the
most chars for potential future extension of this string.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 11/14] tests/qtest: precopy migration with suspend
  2023-11-30 21:37 ` [PATCH V6 11/14] tests/qtest: precopy migration with suspend Steve Sistare
@ 2023-12-04 20:49   ` Peter Xu
  2023-12-05 16:14     ` Steven Sistare
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-12-04 20:49 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Thu, Nov 30, 2023 at 01:37:24PM -0800, Steve Sistare wrote:
> Add a test case to verify that the suspended state is handled correctly
> during live migration precopy.  The test suspends the src, migrates, then
> wakes the dest.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  tests/qtest/migration-helpers.c |  3 ++
>  tests/qtest/migration-helpers.h |  2 ++
>  tests/qtest/migration-test.c    | 64 ++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 65 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
> index fd3b94e..37e8e81 100644
> --- a/tests/qtest/migration-helpers.c
> +++ b/tests/qtest/migration-helpers.c
> @@ -32,6 +32,9 @@ bool migrate_watch_for_events(QTestState *who, const char *name,
>      if (g_str_equal(name, "STOP")) {
>          state->stop_seen = true;
>          return true;
> +    } else if (g_str_equal(name, "SUSPEND")) {
> +        state->suspend_seen = true;
> +        return true;
>      } else if (g_str_equal(name, "RESUME")) {
>          state->resume_seen = true;
>          return true;
> diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
> index 3d32699..b478549 100644
> --- a/tests/qtest/migration-helpers.h
> +++ b/tests/qtest/migration-helpers.h
> @@ -18,6 +18,8 @@
>  typedef struct QTestMigrationState {
>      bool stop_seen;
>      bool resume_seen;
> +    bool suspend_seen;
> +    bool suspend_me;
>  } QTestMigrationState;
>  
>  bool migrate_watch_for_events(QTestState *who, const char *name,
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index e10d5a4..200f023 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -178,7 +178,7 @@ static void bootfile_delete(void)
>  /*
>   * Wait for some output in the serial output file,
>   * we get an 'A' followed by an endless string of 'B's
> - * but on the destination we won't have the A.
> + * but on the destination we won't have the A (unless we enabled suspend/resume)
>   */
>  static void wait_for_serial(const char *side)
>  {
> @@ -245,6 +245,13 @@ static void wait_for_resume(QTestState *who, QTestMigrationState *state)
>      }
>  }
>  
> +static void wait_for_suspend(QTestState *who, QTestMigrationState *state)
> +{
> +    if (!state->suspend_seen) {
> +        qtest_qmp_eventwait(who, "SUSPEND");
> +    }
> +}
> +
>  /*
>   * It's tricky to use qemu's migration event capability with qtest,
>   * events suddenly appearing confuse the qmp()/hmp() responses.
> @@ -299,7 +306,7 @@ static void wait_for_migration_pass(QTestState *who)
>  {
>      uint64_t pass, prev_pass = 0, changes = 0;
>  
> -    while (changes < 2 && !src_state.stop_seen) {
> +    while (changes < 2 && !src_state.stop_seen && !src_state.suspend_seen) {
>          usleep(1000);
>          pass = get_migration_pass(who);
>          changes += (pass != prev_pass);
> @@ -595,7 +602,8 @@ static void migrate_wait_for_dirty_mem(QTestState *from,
>      watch_byte = qtest_readb(from, watch_address);
>      do {
>          usleep(1000 * 10);
> -    } while (qtest_readb(from, watch_address) == watch_byte);
> +    } while (qtest_readb(from, watch_address) == watch_byte &&
> +             !src_state.suspend_seen);

This is hackish to me.

AFAIU the guest code won't ever dirty anything after printing the initial
'B'.  IOW, migrate_wait_for_dirty_mem() should not be called for suspend
test at all, I guess, because we know it won't.

>  }
>  
>  
> @@ -771,6 +779,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
>      dst_state = (QTestMigrationState) { };
>      src_state = (QTestMigrationState) { };
>      bootfile_create(tmpfs, args->suspend_me);
> +    src_state.suspend_me = args->suspend_me;
>  
>      if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
>          memory_size = "150M";
> @@ -1730,6 +1739,9 @@ static void test_precopy_common(MigrateCommon *args)
>           * change anything.
>           */
>          if (args->result == MIG_TEST_SUCCEED) {
> +            if (src_state.suspend_me) {
> +                wait_for_suspend(from, &src_state);
> +            }
>              qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
>              wait_for_stop(from, &src_state);
>              migrate_ensure_converge(from);
> @@ -1777,6 +1789,9 @@ static void test_precopy_common(MigrateCommon *args)
>               */
>              wait_for_migration_complete(from);
>  
> +            if (src_state.suspend_me) {
> +                wait_for_suspend(from, &src_state);
> +            }

Here it's pretty much uneasy to follow too, waiting for SUSPEND to come,
even after migration has already completed!

I suspect it never waits, since suspend_seen is normally always already
set (with the above hack, migrate_wait_for_dirty_mem() plays the role of
waiting for SUSPENDED).

>              wait_for_stop(from, &src_state);
>  
>          } else {

IMHO it'll be cleaner to explicitly wait for SUSPENDED when we know
migration is not yet completed.  Then, we know we're migrating with
SUSPENDED.  In summary, perhaps something squashed like this?

====8<====
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 30d4b32a35..65e6692716 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -605,8 +605,7 @@ static void migrate_wait_for_dirty_mem(QTestState *from,
     watch_byte = qtest_readb(from, watch_address);
     do {
         usleep(1000 * 10);
-    } while (qtest_readb(from, watch_address) == watch_byte &&
-             !src_state.suspend_seen);
+    } while (qtest_readb(from, watch_address) == watch_byte);
 }
 
 
@@ -1805,7 +1804,12 @@ static void test_precopy_common(MigrateCommon *args)
                 wait_for_migration_pass(from);
                 args->iterations--;
             }
-            migrate_wait_for_dirty_mem(from, to);
+
+            if (src_state.suspend_me) {
+                wait_for_suspend(from, &src_state);
+            } else {
+                migrate_wait_for_dirty_mem(from, to);
+            }
 
             migrate_ensure_converge(from);
 
@@ -1814,10 +1818,6 @@ static void test_precopy_common(MigrateCommon *args)
              * hanging forever if migration didn't converge
              */
             wait_for_migration_complete(from);
-
-            if (src_state.suspend_me) {
-                wait_for_suspend(from, &src_state);
-            }
             wait_for_stop(from, &src_state);
 
         } else {
====8<====

I didn't check the postcopy patch, but I'd expect something similar would
be nice.

Thanks,

-- 
Peter Xu



^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-04 20:02           ` Peter Xu
@ 2023-12-04 21:09             ` Fabiano Rosas
  2023-12-04 22:04               ` Peter Xu
  0 siblings, 1 reply; 57+ messages in thread
From: Fabiano Rosas @ 2023-12-04 21:09 UTC (permalink / raw)
  To: Peter Xu
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

Peter Xu <peterx@redhat.com> writes:

> On Mon, Dec 04, 2023 at 04:31:56PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote:
>> >> >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int version_id)
>> >> >>          return -EINVAL;
>> >> >>      }
>> >> >>      s->state = r;
>> >> >> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
>> >> > 
>> >> > IIUC current vm_was_suspended (based on my read of your patch) was not the
>> >> > same as a boolean representing "whether VM is suspended", but only a
>> >> > temporary field to remember that for a VM stop request.  To be explicit, I
>> >> > didn't see this flag set in qemu_system_suspend() in your previous patch.
>> >> > 
>> >> > If so, we can already do:
>> >> > 
>> >> >   vm_set_suspended(s->vm_was_suspended);
>> >> > 
>> >> > Irrelevant of RUN_STATE_SUSPENDED?
>> >> 
>> >> We need both terms of the expression.
>> >> 
>> >> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = false.
>> >> We call global_state_store prior to vm_stop_force_state, so the incoming
>> >> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
>> >
>> > Right.
>> >
>> >> However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
>> >> calling vm_start, we need to restore the suspended state.  Thus in 
>> >> global_state_post_load, we must set vm_was_suspended = true.
>> >
>> > With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED
>> > already?  Then I think it should call vm_start(SUSPENDED) if to start.
>> >
>> > Maybe you're talking about the special case where autostart==false?  We
>> > used to have this (existing process_incoming_migration_bh()):
>> >
>> >     if (!global_state_received() ||
>> >         global_state_get_runstate() == RUN_STATE_RUNNING) {
>> >         if (autostart) {
>> >             vm_start();
>> >         } else {
>> >             runstate_set(RUN_STATE_PAUSED);
>> >         }
>> >     }
>> >
>> > If so maybe I get you, because in the "else" path we do seem to lose the
>> > SUSPENDED state again, but in that case IMHO we should logically set
>> > vm_was_suspended only when we "lose" it - we didn't lose it during
>> > migration, but only until we decided to switch to PAUSED (due to
>> > autostart==false). IOW, change above to something like:
>> >
>> >     state = global_state_get_runstate();
>> >     if (!global_state_received() || runstate_is_alive(state)) {
>> >         if (autostart) {
>> >             vm_start(state);
>> >         } else {
>> >             if (runstate_is_suspended(state)) {
>> >                 /* Remember suspended state before setting system to STOPed */
>> >                 vm_was_suspended = true;
>> >             }
>> >             runstate_set(RUN_STATE_PAUSED);
>> >         }
>> >     }
>> >
>> > It may or may not have a functional difference even if current patch,
>> > though.  However maybe clearer to follow vm_was_suspended's strict
>> > definition.
>> >
>> >> 
>> >> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
>> >> then vm_was_suspended = true.  Migration from that state sets
>> >> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and 
>> >> ends with runstate_set(RUN_STATE_PAUSED).
>> >> 
>> >> I will add a comment here in the code.
>> >>  
>> >> >>      return 0;
>> >> >>  }
>> >> >> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
>> >> >>      .fields = (VMStateField[]) {
>> >> >>          VMSTATE_UINT32(size, GlobalState),
>> >> >>          VMSTATE_BUFFER(runstate, GlobalState),
>> >> >> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
>> >> >>          VMSTATE_END_OF_LIST()
>> >> >>      },
>> >> >>  };
>> >> > 
>> >> > I think this will break migration between old/new, unfortunately.  And
>> >> > since the global state exist mostly for every VM, all VM setup should be
>> >> > affected, and over all archs.
>> >> 
>> >> Thanks, I keep forgetting that my binary tricks are no good here.  However,
>> >> I have one other trick up my sleeve, which is to store vm_was_running in
>> >> global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
>> >> compatible, since that byte is always 0 in older qemu.  It can be implemented
>> >> with a few lines of code change confined to global_state.c, versus many lines 
>> >> spread across files to do it the conventional way using a compat property and
>> >> a subsection.  Sound OK?  
>> >
>> > Tricky!  But sounds okay to me.  I think you're inventing some of your own
>> > way of being compatible, not relying on machine type as a benefit.  If go
>> > this route please document clearly on the layout and also what it looked
>> > like in old binaries.
>> >
>> > I think maybe it'll be good to keep using strings, so in the new binaries
>> > we allow >1 strings, then we define properly on those strings (index 0:
>> > runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to
>> > express, while 0x00 means old binary, etc.).
>> >
>> > I hope this trick will need less code than the subsection solution,
>> > otherwise I'd still consider going with that, which is the "common
>> > solution".
>> >
>> > Let's also see whether Juan/Fabiano/others has any opinions.
>> 
>> Can't we pack the structure and just go ahead and slash 'runstate' in
>> half? That would claim some unused bytes for future backward
>> compatibility issues.
>
> What I meant is something like:
>
>   runstate[100] = {"str1", 0x00, "str2", 0x00, ...}
>
> Where str1 is runstate, and str2 can be either "0"/"1" to reflect suspended
> value.  We define all the strings separated by 0x00, then IIUC we save the
> most chars for potential future extension of this string.
>
> Thanks,

Right, I got your point. I just think we could avoid designing this new
string format by creating new fields with the extra space:

typedef struct QEMU_PACKED {
    uint32_t size;
    uint8_t runstate[50];
    uint8_t unused[50];
    RunState state;
    bool received;
} GlobalState;

In my mind this works seamlessly, or am I mistaken?

In any case, a oneshot hack might be better than both our suggestions
because we can just clean it up a couple of releases from now as if
nothing happened.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 13/14] tests/qtest: bootfile per vm
  2023-11-30 21:37 ` [PATCH V6 13/14] tests/qtest: bootfile per vm Steve Sistare
@ 2023-12-04 21:13   ` Fabiano Rosas
  2023-12-04 22:37     ` Peter Xu
  0 siblings, 1 reply; 57+ messages in thread
From: Fabiano Rosas @ 2023-12-04 21:13 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> Create a separate bootfile for the outgoing and incoming vm, so the block
> layer can lock the file during the background migration test.  Otherwise,
> the test fails with:
>   "Failed to get "write" lock.  Is another process using the image
>    [/tmp/migration-test-WAKPD2/bootsect]?"

Hm.. what is the background migration even trying to access on the boot
disk? @Peter?

This might be a good use for the -snapshot option. It should stop any
attempt to get the write lock. Not a lot of difference, but slightly
simpler.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 14/14] tests/qtest: background migration with suspend
  2023-11-30 21:37 ` [PATCH V6 14/14] tests/qtest: background migration with suspend Steve Sistare
@ 2023-12-04 21:14   ` Fabiano Rosas
  0 siblings, 0 replies; 57+ messages in thread
From: Fabiano Rosas @ 2023-12-04 21:14 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> Add a test case to verify that the suspended state is handled correctly by
> a background migration.  The test suspends the src, migrates, then wakes
> the dest.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 10/14] tests/qtest: option to suspend during migration
  2023-11-30 21:37 ` [PATCH V6 10/14] tests/qtest: option to suspend during migration Steve Sistare
@ 2023-12-04 21:14   ` Fabiano Rosas
  0 siblings, 0 replies; 57+ messages in thread
From: Fabiano Rosas @ 2023-12-04 21:14 UTC (permalink / raw)
  To: Steve Sistare, qemu-devel
  Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras, Steve Sistare

Steve Sistare <steven.sistare@oracle.com> writes:

> Add an option to suspend the src in a-b-bootblock.S, which puts the guest
> in S3 state after one round of writing to memory.  The option is enabled by
> poking a 1 into the suspend_me word in the boot block prior to starting the
> src vm.  Generate symbol offsets in a-b-bootblock.h so that the suspend_me
> offset is known.  Generate the bootblock for each test, because suspend_me
> may differ for each.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Acked-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-04 21:09             ` Fabiano Rosas
@ 2023-12-04 22:04               ` Peter Xu
  2023-12-05 12:44                 ` Fabiano Rosas
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-12-04 22:04 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

On Mon, Dec 04, 2023 at 06:09:16PM -0300, Fabiano Rosas wrote:
> Right, I got your point. I just think we could avoid designing this new
> string format by creating new fields with the extra space:
> 
> typedef struct QEMU_PACKED {
>     uint32_t size;
>     uint8_t runstate[50];
>     uint8_t unused[50];
>     RunState state;
>     bool received;
> } GlobalState;
> 
> In my mind this works seamlessly, or am I mistaken?

I think what you proposed should indeed work.

Currently it's:

    .fields = (VMStateField[]) {
        VMSTATE_UINT32(size, GlobalState),
        VMSTATE_BUFFER(runstate, GlobalState),
        VMSTATE_END_OF_LIST()
    },

I had a quick look at vmstate_info_buffer, it mostly only get()/put() those
buffers with its sizeof(), so looks all fine.  For sure in all cases we'd
better test it to verify.

One side note is since we so far use qapi_enum_parse() for the runstate, I
think the "size" is not ever used..

If we do want a split, IMHO we can consider making runstate[] even smaller
to just free up the rest spaces all in one shot:

  typedef struct QEMU_PACKED {
      uint32_t size;
      /*
       * Assuming 16 is good enough to fit all possible runstate strings..
       * This field must be a string ending with '\0'.
       */
      uint8_t runstate[16];
      /* 0x00 when QEMU doesn't support it, or "0"/"1" to reflect its state */
      uint8_t vm_was_suspended[1];
      /*
       * Still free of use space.  Note that we only have 99 bytes for use
       * because the last byte (the 100th byte) must be zero due to legacy
       * reasons, if not it may be set to zero after loaded on dest QEMU. 
       */
      uint8_t unused[82];
      RunState state;
      bool received;
  } GlobalState;

Pairs with something like:

    .fields = (VMStateField[]) {
        /* Used to be "size" but never used on dest, so always ignored */
        VMSTATE_UNUSED(4),
        VMSTATE_BUFFER(runstate, GlobalState),
        VMSTATE_BUFFER(vm_was_suspended, GlobalState),
        /*
         * This is actually all zeros, but just to differenciate from the
         * last byte..
         */
        VMSTATE_BUFFER(unused, GlobalState),
        /*
         * For historical reasons, the last byte must be 0x00 or it'll be
         * overwritten by old qemu otherwise.
         */
        VMSTATE_UNUSED(1),
        VMSTATE_END_OF_LIST()
    },

> 
> In any case, a oneshot hack might be better than both our suggestions
> because we can just clean it up a couple of releases from now as if
> nothing happened.

It can be forgotten forever, then we keep the code less readable.  If we
have a plan to do that and not so awkward, IMHO we should go directly with
that plan.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-04 17:24       ` Peter Xu
  2023-12-04 19:31         ` Fabiano Rosas
@ 2023-12-04 22:23         ` Steven Sistare
  2023-12-05 16:50           ` Peter Xu
  1 sibling, 1 reply; 57+ messages in thread
From: Steven Sistare @ 2023-12-04 22:23 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On 12/4/2023 12:24 PM, Peter Xu wrote:
> On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote:
>>>> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, int version_id)
>>>>          return -EINVAL;
>>>>      }
>>>>      s->state = r;
>>>> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
>>>
>>> IIUC current vm_was_suspended (based on my read of your patch) was not the
>>> same as a boolean representing "whether VM is suspended", but only a
>>> temporary field to remember that for a VM stop request.  To be explicit, I
>>> didn't see this flag set in qemu_system_suspend() in your previous patch.
>>>
>>> If so, we can already do:
>>>
>>>   vm_set_suspended(s->vm_was_suspended);
>>>
>>> Irrelevant of RUN_STATE_SUSPENDED?
>>
>> We need both terms of the expression.
>>
>> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = false.
>> We call global_state_store prior to vm_stop_force_state, so the incoming
>> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
> 
> Right.
> 
>> However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
>> calling vm_start, we need to restore the suspended state.  Thus in 
>> global_state_post_load, we must set vm_was_suspended = true.
> 
> With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED
> already?  Then I think it should call vm_start(SUSPENDED) if to start.

The V6 code does not pass a state to vm_start, and knowledge of vm_was_suspended
is confined to the global_state and cpus functions.  IMO this is a more modular
and robust solution, as multiple sites may call vm_start(), and the right thing
happens.  Look at patch 6.  The changes are minimal because vm_start "just works".

> Maybe you're talking about the special case where autostart==false?  We
> used to have this (existing process_incoming_migration_bh()):
> 
>     if (!global_state_received() ||
>         global_state_get_runstate() == RUN_STATE_RUNNING) {
>         if (autostart) {
>             vm_start();
>         } else {
>             runstate_set(RUN_STATE_PAUSED);
>         }
>     }
> 
> If so maybe I get you, because in the "else" path we do seem to lose the
> SUSPENDED state again, but in that case IMHO we should logically set
> vm_was_suspended only when we "lose" it - we didn't lose it during
> migration, but only until we decided to switch to PAUSED (due to
> autostart==false). IOW, change above to something like:
> 
>     state = global_state_get_runstate();
>     if (!global_state_received() || runstate_is_alive(state)) {
>         if (autostart) {
>             vm_start(state);
>         } else {
>             if (runstate_is_suspended(state)) {
>                 /* Remember suspended state before setting system to STOPed */
>                 vm_was_suspended = true;
>             }
>             runstate_set(RUN_STATE_PAUSED);
>         }
>     }

This is similar to V5 which tested suspended and fiddled with runstate at
multiple call sites in migration and snapshot.  I believe V6 is cleaner.

> It may or may not have a functional difference even if current patch,
> though.  However maybe clearer to follow vm_was_suspended's strict
> definition.
> 
>>
>> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
>> then vm_was_suspended = true.  Migration from that state sets
>> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load and 
>> ends with runstate_set(RUN_STATE_PAUSED).
>>
>> I will add a comment here in the code.
>>  
>>>>      return 0;
>>>>  }
>>>> @@ -134,6 +143,7 @@ static const VMStateDescription vmstate_globalstate = {
>>>>      .fields = (VMStateField[]) {
>>>>          VMSTATE_UINT32(size, GlobalState),
>>>>          VMSTATE_BUFFER(runstate, GlobalState),
>>>> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
>>>>          VMSTATE_END_OF_LIST()
>>>>      },
>>>>  };
>>>
>>> I think this will break migration between old/new, unfortunately.  And
>>> since the global state exist mostly for every VM, all VM setup should be
>>> affected, and over all archs.
>>
>> Thanks, I keep forgetting that my binary tricks are no good here.  However,
>> I have one other trick up my sleeve, which is to store vm_was_running in
>> global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
>> compatible, since that byte is always 0 in older qemu.  It can be implemented
>> with a few lines of code change confined to global_state.c, versus many lines 
>> spread across files to do it the conventional way using a compat property and
>> a subsection.  Sound OK?  
> 
> Tricky!  But sounds okay to me.  I think you're inventing some of your own
> way of being compatible, not relying on machine type as a benefit.  If go
> this route please document clearly on the layout and also what it looked
> like in old binaries.
> 
> I think maybe it'll be good to keep using strings, so in the new binaries
> we allow >1 strings, then we define properly on those strings (index 0:
> runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to
> express, while 0x00 means old binary, etc.).
> 
> I hope this trick will need less code than the subsection solution,
> otherwise I'd still consider going with that, which is the "common
> solution".
> 
> Let's also see whether Juan/Fabiano/others has any opinions.

The disadvantage of using strings '0' and '1' is the additional check for
the backwards compatible value 0x00.  No big deal, and I'll do that if you prefer, 
but it seems unnecessary. I had already written this for binary 0/1. Not yet tested, 
and still needs comments:

-----------
diff --git a/migration/global_state.c b/migration/global_state.c
index 4e2a9d8..8a59554 100644
--- a/migration/global_state.c
+++ b/migration/global_state.c
@@ -32,9 +32,10 @@ static GlobalState global_state;
 static void global_state_do_store(RunState state)
 {
     const char *state_str = RunState_str(state);
-    assert(strlen(state_str) < sizeof(global_state.runstate));
+    assert(strlen(state_str) < sizeof(global_state.runstate) - 2);
     strpadcpy((char *)global_state.runstate, sizeof(global_state.runstate),
               state_str, '\0');
+    global_state.runstate[strlen(state_str) + 1] = vm_get_suspended();
 }

 void global_state_store(void)
@@ -68,6 +69,12 @@ static bool global_state_needed(void *opaque)
         return true;
     }

+    /* If the suspended state must be remembered, it is needed */
+
+    if (vm_get_suspended()) {
+        return true;
+    }
+
     /* If state is running or paused, it is not needed */

     if (strcmp(runstate, "running") == 0 ||
@@ -109,6 +116,8 @@ static int global_state_post_load(void *opaque, int version_
         return -EINVAL;
     }
     s->state = r;
+    bool vm_was_suspended = runstate[strlen(runstate) + 1];
+    vm_set_suspended(vm_was_suspended || r == RUN_STATE_SUSPENDED);

     return 0;
 }
------------

- Steve


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 13/14] tests/qtest: bootfile per vm
  2023-12-04 21:13   ` Fabiano Rosas
@ 2023-12-04 22:37     ` Peter Xu
  2023-12-05 18:43       ` Steven Sistare
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-12-04 22:37 UTC (permalink / raw)
  To: Fabiano Rosas, Andrey Gruzdev
  Cc: Steve Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

On Mon, Dec 04, 2023 at 06:13:36PM -0300, Fabiano Rosas wrote:
> Steve Sistare <steven.sistare@oracle.com> writes:
> 
> > Create a separate bootfile for the outgoing and incoming vm, so the block
> > layer can lock the file during the background migration test.  Otherwise,
> > the test fails with:
> >   "Failed to get "write" lock.  Is another process using the image
> >    [/tmp/migration-test-WAKPD2/bootsect]?"
> 
> Hm.. what is the background migration even trying to access on the boot
> disk? @Peter?

I didn't yet notice this patch until you asked, but background snapshot is
not designed to be used like this, afaict.

It should normally be used when someone would like to use "savevm", then
background snapshot makes that snapshot save happen with VM running (live)
and mostly as performant as "savevm" due to page write protections (IOW, it
is not dirty tracking, but wr-protect each page so not writtable at all
until unprotected).

Another difference (from "savevm") is, instead of storing that image onto
the block images, it stores that image also separately just like migrating
with "file:" as of now.

When the dest QEMU starts it'll try to grab the image lock already because
it should never run with src running, just like when "loadvm" QEMU doesn't
assume the QEMU that ran "savevm" will be running.

> 
> This might be a good use for the -snapshot option. It should stop any
> attempt to get the write lock. Not a lot of difference, but slightly
> simpler.

We don't yet have a background-snapshot test case.  If we ever need,
that'll need to be done in two steps: start src, save snapshot into file,
start dest, load from snapshot file.  We just shouldn't boot two together.

Now after two years when I re-read the snapshot code a bit, I didn't even
find where QEMU took the disk snapshots.. logically it should be done at
the start of live background snapshot when VM was dumping device states,
something like bdrv_all_can_snapshot() orshould be needed to make sure all
images support snapshot on its own or it should already fail, and take
snapshots to match the image.

IOW, I don't even think current raw disk would be able to support
background snapshot at all, otherwise if VM is live I don't see a way to
match the image (which is still lively updated by the running VM) to a live
snapshot taken.  Copy the author, Andrey, for this question.

Before that is confirmed, maybe the easiest way is we can go without a
background snapshot test case for suspend vm scenario.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-04 22:04               ` Peter Xu
@ 2023-12-05 12:44                 ` Fabiano Rosas
  2023-12-05 14:14                   ` Steven Sistare
  2023-12-05 16:18                   ` Peter Xu
  0 siblings, 2 replies; 57+ messages in thread
From: Fabiano Rosas @ 2023-12-05 12:44 UTC (permalink / raw)
  To: Peter Xu
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

Peter Xu <peterx@redhat.com> writes:

> On Mon, Dec 04, 2023 at 06:09:16PM -0300, Fabiano Rosas wrote:
>> Right, I got your point. I just think we could avoid designing this new
>> string format by creating new fields with the extra space:
>> 
>> typedef struct QEMU_PACKED {
>>     uint32_t size;
>>     uint8_t runstate[50];
>>     uint8_t unused[50];
>>     RunState state;
>>     bool received;
>> } GlobalState;
>> 
>> In my mind this works seamlessly, or am I mistaken?
>
> I think what you proposed should indeed work.
>
> Currently it's:
>
>     .fields = (VMStateField[]) {
>         VMSTATE_UINT32(size, GlobalState),
>         VMSTATE_BUFFER(runstate, GlobalState),
>         VMSTATE_END_OF_LIST()
>     },
>
> I had a quick look at vmstate_info_buffer, it mostly only get()/put() those
> buffers with its sizeof(), so looks all fine.  For sure in all cases we'd
> better test it to verify.
>
> One side note is since we so far use qapi_enum_parse() for the runstate, I
> think the "size" is not ever used..
>
> If we do want a split, IMHO we can consider making runstate[] even smaller
> to just free up the rest spaces all in one shot:
>
>   typedef struct QEMU_PACKED {
>       uint32_t size;
>       /*
>        * Assuming 16 is good enough to fit all possible runstate strings..
>        * This field must be a string ending with '\0'.
>        */
>       uint8_t runstate[16];
>       /* 0x00 when QEMU doesn't support it, or "0"/"1" to reflect its state */
>       uint8_t vm_was_suspended[1];
>       /*
>        * Still free of use space.  Note that we only have 99 bytes for use
>        * because the last byte (the 100th byte) must be zero due to legacy
>        * reasons, if not it may be set to zero after loaded on dest QEMU. 
>        */

I'd add a 'uint8_t reserved;' to go along with this comment instead of
leaving a hole.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-05 12:44                 ` Fabiano Rosas
@ 2023-12-05 14:14                   ` Steven Sistare
  2023-12-05 16:18                   ` Peter Xu
  1 sibling, 0 replies; 57+ messages in thread
From: Steven Sistare @ 2023-12-05 14:14 UTC (permalink / raw)
  To: Fabiano Rosas, Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

On 12/5/2023 7:44 AM, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
>> On Mon, Dec 04, 2023 at 06:09:16PM -0300, Fabiano Rosas wrote:
>>> Right, I got your point. I just think we could avoid designing this new
>>> string format by creating new fields with the extra space:
>>>
>>> typedef struct QEMU_PACKED {
>>>     uint32_t size;
>>>     uint8_t runstate[50];
>>>     uint8_t unused[50];
>>>     RunState state;
>>>     bool received;
>>> } GlobalState;
>>>
>>> In my mind this works seamlessly, or am I mistaken?
>>
>> I think what you proposed should indeed work.
>>
>> Currently it's:
>>
>>     .fields = (VMStateField[]) {
>>         VMSTATE_UINT32(size, GlobalState),
>>         VMSTATE_BUFFER(runstate, GlobalState),
>>         VMSTATE_END_OF_LIST()
>>     },
>>
>> I had a quick look at vmstate_info_buffer, it mostly only get()/put() those
>> buffers with its sizeof(), so looks all fine.  For sure in all cases we'd
>> better test it to verify.
>>
>> One side note is since we so far use qapi_enum_parse() for the runstate, I
>> think the "size" is not ever used..
>>
>> If we do want a split, IMHO we can consider making runstate[] even smaller
>> to just free up the rest spaces all in one shot:
>>
>>   typedef struct QEMU_PACKED {
>>       uint32_t size;
>>       /*
>>        * Assuming 16 is good enough to fit all possible runstate strings..
>>        * This field must be a string ending with '\0'.
>>        */
>>       uint8_t runstate[16];
>>       /* 0x00 when QEMU doesn't support it, or "0"/"1" to reflect its state */
>>       uint8_t vm_was_suspended[1];
>>       /*
>>        * Still free of use space.  Note that we only have 99 bytes for use
>>        * because the last byte (the 100th byte) must be zero due to legacy
>>        * reasons, if not it may be set to zero after loaded on dest QEMU. 
>>        */
> 
> I'd add a 'uint8_t reserved;' to go along with this comment instead of
> leaving a hole.

I'll use this scheme, thanks.  It is a clearer than implicitly packing strings.

- Steve


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 11/14] tests/qtest: precopy migration with suspend
  2023-12-04 20:49   ` Peter Xu
@ 2023-12-05 16:14     ` Steven Sistare
  2023-12-05 21:07       ` Peter Xu
  0 siblings, 1 reply; 57+ messages in thread
From: Steven Sistare @ 2023-12-05 16:14 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On 12/4/2023 3:49 PM, Peter Xu wrote:
> On Thu, Nov 30, 2023 at 01:37:24PM -0800, Steve Sistare wrote:
>> Add a test case to verify that the suspended state is handled correctly
>> during live migration precopy.  The test suspends the src, migrates, then
>> wakes the dest.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  tests/qtest/migration-helpers.c |  3 ++
>>  tests/qtest/migration-helpers.h |  2 ++
>>  tests/qtest/migration-test.c    | 64 ++++++++++++++++++++++++++++++++++++++---
>>  3 files changed, 65 insertions(+), 4 deletions(-)
>>
>> diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
>> index fd3b94e..37e8e81 100644
>> --- a/tests/qtest/migration-helpers.c
>> +++ b/tests/qtest/migration-helpers.c
>> @@ -32,6 +32,9 @@ bool migrate_watch_for_events(QTestState *who, const char *name,
>>      if (g_str_equal(name, "STOP")) {
>>          state->stop_seen = true;
>>          return true;
>> +    } else if (g_str_equal(name, "SUSPEND")) {
>> +        state->suspend_seen = true;
>> +        return true;
>>      } else if (g_str_equal(name, "RESUME")) {
>>          state->resume_seen = true;
>>          return true;
>> diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
>> index 3d32699..b478549 100644
>> --- a/tests/qtest/migration-helpers.h
>> +++ b/tests/qtest/migration-helpers.h
>> @@ -18,6 +18,8 @@
>>  typedef struct QTestMigrationState {
>>      bool stop_seen;
>>      bool resume_seen;
>> +    bool suspend_seen;
>> +    bool suspend_me;
>>  } QTestMigrationState;
>>  
>>  bool migrate_watch_for_events(QTestState *who, const char *name,
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index e10d5a4..200f023 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -178,7 +178,7 @@ static void bootfile_delete(void)
>>  /*
>>   * Wait for some output in the serial output file,
>>   * we get an 'A' followed by an endless string of 'B's
>> - * but on the destination we won't have the A.
>> + * but on the destination we won't have the A (unless we enabled suspend/resume)
>>   */
>>  static void wait_for_serial(const char *side)
>>  {
>> @@ -245,6 +245,13 @@ static void wait_for_resume(QTestState *who, QTestMigrationState *state)
>>      }
>>  }
>>  
>> +static void wait_for_suspend(QTestState *who, QTestMigrationState *state)
>> +{
>> +    if (!state->suspend_seen) {
>> +        qtest_qmp_eventwait(who, "SUSPEND");
>> +    }
>> +}
>> +
>>  /*
>>   * It's tricky to use qemu's migration event capability with qtest,
>>   * events suddenly appearing confuse the qmp()/hmp() responses.
>> @@ -299,7 +306,7 @@ static void wait_for_migration_pass(QTestState *who)
>>  {
>>      uint64_t pass, prev_pass = 0, changes = 0;
>>  
>> -    while (changes < 2 && !src_state.stop_seen) {
>> +    while (changes < 2 && !src_state.stop_seen && !src_state.suspend_seen) {
>>          usleep(1000);
>>          pass = get_migration_pass(who);
>>          changes += (pass != prev_pass);
>> @@ -595,7 +602,8 @@ static void migrate_wait_for_dirty_mem(QTestState *from,
>>      watch_byte = qtest_readb(from, watch_address);
>>      do {
>>          usleep(1000 * 10);
>> -    } while (qtest_readb(from, watch_address) == watch_byte);
>> +    } while (qtest_readb(from, watch_address) == watch_byte &&
>> +             !src_state.suspend_seen);
> 
> This is hackish to me.
> 
> AFAIU the guest code won't ever dirty anything after printing the initial
> 'B'.  IOW, suspend not seen() should not be called for suspend
> test at all, I guess, because we know it won't.
Calling migrate_wait_for_dirty_mem proves that a source write is propagated
to the dest, even for the suspended case.  We fully expect that, but a good 
test verifies our expectations.  That is done in the first loop of 
migrate_wait_for_dirty_mem.  After that, we must check for the suspended 
state, because the second loop will not terminate.  Here is a more explicit
version:

static void migrate_wait_for_dirty_mem(QTestState *from,
                                       QTestState *to)
{
    uint64_t watch_address = start_address + MAGIC_OFFSET_BASE;
    uint64_t marker_address = start_address + MAGIC_OFFSET;
    uint8_t watch_byte;

    /*
     * Wait for the MAGIC_MARKER to get transferred, as an
     * indicator that a migration pass has made some known
     * amount of progress.
     */
    do {
        usleep(1000 * 10);
    } while (qtest_readq(to, marker_address) != MAGIC_MARKER);

+    /* If suspended, src only iterates once, and watch_byte may never change */
+    if (src_state.suspend_me) {
+        return;
+    }

>>  }
>>  
>>  
>> @@ -771,6 +779,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
>>      dst_state = (QTestMigrationState) { };
>>      src_state = (QTestMigrationState) { };
>>      bootfile_create(tmpfs, args->suspend_me);
>> +    src_state.suspend_me = args->suspend_me;
>>  
>>      if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
>>          memory_size = "150M";
>> @@ -1730,6 +1739,9 @@ static void test_precopy_common(MigrateCommon *args)
>>           * change anything.
>>           */
>>          if (args->result == MIG_TEST_SUCCEED) {
>> +            if (src_state.suspend_me) {
>> +                wait_for_suspend(from, &src_state);
>> +            }
>>              qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
>>              wait_for_stop(from, &src_state);
>>              migrate_ensure_converge(from);
>> @@ -1777,6 +1789,9 @@ static void test_precopy_common(MigrateCommon *args)
>>               */
>>              wait_for_migration_complete(from);
>>  
>> +            if (src_state.suspend_me) {
>> +                wait_for_suspend(from, &src_state);
>> +            }
> 
> Here it's pretty much uneasy to follow too, waiting for SUSPEND to come,
> even after migration has already completed!

Agreed.

> I suspect it never waits, since suspend_seen is normally always already
> set (with the above hack, migrate_wait_for_dirty_mem() plays the role of
> waiting for SUSPENDED).

Yes, it played that role.  I will delete all the existing calls to wait_for_suspended,
and add them after wait_for_serial("src_serial") in test_precopy_common and
migrate_postcopy_prepare.

- Steve

>>              wait_for_stop(from, &src_state);
>>  
>>          } else {
> 
> IMHO it'll be cleaner to explicitly wait for SUSPENDED when we know
> migration is not yet completed.  Then, we know we're migrating with
> SUSPENDED.  In summary, perhaps something squashed like this?
> 
> ====8<====
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 30d4b32a35..65e6692716 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -605,8 +605,7 @@ static void migrate_wait_for_dirty_mem(QTestState *from,
>      watch_byte = qtest_readb(from, watch_address);
>      do {
>          usleep(1000 * 10);
> -    } while (qtest_readb(from, watch_address) == watch_byte &&
> -             !src_state.suspend_seen);
> +    } while (qtest_readb(from, watch_address) == watch_byte);
>  }
>  
>  
> @@ -1805,7 +1804,12 @@ static void test_precopy_common(MigrateCommon *args)
>                  wait_for_migration_pass(from);
>                  args->iterations--;
>              }
> -            migrate_wait_for_dirty_mem(from, to);
> +
> +            if (src_state.suspend_me) {
> +                wait_for_suspend(from, &src_state);
> +            } else {
> +                migrate_wait_for_dirty_mem(from, to);
> +            }
>  
>              migrate_ensure_converge(from);
>  
> @@ -1814,10 +1818,6 @@ static void test_precopy_common(MigrateCommon *args)
>               * hanging forever if migration didn't converge
>               */
>              wait_for_migration_complete(from);
> -
> -            if (src_state.suspend_me) {
> -                wait_for_suspend(from, &src_state);
> -            }
>              wait_for_stop(from, &src_state);
>  
>          } else {
> ====8<====
> 
> I didn't check the postcopy patch, but I'd expect something similar would
> be nice.
> 
> Thanks,
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-05 12:44                 ` Fabiano Rosas
  2023-12-05 14:14                   ` Steven Sistare
@ 2023-12-05 16:18                   ` Peter Xu
  2023-12-05 16:52                     ` Fabiano Rosas
  1 sibling, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-12-05 16:18 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

On Tue, Dec 05, 2023 at 09:44:12AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, Dec 04, 2023 at 06:09:16PM -0300, Fabiano Rosas wrote:
> >> Right, I got your point. I just think we could avoid designing this new
> >> string format by creating new fields with the extra space:
> >> 
> >> typedef struct QEMU_PACKED {
> >>     uint32_t size;
> >>     uint8_t runstate[50];
> >>     uint8_t unused[50];
> >>     RunState state;
> >>     bool received;
> >> } GlobalState;
> >> 
> >> In my mind this works seamlessly, or am I mistaken?
> >
> > I think what you proposed should indeed work.
> >
> > Currently it's:
> >
> >     .fields = (VMStateField[]) {
> >         VMSTATE_UINT32(size, GlobalState),
> >         VMSTATE_BUFFER(runstate, GlobalState),
> >         VMSTATE_END_OF_LIST()
> >     },
> >
> > I had a quick look at vmstate_info_buffer, it mostly only get()/put() those
> > buffers with its sizeof(), so looks all fine.  For sure in all cases we'd
> > better test it to verify.
> >
> > One side note is since we so far use qapi_enum_parse() for the runstate, I
> > think the "size" is not ever used..
> >
> > If we do want a split, IMHO we can consider making runstate[] even smaller
> > to just free up the rest spaces all in one shot:
> >
> >   typedef struct QEMU_PACKED {

[1]

> >       uint32_t size;
> >       /*
> >        * Assuming 16 is good enough to fit all possible runstate strings..
> >        * This field must be a string ending with '\0'.
> >        */
> >       uint8_t runstate[16];
> >       /* 0x00 when QEMU doesn't support it, or "0"/"1" to reflect its state */
> >       uint8_t vm_was_suspended[1];
> >       /*
> >        * Still free of use space.  Note that we only have 99 bytes for use
> >        * because the last byte (the 100th byte) must be zero due to legacy
> >        * reasons, if not it may be set to zero after loaded on dest QEMU. 
> >        */
> 
> I'd add a 'uint8_t reserved;' to go along with this comment instead of
> leaving a hole.

Note that "struct GlobalState" is not a binary format but only some
internal storage, what really matters is vmstate_globalstate.  Here the
"uint8_reserved" will be a pure waste of 1 byte in QEMU binary, imho.

I think I just copied what you had previously and extended it, logically I
don't think we ever need QEMU_PACKED right above [1].  We can also drop
"size" directly here, but this can be done later.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-04 22:23         ` Steven Sistare
@ 2023-12-05 16:50           ` Peter Xu
  2023-12-05 17:48             ` Steven Sistare
  0 siblings, 1 reply; 57+ messages in thread
From: Peter Xu @ 2023-12-05 16:50 UTC (permalink / raw)
  To: Steven Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Mon, Dec 04, 2023 at 05:23:57PM -0500, Steven Sistare wrote:
> The V6 code does not pass a state to vm_start, and knowledge of vm_was_suspended
> is confined to the global_state and cpus functions.  IMO this is a more modular
> and robust solution, as multiple sites may call vm_start(), and the right thing
> happens.  Look at patch 6.  The changes are minimal because vm_start "just works".

Oh I think I see what you meant.  Sounds good then.

Shall we hide that into vm_prepare_start()?  It seems three's still one
more call sites that always pass in RUNNING (gdb_continue_partial).

If with above, vm_prepare_start() will go into either RUNNING, SUSPENDED,
or an error.  It returns 0 only if RUNNING, -1 for all the rest.  Maybe we
can already also touch up the retval of vm_prepare_start() to be a boolean,
reflecting "whether vcpu needs to be started".

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-05 16:18                   ` Peter Xu
@ 2023-12-05 16:52                     ` Fabiano Rosas
  2023-12-05 17:04                       ` Steven Sistare
  0 siblings, 1 reply; 57+ messages in thread
From: Fabiano Rosas @ 2023-12-05 16:52 UTC (permalink / raw)
  To: Peter Xu
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

Peter Xu <peterx@redhat.com> writes:

> On Tue, Dec 05, 2023 at 09:44:12AM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Mon, Dec 04, 2023 at 06:09:16PM -0300, Fabiano Rosas wrote:
>> >> Right, I got your point. I just think we could avoid designing this new
>> >> string format by creating new fields with the extra space:
>> >> 
>> >> typedef struct QEMU_PACKED {
>> >>     uint32_t size;
>> >>     uint8_t runstate[50];
>> >>     uint8_t unused[50];
>> >>     RunState state;
>> >>     bool received;
>> >> } GlobalState;
>> >> 
>> >> In my mind this works seamlessly, or am I mistaken?
>> >
>> > I think what you proposed should indeed work.
>> >
>> > Currently it's:
>> >
>> >     .fields = (VMStateField[]) {
>> >         VMSTATE_UINT32(size, GlobalState),
>> >         VMSTATE_BUFFER(runstate, GlobalState),
>> >         VMSTATE_END_OF_LIST()
>> >     },
>> >
>> > I had a quick look at vmstate_info_buffer, it mostly only get()/put() those
>> > buffers with its sizeof(), so looks all fine.  For sure in all cases we'd
>> > better test it to verify.
>> >
>> > One side note is since we so far use qapi_enum_parse() for the runstate, I
>> > think the "size" is not ever used..
>> >
>> > If we do want a split, IMHO we can consider making runstate[] even smaller
>> > to just free up the rest spaces all in one shot:
>> >
>> >   typedef struct QEMU_PACKED {
>
> [1]
>
>> >       uint32_t size;
>> >       /*
>> >        * Assuming 16 is good enough to fit all possible runstate strings..
>> >        * This field must be a string ending with '\0'.
>> >        */
>> >       uint8_t runstate[16];
>> >       /* 0x00 when QEMU doesn't support it, or "0"/"1" to reflect its state */
>> >       uint8_t vm_was_suspended[1];
>> >       /*
>> >        * Still free of use space.  Note that we only have 99 bytes for use
>> >        * because the last byte (the 100th byte) must be zero due to legacy
>> >        * reasons, if not it may be set to zero after loaded on dest QEMU. 
>> >        */
>> 
>> I'd add a 'uint8_t reserved;' to go along with this comment instead of
>> leaving a hole.
>
> Note that "struct GlobalState" is not a binary format but only some
> internal storage, what really matters is vmstate_globalstate.  Here the
> "uint8_reserved" will be a pure waste of 1 byte in QEMU binary, imho.
>

I prefer wasting the byte and make the code more obvious to people who
might not immediately understand what's going on. We could even
assert(!global_state.reserved) to sanity check the assumption. Anyway,
that's minor, I'm fine with it either way.

> I think I just copied what you had previously and extended it, logically I
> don't think we ever need QEMU_PACKED right above [1].  We can also drop
> "size" directly here, but this can be done later.

Ah right, I was initially thinking of letting the new qemu overrun
runstate[16] so we wouldn't have to change the code. But that's indeed
not necessary, your additions to the vmstate make it ok. Thanks.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-05 16:52                     ` Fabiano Rosas
@ 2023-12-05 17:04                       ` Steven Sistare
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Sistare @ 2023-12-05 17:04 UTC (permalink / raw)
  To: Fabiano Rosas, Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

On 12/5/2023 11:52 AM, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
>> On Tue, Dec 05, 2023 at 09:44:12AM -0300, Fabiano Rosas wrote:
>>> Peter Xu <peterx@redhat.com> writes:
>>>
>>>> On Mon, Dec 04, 2023 at 06:09:16PM -0300, Fabiano Rosas wrote:
>>>>> Right, I got your point. I just think we could avoid designing this new
>>>>> string format by creating new fields with the extra space:
>>>>>
>>>>> typedef struct QEMU_PACKED {
>>>>>     uint32_t size;
>>>>>     uint8_t runstate[50];
>>>>>     uint8_t unused[50];
>>>>>     RunState state;
>>>>>     bool received;
>>>>> } GlobalState;
>>>>>
>>>>> In my mind this works seamlessly, or am I mistaken?
>>>>
>>>> I think what you proposed should indeed work.
>>>>
>>>> Currently it's:
>>>>
>>>>     .fields = (VMStateField[]) {
>>>>         VMSTATE_UINT32(size, GlobalState),
>>>>         VMSTATE_BUFFER(runstate, GlobalState),
>>>>         VMSTATE_END_OF_LIST()
>>>>     },
>>>>
>>>> I had a quick look at vmstate_info_buffer, it mostly only get()/put() those
>>>> buffers with its sizeof(), so looks all fine.  For sure in all cases we'd
>>>> better test it to verify.
>>>>
>>>> One side note is since we so far use qapi_enum_parse() for the runstate, I
>>>> think the "size" is not ever used..
>>>>
>>>> If we do want a split, IMHO we can consider making runstate[] even smaller
>>>> to just free up the rest spaces all in one shot:
>>>>
>>>>   typedef struct QEMU_PACKED {
>>
>> [1]
>>
>>>>       uint32_t size;
>>>>       /*
>>>>        * Assuming 16 is good enough to fit all possible runstate strings..
>>>>        * This field must be a string ending with '\0'.
>>>>        */
>>>>       uint8_t runstate[16];
>>>>       /* 0x00 when QEMU doesn't support it, or "0"/"1" to reflect its state */
>>>>       uint8_t vm_was_suspended[1];
>>>>       /*
>>>>        * Still free of use space.  Note that we only have 99 bytes for use
>>>>        * because the last byte (the 100th byte) must be zero due to legacy
>>>>        * reasons, if not it may be set to zero after loaded on dest QEMU. 
>>>>        */
>>>
>>> I'd add a 'uint8_t reserved;' to go along with this comment instead of
>>> leaving a hole.
>>
>> Note that "struct GlobalState" is not a binary format but only some
>> internal storage, what really matters is vmstate_globalstate.  Here the
>> "uint8_reserved" will be a pure waste of 1 byte in QEMU binary, imho.
>>
> 
> I prefer wasting the byte and make the code more obvious to people who
> might not immediately understand what's going on. We could even
> assert(!global_state.reserved) to sanity check the assumption. Anyway,
> that's minor, I'm fine with it either way.
> 
>> I think I just copied what you had previously and extended it, logically I
>> don't think we ever need QEMU_PACKED right above [1].  We can also drop
>> "size" directly here, but this can be done later.
> 
> Ah right, I was initially thinking of letting the new qemu overrun
> runstate[16] so we wouldn't have to change the code. But that's indeed
> not necessary, your additions to the vmstate make it ok. Thanks.

There is no need to reserve byte 100 in the new scheme.  The incoming side
sets s->runstate[sizeof(s->runstate) - 1] = 0 to protect itself, and now
sizeof(runstate) is smaller.

- Steve


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 05/14] migration: propagate suspended runstate
  2023-12-05 16:50           ` Peter Xu
@ 2023-12-05 17:48             ` Steven Sistare
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Sistare @ 2023-12-05 17:48 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On 12/5/2023 11:50 AM, Peter Xu wrote:
> On Mon, Dec 04, 2023 at 05:23:57PM -0500, Steven Sistare wrote:
>> The V6 code does not pass a state to vm_start, and knowledge of vm_was_suspended
>> is confined to the global_state and cpus functions.  IMO this is a more modular
>> and robust solution, as multiple sites may call vm_start(), and the right thing
>> happens.  Look at patch 6.  The changes are minimal because vm_start "just works".
> 
> Oh I think I see what you meant.  Sounds good then.
> 
> Shall we hide that into vm_prepare_start()?  It seems three's still one
> more call sites that always pass in RUNNING (gdb_continue_partial).
> 
> If with above, vm_prepare_start() will go into either RUNNING, SUSPENDED,
> or an error.  It returns 0 only if RUNNING, -1 for all the rest.  Maybe we
> can already also touch up the retval of vm_prepare_start() to be a boolean,
> reflecting "whether vcpu needs to be started".

Yes, that is even nicer, thanks.

/**
 * Prepare for (re)starting the VM.
 * Returns 0 if the vCPUs should be restarted, -1 on an error condition,
 * and 1 otherwise.
 */
int vm_prepare_start(bool step_pending)
{
    int ret = vm_was_suspended ? 1 : 0;
    RunState state = vm_was_suspended ? RUN_STATE_SUSPENDED : RUN_STATE_RUNNING;
    ...
    vm_was_suspended = false;
    return ret;
}

void vm_start(void)
{
    if (!vm_prepare_start(false)) {
        resume_all_vcpus();
    }
}

- Steve


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 13/14] tests/qtest: bootfile per vm
  2023-12-04 22:37     ` Peter Xu
@ 2023-12-05 18:43       ` Steven Sistare
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Sistare @ 2023-12-05 18:43 UTC (permalink / raw)
  To: Peter Xu, Fabiano Rosas, Andrey Gruzdev
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

On 12/4/2023 5:37 PM, Peter Xu wrote:
> On Mon, Dec 04, 2023 at 06:13:36PM -0300, Fabiano Rosas wrote:
>> Steve Sistare <steven.sistare@oracle.com> writes:
>>
>>> Create a separate bootfile for the outgoing and incoming vm, so the block
>>> layer can lock the file during the background migration test.  Otherwise,
>>> the test fails with:
>>>   "Failed to get "write" lock.  Is another process using the image
>>>    [/tmp/migration-test-WAKPD2/bootsect]?"
>>
>> Hm.. what is the background migration even trying to access on the boot
>> disk? @Peter?
> 
> I didn't yet notice this patch until you asked, but background snapshot is
> not designed to be used like this, afaict.
> 
> It should normally be used when someone would like to use "savevm", then
> background snapshot makes that snapshot save happen with VM running (live)
> and mostly as performant as "savevm" due to page write protections (IOW, it
> is not dirty tracking, but wr-protect each page so not writtable at all
> until unprotected).
> 
> Another difference (from "savevm") is, instead of storing that image onto
> the block images, it stores that image also separately just like migrating
> with "file:" as of now.
> 
> When the dest QEMU starts it'll try to grab the image lock already because
> it should never run with src running, just like when "loadvm" QEMU doesn't
> assume the QEMU that ran "savevm" will be running.
> 
>>
>> This might be a good use for the -snapshot option. It should stop any
>> attempt to get the write lock. Not a lot of difference, but slightly
>> simpler.
> 
> We don't yet have a background-snapshot test case.  If we ever need,
> that'll need to be done in two steps: start src, save snapshot into file,
> start dest, load from snapshot file.  We just shouldn't boot two together.
> 
> Now after two years when I re-read the snapshot code a bit, I didn't even
> find where QEMU took the disk snapshots.. logically it should be done at
> the start of live background snapshot when VM was dumping device states,
> something like bdrv_all_can_snapshot() orshould be needed to make sure all
> images support snapshot on its own or it should already fail, and take
> snapshots to match the image.
> 
> IOW, I don't even think current raw disk would be able to support
> background snapshot at all, otherwise if VM is live I don't see a way to
> match the image (which is still lively updated by the running VM) to a live
> snapshot taken.  Copy the author, Andrey, for this question.
> 
> Before that is confirmed, maybe the easiest way is we can go without a
> background snapshot test case for suspend vm scenario.

I am OK with dropping the test patches.  It's a lot of change for a trivial test.
  tests/qtest: background migration with suspend
  tests/qtest: bootfile per vm

- Steve


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 00/14] fix migration of suspended runstate
  2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
                   ` (13 preceding siblings ...)
  2023-11-30 21:37 ` [PATCH V6 14/14] tests/qtest: background migration with suspend Steve Sistare
@ 2023-12-05 18:52 ` Steven Sistare
  2023-12-05 19:19   ` Fabiano Rosas
  2023-12-05 21:37   ` Peter Xu
  14 siblings, 2 replies; 57+ messages in thread
From: Steven Sistare @ 2023-12-05 18:52 UTC (permalink / raw)
  To: qemu-devel, Peter Xu, Fabiano Rosas
  Cc: Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

Hi Peter and Fabiano,

Any comments on these patches, before I send V7?  They are not affected by
the other changes we have discussed, except for renaming runstate_is_started
to runstate_is_live.

  [PATCH V6 04/14] cpus: vm_resume
  [PATCH V6 06/14] migration: preserve suspended runstate
  [PATCH V6 07/14] migration: preserve suspended for snapshot
  [PATCH V6 08/14] migration: preserve suspended for bg_migration

- Steve

On 11/30/2023 4:37 PM, Steve Sistare wrote:
> Migration of a guest in the suspended runstate is broken.  The incoming
> migration code automatically tries to wake the guest, which is wrong;
> the guest should end migration in the same runstate it started.  Further,
> after saving a snapshot in the suspended state and loading it, the vm_start
> fails.  The runstate is RUNNING, but the guest is not.
> 
> See the commit messages for the details.
> 
> Changes in V2:
>   * simplify "start on wakeup request"
>   * fix postcopy, snapshot, and background migration
>   * refactor fixes for each type of migration
>   * explicitly handled suspended events and runstate in tests
>   * add test for postcopy and background migration
> 
> Changes in V3:
>   * rebase to tip
>   * fix hang in new function migrate_wait_for_dirty_mem
> 
> Changes in V4:
>   * rebase to tip
>   * add patch for vm_prepare_start (thanks Peter)
>   * add patch to preserve cpu ticks
> 
> Changes in V5:
>   * rebase to tip
>   * added patches to completely stop vm in suspended state:
>       cpus: refactor vm_stop
>       cpus: stop vm in suspended state
>   * added patch to partially resume vm in suspended state:
>       cpus: start vm in suspended state
>   * modified "preserve suspended ..." patches to use the above.
>   * deleted patch "preserve cpu ticks if suspended".  stop ticks in
>     vm_stop_force_state instead.
>   * deleted patch "add runstate function".  defined new helper function
>     migrate_new_runstate in "preserve suspended runstate"
>   * Added some RB's, but removed other RB's because the patches changed.
> 
> Changes in V6:
>   * all vm_stop calls completely stop the suspended state
>   * refactored and updated the "cpus" patches
>   * simplified the "preserve suspended" patches
>   * added patch "bootfile per vm"
> 
> Steve Sistare (14):
>   cpus: pass runstate to vm_prepare_start
>   cpus: vm_was_suspended
>   cpus: stop vm in suspended runstate
>   cpus: vm_resume
>   migration: propagate suspended runstate
>   migration: preserve suspended runstate
>   migration: preserve suspended for snapshot
>   migration: preserve suspended for bg_migration
>   tests/qtest: migration events
>   tests/qtest: option to suspend during migration
>   tests/qtest: precopy migration with suspend
>   tests/qtest: postcopy migration with suspend
>   tests/qtest: bootfile per vm
>   tests/qtest: background migration with suspend
> 
>  backends/tpm/tpm_emulator.c          |   2 +-
>  gdbstub/system.c                     |   2 +-
>  hw/usb/hcd-ehci.c                    |   2 +-
>  hw/usb/redirect.c                    |   2 +-
>  hw/xen/xen-hvm-common.c              |   2 +-
>  include/migration/snapshot.h         |   7 +
>  include/sysemu/runstate.h            |  19 ++-
>  migration/global_state.c             |  10 ++
>  migration/migration-hmp-cmds.c       |   8 +-
>  migration/migration.c                |  15 +--
>  migration/savevm.c                   |  23 ++--
>  qapi/misc.json                       |  10 +-
>  system/cpus.c                        |  49 +++++--
>  system/runstate.c                    |   9 ++
>  system/vl.c                          |   2 +
>  tests/migration/i386/Makefile        |   5 +-
>  tests/migration/i386/a-b-bootblock.S |  50 +++++++-
>  tests/migration/i386/a-b-bootblock.h |  26 ++--
>  tests/qtest/migration-helpers.c      |  27 ++--
>  tests/qtest/migration-helpers.h      |  11 +-
>  tests/qtest/migration-test.c         | 240 +++++++++++++++++++++++++----------
>  21 files changed, 382 insertions(+), 139 deletions(-)
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 00/14] fix migration of suspended runstate
  2023-12-05 18:52 ` [PATCH V6 00/14] fix migration of suspended runstate Steven Sistare
@ 2023-12-05 19:19   ` Fabiano Rosas
  2023-12-05 21:37   ` Peter Xu
  1 sibling, 0 replies; 57+ messages in thread
From: Fabiano Rosas @ 2023-12-05 19:19 UTC (permalink / raw)
  To: Steven Sistare, qemu-devel, Peter Xu
  Cc: Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Leonardo Bras

Steven Sistare <steven.sistare@oracle.com> writes:

> Hi Peter and Fabiano,
>
> Any comments on these patches, before I send V7?  They are not affected by
> the other changes we have discussed, except for renaming runstate_is_started
> to runstate_is_live.
>
>   [PATCH V6 04/14] cpus: vm_resume
>   [PATCH V6 06/14] migration: preserve suspended runstate
>   [PATCH V6 07/14] migration: preserve suspended for snapshot

Nothing catches my eye for now.

>   [PATCH V6 08/14] migration: preserve suspended for bg_migration

This has my r-b already.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 11/14] tests/qtest: precopy migration with suspend
  2023-12-05 16:14     ` Steven Sistare
@ 2023-12-05 21:07       ` Peter Xu
  0 siblings, 0 replies; 57+ messages in thread
From: Peter Xu @ 2023-12-05 21:07 UTC (permalink / raw)
  To: Steven Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Tue, Dec 05, 2023 at 11:14:43AM -0500, Steven Sistare wrote:
> Calling migrate_wait_for_dirty_mem proves that a source write is propagated
> to the dest, even for the suspended case.  We fully expect that, but a good 
> test verifies our expectations.  That is done in the first loop of 
> migrate_wait_for_dirty_mem.  After that, we must check for the suspended 
> state, because the second loop will not terminate.  Here is a more explicit
> version:
> 
> static void migrate_wait_for_dirty_mem(QTestState *from,
>                                        QTestState *to)
> {
>     uint64_t watch_address = start_address + MAGIC_OFFSET_BASE;
>     uint64_t marker_address = start_address + MAGIC_OFFSET;
>     uint8_t watch_byte;
> 
>     /*
>      * Wait for the MAGIC_MARKER to get transferred, as an
>      * indicator that a migration pass has made some known
>      * amount of progress.
>      */
>     do {
>         usleep(1000 * 10);
>     } while (qtest_readq(to, marker_address) != MAGIC_MARKER);
> 
> +    /* If suspended, src only iterates once, and watch_byte may never change */
> +    if (src_state.suspend_me) {
> +        return;
> +    }

Ok.

> Yes, it played that role.  I will delete all the existing calls to wait_for_suspended,
> and add them after wait_for_serial("src_serial") in test_precopy_common and
> migrate_postcopy_prepare.

Sounds good.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 06/14] migration: preserve suspended runstate
  2023-11-30 21:37 ` [PATCH V6 06/14] migration: preserve " Steve Sistare
@ 2023-12-05 21:34   ` Peter Xu
  0 siblings, 0 replies; 57+ messages in thread
From: Peter Xu @ 2023-12-05 21:34 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Thu, Nov 30, 2023 at 01:37:19PM -0800, Steve Sistare wrote:
> A guest that is migrated in the suspended state automaticaly wakes and
> continues execution.  This is wrong; the guest should end migration in
> the same state it started.  The root cause is that the outgoing migration
> code automatically wakes the guest, then saves the RUNNING runstate in
> global_state_store(), hence the incoming migration code thinks the guest is
> running and continues the guest if autostart is true.
> 
> On the outgoing side, delete the call to qemu_system_wakeup_request().
> Now that vm_stop completely stops a vm in the suspended state (from the
> preceding patches), the existing call to vm_stop_force_state is sufficient
> to correctly migrate all vmstate.
> 
> On the incoming side, call vm_start if the pre-migration state was running
> or suspended.  For the latter, vm_start correctly restores the suspended
> state, and a future system_wakeup monitor request will cause the vm to
> resume running.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 07/14] migration: preserve suspended for snapshot
  2023-11-30 21:37 ` [PATCH V6 07/14] migration: preserve suspended for snapshot Steve Sistare
@ 2023-12-05 21:35   ` Peter Xu
  0 siblings, 0 replies; 57+ messages in thread
From: Peter Xu @ 2023-12-05 21:35 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Thu, Nov 30, 2023 at 01:37:20PM -0800, Steve Sistare wrote:
> Restoring a snapshot can break a suspended guest.  Snapshots suffer from
> the same suspended-state issues that affect live migration, plus they must
> handle an additional problematic scenario, which is that a running vm must
> remain running if it loads a suspended snapshot.  Currently, after loading
> such a snapshot, the vm_start fails.  The runstate is RUNNING, but the guest
> is not.
> 
> To save, the vm_stop call now completely stops the suspended state, courtesy
> of a recent patch.  Finish with vm_resume to leave the vm in the state it had
> prior to the save, correctly restoring the suspended state.
> 
> To load, if the snapshot is not suspended, then vm_stop + vm_resume
> correctly handles all states, and leaves the vm in the state it had prior
> to the load.  However, if the snapshot is suspended, restoration is
> trickier.  First, call vm_resume to restore the state to suspended so the
> current state matches the saved state.  Then, if the pre-load state is
> running, call wakeup to resume running.
> 
> Prior to these changes, the vm_stop to RUN_STATE_SAVE_VM and
> RUN_STATE_RESTORE_VM did not change runstate if the current state was
> paused, suspended, or prelaunch, but now it does, so allow these
> transitions.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

One nitpick below:

[...]

> +void load_snapshot_resume(RunState state)
> +{
> +    vm_resume(state);
> +    if (state == RUN_STATE_RUNNING && runstate_get() == RUN_STATE_SUSPENDED) {
> +        qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);

Please avoid using NULL for Error**.  &error_abort seems apropriate.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 08/14] migration: preserve suspended for bg_migration
  2023-11-30 21:37 ` [PATCH V6 08/14] migration: preserve suspended for bg_migration Steve Sistare
@ 2023-12-05 21:35   ` Peter Xu
  0 siblings, 0 replies; 57+ messages in thread
From: Peter Xu @ 2023-12-05 21:35 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Thu, Nov 30, 2023 at 01:37:21PM -0800, Steve Sistare wrote:
> Do not wake a suspended guest during bg_migration, and restore the prior
> state at finish rather than unconditionally running.  Allow the additional
> state transitions that occur.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 04/14] cpus: vm_resume
  2023-11-30 21:37 ` [PATCH V6 04/14] cpus: vm_resume Steve Sistare
@ 2023-12-05 21:36   ` Peter Xu
  0 siblings, 0 replies; 57+ messages in thread
From: Peter Xu @ 2023-12-05 21:36 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras

On Thu, Nov 30, 2023 at 01:37:17PM -0800, Steve Sistare wrote:
> Define the vm_resume helper, for use in subsequent patches.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 00/14] fix migration of suspended runstate
  2023-12-05 18:52 ` [PATCH V6 00/14] fix migration of suspended runstate Steven Sistare
  2023-12-05 19:19   ` Fabiano Rosas
@ 2023-12-05 21:37   ` Peter Xu
  1 sibling, 0 replies; 57+ messages in thread
From: Peter Xu @ 2023-12-05 21:37 UTC (permalink / raw)
  To: Steven Sistare
  Cc: qemu-devel, Fabiano Rosas, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Leonardo Bras

On Tue, Dec 05, 2023 at 01:52:23PM -0500, Steven Sistare wrote:
> Hi Peter and Fabiano,
> 
> Any comments on these patches, before I send V7?  They are not affected by
> the other changes we have discussed, except for renaming runstate_is_started
> to runstate_is_live.
> 
>   [PATCH V6 04/14] cpus: vm_resume
>   [PATCH V6 06/14] migration: preserve suspended runstate
>   [PATCH V6 07/14] migration: preserve suspended for snapshot
>   [PATCH V6 08/14] migration: preserve suspended for bg_migration

One nitpick I put in the reply there, all look good to me in general.

Side note: I'll be away for the rest of the week.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-11-30 21:37 ` [PATCH V6 03/14] cpus: stop vm in suspended runstate Steve Sistare
  2023-11-30 22:10   ` Peter Xu
@ 2023-12-22 12:20   ` Markus Armbruster
  2023-12-22 15:53     ` Steven Sistare
  1 sibling, 1 reply; 57+ messages in thread
From: Markus Armbruster @ 2023-12-22 12:20 UTC (permalink / raw)
  To: Steve Sistare
  Cc: qemu-devel, Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Eric Blake

Steve Sistare <steven.sistare@oracle.com> writes:

> Currently, a vm in the suspended state is not completely stopped.  The VCPUs
> have been paused, but the cpu clock still runs, and runstate notifiers for
> the transition to stopped have not been called.  This causes problems for
> live migration.  Stale cpu timers_state is saved to the migration stream,
> causing time errors in the guest when it wakes from suspend, and state that
> would have been modified by runstate notifiers is wrong.
>
> Modify vm_stop to completely stop the vm if the current state is suspended,
> transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
> Modify vm_start to restore the suspended state.

Can you explain this to me in terms of the @current_run_state state
machine?  Like

    Before the patch, trigger X in state Y goes to state Z.
    Afterwards, it goes to ...

> This affects all callers of vm_stop and vm_start, notably, the qapi stop and
> cont commands.  For example:
>
>     (qemu) info status
>     VM status: paused (suspended)
>
>     (qemu) stop
>     (qemu) info status
>     VM status: paused
>
>     (qemu) cont
>     (qemu) info status
>     VM status: paused (suspended)
>
>     (qemu) system_wakeup
>     (qemu) info status
>     VM status: running
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  include/sysemu/runstate.h |  5 +++++
>  qapi/misc.json            | 10 ++++++++--
>  system/cpus.c             | 19 ++++++++++++++-----
>  system/runstate.c         |  3 +++
>  4 files changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
> index f6a337b..1d6828f 100644
> --- a/include/sysemu/runstate.h
> +++ b/include/sysemu/runstate.h
> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>      return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>  }
>  
> +static inline bool runstate_is_started(RunState state)
> +{
> +    return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
> +}
> +
>  void vm_start(void);
>  
>  /**
> diff --git a/qapi/misc.json b/qapi/misc.json
> index cda2eff..efb8d44 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -134,7 +134,7 @@
>  ##
>  # @stop:
>  #
> -# Stop all guest VCPU execution.
> +# Stop all guest VCPU and VM execution.

Doesn't "stop all VM execution" imply "stop all guest vCPU execution"?

>  #
>  # Since: 0.14
>  #
> @@ -143,6 +143,9 @@
   # Notes: This function will succeed even if the guest is already in
   #     the stopped state.  In "inmigrate" state, it will ensure that
>  #     the guest remains paused once migration finishes, as if the -S
>  #     option was passed on the command line.
>  #
> +#     In the "suspended" state, it will completely stop the VM and
> +#     cause a transition to the "paused" state. (Since 9.0)
> +#

What user-observable (with query-status) state transitions are possible
here?

>  # Example:
>  #
>  # -> { "execute": "stop" }
> @@ -153,7 +156,7 @@
>  ##
>  # @cont:
>  #
> -# Resume guest VCPU execution.
> +# Resume guest VCPU and VM execution.
>  #
>  # Since: 0.14
>  #
> @@ -165,6 +168,9 @@
   # Returns: If successful, nothing
   #
   # Notes: This command will succeed if the guest is currently running.
   #     It will also succeed if the guest is in the "inmigrate" state;
   #     in this case, the effect of the command is to make sure the
>  #     guest starts once migration finishes, removing the effect of the
>  #     -S command line option if it was passed.
>  #
> +#     If the VM was previously suspended, and not been reset or woken,
> +#     this command will transition back to the "suspended" state. (Since 9.0)

Long line.

What user-observable state transitions are possible here?

> +#
>  # Example:
>  #
>  # -> { "execute": "cont" }

Should we update documentation of query-status, too?

   ##
   # @StatusInfo:
   #
   # Information about VCPU run state
   #
   # @running: true if all VCPUs are runnable, false if not runnable
   #
   # @singlestep: true if using TCG with one guest instruction per
   #     translation block
   #
   # @status: the virtual machine @RunState
   #
   # Features:
   #
   # @deprecated: Member 'singlestep' is deprecated (with no
   #     replacement).
   #
   # Since: 0.14
   #
   # Notes: @singlestep is enabled on the command line with '-accel
   #     tcg,one-insn-per-tb=on', or with the HMP 'one-insn-per-tb'
   #     command.
   ##
   { 'struct': 'StatusInfo',
     'data': {'running': 'bool',
              'singlestep': { 'type': 'bool', 'features': [ 'deprecated' ]},
              'status': 'RunState'} }

   ##
   # @query-status:
   #
   # Query the run status of all VCPUs
   #
   # Returns: @StatusInfo reflecting all VCPUs
   #
   # Since: 0.14
   #
   # Example:
   #
   # -> { "execute": "query-status" }
   # <- { "return": { "running": true,
   #                  "singlestep": false,
   #                  "status": "running" } }
   ##
   { 'command': 'query-status', 'returns': 'StatusInfo',
     'allow-preconfig': true }

> diff --git a/system/cpus.c b/system/cpus.c
> index ef7a0d3..cbc6d6d 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -277,11 +277,15 @@ bool vm_get_suspended(void)
>  static int do_vm_stop(RunState state, bool send_stop)
>  {
>      int ret = 0;
> +    RunState oldstate = runstate_get();
>  
> -    if (runstate_is_running()) {
> +    if (runstate_is_started(oldstate)) {
> +        vm_was_suspended = (oldstate == RUN_STATE_SUSPENDED);
>          runstate_set(state);
>          cpu_disable_ticks();
> -        pause_all_vcpus();
> +        if (oldstate == RUN_STATE_RUNNING) {
> +            pause_all_vcpus();
> +        }
>          vm_state_notify(0, state);
>          if (send_stop) {
>              qapi_event_send_stop();
> @@ -736,8 +740,13 @@ int vm_prepare_start(bool step_pending, RunState state)
>  
>  void vm_start(void)
>  {
> -    if (!vm_prepare_start(false, RUN_STATE_RUNNING)) {
> -        resume_all_vcpus();
> +    RunState state = vm_was_suspended ? RUN_STATE_SUSPENDED : RUN_STATE_RUNNING;
> +
> +    if (!vm_prepare_start(false, state)) {
> +        if (state == RUN_STATE_RUNNING) {
> +            resume_all_vcpus();
> +        }
> +        vm_was_suspended = false;
>      }
>  }
>  
> @@ -745,7 +754,7 @@ void vm_start(void)
>     current state is forgotten forever */
>  int vm_stop_force_state(RunState state)
>  {
> -    if (runstate_is_running()) {
> +    if (runstate_is_started(runstate_get())) {
>          return vm_stop(state);
>      } else {
>          int ret;
> diff --git a/system/runstate.c b/system/runstate.c
> index ea9d6c2..e2fa204 100644
> --- a/system/runstate.c
> +++ b/system/runstate.c
> @@ -108,6 +108,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE },
>      { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
>      { RUN_STATE_PAUSED, RUN_STATE_COLO},
> +    { RUN_STATE_PAUSED, RUN_STATE_SUSPENDED},
>  
>      { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
>      { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
> @@ -161,6 +162,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
>      { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>      { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
> +    { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED},
>  
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
> @@ -502,6 +504,7 @@ void qemu_system_reset(ShutdownCause reason)
>          qapi_event_send_reset(shutdown_caused_by_guest(reason), reason);
>      }
>      cpu_synchronize_all_post_reset();
> +    vm_set_suspended(false);
>  }
>  
>  /*



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-12-22 12:20   ` Markus Armbruster
@ 2023-12-22 15:53     ` Steven Sistare
  2023-12-23  5:41       ` Markus Armbruster
  0 siblings, 1 reply; 57+ messages in thread
From: Steven Sistare @ 2023-12-22 15:53 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-devel, Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Eric Blake

On 12/22/2023 7:20 AM, Markus Armbruster wrote:
> Steve Sistare <steven.sistare@oracle.com> writes:
> 
>> Currently, a vm in the suspended state is not completely stopped.  The VCPUs
>> have been paused, but the cpu clock still runs, and runstate notifiers for
>> the transition to stopped have not been called.  This causes problems for
>> live migration.  Stale cpu timers_state is saved to the migration stream,
>> causing time errors in the guest when it wakes from suspend, and state that
>> would have been modified by runstate notifiers is wrong.
>>
>> Modify vm_stop to completely stop the vm if the current state is suspended,
>> transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
>> Modify vm_start to restore the suspended state.
> 
> Can you explain this to me in terms of the @current_run_state state
> machine?  Like
> 
>     Before the patch, trigger X in state Y goes to state Z.
>     Afterwards, it goes to ...

Old behavior:
  RUN_STATE_SUSPENDED --> stop --> RUN_STATE_SUSPENDED

New behavior:
  RUN_STATE_SUSPENDED --> stop --> RUN_STATE_PAUSED
  RUN_STATE_PAUSED    --> cont --> RUN_STATE_SUSPENDED

>> This affects all callers of vm_stop and vm_start, notably, the qapi stop and
>> cont commands.  For example:
>>
>>     (qemu) info status
>>     VM status: paused (suspended)
>>
>>     (qemu) stop
>>     (qemu) info status
>>     VM status: paused
>>
>>     (qemu) cont
>>     (qemu) info status
>>     VM status: paused (suspended)
>>
>>     (qemu) system_wakeup
>>     (qemu) info status
>>     VM status: running
>>
>> Suggested-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  include/sysemu/runstate.h |  5 +++++
>>  qapi/misc.json            | 10 ++++++++--
>>  system/cpus.c             | 19 ++++++++++++++-----
>>  system/runstate.c         |  3 +++
>>  4 files changed, 30 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
>> index f6a337b..1d6828f 100644
>> --- a/include/sysemu/runstate.h
>> +++ b/include/sysemu/runstate.h
>> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>>      return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>>  }
>>  
>> +static inline bool runstate_is_started(RunState state)
>> +{
>> +    return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
>> +}
>> +
>>  void vm_start(void);
>>  
>>  /**
>> diff --git a/qapi/misc.json b/qapi/misc.json
>> index cda2eff..efb8d44 100644
>> --- a/qapi/misc.json
>> +++ b/qapi/misc.json
>> @@ -134,7 +134,7 @@
>>  ##
>>  # @stop:
>>  #
>> -# Stop all guest VCPU execution.
>> +# Stop all guest VCPU and VM execution.
> 
> Doesn't "stop all VM execution" imply "stop all guest vCPU execution"?

Agreed, so we simply have:

# @stop:
# Stop guest VM execution.

# @cont:
# Resume guest VM execution.

>>  #
>>  # Since: 0.14
>>  #
>> @@ -143,6 +143,9 @@
>    # Notes: This function will succeed even if the guest is already in
>    #     the stopped state.  In "inmigrate" state, it will ensure that
>>  #     the guest remains paused once migration finishes, as if the -S
>>  #     option was passed on the command line.
>>  #
>> +#     In the "suspended" state, it will completely stop the VM and
>> +#     cause a transition to the "paused" state. (Since 9.0)
>> +#
> 
> What user-observable (with query-status) state transitions are possible
> here?

{"status": "suspended", "singlestep": false, "running": false}
--> stop -->
{"status": "paused", "singlestep": false, "running": false}

>>  # Example:
>>  #
>>  # -> { "execute": "stop" }
>> @@ -153,7 +156,7 @@
>>  ##
>>  # @cont:
>>  #
>> -# Resume guest VCPU execution.
>> +# Resume guest VCPU and VM execution.
>>  #
>>  # Since: 0.14
>>  #
>> @@ -165,6 +168,9 @@
>    # Returns: If successful, nothing
>    #
>    # Notes: This command will succeed if the guest is currently running.
>    #     It will also succeed if the guest is in the "inmigrate" state;
>    #     in this case, the effect of the command is to make sure the
>>  #     guest starts once migration finishes, removing the effect of the
>>  #     -S command line option if it was passed.
>>  #
>> +#     If the VM was previously suspended, and not been reset or woken,
>> +#     this command will transition back to the "suspended" state. (Since 9.0)
> 
> Long line.

It fits in 80 columns, but perhaps this looks nicer:

#     If the VM was previously suspended, and not been reset or woken,
#     this command will transition back to the "suspended" state.
#     (Since 9.0)

> What user-observable state transitions are possible here?

{"status": "paused", "singlestep": false, "running": false}
--> cont -->
{"status": "suspended", "singlestep": false, "running": false}

>> +#
>>  # Example:
>>  #
>>  # -> { "execute": "cont" }
> 
> Should we update documentation of query-status, too?

IMO no. The new behavior changes the status/RunState field only, and the
domain of values does not change, only the transitions caused by the commands
described here.

- Steve

>    ##
>    # @StatusInfo:
>    #
>    # Information about VCPU run state
>    #
>    # @running: true if all VCPUs are runnable, false if not runnable
>    #
>    # @singlestep: true if using TCG with one guest instruction per
>    #     translation block
>    #
>    # @status: the virtual machine @RunState
>    #
>    # Features:
>    #
>    # @deprecated: Member 'singlestep' is deprecated (with no
>    #     replacement).
>    #
>    # Since: 0.14
>    #
>    # Notes: @singlestep is enabled on the command line with '-accel
>    #     tcg,one-insn-per-tb=on', or with the HMP 'one-insn-per-tb'
>    #     command.
>    ##
>    { 'struct': 'StatusInfo',
>      'data': {'running': 'bool',
>               'singlestep': { 'type': 'bool', 'features': [ 'deprecated' ]},
>               'status': 'RunState'} }
> 
>    ##
>    # @query-status:
>    #
>    # Query the run status of all VCPUs
>    #
>    # Returns: @StatusInfo reflecting all VCPUs
>    #
>    # Since: 0.14
>    #
>    # Example:
>    #
>    # -> { "execute": "query-status" }
>    # <- { "return": { "running": true,
>    #                  "singlestep": false,
>    #                  "status": "running" } }
>    ##
>    { 'command': 'query-status', 'returns': 'StatusInfo',
>      'allow-preconfig': true }
> 
> [...]


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-12-22 15:53     ` Steven Sistare
@ 2023-12-23  5:41       ` Markus Armbruster
  2024-01-03 13:09         ` Peter Xu
  2024-01-03 14:47         ` Steven Sistare
  0 siblings, 2 replies; 57+ messages in thread
From: Markus Armbruster @ 2023-12-23  5:41 UTC (permalink / raw)
  To: Steven Sistare
  Cc: qemu-devel, Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Eric Blake

Steven Sistare <steven.sistare@oracle.com> writes:

> On 12/22/2023 7:20 AM, Markus Armbruster wrote:
>> Steve Sistare <steven.sistare@oracle.com> writes:
>> 
>>> Currently, a vm in the suspended state is not completely stopped.  The VCPUs
>>> have been paused, but the cpu clock still runs, and runstate notifiers for
>>> the transition to stopped have not been called.  This causes problems for
>>> live migration.  Stale cpu timers_state is saved to the migration stream,
>>> causing time errors in the guest when it wakes from suspend, and state that
>>> would have been modified by runstate notifiers is wrong.
>>>
>>> Modify vm_stop to completely stop the vm if the current state is suspended,
>>> transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
>>> Modify vm_start to restore the suspended state.
>> 
>> Can you explain this to me in terms of the @current_run_state state
>> machine?  Like
>> 
>>     Before the patch, trigger X in state Y goes to state Z.
>>     Afterwards, it goes to ...
>
> Old behavior:
>   RUN_STATE_SUSPENDED --> stop --> RUN_STATE_SUSPENDED
>
> New behavior:
>   RUN_STATE_SUSPENDED --> stop --> RUN_STATE_PAUSED
>   RUN_STATE_PAUSED    --> cont --> RUN_STATE_SUSPENDED

This clarifies things quite a bit for me.  Maybe work it into the commit
message?

>>> This affects all callers of vm_stop and vm_start, notably, the qapi stop and
>>> cont commands.  For example:
>>>
>>>     (qemu) info status
>>>     VM status: paused (suspended)
>>>
>>>     (qemu) stop
>>>     (qemu) info status
>>>     VM status: paused
>>>
>>>     (qemu) cont
>>>     (qemu) info status
>>>     VM status: paused (suspended)
>>>
>>>     (qemu) system_wakeup
>>>     (qemu) info status
>>>     VM status: running
>>>
>>> Suggested-by: Peter Xu <peterx@redhat.com>
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>> ---
>>>  include/sysemu/runstate.h |  5 +++++
>>>  qapi/misc.json            | 10 ++++++++--
>>>  system/cpus.c             | 19 ++++++++++++++-----
>>>  system/runstate.c         |  3 +++
>>>  4 files changed, 30 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
>>> index f6a337b..1d6828f 100644
>>> --- a/include/sysemu/runstate.h
>>> +++ b/include/sysemu/runstate.h
>>> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>>>      return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>>>  }
>>>  
>>> +static inline bool runstate_is_started(RunState state)
>>> +{
>>> +    return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
>>> +}
>>> +
>>>  void vm_start(void);
>>>  
>>>  /**
>>> diff --git a/qapi/misc.json b/qapi/misc.json
>>> index cda2eff..efb8d44 100644
>>> --- a/qapi/misc.json
>>> +++ b/qapi/misc.json
>>> @@ -134,7 +134,7 @@
>>>  ##
>>>  # @stop:
>>>  #
>>> -# Stop all guest VCPU execution.
>>> +# Stop all guest VCPU and VM execution.
>> 
>> Doesn't "stop all VM execution" imply "stop all guest vCPU execution"?
>
> Agreed, so we simply have:
>
> # @stop:
> # Stop guest VM execution.
>
> # @cont:
> # Resume guest VM execution.

Yes, please.

>>>  #
>>>  # Since: 0.14
>>>  #
>>> @@ -143,6 +143,9 @@
>>    # Notes: This function will succeed even if the guest is already in
>>    #     the stopped state.  In "inmigrate" state, it will ensure that
>>>  #     the guest remains paused once migration finishes, as if the -S
>>>  #     option was passed on the command line.
>>>  #
>>> +#     In the "suspended" state, it will completely stop the VM and
>>> +#     cause a transition to the "paused" state. (Since 9.0)
>>> +#
>> 
>> What user-observable (with query-status) state transitions are possible
>> here?
>
> {"status": "suspended", "singlestep": false, "running": false}
> --> stop -->
> {"status": "paused", "singlestep": false, "running": false}
>
>>>  # Example:
>>>  #
>>>  # -> { "execute": "stop" }
>>> @@ -153,7 +156,7 @@
>>>  ##
>>>  # @cont:
>>>  #
>>> -# Resume guest VCPU execution.
>>> +# Resume guest VCPU and VM execution.
>>>  #
>>>  # Since: 0.14
>>>  #
>>> @@ -165,6 +168,9 @@
>>    # Returns: If successful, nothing
>>    #
>>    # Notes: This command will succeed if the guest is currently running.
>>    #     It will also succeed if the guest is in the "inmigrate" state;
>>    #     in this case, the effect of the command is to make sure the
>>>  #     guest starts once migration finishes, removing the effect of the
>>>  #     -S command line option if it was passed.
>>>  #
>>> +#     If the VM was previously suspended, and not been reset or woken,
>>> +#     this command will transition back to the "suspended" state. (Since 9.0)
>> 
>> Long line.
>
> It fits in 80 columns, but perhaps this looks nicer:
>
> #     If the VM was previously suspended, and not been reset or woken,
> #     this command will transition back to the "suspended" state.
> #     (Since 9.0)

It does :)

docs/devel/qapi-code-gen.rst section "Documentation markup":

    For legibility, wrap text paragraphs so every line is at most 70
    characters long.

>> What user-observable state transitions are possible here?
>
> {"status": "paused", "singlestep": false, "running": false}
> --> cont -->
> {"status": "suspended", "singlestep": false, "running": false}
>
>>> +#
>>>  # Example:
>>>  #
>>>  # -> { "execute": "cont" }
>> 
>> Should we update documentation of query-status, too?
>
> IMO no. The new behavior changes the status/RunState field only, and the
> domain of values does not change, only the transitions caused by the commands
> described here.

I see.

But if we change the stop's and cont's description from "guest VCPU
execution" to "guest VM execution", maybe we want to change
query-status's from "Information about VCPU run state" to "Information
about VM run state.

> - Steve
>
>>    ##
>>    # @StatusInfo:
>>    #
>>    # Information about VCPU run state
>>    #
>>    # @running: true if all VCPUs are runnable, false if not runnable
>>    #
>>    # @singlestep: true if using TCG with one guest instruction per
>>    #     translation block
>>    #
>>    # @status: the virtual machine @RunState
>>    #
>>    # Features:
>>    #
>>    # @deprecated: Member 'singlestep' is deprecated (with no
>>    #     replacement).
>>    #
>>    # Since: 0.14
>>    #
>>    # Notes: @singlestep is enabled on the command line with '-accel
>>    #     tcg,one-insn-per-tb=on', or with the HMP 'one-insn-per-tb'
>>    #     command.
>>    ##
>>    { 'struct': 'StatusInfo',
>>      'data': {'running': 'bool',
>>               'singlestep': { 'type': 'bool', 'features': [ 'deprecated' ]},
>>               'status': 'RunState'} }
>> 
>>    ##
>>    # @query-status:
>>    #
>>    # Query the run status of all VCPUs
>>    #
>>    # Returns: @StatusInfo reflecting all VCPUs
>>    #
>>    # Since: 0.14
>>    #
>>    # Example:
>>    #
>>    # -> { "execute": "query-status" }
>>    # <- { "return": { "running": true,
>>    #                  "singlestep": false,
>>    #                  "status": "running" } }
>>    ##
>>    { 'command': 'query-status', 'returns': 'StatusInfo',
>>      'allow-preconfig': true }
>> 
>> [...]



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-12-23  5:41       ` Markus Armbruster
@ 2024-01-03 13:09         ` Peter Xu
  2024-01-03 13:32           ` Steven Sistare
  2024-01-03 14:47         ` Steven Sistare
  1 sibling, 1 reply; 57+ messages in thread
From: Peter Xu @ 2024-01-03 13:09 UTC (permalink / raw)
  To: Markus Armbruster, Steven Sistare
  Cc: Steven Sistare, qemu-devel, Juan Quintela, Paolo Bonzini,
	Thomas Huth, Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Eric Blake

Steven,

The discussions seem to still apply to the latest.  Do you plan to post a
new version, with everything covered?

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2024-01-03 13:09         ` Peter Xu
@ 2024-01-03 13:32           ` Steven Sistare
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Sistare @ 2024-01-03 13:32 UTC (permalink / raw)
  To: Peter Xu, Markus Armbruster
  Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Eric Blake

On 1/3/2024 8:09 AM, Peter Xu wrote:
> Steven,
> 
> The discussions seem to still apply to the latest.  Do you plan to post a
> new version, with everything covered?

Yes, today, thanks - steve



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2023-12-23  5:41       ` Markus Armbruster
  2024-01-03 13:09         ` Peter Xu
@ 2024-01-03 14:47         ` Steven Sistare
  2024-01-08  7:43           ` Markus Armbruster
  1 sibling, 1 reply; 57+ messages in thread
From: Steven Sistare @ 2024-01-03 14:47 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-devel, Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Eric Blake

On 12/23/2023 12:41 AM, Markus Armbruster wrote:
> Steven Sistare <steven.sistare@oracle.com> writes:
> 
>> On 12/22/2023 7:20 AM, Markus Armbruster wrote:
>>> Steve Sistare <steven.sistare@oracle.com> writes:
>>>
>>>> Currently, a vm in the suspended state is not completely stopped.  The VCPUs
>>>> have been paused, but the cpu clock still runs, and runstate notifiers for
>>>> the transition to stopped have not been called.  This causes problems for
>>>> live migration.  Stale cpu timers_state is saved to the migration stream,
>>>> causing time errors in the guest when it wakes from suspend, and state that
>>>> would have been modified by runstate notifiers is wrong.
>>>>
>>>> Modify vm_stop to completely stop the vm if the current state is suspended,
>>>> transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
>>>> Modify vm_start to restore the suspended state.
>>>
>>> Can you explain this to me in terms of the @current_run_state state
>>> machine?  Like
>>>
>>>     Before the patch, trigger X in state Y goes to state Z.
>>>     Afterwards, it goes to ...
>>
>> Old behavior:
>>   RUN_STATE_SUSPENDED --> stop --> RUN_STATE_SUSPENDED
>>
>> New behavior:
>>   RUN_STATE_SUSPENDED --> stop --> RUN_STATE_PAUSED
>>   RUN_STATE_PAUSED    --> cont --> RUN_STATE_SUSPENDED
> 
> This clarifies things quite a bit for me.  Maybe work it into the commit
> message?

Will do.

>>>> This affects all callers of vm_stop and vm_start, notably, the qapi stop and
>>>> cont commands.  For example:
>>>>
>>>>     (qemu) info status
>>>>     VM status: paused (suspended)
>>>>
>>>>     (qemu) stop
>>>>     (qemu) info status
>>>>     VM status: paused
>>>>
>>>>     (qemu) cont
>>>>     (qemu) info status
>>>>     VM status: paused (suspended)
>>>>
>>>>     (qemu) system_wakeup
>>>>     (qemu) info status
>>>>     VM status: running
>>>>
>>>> Suggested-by: Peter Xu <peterx@redhat.com>
>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>> ---
>>>>  include/sysemu/runstate.h |  5 +++++
>>>>  qapi/misc.json            | 10 ++++++++--
>>>>  system/cpus.c             | 19 ++++++++++++++-----
>>>>  system/runstate.c         |  3 +++
>>>>  4 files changed, 30 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
>>>> index f6a337b..1d6828f 100644
>>>> --- a/include/sysemu/runstate.h
>>>> +++ b/include/sysemu/runstate.h
>>>> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause cause)
>>>>      return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>>>>  }
>>>>  
>>>> +static inline bool runstate_is_started(RunState state)
>>>> +{
>>>> +    return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
>>>> +}
>>>> +
>>>>  void vm_start(void);
>>>>  
>>>>  /**
>>>> diff --git a/qapi/misc.json b/qapi/misc.json
>>>> index cda2eff..efb8d44 100644
>>>> --- a/qapi/misc.json
>>>> +++ b/qapi/misc.json
>>>> @@ -134,7 +134,7 @@
>>>>  ##
>>>>  # @stop:
>>>>  #
>>>> -# Stop all guest VCPU execution.
>>>> +# Stop all guest VCPU and VM execution.
>>>
>>> Doesn't "stop all VM execution" imply "stop all guest vCPU execution"?
>>
>> Agreed, so we simply have:
>>
>> # @stop:
>> # Stop guest VM execution.
>>
>> # @cont:
>> # Resume guest VM execution.
> 
> Yes, please.

Will do.

>>>>  #
>>>>  # Since: 0.14
>>>>  #
>>>> @@ -143,6 +143,9 @@
>>>    # Notes: This function will succeed even if the guest is already in
>>>    #     the stopped state.  In "inmigrate" state, it will ensure that
>>>>  #     the guest remains paused once migration finishes, as if the -S
>>>>  #     option was passed on the command line.
>>>>  #
>>>> +#     In the "suspended" state, it will completely stop the VM and
>>>> +#     cause a transition to the "paused" state. (Since 9.0)
>>>> +#
>>>
>>> What user-observable (with query-status) state transitions are possible
>>> here?
>>
>> {"status": "suspended", "singlestep": false, "running": false}
>> --> stop -->
>> {"status": "paused", "singlestep": false, "running": false}
>>
>>>>  # Example:
>>>>  #
>>>>  # -> { "execute": "stop" }
>>>> @@ -153,7 +156,7 @@
>>>>  ##
>>>>  # @cont:
>>>>  #
>>>> -# Resume guest VCPU execution.
>>>> +# Resume guest VCPU and VM execution.
>>>>  #
>>>>  # Since: 0.14
>>>>  #
>>>> @@ -165,6 +168,9 @@
>>>    # Returns: If successful, nothing
>>>    #
>>>    # Notes: This command will succeed if the guest is currently running.
>>>    #     It will also succeed if the guest is in the "inmigrate" state;
>>>    #     in this case, the effect of the command is to make sure the
>>>>  #     guest starts once migration finishes, removing the effect of the
>>>>  #     -S command line option if it was passed.
>>>>  #
>>>> +#     If the VM was previously suspended, and not been reset or woken,
>>>> +#     this command will transition back to the "suspended" state. (Since 9.0)
>>>
>>> Long line.
>>
>> It fits in 80 columns, but perhaps this looks nicer:
>>
>> #     If the VM was previously suspended, and not been reset or woken,
>> #     this command will transition back to the "suspended" state.
>> #     (Since 9.0)
> 
> It does :)
> 
> docs/devel/qapi-code-gen.rst section "Documentation markup":
> 
>     For legibility, wrap text paragraphs so every line is at most 70
>     characters long.

Will do, thanks for the reference.

>>> What user-observable state transitions are possible here?
>>
>> {"status": "paused", "singlestep": false, "running": false}
>> --> cont -->
>> {"status": "suspended", "singlestep": false, "running": false}
>>
>>>> +#
>>>>  # Example:
>>>>  #
>>>>  # -> { "execute": "cont" }
>>>
>>> Should we update documentation of query-status, too?
>>
>> IMO no. The new behavior changes the status/RunState field only, and the
>> domain of values does not change, only the transitions caused by the commands
>> described here.
> 
> I see.
> 
> But if we change the stop's and cont's description from "guest VCPU
> execution" to "guest VM execution", maybe we want to change
> query-status's from "Information about VCPU run state" to "Information
> about VM run state.

Makes sense:

 # @StatusInfo:
 #
-# Information about VCPU run state
+# Information about VM run state

 # @query-status:
 #
-# Query the run status of all VCPUs
+# Query the run status of the VM
 #
-# Returns: @StatusInfo reflecting all VCPUs
+# Returns: @StatusInfo reflecting the VM

With these changes, can I add your Acked-by to the commit?

- Steve

>>>    ##
>>>    # @StatusInfo:
>>>    #
>>>    # Information about VCPU run state
>>>    #
>>>    # @running: true if all VCPUs are runnable, false if not runnable
>>>    #
>>>    # @singlestep: true if using TCG with one guest instruction per
>>>    #     translation block
>>>    #
>>>    # @status: the virtual machine @RunState
>>>    #
>>>    # Features:
>>>    #
>>>    # @deprecated: Member 'singlestep' is deprecated (with no
>>>    #     replacement).
>>>    #
>>>    # Since: 0.14
>>>    #
>>>    # Notes: @singlestep is enabled on the command line with '-accel
>>>    #     tcg,one-insn-per-tb=on', or with the HMP 'one-insn-per-tb'
>>>    #     command.
>>>    ##
>>>    { 'struct': 'StatusInfo',
>>>      'data': {'running': 'bool',
>>>               'singlestep': { 'type': 'bool', 'features': [ 'deprecated' ]},
>>>               'status': 'RunState'} }
>>>
>>>    ##
>>>    # @query-status:
>>>    #
>>>    # Query the run status of all VCPUs
>>>    #
>>>    # Returns: @StatusInfo reflecting all VCPUs
>>>    #
>>>    # Since: 0.14
>>>    #
>>>    # Example:
>>>    #
>>>    # -> { "execute": "query-status" }
>>>    # <- { "return": { "running": true,
>>>    #                  "singlestep": false,
>>>    #                  "status": "running" } }
>>>    ##
>>>    { 'command': 'query-status', 'returns': 'StatusInfo',
>>>      'allow-preconfig': true }
>>>
>>> [...]
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate
  2024-01-03 14:47         ` Steven Sistare
@ 2024-01-08  7:43           ` Markus Armbruster
  0 siblings, 0 replies; 57+ messages in thread
From: Markus Armbruster @ 2024-01-08  7:43 UTC (permalink / raw)
  To: Steven Sistare
  Cc: qemu-devel, Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
	Daniel P. Berrangé,
	Fabiano Rosas, Leonardo Bras, Eric Blake

Steven Sistare <steven.sistare@oracle.com> writes:

[...]

> With these changes, can I add your Acked-by to the commit?

I'm afraid I lost context over the break.  Suggest you post v7, and I
provide my Acked-by there.  Likely easier for me.

Happy new year!

[...]



^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2024-01-08  7:44 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-30 21:37 [PATCH V6 00/14] fix migration of suspended runstate Steve Sistare
2023-11-30 21:37 ` [PATCH V6 01/14] cpus: pass runstate to vm_prepare_start Steve Sistare
2023-11-30 21:37 ` [PATCH V6 02/14] cpus: vm_was_suspended Steve Sistare
2023-11-30 22:03   ` Peter Xu
2023-11-30 21:37 ` [PATCH V6 03/14] cpus: stop vm in suspended runstate Steve Sistare
2023-11-30 22:10   ` Peter Xu
2023-12-01 17:11     ` Steven Sistare
2023-12-04 16:35       ` Peter Xu
2023-12-04 16:41         ` Steven Sistare
2023-12-22 12:20   ` Markus Armbruster
2023-12-22 15:53     ` Steven Sistare
2023-12-23  5:41       ` Markus Armbruster
2024-01-03 13:09         ` Peter Xu
2024-01-03 13:32           ` Steven Sistare
2024-01-03 14:47         ` Steven Sistare
2024-01-08  7:43           ` Markus Armbruster
2023-11-30 21:37 ` [PATCH V6 04/14] cpus: vm_resume Steve Sistare
2023-12-05 21:36   ` Peter Xu
2023-11-30 21:37 ` [PATCH V6 05/14] migration: propagate suspended runstate Steve Sistare
2023-11-30 23:06   ` Peter Xu
2023-12-01 16:23     ` Steven Sistare
2023-12-04 17:24       ` Peter Xu
2023-12-04 19:31         ` Fabiano Rosas
2023-12-04 20:02           ` Peter Xu
2023-12-04 21:09             ` Fabiano Rosas
2023-12-04 22:04               ` Peter Xu
2023-12-05 12:44                 ` Fabiano Rosas
2023-12-05 14:14                   ` Steven Sistare
2023-12-05 16:18                   ` Peter Xu
2023-12-05 16:52                     ` Fabiano Rosas
2023-12-05 17:04                       ` Steven Sistare
2023-12-04 22:23         ` Steven Sistare
2023-12-05 16:50           ` Peter Xu
2023-12-05 17:48             ` Steven Sistare
2023-11-30 21:37 ` [PATCH V6 06/14] migration: preserve " Steve Sistare
2023-12-05 21:34   ` Peter Xu
2023-11-30 21:37 ` [PATCH V6 07/14] migration: preserve suspended for snapshot Steve Sistare
2023-12-05 21:35   ` Peter Xu
2023-11-30 21:37 ` [PATCH V6 08/14] migration: preserve suspended for bg_migration Steve Sistare
2023-12-05 21:35   ` Peter Xu
2023-11-30 21:37 ` [PATCH V6 09/14] tests/qtest: migration events Steve Sistare
2023-11-30 21:37 ` [PATCH V6 10/14] tests/qtest: option to suspend during migration Steve Sistare
2023-12-04 21:14   ` Fabiano Rosas
2023-11-30 21:37 ` [PATCH V6 11/14] tests/qtest: precopy migration with suspend Steve Sistare
2023-12-04 20:49   ` Peter Xu
2023-12-05 16:14     ` Steven Sistare
2023-12-05 21:07       ` Peter Xu
2023-11-30 21:37 ` [PATCH V6 12/14] tests/qtest: postcopy " Steve Sistare
2023-11-30 21:37 ` [PATCH V6 13/14] tests/qtest: bootfile per vm Steve Sistare
2023-12-04 21:13   ` Fabiano Rosas
2023-12-04 22:37     ` Peter Xu
2023-12-05 18:43       ` Steven Sistare
2023-11-30 21:37 ` [PATCH V6 14/14] tests/qtest: background migration with suspend Steve Sistare
2023-12-04 21:14   ` Fabiano Rosas
2023-12-05 18:52 ` [PATCH V6 00/14] fix migration of suspended runstate Steven Sistare
2023-12-05 19:19   ` Fabiano Rosas
2023-12-05 21:37   ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.