All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Auto exit source QEMU process after a successful migration
@ 2021-07-05 12:36 Kunkun Jiang
  2021-07-05 12:36 ` [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed' Kunkun Jiang
  2021-07-05 12:36 ` [PATCH 2/2] qapi/migration: Add a new migration capability 'auto-quit' Kunkun Jiang
  0 siblings, 2 replies; 8+ messages in thread
From: Kunkun Jiang @ 2021-07-05 12:36 UTC (permalink / raw)
  To: Juan Quintela, Dr . David Alan Gilbert, Eric Blake,
	Markus Armbruster, Paolo Bonzini, open list:All patches CC here
  Cc: wanghaibin.wang, jiangkunkun

Hi all,

This serial include patches as below:
Patch 1:
- add a new shutdown cause 'migration-completed', which used for automatically
  exit of source QEMU process after a successful migration

Patch 2:
- add a new migration capability 'auto-quit' to control whether to automatically
  exit source QEMU process after a successful migration

Kunkun Jiang (2):
  qapi/run-state: Add a new shutdown cause 'migration-completed'
  qapi/migration: Add a new migration capability 'auto-quit'

 migration/migration.c | 13 +++++++++++++
 migration/migration.h |  1 +
 qapi/migration.json   |  6 +++++-
 qapi/run-state.json   |  4 +++-
 4 files changed, 22 insertions(+), 2 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed'
  2021-07-05 12:36 [PATCH 0/2] Auto exit source QEMU process after a successful migration Kunkun Jiang
@ 2021-07-05 12:36 ` Kunkun Jiang
  2021-07-05 12:48   ` Daniel P. Berrangé
  2021-07-05 12:36 ` [PATCH 2/2] qapi/migration: Add a new migration capability 'auto-quit' Kunkun Jiang
  1 sibling, 1 reply; 8+ messages in thread
From: Kunkun Jiang @ 2021-07-05 12:36 UTC (permalink / raw)
  To: Juan Quintela, Dr . David Alan Gilbert, Eric Blake,
	Markus Armbruster, Paolo Bonzini, open list:All patches CC here
  Cc: wanghaibin.wang, jiangkunkun

In the current version, the source QEMU process does not automatic
exit after a successful migration. Additional action is required,
such as sending { "execute": "quit" } or ctrl+c. For simplify, add
a new shutdown cause 'migration-completed' to exit the source QEMU
process after a successful migration.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 migration/migration.c | 1 +
 qapi/run-state.json   | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 4228635d18..16782c93c2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3539,6 +3539,7 @@ static void migration_iteration_finish(MigrationState *s)
     case MIGRATION_STATUS_COMPLETED:
         migration_calculate_complete(s);
         runstate_set(RUN_STATE_POSTMIGRATE);
+        qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
         break;
 
     case MIGRATION_STATUS_ACTIVE:
diff --git a/qapi/run-state.json b/qapi/run-state.json
index 43d66d700f..66aaef4e2b 100644
--- a/qapi/run-state.json
+++ b/qapi/run-state.json
@@ -86,12 +86,14 @@
 #                   ignores --no-reboot. This is useful for sanitizing
 #                   hypercalls on s390 that are used during kexec/kdump/boot
 #
+# @migration-completed: Reaction to the successful migration
+#
 ##
 { 'enum': 'ShutdownCause',
   # Beware, shutdown_caused_by_guest() depends on enumeration order
   'data': [ 'none', 'host-error', 'host-qmp-quit', 'host-qmp-system-reset',
             'host-signal', 'host-ui', 'guest-shutdown', 'guest-reset',
-            'guest-panic', 'subsystem-reset'] }
+            'guest-panic', 'subsystem-reset', 'migration-completed'] }
 
 ##
 # @StatusInfo:
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] qapi/migration: Add a new migration capability 'auto-quit'
  2021-07-05 12:36 [PATCH 0/2] Auto exit source QEMU process after a successful migration Kunkun Jiang
  2021-07-05 12:36 ` [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed' Kunkun Jiang
@ 2021-07-05 12:36 ` Kunkun Jiang
  1 sibling, 0 replies; 8+ messages in thread
From: Kunkun Jiang @ 2021-07-05 12:36 UTC (permalink / raw)
  To: Juan Quintela, Dr . David Alan Gilbert, Eric Blake,
	Markus Armbruster, Paolo Bonzini, open list:All patches CC here
  Cc: wanghaibin.wang, jiangkunkun

For compatibility, a new migration capability 'auto-quit' is added
to control the exit of source QEMU after a successful migration.

Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
---
 migration/migration.c | 14 +++++++++++++-
 migration/migration.h |  1 +
 qapi/migration.json   |  6 +++++-
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 16782c93c2..82ad6d35b2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2567,6 +2567,15 @@ bool migrate_background_snapshot(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT];
 }
 
+bool migrate_auto_quit(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_AUTO_QUIT];
+}
+
 /* migration thread support */
 /*
  * Something bad happened to the RP stream, mark an error
@@ -3539,7 +3548,10 @@ static void migration_iteration_finish(MigrationState *s)
     case MIGRATION_STATUS_COMPLETED:
         migration_calculate_complete(s);
         runstate_set(RUN_STATE_POSTMIGRATE);
-        qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
+
+        if (migrate_auto_quit()) {
+            qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
+        }
         break;
 
     case MIGRATION_STATUS_ACTIVE:
diff --git a/migration/migration.h b/migration/migration.h
index 2ebb740dfa..a72b178d35 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -349,6 +349,7 @@ int migrate_decompress_threads(void);
 bool migrate_use_events(void);
 bool migrate_postcopy_blocktime(void);
 bool migrate_background_snapshot(void);
+bool migrate_auto_quit(void);
 
 /* Sending on the return path - generic and then for each message type */
 void migrate_send_rp_shut(MigrationIncomingState *mis,
diff --git a/qapi/migration.json b/qapi/migration.json
index 1124a2dda8..26c1a52c56 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -452,6 +452,9 @@
 #                       procedure starts. The VM RAM is saved with running VM.
 #                       (since 6.0)
 #
+# @auto-quit: If enabled, QEMU process will exit after a successful migration.
+#             (since 6.1)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
@@ -459,7 +462,8 @@
            'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
            'block', 'return-path', 'pause-before-switchover', 'multifd',
            'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
-           'x-ignore-shared', 'validate-uuid', 'background-snapshot'] }
+           'x-ignore-shared', 'validate-uuid', 'background-snapshot',
+           'auto-quit'] }
 
 ##
 # @MigrationCapabilityStatus:
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed'
  2021-07-05 12:36 ` [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed' Kunkun Jiang
@ 2021-07-05 12:48   ` Daniel P. Berrangé
  2021-07-05 13:22     ` Kunkun Jiang
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel P. Berrangé @ 2021-07-05 12:48 UTC (permalink / raw)
  To: Kunkun Jiang
  Cc: Juan Quintela, open list:All patches CC here, Markus Armbruster,
	wanghaibin.wang, Paolo Bonzini, Eric Blake,
	Dr . David Alan Gilbert

On Mon, Jul 05, 2021 at 08:36:52PM +0800, Kunkun Jiang wrote:
> In the current version, the source QEMU process does not automatic
> exit after a successful migration. Additional action is required,
> such as sending { "execute": "quit" } or ctrl+c. For simplify, add
> a new shutdown cause 'migration-completed' to exit the source QEMU
> process after a successful migration.

IIUC, 'STATUS_COMPLETED' state is entered on the source host
once it has finished sending all VM state, and thus does not
guarantee that the target host has successfully received and
loaded all VM state.

Typically a mgmt app will need to directly confirm that the
target host QEMU has succesfully started running, before it
will tell the source QEMU to quit.

So, AFAICT, this automatic exit after STATUS_COMPLETED is 
not safe and could lead to total loss of the running VM in
error scenarios.



> 
> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> ---
>  migration/migration.c | 1 +
>  qapi/run-state.json   | 4 +++-
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 4228635d18..16782c93c2 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3539,6 +3539,7 @@ static void migration_iteration_finish(MigrationState *s)
>      case MIGRATION_STATUS_COMPLETED:
>          migration_calculate_complete(s);
>          runstate_set(RUN_STATE_POSTMIGRATE);
> +        qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
>          break;
>  
>      case MIGRATION_STATUS_ACTIVE:
> diff --git a/qapi/run-state.json b/qapi/run-state.json
> index 43d66d700f..66aaef4e2b 100644
> --- a/qapi/run-state.json
> +++ b/qapi/run-state.json
> @@ -86,12 +86,14 @@
>  #                   ignores --no-reboot. This is useful for sanitizing
>  #                   hypercalls on s390 that are used during kexec/kdump/boot
>  #
> +# @migration-completed: Reaction to the successful migration
> +#
>  ##
>  { 'enum': 'ShutdownCause',
>    # Beware, shutdown_caused_by_guest() depends on enumeration order
>    'data': [ 'none', 'host-error', 'host-qmp-quit', 'host-qmp-system-reset',
>              'host-signal', 'host-ui', 'guest-shutdown', 'guest-reset',
> -            'guest-panic', 'subsystem-reset'] }
> +            'guest-panic', 'subsystem-reset', 'migration-completed'] }
>  
>  ##
>  # @StatusInfo:
> -- 
> 2.23.0
> 
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed'
  2021-07-05 12:48   ` Daniel P. Berrangé
@ 2021-07-05 13:22     ` Kunkun Jiang
  2021-07-06 10:27       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 8+ messages in thread
From: Kunkun Jiang @ 2021-07-05 13:22 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Juan Quintela, open list:All patches CC here, Markus Armbruster,
	wanghaibin.wang, Paolo Bonzini, Eric Blake,
	Dr . David Alan Gilbert

Hi Daniel,

On 2021/7/5 20:48, Daniel P. Berrangé wrote:
> On Mon, Jul 05, 2021 at 08:36:52PM +0800, Kunkun Jiang wrote:
>> In the current version, the source QEMU process does not automatic
>> exit after a successful migration. Additional action is required,
>> such as sending { "execute": "quit" } or ctrl+c. For simplify, add
>> a new shutdown cause 'migration-completed' to exit the source QEMU
>> process after a successful migration.
> IIUC, 'STATUS_COMPLETED' state is entered on the source host
> once it has finished sending all VM state, and thus does not
> guarantee that the target host has successfully received and
> loaded all VM state.
Thanks for your reply.

If the target host doesn't successfully receive and load all VM state,
we can send { "execute": "cont" } to resume the soruce in time to
ensure that VM will not lost?
> Typically a mgmt app will need to directly confirm that the
> target host QEMU has succesfully started running, before it
> will tell the source QEMU to quit.
'a mgmt app', such as libvirt?

Thanks,
Kunkun Jiang
> So, AFAICT, this automatic exit after STATUS_COMPLETED is
> not safe and could lead to total loss of the running VM in
> error scenarios.
>
>
>
>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
>> ---
>>   migration/migration.c | 1 +
>>   qapi/run-state.json   | 4 +++-
>>   2 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 4228635d18..16782c93c2 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -3539,6 +3539,7 @@ static void migration_iteration_finish(MigrationState *s)
>>       case MIGRATION_STATUS_COMPLETED:
>>           migration_calculate_complete(s);
>>           runstate_set(RUN_STATE_POSTMIGRATE);
>> +        qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
>>           break;
>>   
>>       case MIGRATION_STATUS_ACTIVE:
>> diff --git a/qapi/run-state.json b/qapi/run-state.json
>> index 43d66d700f..66aaef4e2b 100644
>> --- a/qapi/run-state.json
>> +++ b/qapi/run-state.json
>> @@ -86,12 +86,14 @@
>>   #                   ignores --no-reboot. This is useful for sanitizing
>>   #                   hypercalls on s390 that are used during kexec/kdump/boot
>>   #
>> +# @migration-completed: Reaction to the successful migration
>> +#
>>   ##
>>   { 'enum': 'ShutdownCause',
>>     # Beware, shutdown_caused_by_guest() depends on enumeration order
>>     'data': [ 'none', 'host-error', 'host-qmp-quit', 'host-qmp-system-reset',
>>               'host-signal', 'host-ui', 'guest-shutdown', 'guest-reset',
>> -            'guest-panic', 'subsystem-reset'] }
>> +            'guest-panic', 'subsystem-reset', 'migration-completed'] }
>>   
>>   ##
>>   # @StatusInfo:
>> -- 
>> 2.23.0
>>
>>
> Regards,
> Daniel




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed'
  2021-07-05 13:22     ` Kunkun Jiang
@ 2021-07-06 10:27       ` Dr. David Alan Gilbert
  2021-07-07  2:34         ` Kunkun Jiang
  0 siblings, 1 reply; 8+ messages in thread
From: Dr. David Alan Gilbert @ 2021-07-06 10:27 UTC (permalink / raw)
  To: Kunkun Jiang
  Cc: Daniel P. Berrangé,
	Juan Quintela, open list:All patches CC here, Markus Armbruster,
	wanghaibin.wang, Paolo Bonzini, Eric Blake

* Kunkun Jiang (jiangkunkun@huawei.com) wrote:
> Hi Daniel,
> 
> On 2021/7/5 20:48, Daniel P. Berrangé wrote:
> > On Mon, Jul 05, 2021 at 08:36:52PM +0800, Kunkun Jiang wrote:
> > > In the current version, the source QEMU process does not automatic
> > > exit after a successful migration. Additional action is required,
> > > such as sending { "execute": "quit" } or ctrl+c. For simplify, add
> > > a new shutdown cause 'migration-completed' to exit the source QEMU
> > > process after a successful migration.
> > IIUC, 'STATUS_COMPLETED' state is entered on the source host
> > once it has finished sending all VM state, and thus does not
> > guarantee that the target host has successfully received and
> > loaded all VM state.
> Thanks for your reply.
> 
> If the target host doesn't successfully receive and load all VM state,
> we can send { "execute": "cont" } to resume the soruce in time to
> ensure that VM will not lost?

Yes, that's pretty common at the moment;  the failed migration can
happen at lots of different points:
  a) The last part of the actual migration stream/loading the devices
    - that's pretty easy, since the destination hasn't actually got
    the full migration stream.

  b) If the migration itself completes, but then the management system
    then tries to reconfigure the networking/storage on the destination,
    and something goes wrong in that, then it can roll that back and
    cont on the source.

So, it's a pretty common type of failure/recovery  - the management
application has to be a bit careful not to do anything destructive
until as late as possible, so it knows it can switch back.

> > Typically a mgmt app will need to directly confirm that the
> > target host QEMU has succesfully started running, before it
> > will tell the source QEMU to quit.
> 'a mgmt app', such as libvirt?

Yes, it's currently libvirt that does that; but any of the control
things could (it's just libvirt has been going long enough so it knows
about lots and lots of nasty cases of migration failure, and recovering
properly).

Can you explain why did you want to get the source to automatically
quit?  In a real setup where does it help?

Dave


> Thanks,
> Kunkun Jiang
> > So, AFAICT, this automatic exit after STATUS_COMPLETED is
> > not safe and could lead to total loss of the running VM in
> > error scenarios.
> > 
> > 
> > 
> > > Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > > ---
> > >   migration/migration.c | 1 +
> > >   qapi/run-state.json   | 4 +++-
> > >   2 files changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 4228635d18..16782c93c2 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -3539,6 +3539,7 @@ static void migration_iteration_finish(MigrationState *s)
> > >       case MIGRATION_STATUS_COMPLETED:
> > >           migration_calculate_complete(s);
> > >           runstate_set(RUN_STATE_POSTMIGRATE);
> > > +        qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
> > >           break;
> > >       case MIGRATION_STATUS_ACTIVE:
> > > diff --git a/qapi/run-state.json b/qapi/run-state.json
> > > index 43d66d700f..66aaef4e2b 100644
> > > --- a/qapi/run-state.json
> > > +++ b/qapi/run-state.json
> > > @@ -86,12 +86,14 @@
> > >   #                   ignores --no-reboot. This is useful for sanitizing
> > >   #                   hypercalls on s390 that are used during kexec/kdump/boot
> > >   #
> > > +# @migration-completed: Reaction to the successful migration
> > > +#
> > >   ##
> > >   { 'enum': 'ShutdownCause',
> > >     # Beware, shutdown_caused_by_guest() depends on enumeration order
> > >     'data': [ 'none', 'host-error', 'host-qmp-quit', 'host-qmp-system-reset',
> > >               'host-signal', 'host-ui', 'guest-shutdown', 'guest-reset',
> > > -            'guest-panic', 'subsystem-reset'] }
> > > +            'guest-panic', 'subsystem-reset', 'migration-completed'] }
> > >   ##
> > >   # @StatusInfo:
> > > -- 
> > > 2.23.0
> > > 
> > > 
> > Regards,
> > Daniel
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed'
  2021-07-06 10:27       ` Dr. David Alan Gilbert
@ 2021-07-07  2:34         ` Kunkun Jiang
  2021-07-14 18:39           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 8+ messages in thread
From: Kunkun Jiang @ 2021-07-07  2:34 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrangé,
	Juan Quintela, open list:All patches CC here, Markus Armbruster,
	wanghaibin.wang, Paolo Bonzini, Eric Blake

On 2021/7/6 18:27, Dr. David Alan Gilbert wrote:
> * Kunkun Jiang (jiangkunkun@huawei.com) wrote:
>> Hi Daniel,
>>
>> On 2021/7/5 20:48, Daniel P. Berrangé wrote:
>>> On Mon, Jul 05, 2021 at 08:36:52PM +0800, Kunkun Jiang wrote:
>>>> In the current version, the source QEMU process does not automatic
>>>> exit after a successful migration. Additional action is required,
>>>> such as sending { "execute": "quit" } or ctrl+c. For simplify, add
>>>> a new shutdown cause 'migration-completed' to exit the source QEMU
>>>> process after a successful migration.
>>> IIUC, 'STATUS_COMPLETED' state is entered on the source host
>>> once it has finished sending all VM state, and thus does not
>>> guarantee that the target host has successfully received and
>>> loaded all VM state.
>> Thanks for your reply.
>>
>> If the target host doesn't successfully receive and load all VM state,
>> we can send { "execute": "cont" } to resume the soruce in time to
>> ensure that VM will not lost?
> Yes, that's pretty common at the moment;  the failed migration can
> happen at lots of different points:
>    a) The last part of the actual migration stream/loading the devices
>      - that's pretty easy, since the destination hasn't actually got
>      the full migration stream.
>
>    b) If the migration itself completes, but then the management system
>      then tries to reconfigure the networking/storage on the destination,
>      and something goes wrong in that, then it can roll that back and
>      cont on the source.
>
> So, it's a pretty common type of failure/recovery  - the management
> application has to be a bit careful not to do anything destructive
> until as late as possible, so it knows it can switch back.
Okay, I see.
>>> Typically a mgmt app will need to directly confirm that the
>>> target host QEMU has succesfully started running, before it
>>> will tell the source QEMU to quit.
>> 'a mgmt app', such as libvirt?
> Yes, it's currently libvirt that does that; but any of the control
> things could (it's just libvirt has been going long enough so it knows
> about lots and lots of nasty cases of migration failure, and recovering
> properly).
>
> Can you explain why did you want to get the source to automatically
> quit?  In a real setup where does it help?
Sorry, my thoughts on live migration scenarios are not comprehensive enough.

Thanks,
Kunkun Jiang
> Dave
>
>
>> Thanks,
>> Kunkun Jiang
>>> So, AFAICT, this automatic exit after STATUS_COMPLETED is
>>> not safe and could lead to total loss of the running VM in
>>> error scenarios.
>>>
>>>
>>>
>>>> Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
>>>> ---
>>>>    migration/migration.c | 1 +
>>>>    qapi/run-state.json   | 4 +++-
>>>>    2 files changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/migration/migration.c b/migration/migration.c
>>>> index 4228635d18..16782c93c2 100644
>>>> --- a/migration/migration.c
>>>> +++ b/migration/migration.c
>>>> @@ -3539,6 +3539,7 @@ static void migration_iteration_finish(MigrationState *s)
>>>>        case MIGRATION_STATUS_COMPLETED:
>>>>            migration_calculate_complete(s);
>>>>            runstate_set(RUN_STATE_POSTMIGRATE);
>>>> +        qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
>>>>            break;
>>>>        case MIGRATION_STATUS_ACTIVE:
>>>> diff --git a/qapi/run-state.json b/qapi/run-state.json
>>>> index 43d66d700f..66aaef4e2b 100644
>>>> --- a/qapi/run-state.json
>>>> +++ b/qapi/run-state.json
>>>> @@ -86,12 +86,14 @@
>>>>    #                   ignores --no-reboot. This is useful for sanitizing
>>>>    #                   hypercalls on s390 that are used during kexec/kdump/boot
>>>>    #
>>>> +# @migration-completed: Reaction to the successful migration
>>>> +#
>>>>    ##
>>>>    { 'enum': 'ShutdownCause',
>>>>      # Beware, shutdown_caused_by_guest() depends on enumeration order
>>>>      'data': [ 'none', 'host-error', 'host-qmp-quit', 'host-qmp-system-reset',
>>>>                'host-signal', 'host-ui', 'guest-shutdown', 'guest-reset',
>>>> -            'guest-panic', 'subsystem-reset'] }
>>>> +            'guest-panic', 'subsystem-reset', 'migration-completed'] }
>>>>    ##
>>>>    # @StatusInfo:
>>>> -- 
>>>> 2.23.0
>>>>
>>>>
>>> Regards,
>>> Daniel
>>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed'
  2021-07-07  2:34         ` Kunkun Jiang
@ 2021-07-14 18:39           ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 8+ messages in thread
From: Dr. David Alan Gilbert @ 2021-07-14 18:39 UTC (permalink / raw)
  To: Kunkun Jiang
  Cc: Daniel P. Berrangé,
	Juan Quintela, open list:All patches CC here, Markus Armbruster,
	wanghaibin.wang, Paolo Bonzini, Eric Blake

* Kunkun Jiang (jiangkunkun@huawei.com) wrote:
> On 2021/7/6 18:27, Dr. David Alan Gilbert wrote:
> > * Kunkun Jiang (jiangkunkun@huawei.com) wrote:
> > > Hi Daniel,
> > > 
> > > On 2021/7/5 20:48, Daniel P. Berrangé wrote:
> > > > On Mon, Jul 05, 2021 at 08:36:52PM +0800, Kunkun Jiang wrote:
> > > > > In the current version, the source QEMU process does not automatic
> > > > > exit after a successful migration. Additional action is required,
> > > > > such as sending { "execute": "quit" } or ctrl+c. For simplify, add
> > > > > a new shutdown cause 'migration-completed' to exit the source QEMU
> > > > > process after a successful migration.
> > > > IIUC, 'STATUS_COMPLETED' state is entered on the source host
> > > > once it has finished sending all VM state, and thus does not
> > > > guarantee that the target host has successfully received and
> > > > loaded all VM state.
> > > Thanks for your reply.
> > > 
> > > If the target host doesn't successfully receive and load all VM state,
> > > we can send { "execute": "cont" } to resume the soruce in time to
> > > ensure that VM will not lost?
> > Yes, that's pretty common at the moment;  the failed migration can
> > happen at lots of different points:
> >    a) The last part of the actual migration stream/loading the devices
> >      - that's pretty easy, since the destination hasn't actually got
> >      the full migration stream.
> > 
> >    b) If the migration itself completes, but then the management system
> >      then tries to reconfigure the networking/storage on the destination,
> >      and something goes wrong in that, then it can roll that back and
> >      cont on the source.
> > 
> > So, it's a pretty common type of failure/recovery  - the management
> > application has to be a bit careful not to do anything destructive
> > until as late as possible, so it knows it can switch back.
> Okay, I see.
> > > > Typically a mgmt app will need to directly confirm that the
> > > > target host QEMU has succesfully started running, before it
> > > > will tell the source QEMU to quit.
> > > 'a mgmt app', such as libvirt?
> > Yes, it's currently libvirt that does that; but any of the control
> > things could (it's just libvirt has been going long enough so it knows
> > about lots and lots of nasty cases of migration failure, and recovering
> > properly).
> > 
> > Can you explain why did you want to get the source to automatically
> > quit?  In a real setup where does it help?
> Sorry, my thoughts on live migration scenarios are not comprehensive enough.

That's OK; it takes a little while to understand all of the recovery and
error cases;  people *really* want to recover from a failed migration;
so we try and be very careful about not throwing away the source.

Dave

> Thanks,
> Kunkun Jiang
> > Dave
> > 
> > 
> > > Thanks,
> > > Kunkun Jiang
> > > > So, AFAICT, this automatic exit after STATUS_COMPLETED is
> > > > not safe and could lead to total loss of the running VM in
> > > > error scenarios.
> > > > 
> > > > 
> > > > 
> > > > > Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > > > > ---
> > > > >    migration/migration.c | 1 +
> > > > >    qapi/run-state.json   | 4 +++-
> > > > >    2 files changed, 4 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > > index 4228635d18..16782c93c2 100644
> > > > > --- a/migration/migration.c
> > > > > +++ b/migration/migration.c
> > > > > @@ -3539,6 +3539,7 @@ static void migration_iteration_finish(MigrationState *s)
> > > > >        case MIGRATION_STATUS_COMPLETED:
> > > > >            migration_calculate_complete(s);
> > > > >            runstate_set(RUN_STATE_POSTMIGRATE);
> > > > > +        qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
> > > > >            break;
> > > > >        case MIGRATION_STATUS_ACTIVE:
> > > > > diff --git a/qapi/run-state.json b/qapi/run-state.json
> > > > > index 43d66d700f..66aaef4e2b 100644
> > > > > --- a/qapi/run-state.json
> > > > > +++ b/qapi/run-state.json
> > > > > @@ -86,12 +86,14 @@
> > > > >    #                   ignores --no-reboot. This is useful for sanitizing
> > > > >    #                   hypercalls on s390 that are used during kexec/kdump/boot
> > > > >    #
> > > > > +# @migration-completed: Reaction to the successful migration
> > > > > +#
> > > > >    ##
> > > > >    { 'enum': 'ShutdownCause',
> > > > >      # Beware, shutdown_caused_by_guest() depends on enumeration order
> > > > >      'data': [ 'none', 'host-error', 'host-qmp-quit', 'host-qmp-system-reset',
> > > > >                'host-signal', 'host-ui', 'guest-shutdown', 'guest-reset',
> > > > > -            'guest-panic', 'subsystem-reset'] }
> > > > > +            'guest-panic', 'subsystem-reset', 'migration-completed'] }
> > > > >    ##
> > > > >    # @StatusInfo:
> > > > > -- 
> > > > > 2.23.0
> > > > > 
> > > > > 
> > > > Regards,
> > > > Daniel
> > > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-14 18:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-05 12:36 [PATCH 0/2] Auto exit source QEMU process after a successful migration Kunkun Jiang
2021-07-05 12:36 ` [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed' Kunkun Jiang
2021-07-05 12:48   ` Daniel P. Berrangé
2021-07-05 13:22     ` Kunkun Jiang
2021-07-06 10:27       ` Dr. David Alan Gilbert
2021-07-07  2:34         ` Kunkun Jiang
2021-07-14 18:39           ` Dr. David Alan Gilbert
2021-07-05 12:36 ` [PATCH 2/2] qapi/migration: Add a new migration capability 'auto-quit' Kunkun Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.