All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: lizhijian@cn.fujitsu.com, quintela@redhat.com,
	yunhong.jiang@intel.com, eddie.dong@intel.com,
	qemu-devel@nongnu.org, peter.huangpeng@huawei.com,
	arei.gonglei@huawei.com, stefanha@redhat.com,
	amit.shah@redhat.com, dgilbert@redhat.com,
	hongyang.yang@easystack.cn
Subject: Re: [Qemu-devel] [PATCH COLO-Frame v12 11/38] COLO: Add a new RunState RUN_STATE_COLO
Date: Tue, 12 Jan 2016 20:54:14 +0800	[thread overview]
Message-ID: <5694F776.8050802@huawei.com> (raw)
In-Reply-To: <878u3wz27f.fsf@blackfin.pond.sub.org>

On 2016/1/11 21:16, Markus Armbruster wrote:
> Hailiang Zhang <zhang.zhanghailiang@huawei.com> writes:
>
>> On 2015/12/19 17:27, Markus Armbruster wrote:
>>> zhanghailiang <zhang.zhanghailiang@huawei.com> writes:
>>>
>>>> Guest will enter this state when paused to save/restore VM state
>>>> under colo checkpoint.
>>>>
>>>> Cc: Eric Blake <eblake@redhat.com>
>>>> Cc: Markus Armbruster <armbru@redhat.com>
>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>> Reviewed-by: Eric Blake <eblake@redhat.com>
>>>> ---
>>>>    qapi-schema.json | 5 ++++-
>>>>    vl.c             | 8 ++++++++
>>>>    2 files changed, 12 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>>> index 85f7800..0423b47 100644
>>>> --- a/qapi-schema.json
>>>> +++ b/qapi-schema.json
>>>> @@ -154,12 +154,15 @@
>>>>    # @watchdog: the watchdog action is configured to pause and has been triggered
>>>>    #
>>>>    # @guest-panicked: guest has been panicked as a result of guest OS panic
>>>> +#
>>>> +# @colo: guest is paused to save/restore VM state under colo checkpoint (since
>>>> +# 2.6)
>>>>    ##
>>>>    { 'enum': 'RunState',
>>>>      'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>>>>                'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>>>>                'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
>>>> -            'guest-panicked' ] }
>>>> +            'guest-panicked', 'colo' ] }
>>>>
>>>>    ##
>>>>    # @StatusInfo:
>>>> diff --git a/vl.c b/vl.c
>>>> index f84fde8..fca630b 100644
>>>> --- a/vl.c
>>>> +++ b/vl.c
>>>> @@ -594,6 +594,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>>>>        { RUN_STATE_INMIGRATE, RUN_STATE_WATCHDOG },
>>>>        { RUN_STATE_INMIGRATE, RUN_STATE_GUEST_PANICKED },
>>>>        { RUN_STATE_INMIGRATE, RUN_STATE_FINISH_MIGRATE },
>>>> +    { RUN_STATE_INMIGRATE, RUN_STATE_COLO },
>>>>
>>>>        { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
>>>>        { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
>>>> @@ -603,6 +604,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>>>>
>>>>        { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
>>>>        { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
>>>> +    { RUN_STATE_PAUSED, RUN_STATE_COLO},
>>>>
>>>>        { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
>>>>        { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
>>>> @@ -613,9 +615,12 @@ static const RunStateTransition runstate_transitions_def[] = {
>>>>
>>>>        { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
>>>>        { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
>>>> +    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
>>>>
>>>>        { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
>>>>
>>>> +    { RUN_STATE_COLO, RUN_STATE_RUNNING },
>>>> +
>>>>        { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
>>>>        { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
>>>>        { RUN_STATE_RUNNING, RUN_STATE_IO_ERROR },
>>>> @@ -626,6 +631,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>>>>        { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
>>>>        { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
>>>>        { RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
>>>> +    { RUN_STATE_RUNNING, RUN_STATE_COLO},
>>>>
>>>>        { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
>>>>
>>>> @@ -636,9 +642,11 @@ static const RunStateTransition runstate_transitions_def[] = {
>>>>        { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
>>>>        { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
>>>>        { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
>>>> +    { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
>>>>
>>>>        { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>>>>        { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>>>> +    { RUN_STATE_WATCHDOG, RUN_STATE_COLO},
>>>>
>>>>        { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
>>>>        { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
>>>
>>> Pardon my ignorance, but could you explain the new run state in a bit
>>> more detail for me?
>>>
>>
>> OK, in normally, we only need switch between COLO and RUNNING state.
>> But we can't forbid users to issue other command while VM is COLO state.
>>
>> In every checkpoint, we have to pause to send VM's state to SVM, and before we
>> pause VM, users may issue 'stop' command, which will change state to
>> 'RUN_STATE_PAUSE',
>> we don't want to abort VM because of this command. (Actually, we will
>> support 'stop' VM
>> during VM is in COLO state). So we need the state machine
>> 'RUN_STATE_PAUSED -> RUN_STATE_COLO'.
>
> What's the next state then?
>

We may switch to RUN_STATE_RUNNING, actually, here, the RUN_STATE_COLO is only used to
indicate that VM is stopped in COLO process.

>> We enter COLO state just after a full migration process which the last
>> state will be
>> 'RUN_STATE_FINISH_MIGRATE' or 'RUN_STATE_INMIGRATE', before we enter
>> COLO loop, we may get
>> 'x-colo-lost-heartbeat', and will run into 'RUN_STATE_COLO' pause, so we need
>> state machines 'RUN_STATE_FINISH_MIGRATE -> RUN_STATE_COLO'and
>> 'RUN_STATE_INMIGRATE, RUN_STATE_COLO'.
>> The reason we need RUN_STATE_SUSPENDED -> RUN_STATE_COLO is, guest or
>> users may issue standby command.
>> We need to ensure VM not be crashed.
>>
>> Actually, we may need more states which can go to 'colo' state, maybe
>> just follow the cases of
>> 'MIGRATE' state.
>
> I believe we should fully work out the state transitions added by COLO.
> I like to write that down in this form:
>
>      (state, trigger) -> (action, state')
>

I'm a little confused, for runstate_transitions_def, it seems that,
the state transition is a simple way: (state1, state2). Here we only switch to
RUN_STATE_COLO state when we need to do something with VM is paused.

> Example:
>
>      (running, checkpoint) -> (begin-checkpointing, colo)
>

Do you want me to add these new states into runstate_transitions_def ?
What's the real status (running or stopping) of 'checkpoint' and 'colo' for VM here ?

> with a suitable explanation of 'checkpoint' and 'begin-checkpointing'.
>
> For brevity, multiple
>
>      (state1, trigger) -> (action, state')
>      (state2, trigger) -> (action, state')
>      ...
>      (stateN, trigger) -> (action, state')
>
> can be abbreviated to
>
>      ({state1, state2, stateN}, trigger) -> (action, state')
>
> Example:
>
>      ({running, paused, ...}, checkpoint) -> (begin-checkpointing, colo)
>
> For clarity, chains of state transitions should be described in the
> order they happen.
>
> Pictures showing the states connected with transition arrows labelled
> with the trigger can help.
>
> Two properties to check:
>
> 1. Correctness: every state transition thus written down does the right
>     thing.
>
> 2. Completeness: for every pair (state, trigger), we got a state
>     transition, or an explanation why it cannot happen.
>
>> Thanks,
>> zhanghailiang
>>
>>> Your additions to runstate_transitions_def[] show we can go *from* state
>>> 'colo' only to state 'running', but we can go *to* state 'colo' from
>>> various other states.  This may well be sane, but it's not *obviously*
>>> sane :)
>
> .
>

  reply	other threads:[~2016-01-12 12:54 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-15  8:22 [Qemu-devel] [PATCH COLO-Frame v12 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 01/38] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
2015-12-15  9:46   ` Wen Congyang
2015-12-15 11:19     ` Hailiang Zhang
2015-12-15 11:31     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 02/38] migration: Introduce capability 'x-colo' to migration zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 03/38] COLO: migrate colo related info to secondary node zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 04/38] migration: Export migrate_set_state() zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 05/38] migration: Add state records for migration incoming zhanghailiang
2015-12-15 17:36   ` Dr. David Alan Gilbert
2015-12-16  5:37     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 06/38] migration: Integrate COLO checkpoint process into migration zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 07/38] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 08/38] migration: Rename the'file' member of MigrationState zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 09/38] COLO/migration: Create a new communication path from destination to source zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 10/38] COLO: Implement colo checkpoint protocol zhanghailiang
2015-12-18 14:52   ` Dr. David Alan Gilbert
2015-12-28  7:34     ` Hailiang Zhang
2015-12-19  8:54   ` Markus Armbruster
2015-12-22  7:00     ` Hailiang Zhang
2016-01-11 12:47       ` Markus Armbruster
2016-01-12 12:57         ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 11/38] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2015-12-19  9:27   ` Markus Armbruster
2015-12-22 13:32     ` Hailiang Zhang
2016-01-11 13:16       ` Markus Armbruster
2016-01-12 12:54         ` Hailiang Zhang [this message]
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 12/38] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 13/38] COLO: Save PVM state to secondary side when do checkpoint zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 14/38] ram: Split host_from_stream_offset() into two helper functions zhanghailiang
2015-12-18 15:18   ` Dr. David Alan Gilbert
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 15/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 16/38] ram/COLO: Record the dirty pages that SVM received zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 17/38] COLO: Load VMState into qsb before restore it zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 18/38] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
2015-12-15 11:07   ` Changlong Xie
2015-12-25  3:03     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 19/38] COLO: Add checkpoint-delay parameter for migrate-set-parameters zhanghailiang
2015-12-19  9:33   ` Markus Armbruster
2015-12-22 13:43     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 20/38] COLO: synchronize PVM's state to SVM periodically zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 21/38] COLO failover: Introduce a new command to trigger a failover zhanghailiang
2015-12-18 15:27   ` Dr. David Alan Gilbert
2015-12-19  9:38   ` Markus Armbruster
2015-12-22 13:50     ` Hailiang Zhang
2015-12-25  2:27       ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 22/38] COLO failover: Introduce state to record failover process zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 23/38] COLO: Implement failover work for Primary VM zhanghailiang
2015-12-18 15:35   ` Dr. David Alan Gilbert
2015-12-28  7:39     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 24/38] COLO: Implement failover work for Secondary VM zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 25/38] qmp event: Add event notification for COLO error zhanghailiang
2015-12-18 16:03   ` Eric Blake
2015-12-23  1:55     ` Hailiang Zhang
2015-12-19 10:02   ` Markus Armbruster
2015-12-21 21:14     ` [Qemu-devel] [Qemu-block] " John Snow
2015-12-23  3:14       ` Hailiang Zhang
2015-12-23  1:24     ` [Qemu-devel] " Wen Congyang
2016-01-05 19:21       ` [Qemu-devel] [Qemu-block] " John Snow
2015-12-23  3:10     ` [Qemu-devel] " Hailiang Zhang
2016-01-11 13:24       ` Markus Armbruster
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 26/38] COLO failover: Shutdown related socket fd when do failover zhanghailiang
2015-12-15  9:44   ` Dr. David Alan Gilbert
2015-12-15 10:23   ` Dr. David Alan Gilbert
2015-12-16  5:58     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 27/38] COLO failover: Don't do failover during loading VM's state zhanghailiang
2015-12-15 10:21   ` Dr. David Alan Gilbert
2015-12-25  1:02     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 28/38] COLO: Process shutdown command for VM in COLO state zhanghailiang
2015-12-15 11:31   ` Dr. David Alan Gilbert
2015-12-25  6:13     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 29/38] COLO: Update the global runstate after going into colo state zhanghailiang
2015-12-15 11:52   ` Dr. David Alan Gilbert
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 30/38] savevm: Split load vm state function qemu_loadvm_state zhanghailiang
2015-12-15 12:08   ` Dr. David Alan Gilbert
2015-12-25  6:37     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 31/38] COLO: Separate the process of saving/loading ram and device state zhanghailiang
2015-12-18 10:53   ` Dr. David Alan Gilbert
2015-12-28  3:46     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 32/38] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
2015-12-18 12:01   ` Dr. David Alan Gilbert
2015-12-28  7:29     ` Hailiang Zhang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 33/38] net/filter-buffer: Add default filter-buffer for each netdev zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 34/38] filter-buffer: Accept zero interval zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 35/38] filter-buffer: Introduce a helper function to enable/disable default filter zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 36/38] filter-buffer: Introduce a helper function to release packets zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 37/38] colo: Use default buffer-filter to buffer and " zhanghailiang
2015-12-15  8:22 ` [Qemu-devel] [PATCH COLO-Frame v12 38/38] COLO: Add block replication into colo process zhanghailiang
2015-12-15 12:14 ` [Qemu-devel] [PATCH COLO-Frame v12 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) Dr. David Alan Gilbert
2015-12-15 12:41   ` Hailiang Zhang
2015-12-17 10:52     ` Dr. David Alan Gilbert
2015-12-18  1:10       ` Hailiang Zhang
2015-12-18 15:47         ` Dr. David Alan Gilbert
2015-12-23  1:24           ` Hailiang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5694F776.8050802@huawei.com \
    --to=zhang.zhanghailiang@huawei.com \
    --cc=amit.shah@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=hongyang.yang@easystack.cn \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.