From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39888) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZvgJS-0003Ax-Ky for qemu-devel@nongnu.org; Mon, 09 Nov 2015 01:52:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZvgJP-00054y-Bl for qemu-devel@nongnu.org; Mon, 09 Nov 2015 01:52:22 -0500 Received: from szxga02-in.huawei.com ([119.145.14.65]:37580) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZvgJN-00054X-R6 for qemu-devel@nongnu.org; Mon, 09 Nov 2015 01:52:19 -0500 References: <1446551816-15768-1-git-send-email-zhang.zhanghailiang@huawei.com> <1446551816-15768-10-git-send-email-zhang.zhanghailiang@huawei.com> <20151106182632.GH4267@work-vm> From: zhanghailiang Message-ID: <56404270.90606@huawei.com> Date: Mon, 9 Nov 2015 14:51:28 +0800 MIME-Version: 1.0 In-Reply-To: <20151106182632.GH4267@work-vm> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: lizhijian@cn.fujitsu.com, quintela@redhat.com, yunhong.jiang@intel.com, eddie.dong@intel.com, peter.huangpeng@huawei.com, qemu-devel@nongnu.org, arei.gonglei@huawei.com, stefanha@redhat.com, amit.shah@redhat.com On 2015/11/7 2:26, Dr. David Alan Gilbert wrote: > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote: >> We need communications protocol of user-defined to control the checkpoint >> process. >> >> The new checkpoint request is started by Primary VM, and the interactive process >> like below: >> Checkpoint synchronizing points, >> >> Primary Secondary >> 'checkpoint-request' @ -----------------------------> >> Suspend (In hybrid mode) >> 'checkpoint-reply' <------------------------------ @ >> Suspend&Save state > > Why is this initial pair necessary? Can't you just start with the vmstate-send > and save the extra request pair/round trip? On the 2nd checkpoint we know > the SVM already received the previous checkpoint because we got it's first > vmstate-load. > Yes, we can certainly drop this handshaking in simple checkpoint mode. But we still need to do some initial work (preparing work) in simple checkpoint mode. And i'm not sure if this initial work is time-wasting or not. We choose to do this preparing work before send the 'checkpoint-reply' to reducing VM's STOP time as possible as we can. > I guess in full-COLO (rather than simple checkpoint) you can get the secondary to > do some of it's stopping/cleanup after it sends the checkpoint-reply Actually, we do it before it sends 'checkpoint-reply' :) > but before vmstate-send, so you can hide some of the time. > > (Perhaps add a comment to explain) > OK, I will add more comment about this. >> 'vmstate-send' @ -----------------------------> >> Send state Receive state >> 'vmstate-received' <------------------------------ @ >> Release packets Load state >> 'vmstate-load' <------------------------------ @ >> Resume Resume (In hybrid mode) >> >> Start Comparing (In hybrid mode) >> NOTE: >> 1) '@' who sends the message >> 2) Every sync-point is synchronized by two sides with only >> one handshake(single direction) for low-latency. >> If more strict synchronization is required, a opposite direction >> sync-point should be added. >> 3) Since sync-points are single direction, the remote side may >> go forward a lot when this side just receives the sync-point. >> 4) For now, we only support 'periodic' checkpoint, for which >> the Secondary VM is not running, later we will support 'hybrid' mode. >> >> Signed-off-by: zhanghailiang >> Signed-off-by: Li Zhijian >> Signed-off-by: Gonglei >> Cc: Eric Blake >> --- >> v10: >> - Rename enum COLOCmd to COLOCommand (Eric's suggestion). >> - Remove unused 'ram-steal' >> --- >> migration/colo.c | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- >> qapi-schema.json | 27 ++++++++ >> trace-events | 2 + >> 3 files changed, 219 insertions(+), 2 deletions(-) >> >> diff --git a/migration/colo.c b/migration/colo.c >> index 4fdf3a9..2510762 100644 >> --- a/migration/colo.c >> +++ b/migration/colo.c >> @@ -10,10 +10,12 @@ >> * later. See the COPYING file in the top-level directory. >> */ >> >> +#include >> #include "sysemu/sysemu.h" >> #include "migration/colo.h" >> #include "trace.h" >> #include "qemu/error-report.h" >> +#include "qemu/sockets.h" >> >> bool colo_supported(void) >> { >> @@ -34,6 +36,103 @@ bool migration_incoming_in_colo_state(void) >> return mis && (mis->state == MIGRATION_STATUS_COLO); >> } >> >> +/* colo checkpoint control helper */ >> +static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value) >> +{ >> + int ret = 0; >> + >> + qemu_put_be32(f, cmd); >> + qemu_put_be64(f, value); >> + qemu_fflush(f); >> + >> + ret = qemu_file_get_error(f); >> + trace_colo_ctl_put(COLOCommand_lookup[cmd], value); >> + >> + return ret; >> +} >> + >> +static int colo_ctl_get_cmd(QEMUFile *f, uint32_t *cmd) >> +{ >> + int ret = 0; >> + >> + *cmd = qemu_get_be32(f); >> + ret = qemu_file_get_error(f); >> + if (ret < 0) { >> + return ret; >> + } >> + if (*cmd >= COLO_COMMAND_MAX) { >> + error_report("Invalid colo command, get cmd:%d", *cmd); >> + return -EINVAL; >> + } >> + trace_colo_ctl_get(COLOCommand_lookup[*cmd]); >> + >> + return 0; >> +} >> + >> +static int colo_ctl_get(QEMUFile *f, uint32_t require) >> +{ >> + int ret; >> + uint32_t cmd; >> + uint64_t value; >> + >> + ret = colo_ctl_get_cmd(f, &cmd); >> + if (ret < 0) { >> + return ret; >> + } >> + if (cmd != require) { >> + error_report("Unexpect colo command, expect:%d, but get cmd:%d", >> + require, cmd); >> + return -EINVAL; >> + } >> + >> + value = qemu_get_be64(f); >> + ret = qemu_file_get_error(f); >> + if (ret < 0) { >> + return ret; >> + } >> + >> + return value; >> +} > > Should the return type be uint64_t since you're returning value? > But then you're also using it to return an error code; so perhaps > it might be better to have a uint64_t *value parameter to > return the value separately; or define the range that the value > can actually take. > Good catch. Use parameter to return 'value' is a good idea, Will fix it in next version. Thanks, zhanghailiang > (Also very minor typo: 'got' not 'get' in a few errors) > >> +static int colo_do_checkpoint_transaction(MigrationState *s) >> +{ >> + int ret; >> + >> + ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_REPLY); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + /* TODO: suspend and save vm state to colo buffer */ >> + >> + ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + /* TODO: send vmstate to Secondary */ >> + >> + ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED); >> + if (ret < 0) { >> + goto out; >> + }> + >> + ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_LOADED); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + /* TODO: resume Primary */ >> + >> +out: >> + return ret; >> +} >> + >> static void colo_process_checkpoint(MigrationState *s) >> { >> int fd, ret = 0; >> @@ -51,12 +150,27 @@ static void colo_process_checkpoint(MigrationState *s) >> goto out; >> } >> >> + /* >> + * Wait for Secondary finish loading vm states and enter COLO >> + * restore. >> + */ >> + ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_READY); >> + if (ret < 0) { >> + goto out; >> + } >> + >> qemu_mutex_lock_iothread(); >> vm_start(); >> qemu_mutex_unlock_iothread(); >> trace_colo_vm_state_change("stop", "run"); >> >> - /*TODO: COLO checkpoint savevm loop*/ >> + while (s->state == MIGRATION_STATUS_COLO) { >> + /* start a colo checkpoint */ >> + ret = colo_do_checkpoint_transaction(s); >> + if (ret < 0) { >> + goto out; >> + } >> + } >> >> out: >> if (ret < 0) { >> @@ -79,6 +193,39 @@ void migrate_start_colo_process(MigrationState *s) >> qemu_mutex_lock_iothread(); >> } >> >> +/* >> + * return: >> + * 0: start a checkpoint >> + * -1: some error happened, exit colo restore >> + */ >> +static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request) >> +{ >> + int ret; >> + uint32_t cmd; >> + uint64_t value; >> + >> + ret = colo_ctl_get_cmd(f, &cmd); >> + if (ret < 0) { >> + /* do failover ? */ >> + return ret; >> + } >> + /* Fix me: this value should be 0, which is not so good, >> + * should be used for checking ? >> + */ >> + value = qemu_get_be64(f); >> + if (value != 0) { > > should output error message as well? > >> + return -EINVAL; >> + } >> + >> + switch (cmd) { >> + case COLO_COMMAND_CHECKPOINT_REQUEST: >> + *checkpoint_request = 1; >> + return 0; >> + default: >> + return -EINVAL; >> + } >> +} >> + >> void *colo_process_incoming_thread(void *opaque) >> { >> MigrationIncomingState *mis = opaque; >> @@ -98,7 +245,48 @@ void *colo_process_incoming_thread(void *opaque) >> error_report("colo incoming thread: Open QEMUFile to_src_file failed"); >> goto out; >> } >> - /* TODO: COLO checkpoint restore loop */ >> + >> + ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + while (mis->state == MIGRATION_STATUS_COLO) { >> + int request = 0; >> + int ret = colo_wait_handle_cmd(mis->from_src_file, &request); >> + >> + if (ret < 0) { >> + break; >> + } else { >> + if (!request) { >> + continue; >> + } >> + } >> + >> + ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_REPLY, 0); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + ret = colo_ctl_get(mis->from_src_file, COLO_COMMAND_VMSTATE_SEND); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + /* TODO: read migration data into colo buffer */ >> + >> + ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + /* TODO: load vm state */ >> + >> + ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0); >> + if (ret < 0) { >> + goto out; >> + } >> + } >> >> out: >> if (ret < 0) { >> diff --git a/qapi-schema.json b/qapi-schema.json >> index 22251ec..5c4fe6d 100644 >> --- a/qapi-schema.json >> +++ b/qapi-schema.json >> @@ -702,6 +702,33 @@ >> '*tls-port': 'int', '*cert-subject': 'str' } } >> >> ## >> +# @COLOCommand >> +# >> +# The colo command >> +# >> +# @invalid: unknown command >> +# >> +# @checkpoint-ready: SVM is ready for checkpointing >> +# >> +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing >> +# >> +# @checkpoint-reply: SVM gets PVM's checkpoint request >> +# >> +# @vmstate-send: VM's state will be sent by PVM. >> +# >> +# @vmstate-size: The total size of VMstate. >> +# >> +# @vmstate-received: VM's state has been received by SVM >> +# >> +# @vmstate-loaded: VM's state has been loaded by SVM >> +# >> +# Since: 2.5 >> +## >> +{ 'enum': 'COLOCommand', >> + 'data': [ 'invalid', 'checkpoint-ready', 'checkpoint-request', >> + 'checkpoint-reply', 'vmstate-send', 'vmstate-size', >> + 'vmstate-received', 'vmstate-loaded' ] } >> + >> # @MouseInfo: >> # >> # Information about a mouse device. >> diff --git a/trace-events b/trace-events >> index 9cd6391..ee4679c 100644 >> --- a/trace-events >> +++ b/trace-events >> @@ -1499,6 +1499,8 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) "" >> >> # migration/colo.c >> colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'" >> +colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64"" >> +colo_ctl_get(const char *msg) "Receive '%s' cmd" >> >> # kvm-all.c >> kvm_ioctl(int type, void *arg) "type 0x%x, arg %p" >> -- >> 1.8.3.1 >> >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > . >