From: Wen Congyang <wency@cn.fujitsu.com>
To: Wei Liu <wei.liu2@citrix.com>
Cc: Lars Kurth <lars.kurth@citrix.com>,
Changlong Xie <xiecl.fnst@cn.fujitsu.com>,
Ian Campbell <ian.campbell@citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Jiang Yunhong <yunhong.jiang@intel.com>,
Ian Jackson <ian.jackson@eu.citrix.com>,
xen devel <xen-devel@lists.xen.org>,
Dong Eddie <eddie.dong@intel.com>,
Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
Shriram Rajagopalan <rshriram@cs.ubc.ca>,
Yang Hongyang <hongyang.yang@easystack.cn>
Subject: Re: [PATCH v10 16/31] secondary vm suspend/resume/checkpoint code
Date: Tue, 1 Mar 2016 18:06:22 +0800 [thread overview]
Message-ID: <56D5699E.2060002@cn.fujitsu.com> (raw)
In-Reply-To: <20160225155649.GH4235@citrix.com>
On 02/25/2016 11:56 PM, Wei Liu wrote:
> On Mon, Feb 22, 2016 at 10:52:20AM +0800, Wen Congyang wrote:
>> Secondary vm is running in colo mode. So we will do
>> the following things again and again:
>> 1. Resume secondary vm
>> a. Send CHECKPOINT_SVM_READY to master.
>> b. If it is not the first resume, call libxl__checkpoint_devices_preresume().
>> c. If it is the first resume(resume right after live migration),
>> - call libxl__xc_domain_restore_done() to build the secondary vm.
>> - enable secondary vm's logdirty.
>> - call libxl__domain_resume() to resume secondary vm.
>> - call libxl__checkpoint_devices_setup() to setup checkpoint devices.
>> d. Send CHECKPOINT_SVM_RESUMED to master.
>> 2. Wait a new checkpoint
>> a. Call libxl__checkpoint_devices_commit().
>> b. Read CHECKPOINT_NEW from master.
>> 3. Suspend secondary vm
>> a. Suspend secondary vm.
>> b. Call libxl__checkpoint_devices_postsuspend().
>> c. Send CHECKPOINT_SVM_SUSPENDED to master.
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
>> ---
>> tools/libxc/xc_sr_common.h | 2 +
>> tools/libxc/xc_sr_save.c | 3 +-
>> tools/libxl/Makefile | 1 +
>> tools/libxl/libxl_colo.h | 24 +
>> tools/libxl/libxl_colo_restore.c | 1038 ++++++++++++++++++++++++++++++++++++++
>> tools/libxl/libxl_create.c | 37 ++
>> tools/libxl/libxl_internal.h | 19 +
>> tools/libxl/libxl_save_callout.c | 7 +-
>> tools/libxl/libxl_stream_read.c | 12 +
>> tools/libxl/libxl_types.idl | 1 +
>
> There is a bunch of TODOs in libxl_colo.c but I don't think you're in a
> better position to judge whether they should be blocker or not.
>
>> 10 files changed, 1142 insertions(+), 2 deletions(-)
>> create mode 100644 tools/libxl/libxl_colo.h
>> create mode 100644 tools/libxl/libxl_colo_restore.c
>>
>> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
>> index 5d9f497..2bfed64 100644
>> --- a/tools/libxc/xc_sr_common.h
>> +++ b/tools/libxc/xc_sr_common.h
>> @@ -184,10 +184,12 @@ struct xc_sr_context
>> * migration stream
>> * 0: Plain VM
>> * 1: Remus
>> + * 2: COLO
>> */
>> enum {
>> MIG_STREAM_NONE, /* plain stream */
>> MIG_STREAM_REMUS,
>> + MIG_STREAM_COLO,
>> } migration_stream;
>>
>> union /* Common save or restore data. */
>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>> index fe210cc..7393355 100644
>> --- a/tools/libxc/xc_sr_save.c
>> +++ b/tools/libxc/xc_sr_save.c
>> @@ -846,7 +846,8 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
>>
>> /* If altering migration_stream update this assert too. */
>> assert(checkpointed_stream == MIG_STREAM_NONE ||
>> - checkpointed_stream == MIG_STREAM_REMUS);
>> + checkpointed_stream == MIG_STREAM_REMUS ||
>> + checkpointed_stream == MIG_STREAM_COLO);
>>
>> /*
>> * TODO: Find some time to better tweak the live migration algorithm.
>
> [...]
>
>> +
>> +#include "libxl_osdeps.h" /* must come before any other headers */
>> +
>> +#include "libxl_internal.h"
>> +#include "libxl_colo.h"
>> +#include "libxl_sr_stream_format.h"
>> +
>> +enum {
>> + LIBXL_COLO_SETUPED,
>> + LIBXL_COLO_SUSPENDED,
>> + LIBXL_COLO_RESUMED,
>> +};
>> +
>> +typedef struct libxl__colo_restore_checkpoint_state libxl__colo_restore_checkpoint_state;
>> +struct libxl__colo_restore_checkpoint_state {
>> + libxl__domain_suspend_state dsps;
>> + libxl__logdirty_switch lds;
>> + libxl__colo_restore_state *crs;
>> + libxl__stream_write_state sws;
>> + int status;
>> + bool preresume;
>> + /* used for teardown */
>> + int teardown_devices;
>> + int saved_rc;
>> + char *state_file;
>> +
>> + void (*callback)(libxl__egc *,
>> + libxl__colo_restore_checkpoint_state *,
>> + int);
>> +};
>> +
>
> Shouldn't the enum and struct belong to libxl_colo.h ?
If we inlucde libxl_colo.h in libxl_internal.h, we cannot move this into colo.h, because
this structure needs libxl__domain_suspend_state, libxl__logdirty_switch, ...
We cannot just declare it, because this structure needs know there size.
>
>> +
>> +static void libxl__colo_restore_domain_resume_callback(void *data);
>> +static void libxl__colo_restore_domain_checkpoint_callback(void *data);
>> +static void libxl__colo_restore_domain_wait_checkpoint_callback(void *data);
>> +static void libxl__colo_restore_domain_suspend_callback(void *data);
>> +
>> +static const libxl__checkpoint_device_instance_ops *colo_restore_ops[] = {
>> + NULL,
>> +};
>> +
>
> It would be helpful to list the callbacks at the beginning of the time
> in the order they are supposed to occur.
>
> See libxl_create.c for example. Search for "Event callbacks, in this
> order".
>
> I've tried to map the algorithm you described in commit message to all
> the callbacks, but without some references it is just too time consuming
> from my end.
>
> I think what I'm going to do is to make sure the normal path that
> doesn't use COLO is not broken and leave the internal to you and Ian (if
> he fancies to dig into details).
>
> [...]
>> +
>> +void libxl__colo_restore_setup(libxl__egc *egc,
>> + libxl__colo_restore_state *crs)
>> +{
>> + libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
>> + libxl__colo_restore_checkpoint_state *crcs;
>> + int rc = ERROR_FAIL;
>> +
>> + /* Convenience aliases */
>> + libxl__srm_restore_autogen_callbacks *const callbacks =
>> + &dcs->srs.shs.callbacks.restore.a;
>> + const int domid = crs->domid;
>> +
>> + STATE_AO_GC(crs->ao);
>> +
>> + GCNEW(crcs);
>> + crs->crcs = crcs;
>> + crcs->crs = crs;
>> +
>> + /* setup dsps */
>> + crcs->dsps.ao = ao;
>> + crcs->dsps.domid = domid;
>> + if (init_dsps(&crcs->dsps))
>> + goto err;
>> +
>> + callbacks->suspend = libxl__colo_restore_domain_suspend_callback;
>> + callbacks->postcopy = libxl__colo_restore_domain_resume_callback;
>> + callbacks->checkpoint = libxl__colo_restore_domain_checkpoint_callback;
>> + callbacks->wait_checkpoint = libxl__colo_restore_domain_wait_checkpoint_callback;
>> +
>> + /*
>> + * Secondary vm is running in colo mode, so we need to call
>> + * libxl__xc_domain_restore_done() to create secondary vm.
>> + * But we will exit in domain_create_cb(). So replace the
>> + * callback here.
>> + */
>> + crs->saved_cb = dcs->callback;
>> + dcs->callback = libxl__colo_domain_create_cb;
>> + crcs->state_file = GCSPRINTF(LIBXL_DEVICE_MODEL_RESTORE_FILE".%d", domid);
>
> Can you use a different name space from the normal one?
>
> For example, you can put
>
> #define LIBXL_COLO_DEVICE_MODEL_RESTORE_FILE XXXX
>
> in libxl_colo.h and use it in all COLO code.
>
>
>> + crcs->status = LIBXL_COLO_SETUPED;
>> +
>> + libxl__logdirty_init(&crcs->lds);
>> + crcs->lds.ao = ao;
>> +
>> + crcs->sws.fd = crs->send_back_fd;
>> + crcs->sws.ao = ao;
>> + crcs->sws.back_channel = true;
>> +
>> + dcs->cds.concrete_data = crs;
>> +
>> + libxl__stream_write_start(egc, &crcs->sws);
>> +
>> + rc = 0;
>> +
>> +out:
>> + crs->callback(egc, crs, rc);
>> + return;
>> +
>> +err:
>> + goto out;
>> +}
>> +
>> +static void libxl__colo_domain_create_cb(libxl__egc *egc,
>> + libxl__domain_create_state *dcs,
>> + int rc, uint32_t domid)
>> +{
>> + libxl__colo_restore_checkpoint_state *crcs = dcs->crs.crcs;
>> +
>> + crcs->callback(egc, crcs, rc);
>> +}
>> +
>> +
> [...]
>> +
>> +static void colo_disable_logdirty_done(libxl__egc *egc,
>> + libxl__logdirty_switch *lds,
>> + int rc)
>> +{
>> + libxl__colo_restore_checkpoint_state *crcs = CONTAINER_OF(lds, *crcs, lds);
>> +
>> + EGC_GC;
>> +
>> + if (rc)
>> + LOG(WARN, "cannot disable logdirty");
>> +
>> + if (crcs->status == LIBXL_COLO_SUSPENDED) {
>> + /*
>> + * failover when reading state from master, so no need to
>> + * call libxl__domain_restore().
>
> You need to update this comment to the right function name.
>
>> + */
>> + colo_resume_vm(egc, crcs, 0);
>> + return;
>> + }
>> +
>> + /* If we cannot disable logdirty, we still can do failover */
>> + crcs->callback(egc, crcs, 0);
>> +}
>> +
> [...]
>>
>> +/* colo related structure */
>> +typedef struct libxl__colo_restore_state libxl__colo_restore_state;
>> +typedef void libxl__colo_callback(libxl__egc *,
>> + libxl__colo_restore_state *, int rc);
>> +struct libxl__colo_restore_state {
>> + /* must set by caller of libxl__colo_(setup|teardown) */
>> + libxl__ao *ao;
>> + uint32_t domid;
>> + int send_back_fd;
>> + int recv_fd;
>> + int hvm;
>> + libxl__colo_callback *callback;
>> +
>> + /* private, colo restore checkpoint state */
>> + libxl__domain_create_cb *saved_cb;
>> + void *crcs;
>> +};
>>
>
> And this should go to libxl_colo.h, too? And libxl_internal.h includes
> libxl_colo.h?
If we do so, we should declare libxl__ao, libxl__domain_create_cb in libxl_colo.h
Thanks
Wen Congyang
>
> I just don't want to colo structures and functions scatter in
> different places.
>
>> struct libxl__domain_create_state {
>> /* filled in by user */
>> @@ -3486,6 +3503,8 @@ struct libxl__domain_create_state {
>> /* private to domain_create */
>> int guest_domid;
>> libxl__domain_build_state build_state;
>> + libxl__colo_restore_state crs;
>> + libxl__checkpoint_devices_state cds;
>> libxl__bootloader_state bl;
>> libxl__stub_dm_spawn_state dmss;
>> /* If we're not doing stubdom, we use only dmss.dm,
>> diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
>> index 0d6949a..b1810b2 100644
>> --- a/tools/libxl/libxl_save_callout.c
>> +++ b/tools/libxl/libxl_save_callout.c
>> @@ -15,6 +15,7 @@
>> #include "libxl_osdeps.h"
>>
>> #include "libxl_internal.h"
>> +#include "libxl_colo.h"
>>
>> /* stream_fd is as from the caller (eventually, the application).
>> * It may be 0, 1 or 2, in which case we need to dup it elsewhere.
>> @@ -68,7 +69,11 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
>> shs->ao = ao;
>> shs->domid = domid;
>> shs->recv_callback = libxl__srm_callout_received_restore;
>> - shs->completion_callback = libxl__xc_domain_restore_done;
>> + if (dcs->restore_params.checkpointed_stream ==
>> + LIBXL_CHECKPOINTED_STREAM_COLO)
>
> This is very strange line wrap.
>
>> + shs->completion_callback = libxl__colo_restore_teardown;
>> + else
>> + shs->completion_callback = libxl__xc_domain_restore_done;
>> shs->caller_state = dcs;
>> shs->need_results = 1;
>>
>> diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
>> index 5d980d9..d6bd2fe 100644
>> --- a/tools/libxl/libxl_stream_read.c
>> +++ b/tools/libxl/libxl_stream_read.c
>> @@ -846,6 +846,18 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
>> */
>> if (libxl__stream_read_inuse(stream)) {
>> switch (checkpointed_stream) {
>> + case LIBXL_CHECKPOINTED_STREAM_COLO:
>> + if (stream->completion_callback) {
>> + /*
>> + * restore, just build the secondary vm, don't close
>> + * the stream
>> + */
>> + stream->completion_callback(egc, stream, 0);
>> + } else {
>> + /* failover, just close the stream */
>> + stream_complete(egc, stream, 0);
>> + }
>> + break;
>> case LIBXL_CHECKPOINTED_STREAM_REMUS:
>> /*
>> * Failover from primary. Domain state is currently at a
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index 632c009..33f4a90 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -232,6 +232,7 @@ libxl_hdtype = Enumeration("hdtype", [
>> libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
>> (0, "NONE"),
>> (1, "REMUS"),
>> + (2, "COLO"),
>> ])
>>
>> #
>> --
>> 2.5.0
>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>
>
> .
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-03-01 10:06 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-22 2:52 [PATCH v10 00/31] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Wen Congyang
2016-02-22 2:52 ` [PATCH v10 01/31] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Wen Congyang
2016-02-25 15:53 ` Wei Liu
2016-02-26 1:55 ` Wen Congyang
2016-02-22 2:52 ` [PATCH v10 02/31] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Wen Congyang
2016-02-22 2:52 ` [PATCH v10 03/31] tools/libxl: Add back channel to allow migration target send data back Wen Congyang
2016-02-22 2:52 ` [PATCH v10 04/31] tools/libxl: Introduce new helper function dup_fd_helper() Wen Congyang
2016-02-25 15:53 ` Wei Liu
2016-02-22 2:52 ` [PATCH v10 05/31] tools/libx{l, c}: add back channel to libxc Wen Congyang
2016-02-25 15:54 ` Wei Liu
2016-02-22 2:52 ` [PATCH v10 06/31] docs: add colo readme Wen Congyang
2016-02-25 15:54 ` Wei Liu
2016-02-22 2:52 ` [PATCH v10 07/31] docs/libxl: Introduce CHECKPOINT_CONTEXT to support migration v2 colo streams Wen Congyang
2016-02-25 15:54 ` Wei Liu
2016-02-26 1:59 ` Wen Congyang
2016-02-22 2:52 ` [PATCH v10 08/31] libxc/migration: Specification update for DIRTY_PFN_LIST records Wen Congyang
2016-02-22 2:52 ` [PATCH v10 09/31] libxc/migration: export read_record for common use Wen Congyang
2016-02-22 2:52 ` [PATCH v10 10/31] tools/libxl: add back channel support to write stream Wen Congyang
2016-02-25 15:54 ` Wei Liu
2016-02-26 2:11 ` Wen Congyang
2016-03-02 15:02 ` Wei Liu
2016-03-03 1:25 ` Wen Congyang
2016-02-22 2:52 ` [PATCH v10 11/31] tools/libxl: write checkpoint_state records into the stream Wen Congyang
2016-02-22 2:52 ` [PATCH v10 12/31] tools/libxl: add back channel support to read stream Wen Congyang
2016-02-25 15:54 ` Wei Liu
2016-02-26 2:16 ` Wen Congyang
2016-03-02 15:03 ` Wei Liu
2016-02-22 2:52 ` [PATCH v10 13/31] tools/libxl: handle checkpoint_state records in a libxl migration v2 " Wen Congyang
2016-02-22 2:52 ` [PATCH v10 14/31] tools/libx{l, c}: introduce wait_checkpoint callback Wen Congyang
2016-02-22 2:52 ` [PATCH v10 15/31] tools/libx{l, c}: add postcopy/suspend callback to restore side Wen Congyang
2016-02-22 2:52 ` [PATCH v10 16/31] secondary vm suspend/resume/checkpoint code Wen Congyang
2016-02-25 15:56 ` Wei Liu
2016-02-26 2:30 ` Wen Congyang
2016-03-01 10:06 ` Wen Congyang [this message]
2016-02-22 2:52 ` [PATCH v10 17/31] primary " Wen Congyang
2016-02-25 15:57 ` Wei Liu
2016-02-26 2:32 ` Wen Congyang
2016-02-22 2:52 ` [PATCH v10 18/31] libxc/restore: support COLO restore Wen Congyang
2016-02-25 15:57 ` Wei Liu
2016-02-26 2:33 ` Wen Congyang
2016-02-22 2:52 ` [PATCH v10 19/31] libxc/restore: send dirty pfn list to primary when checkpoint under colo Wen Congyang
2016-02-22 2:52 ` [PATCH v10 20/31] send store gfn and console gfn to xl before resuming secondary vm Wen Congyang
2016-02-22 2:52 ` [PATCH v10 21/31] libxc/save: support COLO save Wen Congyang
2016-02-25 15:58 ` Wei Liu
2016-02-26 2:35 ` Wen Congyang
2016-02-22 2:52 ` [PATCH v10 22/31] implement the cmdline for COLO Wen Congyang
2016-03-02 15:03 ` Wei Liu
2016-03-03 1:30 ` Wen Congyang
2016-02-22 2:52 ` [PATCH v10 23/31] COLO: introduce new API to prepare/start/do/get_error/stop replication Wen Congyang
2016-03-02 15:03 ` Wei Liu
2016-02-22 2:52 ` [PATCH v10 24/31] Support colo mode for qemu disk Wen Congyang
2016-03-02 15:04 ` Wei Liu
2016-03-03 1:40 ` Wen Congyang
2016-02-22 2:52 ` [PATCH v10 25/31] COLO: use qemu block replication Wen Congyang
2016-03-02 15:03 ` Wei Liu
2016-02-22 2:52 ` [PATCH v10 26/31] COLO proxy: implement setup/teardown of COLO proxy module Wen Congyang
2016-03-02 15:04 ` Wei Liu
2016-03-11 22:25 ` Konrad Rzeszutek Wilk
2016-03-14 9:13 ` Wen Congyang
2016-03-22 3:40 ` Changlong Xie
2016-02-22 2:52 ` [PATCH v10 27/31] COLO proxy: preresume, postresume and checkpoint Wen Congyang
2016-03-02 15:04 ` Wei Liu
2016-02-22 2:52 ` [PATCH v10 28/31] COLO nic: implement COLO nic subkind Wen Congyang
2016-03-02 15:04 ` Wei Liu
2016-02-22 2:52 ` [PATCH v10 29/31] setup and control colo proxy on primary side Wen Congyang
2016-02-22 2:52 ` [PATCH v10 30/31] setup and control colo proxy on secondary side Wen Congyang
2016-02-22 2:52 ` [PATCH v10 31/31] cmdline switches and config vars to control colo-proxy Wen Congyang
2016-03-02 15:05 ` Wei Liu
2016-03-03 1:41 ` Wen Congyang
2016-02-25 16:05 ` [PATCH v10 00/31] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Wei Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56D5699E.2060002@cn.fujitsu.com \
--to=wency@cn.fujitsu.com \
--cc=andrew.cooper3@citrix.com \
--cc=eddie.dong@intel.com \
--cc=guijianfeng@cn.fujitsu.com \
--cc=hongyang.yang@easystack.cn \
--cc=ian.campbell@citrix.com \
--cc=ian.jackson@eu.citrix.com \
--cc=lars.kurth@citrix.com \
--cc=rshriram@cs.ubc.ca \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
--cc=xiecl.fnst@cn.fujitsu.com \
--cc=yunhong.jiang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).