* [PATCH v7 COLO 01/18] docs: add colo readme
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
@ 2015-06-25 6:30 ` Yang Hongyang
2015-07-14 15:15 ` Ian Campbell
2015-06-25 6:30 ` [PATCH v7 COLO 02/18] tools/libxl: handle colo_context records in a libxl migration v2 stream Yang Hongyang
` (17 subsequent siblings)
18 siblings, 1 reply; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:30 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
add colo readme, refer to
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
docs/README.colo | 9 +++++++++
1 file changed, 9 insertions(+)
create mode 100644 docs/README.colo
diff --git a/docs/README.colo b/docs/README.colo
new file mode 100644
index 0000000..466eb72
--- /dev/null
+++ b/docs/README.colo
@@ -0,0 +1,9 @@
+COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service)
+project is a high availability solution. Both primary VM (PVM) and secondary VM
+(SVM) run in parallel. They receive the same request from client, and generate
+response in parallel too. If the response packets from PVM and SVM are
+identical, they are released immediately. Otherwise, a VM checkpoint (on demand)
+is conducted.
+
+See the website at http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
+for details.
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v7 COLO 01/18] docs: add colo readme
2015-06-25 6:30 ` [PATCH v7 COLO 01/18] docs: add colo readme Yang Hongyang
@ 2015-07-14 15:15 ` Ian Campbell
0 siblings, 0 replies; 24+ messages in thread
From: Ian Campbell @ 2015-07-14 15:15 UTC (permalink / raw)
To: Yang Hongyang
Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
xen-devel, guijianfeng, rshriram, ian.jackson
On Thu, 2015-06-25 at 14:30 +0800, Yang Hongyang wrote:
> add colo readme, refer to
> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 02/18] tools/libxl: handle colo_context records in a libxl migration v2 stream
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
2015-06-25 6:30 ` [PATCH v7 COLO 01/18] docs: add colo readme Yang Hongyang
@ 2015-06-25 6:30 ` Yang Hongyang
2015-07-14 15:19 ` Ian Campbell
2015-06-25 6:30 ` [PATCH v7 COLO 03/18] tools/libxl: write colo_context records into the stream Yang Hongyang
` (16 subsequent siblings)
18 siblings, 1 reply; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:30 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
From: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
tools/libxl/libxl_internal.h | 3 +++
tools/libxl/libxl_stream_read.c | 51 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 54 insertions(+)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 840734d..7f591ee 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3249,6 +3249,7 @@ struct libxl__stream_read_state {
int joined_rc;
bool running;
bool in_checkpoint;
+ bool in_colo_context;
libxl__datacopier_state dc;
size_t expected_len;
libxl_sr_hdr hdr;
@@ -3263,6 +3264,8 @@ _hidden void libxl__stream_read_continue(libxl__egc *egc,
libxl__stream_read_state *stream);
_hidden void libxl__stream_read_start_checkpoint(
libxl__egc *egc, libxl__stream_read_state *stream);
+_hidden void libxl__stream_read_colo_context(
+ libxl__egc *egc, libxl__stream_read_state *stream);
_hidden void libxl__stream_read_abort(libxl__egc *egc,
libxl__stream_read_state *stream, int rc);
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 72a9972..d1c2d20 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -93,6 +93,13 @@ static void emulator_padding_done(libxl__egc *egc,
static void checkpoint_done(libxl__egc *egc,
libxl__stream_read_state *stream, int rc);
+static void handle_colo_context(libxl__egc *egc,
+ libxl__stream_read_state *stream);
+
+/* Error handling for colo context mini-loop */
+static void colo_context_done(libxl__egc *egc,
+ libxl__stream_read_state *stream, int rc);
+
void libxl__stream_read_start(libxl__egc *egc,
libxl__stream_read_state *stream)
{
@@ -190,6 +197,7 @@ void libxl__stream_read_start_checkpoint(libxl__egc *egc,
assert(stream->running);
assert(!stream->in_checkpoint);
assert(!stream->back_channel);
+ assert(!stream->in_colo_context);
stream->in_checkpoint = true;
/* Read a record header. */
@@ -211,6 +219,17 @@ void libxl__stream_read_start_checkpoint(libxl__egc *egc,
stream_failed(egc, stream, ret);
}
+void libxl__stream_read_colo_context(libxl__egc *egc,
+ libxl__stream_read_state *stream)
+{
+ assert(stream->running);
+ assert(!stream->in_checkpoint);
+ assert(!stream->in_colo_context);
+ stream->in_colo_context = true;
+
+ libxl__stream_read_continue(egc, stream);
+}
+
void libxl__stream_read_abort(libxl__egc *egc,
libxl__stream_read_state *stream, int rc)
{
@@ -240,6 +259,12 @@ static void stream_failed(libxl__egc *egc,
return;
}
+ if (stream->in_colo_context) {
+ assert(rc < 0);
+ colo_context_done(egc, stream, rc);
+ return;
+ }
+
if (stream->back_channel) {
stream->completion_callback(egc, stream, rc);
return;
@@ -259,6 +284,7 @@ static void stream_done(libxl__egc *egc,
assert(!stream->running);
assert(!stream->in_checkpoint);
assert(!stream->back_channel);
+ assert(!stream->in_colo_context);
if (stream->v2_carefd)
libxl__carefd_close(stream->v2_carefd);
@@ -531,6 +557,15 @@ static void process_record(libxl__egc *egc,
checkpoint_done(egc, stream, 0);
break;
+ case REC_TYPE_COLO_CONTEXT:
+ if (!stream->in_colo_context) {
+ LOG(ERROR, "Unexpected COLO_CONTEXT record in stream");
+ ret = ERROR_FAIL;
+ goto err;
+ }
+ handle_colo_context(egc, stream);
+ break;
+
default:
LOG(ERROR, "Unrecognised record 0x%08x", rec_hdr->type);
ret = ERROR_FAIL;
@@ -671,6 +706,14 @@ static void emulator_padding_done(libxl__egc *egc,
stream_failed(egc, stream, ret);
}
+static void handle_colo_context(libxl__egc *egc,
+ libxl__stream_read_state *stream)
+{
+ libxl_sr_colo_context *colo_context = stream->rec_body;
+
+ colo_context_done(egc, stream, colo_context->id);
+}
+
static void checkpoint_done(libxl__egc *egc,
libxl__stream_read_state *stream, int rc)
{
@@ -679,6 +722,14 @@ static void checkpoint_done(libxl__egc *egc,
stream->read_records_callback(egc, stream, rc);
}
+static void colo_context_done(libxl__egc *egc,
+ libxl__stream_read_state *stream, int rc)
+{
+ assert(stream->in_colo_context);
+ stream->in_colo_context = false;
+ stream->read_records_callback(egc, stream, rc);
+}
+
/*
* Local variables:
* mode: C
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v7 COLO 02/18] tools/libxl: handle colo_context records in a libxl migration v2 stream
2015-06-25 6:30 ` [PATCH v7 COLO 02/18] tools/libxl: handle colo_context records in a libxl migration v2 stream Yang Hongyang
@ 2015-07-14 15:19 ` Ian Campbell
2015-07-15 0:34 ` Yang Hongyang
0 siblings, 1 reply; 24+ messages in thread
From: Ian Campbell @ 2015-07-14 15:19 UTC (permalink / raw)
To: Yang Hongyang
Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
xen-devel, guijianfeng, rshriram, ian.jackson
On Thu, 2015-06-25 at 14:30 +0800, Yang Hongyang wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
> tools/libxl/libxl_internal.h | 3 +++
> tools/libxl/libxl_stream_read.c | 51 +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 54 insertions(+)
This patch is certainly not so trivial that it can get away with a
subject only and no commit body.
The same goes for patch #3.
In particular both of them add some new stream state which is not
described anywhere.
Ian.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v7 COLO 02/18] tools/libxl: handle colo_context records in a libxl migration v2 stream
2015-07-14 15:19 ` Ian Campbell
@ 2015-07-15 0:34 ` Yang Hongyang
0 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-07-15 0:34 UTC (permalink / raw)
To: Ian Campbell
Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
xen-devel, guijianfeng, rshriram, ian.jackson
Hi Ian,
Thanks for the review.
On 07/14/2015 11:19 PM, Ian Campbell wrote:
> On Thu, 2015-06-25 at 14:30 +0800, Yang Hongyang wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>> tools/libxl/libxl_internal.h | 3 +++
>> tools/libxl/libxl_stream_read.c | 51 +++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 54 insertions(+)
>
> This patch is certainly not so trivial that it can get away with a
> subject only and no commit body.
>
> The same goes for patch #3.
>
> In particular both of them add some new stream state which is not
> described anywhere.
This patch is changed a lot due to the migration v2 change, I've added some
commit message in v8, I will add more desciption as you suggested.
>
> Ian.
>
> .
>
--
Thanks,
Yang.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 03/18] tools/libxl: write colo_context records into the stream
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
2015-06-25 6:30 ` [PATCH v7 COLO 01/18] docs: add colo readme Yang Hongyang
2015-06-25 6:30 ` [PATCH v7 COLO 02/18] tools/libxl: handle colo_context records in a libxl migration v2 stream Yang Hongyang
@ 2015-06-25 6:30 ` Yang Hongyang
2015-06-25 6:30 ` [PATCH v7 COLO 04/18] secondary vm suspend/resume/checkpoint code Yang Hongyang
` (15 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:30 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
From: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
tools/libxl/libxl_internal.h | 4 ++
tools/libxl/libxl_stream_write.c | 92 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 96 insertions(+)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7f591ee..1ef6fc8 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2905,6 +2905,7 @@ struct libxl__stream_write_state {
size_t padding;
bool running;
bool in_checkpoint;
+ bool in_colo_context;
libxl__datacopier_state dc;
};
@@ -2913,6 +2914,9 @@ _hidden void libxl__stream_write_start(libxl__egc *egc,
_hidden void libxl__stream_write_start_checkpoint(
libxl__egc *egc, libxl__stream_write_state *stream);
+_hidden void libxl__stream_write_colo_context(
+ libxl__egc *egc, libxl__stream_write_state *stream,
+ libxl_sr_colo_context *colo_context);
_hidden void libxl__stream_write_abort(libxl__egc *egc,
libxl__stream_write_state *stream,
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index 3f981f0..a445189 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -107,6 +107,15 @@ static void checkpoint_end_record_done(libxl__egc *egc,
libxl__datacopier_state *dc,
int onwrite, int errnoval);
+static void write_colo_context(libxl__egc *egc,
+ libxl__stream_write_state *stream,
+ libxl_sr_colo_context *colo_context);
+static void write_colo_context_done(libxl__egc *egc,
+ libxl__datacopier_state *dc,
+ int onwrite, int errnoval);
+static void colo_context_done(libxl__egc *egc,
+ libxl__stream_write_state *stream, int rc);
+
void libxl__stream_write_start(libxl__egc *egc,
libxl__stream_write_state *stream)
{
@@ -154,11 +163,24 @@ void libxl__stream_write_start_checkpoint(libxl__egc *egc,
assert(stream->running);
assert(!stream->in_checkpoint);
assert(!stream->back_channel);
+ assert(!stream->in_colo_context);
stream->in_checkpoint = true;
write_toolstack_record(egc, stream);
}
+void libxl__stream_write_colo_context(libxl__egc *egc,
+ libxl__stream_write_state *stream,
+ libxl_sr_colo_context *colo_context)
+{
+ assert(stream->running);
+ assert(!stream->in_checkpoint);
+ assert(!stream->in_colo_context);
+ stream->in_colo_context = true;
+
+ write_colo_context(egc, stream, colo_context);
+}
+
void libxl__stream_write_abort(libxl__egc *egc,
libxl__stream_write_state *stream, int rc)
{
@@ -171,6 +193,7 @@ static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
stream->running = false;
assert(!stream->in_checkpoint);
+ assert(!stream->in_colo_context);
stream_done(egc, stream);
}
@@ -189,6 +212,11 @@ static void stream_failed(libxl__egc *egc,
return;
}
+ if (stream->in_colo_context) {
+ colo_context_done(egc, stream, rc);
+ return;
+ }
+
if (stream->back_channel) {
stream->completion_callback(egc, stream, rc);
return;
@@ -207,6 +235,7 @@ static void stream_done(libxl__egc *egc,
assert(!stream->running);
assert(!stream->in_checkpoint);
+ assert(!stream->in_colo_context);
check_stream_finished(egc, dss, stream->rc, "stream");
}
@@ -544,6 +573,61 @@ static void emulator_padding_done(libxl__egc *egc,
stream_failed(egc, stream, ret);
}
+static void write_colo_context(libxl__egc *egc,
+ libxl__stream_write_state *stream,
+ libxl_sr_colo_context *colo_context)
+{
+ libxl__datacopier_state *dc = &stream->dc;
+ STATE_AO_GC(stream->ao);
+ struct libxl_sr_rec_hdr rec = { REC_TYPE_COLO_CONTEXT, 0 };
+ int ret = 0;
+ uint32_t padding_len;
+
+ dc->copywhat = "colo context record";
+ dc->writewhat = "save/migration stream";
+ dc->callback = write_colo_context_done;
+
+ ret = libxl__datacopier_start(dc);
+ if (ret)
+ goto err;
+
+ rec.length = sizeof(*colo_context);
+
+ libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
+ libxl__datacopier_prefixdata(egc, dc, colo_context, rec.length);
+
+ padding_len = ROUNDUP(rec.length, REC_ALIGN_ORDER) - rec.length;
+ if (padding_len)
+ libxl__datacopier_prefixdata(egc, dc, zero_padding, padding_len);
+
+ return;
+
+ err:
+ assert(ret);
+ stream_failed(egc, stream, ret);
+}
+
+static void write_colo_context_done(libxl__egc *egc,
+ libxl__datacopier_state *dc,
+ int onwrite, int errnoval)
+{
+ libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+ STATE_AO_GC(stream->ao);
+ int ret = 0;
+
+ if (onwrite || errnoval) {
+ ret = ERROR_FAIL;
+ goto err;
+ }
+
+ colo_context_done(egc, stream, 0);
+ return;
+
+ err:
+ assert(ret);
+ stream_failed(egc, stream, ret);
+}
+
static void write_end_record(libxl__egc *egc,
libxl__stream_write_state *stream)
{
@@ -645,6 +729,14 @@ static void checkpoint_end_record_done(libxl__egc *egc,
stream_failed(egc, stream, ret);
}
+static void colo_context_done(libxl__egc *egc,
+ libxl__stream_write_state *stream, int rc)
+{
+ assert(stream->in_colo_context);
+ stream->in_colo_context = false;
+ stream->write_records_callback(egc, stream, rc);
+}
+
/*
* Local variables:
* mode: C
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 04/18] secondary vm suspend/resume/checkpoint code
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (2 preceding siblings ...)
2015-06-25 6:30 ` [PATCH v7 COLO 03/18] tools/libxl: write colo_context records into the stream Yang Hongyang
@ 2015-06-25 6:30 ` Yang Hongyang
2015-06-25 6:30 ` [PATCH v7 COLO 05/18] primary " Yang Hongyang
` (14 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:30 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
From: Wen Congyang <wency@cn.fujitsu.com>
Secondary vm is running in colo mode. So we will do
the following things again and again:
1. Resume secondary vm
a. Send LIBXL_COLO_SVM_READY to master.
b. If it is not the first resume, call libxl__checkpoint_devices_preresume().
c. If it is the first resume(resume right after live migration),
- call libxl__xc_domain_restore_done() to build the secondary vm.
- enable secondary vm's logdirty.
- call libxl__domain_resume() to resume secondary vm.
- call libxl__checkpoint_devices_setup() to setup checkpoint devices.
d. Send LIBXL_COLO_SVM_RESUMED to master.
2. Wait a new checkpoint
a. Call libxl__checkpoint_devices_commit().
b. Read LIBXL_COLO_NEW_CHECKPOINT from master.
3. Suspend secondary vm
a. Suspend secondary vm.
b. Call libxl__checkpoint_devices_postsuspend().
c. Send LIBXL_COLO_SVM_SUSPENDED to master.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
tools/libxl/Makefile | 1 +
tools/libxl/libxl_colo.h | 38 ++
tools/libxl/libxl_colo_restore.c | 989 +++++++++++++++++++++++++++++++++++++++
tools/libxl/libxl_create.c | 111 ++++-
tools/libxl/libxl_internal.h | 19 +
tools/libxl/libxl_save_callout.c | 7 +-
6 files changed, 1163 insertions(+), 2 deletions(-)
create mode 100644 tools/libxl/libxl_colo.h
create mode 100644 tools/libxl/libxl_colo_restore.c
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 2f4efd4..66ae63d 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -57,6 +57,7 @@ LIBXL_OBJS-y += libxl_nonetbuffer.o
endif
LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_colo_restore.o
LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
new file mode 100644
index 0000000..91df275
--- /dev/null
+++ b/tools/libxl/libxl_colo.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#ifndef LIBXL_COLO_H
+#define LIBXL_COLO_H
+
+/*
+ * values to control suspend/resume primary vm and secondary vm
+ * at the same time
+ */
+enum {
+ LIBXL_COLO_NEW_CHECKPOINT = 1,
+ LIBXL_COLO_SVM_SUSPENDED,
+ LIBXL_COLO_SVM_READY,
+ LIBXL_COLO_SVM_RESUMED,
+};
+
+extern void libxl__colo_restore_done(libxl__egc *egc, void *dcs_void,
+ int ret, int retval, int errnoval);
+extern void libxl__colo_restore_setup(libxl__egc *egc,
+ libxl__colo_restore_state *crs);
+extern void libxl__colo_restore_teardown(libxl__egc *egc,
+ libxl__colo_restore_state *crs,
+ int rc);
+
+#endif
diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
new file mode 100644
index 0000000..40fd170
--- /dev/null
+++ b/tools/libxl/libxl_colo_restore.c
@@ -0,0 +1,989 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Wen Congyang <wency@cn.fujitsu.com>
+ * Yang Hongyang <yanghy@cn.fujitsu.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+#include "libxl_colo.h"
+#include "libxl_sr_stream_format.h"
+
+enum {
+ LIBXL_COLO_SETUPED,
+ LIBXL_COLO_SUSPENDED,
+ LIBXL_COLO_RESUMED,
+};
+
+typedef struct libxl__colo_restore_checkpoint_state libxl__colo_restore_checkpoint_state;
+struct libxl__colo_restore_checkpoint_state {
+ libxl__domain_suspend_state dsps;
+ libxl__logdirty_switch lds;
+ libxl__colo_restore_state *crs;
+ libxl__stream_write_state sws;
+ int status;
+ bool preresume;
+ /* used for teardown */
+ int teardown_devices;
+ int saved_rc;
+
+ void (*callback)(libxl__egc *,
+ libxl__colo_restore_checkpoint_state *,
+ int);
+};
+
+
+static void libxl__colo_restore_domain_resume_callback(void *data);
+static void libxl__colo_restore_domain_checkpoint_callback(void *data);
+static void libxl__colo_restore_domain_should_checkpoint_callback(void *data);
+static void libxl__colo_restore_domain_suspend_callback(void *data);
+
+static const libxl__checkpoint_device_instance_ops *colo_restore_ops[] = {
+ NULL,
+};
+
+/* ===================== colo: common functions ===================== */
+static void colo_enable_logdirty(libxl__colo_restore_state *crs, libxl__egc *egc)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+
+ /* Convenience aliases */
+ const uint32_t domid = crs->domid;
+ libxl__logdirty_switch *const lds = &crcs->lds;
+
+ STATE_AO_GC(crs->ao);
+
+ /* we need to know which pages are dirty to restore the guest */
+ if (xc_shadow_control(CTX->xch, domid,
+ XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY,
+ NULL, 0, NULL, 0, NULL) < 0) {
+ LOG(ERROR, "cannot enable secondary vm's logdirty");
+ lds->callback(egc, lds, ERROR_FAIL);
+ return;
+ }
+
+ if (crs->hvm) {
+ libxl__domain_common_switch_qemu_logdirty(egc, domid, 1, lds);
+ return;
+ }
+
+ lds->callback(egc, lds, 0);
+}
+
+static void colo_disable_logdirty(libxl__colo_restore_state *crs,
+ libxl__egc *egc)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+
+ /* Convenience aliases */
+ const uint32_t domid = crs->domid;
+ libxl__logdirty_switch *const lds = &crcs->lds;
+
+ STATE_AO_GC(crs->ao);
+
+ /* we need to know which pages are dirty to restore the guest */
+ if (xc_shadow_control(CTX->xch, domid, XEN_DOMCTL_SHADOW_OP_OFF,
+ NULL, 0, NULL, 0, NULL) < 0)
+ LOG(WARN, "cannot disable secondary vm's logdirty");
+
+ if (crs->hvm) {
+ libxl__domain_common_switch_qemu_logdirty(egc, domid, 0, lds);
+ return;
+ }
+
+ lds->callback(egc, lds, 0);
+}
+
+static void colo_resume_vm(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs,
+ int restore_device_model)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crcs->crs, *dcs, crs);
+ int rc;
+
+ /* Convenience aliases */
+ libxl__colo_restore_state *const crs = crcs->crs;
+
+ STATE_AO_GC(crs->ao);
+
+ if (!crs->saved_cb) {
+ /* TODO: sync mmu for hvm? */
+ if (restore_device_model) {
+ rc = libxl__domain_restore(gc, crs->domid);
+ if (rc) {
+ LOG(ERROR, "cannot restore device model for secondary vm");
+ crcs->callback(egc, crcs, rc);
+ return;
+ }
+ }
+ rc = libxl__domain_resume(gc, crs->domid, 0);
+ if (rc)
+ LOG(ERROR, "cannot resume secondary vm");
+
+ crcs->callback(egc, crcs, rc);
+ return;
+ }
+
+ /*
+ * TODO: get store mfn and console mfn
+ * We should call the callback restore_results in
+ * xc_domain_restore() before resuming the guest.
+ */
+ libxl__xc_domain_restore_done(egc, dcs, 0, 0, 0);
+
+ return;
+}
+
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+ /* init device subkind-specific state in the libxl ctx */
+ int rc;
+ STATE_AO_GC(cds->ao);
+
+ rc = 0;
+ return rc;
+}
+
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+ /* cleanup device subkind-specific state in the libxl ctx */
+ STATE_AO_GC(cds->ao);
+}
+
+
+/* ================ colo: setup restore environment ================ */
+static void libxl__colo_domain_create_cb(libxl__egc *egc,
+ libxl__domain_create_state *dcs,
+ int rc, uint32_t domid);
+
+static int init_dsps(libxl__domain_suspend_state *dsps)
+{
+ int rc = ERROR_FAIL;
+ libxl_domain_type type;
+
+ STATE_AO_GC(dsps->ao);
+
+ type = libxl__domain_type(gc, dsps->domid);
+ if (type == LIBXL_DOMAIN_TYPE_INVALID)
+ goto out;
+
+ libxl__xswait_init(&dsps->pvcontrol);
+ libxl__ev_evtchn_init(&dsps->guest_evtchn);
+ libxl__ev_xswatch_init(&dsps->guest_watch);
+ libxl__ev_time_init(&dsps->guest_timeout);
+
+ if (type == LIBXL_DOMAIN_TYPE_HVM)
+ dsps->hvm = 1;
+ else
+ dsps->hvm = 0;
+
+ dsps->guest_evtchn.port = -1;
+ dsps->guest_evtchn_lockfd = -1;
+ dsps->guest_responded = 0;
+ dsps->dm_savefile = libxl__device_model_savefile(gc, dsps->domid);
+
+ /* Secondary vm is not created, so we cannot get evtchn port */
+
+ rc = 0;
+
+out:
+ return rc;
+}
+
+void libxl__colo_restore_setup(libxl__egc *egc,
+ libxl__colo_restore_state *crs)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ libxl__colo_restore_checkpoint_state *crcs;
+ int rc = ERROR_FAIL;
+
+ /* Convenience aliases */
+ libxl__srm_restore_autogen_callbacks *const callbacks =
+ &dcs->shs.callbacks.restore.a;
+ const int domid = crs->domid;
+
+ STATE_AO_GC(crs->ao);
+
+ GCNEW(crcs);
+ crs->crcs = crcs;
+ crcs->crs = crs;
+
+ /* setup dsps */
+ crcs->dsps.ao = ao;
+ crcs->dsps.domid = domid;
+ if (init_dsps(&crcs->dsps))
+ goto err;
+
+ callbacks->suspend = libxl__colo_restore_domain_suspend_callback;
+ callbacks->postcopy = libxl__colo_restore_domain_resume_callback;
+ callbacks->checkpoint = libxl__colo_restore_domain_checkpoint_callback;
+ callbacks->should_checkpoint = libxl__colo_restore_domain_should_checkpoint_callback;
+
+ /*
+ * Secondary vm is running in colo mode, so we need to call
+ * libxl__xc_domain_restore_done() to create secondary vm.
+ * But we will exit in domain_create_cb(). So replace the
+ * callback here.
+ */
+ crs->saved_cb = dcs->callback;
+ dcs->callback = libxl__colo_domain_create_cb;
+ crcs->status = LIBXL_COLO_SETUPED;
+
+ libxl__logdirty_init(&crcs->lds);
+ crcs->lds.ao = ao;
+
+ crcs->sws.fd = crs->send_fd;
+ crcs->sws.ao = ao;
+ crcs->sws.back_channel = true;
+
+ libxl__stream_write_start(egc, &crcs->sws);
+
+ rc = 0;
+
+out:
+ crs->callback(egc, crs, rc);
+ return;
+
+err:
+ goto out;
+}
+
+static void libxl__colo_domain_create_cb(libxl__egc *egc,
+ libxl__domain_create_state *dcs,
+ int rc, uint32_t domid)
+{
+ libxl__colo_restore_checkpoint_state *crcs = dcs->crs.crcs;
+
+ crcs->callback(egc, crcs, rc);
+}
+
+
+/* ================ colo: teardown restore environment ================ */
+static void colo_restore_teardown_done(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+static void do_failover_done(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state* crcs,
+ int rc);
+static void colo_disable_logdirty_done(libxl__egc *egc,
+ libxl__logdirty_switch *lds,
+ int rc);
+
+static void do_failover(libxl__egc *egc, libxl__colo_restore_state *crs)
+{
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+
+ /* Convenience aliases */
+ const int status = crcs->status;
+ libxl__logdirty_switch *const lds = &crcs->lds;
+
+ STATE_AO_GC(crs->ao);
+
+ switch(status) {
+ case LIBXL_COLO_SETUPED:
+ /* We don't enable logdirty now */
+ colo_resume_vm(egc, crcs, 0);
+ return;
+ case LIBXL_COLO_SUSPENDED:
+ case LIBXL_COLO_RESUMED:
+ /* disable logdirty first */
+ lds->callback = colo_disable_logdirty_done;
+ colo_disable_logdirty(crs, egc);
+ return;
+ default:
+ LOG(ERROR, "invalid status: %d", status);
+ crcs->callback(egc, crcs, ERROR_FAIL);
+ }
+}
+
+void libxl__colo_restore_teardown(libxl__egc *egc,
+ libxl__colo_restore_state *crs,
+ int rc)
+{
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+
+ EGC_GC;
+
+ /* TODO: abort the stream it it is in use. */
+
+ crcs->saved_rc = rc;
+ if (!crcs->teardown_devices) {
+ colo_restore_teardown_done(egc, &crs->cds, 0);
+ return;
+ }
+
+ crs->cds.callback = colo_restore_teardown_done;
+ libxl__checkpoint_devices_teardown(egc, &crs->cds);
+}
+
+static void colo_restore_teardown_done(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+
+ EGC_GC;
+
+ if (rc)
+ LOG(ERROR, "COLO: failed to teardown device for guest with domid %u,"
+ " rc %d", cds->domid, rc);
+
+ if (crcs->teardown_devices)
+ cleanup_device_subkind(cds);
+
+ rc = crcs->saved_rc;
+ if (!rc) {
+ crcs->callback = do_failover_done;
+ do_failover(egc, crs);
+ return;
+ }
+
+ if (crs->saved_cb) {
+ dcs->callback = crs->saved_cb;
+ crs->saved_cb = NULL;
+ }
+ crs->callback(egc, crs, rc);
+}
+
+static void do_failover_done(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state* crcs,
+ int rc)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crcs->crs, *dcs, crs);
+
+ /* Convenience aliases */
+ libxl__colo_restore_state *const crs = crcs->crs;
+
+ STATE_AO_GC(crs->ao);
+
+ if (rc)
+ LOG(ERROR, "cannot do failover");
+
+ if (crs->saved_cb) {
+ dcs->callback = crs->saved_cb;
+ crs->saved_cb = NULL;
+ }
+
+ crs->callback(egc, crs, rc);
+}
+
+static void colo_disable_logdirty_done(libxl__egc *egc,
+ libxl__logdirty_switch *lds,
+ int rc)
+{
+ libxl__colo_restore_checkpoint_state *crcs = CONTAINER_OF(lds, *crcs, lds);
+
+ STATE_AO_GC(lds->ao);
+
+ if (rc)
+ LOG(WARN, "cannot disable logdirty");
+
+ if (crcs->status == LIBXL_COLO_SUSPENDED) {
+ /*
+ * failover when reading state from master, so no need to
+ * call libxl__domain_restore().
+ */
+ colo_resume_vm(egc, crcs, 0);
+ return;
+ }
+
+ /* If we cannot disable logdirty, we still can do failover */
+ crcs->callback(egc, crcs, 0);
+}
+
+/*
+ * checkpoint callbacks are called in the following order:
+ * 1. resume
+ * 2. should_checkpoint
+ * 3. suspend
+ * 4. checkpoint
+ */
+static void colo_common_write_stream_done(libxl__egc *egc,
+ libxl__stream_write_state *stream,
+ int rc);
+static void colo_common_read_stream_done(libxl__egc *egc,
+ libxl__stream_read_state *stream,
+ int rc);
+/* ===================== colo: resume secondary vm ===================== */
+/*
+ * Do the following things when resuming secondary vm the first time:
+ * 1. resume secondary vm
+ * 2. enable log dirty
+ * 3. setup checkpoint devices
+ * 4. write LIBXL_COLO_SVM_READY
+ * 5. unpause secondary vm
+ * 6. write LIBXL_COLO_SVM_RESUMED
+ *
+ * Do the following things when resuming secondary vm:
+ * 1. write LIBXL_COLO_SVM_READY
+ * 2. resume secondary vm
+ * 3. write LIBXL_COLO_SVM_RESUMED
+ */
+static void colo_send_svm_ready(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs);
+static void colo_send_svm_ready_done(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs,
+ int rc);
+static void colo_restore_preresume_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+static void colo_restore_resume_vm(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs);
+static void colo_resume_vm_done(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs,
+ int rc);
+static void colo_write_svm_resumed(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs);
+static void colo_enable_logdirty_done(libxl__egc *egc,
+ libxl__logdirty_switch *lds,
+ int retval);
+static void colo_reenable_logdirty(libxl__egc *egc,
+ libxl__logdirty_switch *lds,
+ int rc);
+static void colo_reenable_logdirty_done(libxl__egc *egc,
+ libxl__logdirty_switch *lds,
+ int rc);
+static void colo_setup_checkpoint_devices(libxl__egc *egc,
+ libxl__colo_restore_state *crs);
+static void colo_restore_setup_cds_done(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+static void colo_unpause_svm(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs);
+
+static void libxl__colo_restore_domain_resume_callback(void *data)
+{
+ libxl__save_helper_state *shs = data;
+ libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
+ libxl__colo_restore_checkpoint_state *crcs = dcs->crs.crcs;
+
+ if (crcs->teardown_devices)
+ colo_send_svm_ready(shs->egc, crcs);
+ else
+ colo_restore_resume_vm(shs->egc, crcs);
+}
+
+static void colo_send_svm_ready(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs)
+{
+ libxl_sr_colo_context colo_context = { .id = COLO_SVM_READY };
+
+ crcs->callback = colo_send_svm_ready_done;
+ crcs->sws.write_records_callback = colo_common_write_stream_done;
+ libxl__stream_write_colo_context(egc, &crcs->sws, &colo_context);
+}
+
+static void colo_send_svm_ready_done(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs,
+ int rc)
+{
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *cds = &crcs->crs->cds;
+
+ if (!crcs->preresume) {
+ crcs->preresume = true;
+ colo_unpause_svm(egc, crcs);
+ return;
+ }
+
+ cds->callback = colo_restore_preresume_cb;
+ libxl__checkpoint_devices_preresume(egc, cds);
+}
+
+static void colo_restore_preresume_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+
+ /* Convenience aliases */
+ libxl__save_helper_state *const shs = &dcs->shs;
+
+ STATE_AO_GC(crs->ao);
+
+ if (rc) {
+ LOG(ERROR, "preresume fails");
+ goto out;
+ }
+
+ colo_restore_resume_vm(egc, crcs);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+}
+
+static void colo_restore_resume_vm(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs)
+{
+
+ crcs->callback = colo_resume_vm_done;
+ colo_resume_vm(egc, crcs, 1);
+}
+
+static void colo_resume_vm_done(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs,
+ int rc)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crcs->crs, *dcs, crs);
+
+ /* Convenience aliases */
+ libxl__colo_restore_state *const crs = crcs->crs;
+ libxl__logdirty_switch *const lds = &crcs->lds;
+ libxl__save_helper_state *const shs = &dcs->shs;
+
+ STATE_AO_GC(crs->ao);
+
+ if (rc) {
+ LOG(ERROR, "cannot resume secondary vm");
+ goto out;
+ }
+
+ crcs->status = LIBXL_COLO_RESUMED;
+
+ /* avoid calling libxl__xc_domain_restore_done() more than once */
+ if (crs->saved_cb) {
+ dcs->callback = crs->saved_cb;
+ crs->saved_cb = NULL;
+
+ lds->callback = colo_enable_logdirty_done;
+ colo_enable_logdirty(crs, egc);
+ return;
+ }
+
+ colo_write_svm_resumed(egc, crcs);
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+}
+
+static void colo_write_svm_resumed(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs)
+{
+ libxl_sr_colo_context colo_context = { .id = COLO_SVM_RESUMED };
+
+ crcs->callback = NULL;
+ crcs->sws.write_records_callback = colo_common_write_stream_done;
+ libxl__stream_write_colo_context(egc, &crcs->sws, &colo_context);
+}
+
+static void colo_enable_logdirty_done(libxl__egc *egc,
+ libxl__logdirty_switch *lds,
+ int rc)
+{
+ libxl__colo_restore_checkpoint_state *crcs = CONTAINER_OF(lds, *crcs, lds);
+
+ /* Convenience aliases */
+ libxl__colo_restore_state *const crs = crcs->crs;
+
+ STATE_AO_GC(crs->ao);
+
+ if (rc) {
+ /*
+ * log-dirty already enabled? There's no test op,
+ * so attempt to disable then reenable it
+ */
+ lds->callback = colo_reenable_logdirty;
+ colo_disable_logdirty(crs, egc);
+ return;
+ }
+
+ colo_setup_checkpoint_devices(egc, crs);
+}
+
+static void colo_reenable_logdirty(libxl__egc *egc,
+ libxl__logdirty_switch *lds,
+ int rc)
+{
+ libxl__colo_restore_checkpoint_state *crcs = CONTAINER_OF(lds, *crcs, lds);
+ libxl__domain_create_state *dcs = CONTAINER_OF(crcs->crs, *dcs, crs);
+
+ /* Convenience aliases */
+ libxl__colo_restore_state *const crs = crcs->crs;
+ libxl__save_helper_state *const shs = &dcs->shs;
+
+ STATE_AO_GC(crs->ao);
+
+ if (rc) {
+ LOG(ERROR, "cannot enable logdirty");
+ goto out;
+ }
+
+ lds->callback = colo_reenable_logdirty_done;
+ colo_enable_logdirty(crs, egc);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+}
+
+static void colo_reenable_logdirty_done(libxl__egc *egc,
+ libxl__logdirty_switch *lds,
+ int rc)
+{
+ libxl__colo_restore_checkpoint_state *crcs = CONTAINER_OF(lds, *crcs, lds);
+ libxl__domain_create_state *dcs = CONTAINER_OF(crcs->crs, *dcs, crs);
+
+ /* Convenience aliases */
+ libxl__save_helper_state *const shs = &dcs->shs;
+
+ STATE_AO_GC(crcs->crs->ao);
+
+ if (rc) {
+ LOG(ERROR, "cannot enable logdirty");
+ goto out;
+ }
+
+ colo_setup_checkpoint_devices(egc, crcs->crs);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+}
+
+/*
+ * We cannot setup checkpoint devices in libxl__colo_restore_setup(),
+ * because the guest is not ready.
+ */
+static void colo_setup_checkpoint_devices(libxl__egc *egc,
+ libxl__colo_restore_state *crs)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *cds = &crs->cds;
+ libxl__save_helper_state *const shs = &dcs->shs;
+
+ STATE_AO_GC(crs->ao);
+
+ /* TODO: disk/nic support */
+ cds->device_kind_flags = 0;
+ cds->callback = colo_restore_setup_cds_done;
+ cds->ao = ao;
+ cds->domid = crs->domid;
+ cds->ops = colo_restore_ops;
+
+ if (init_device_subkind(cds))
+ goto out;
+
+ crcs->teardown_devices = 1;
+
+ libxl__checkpoint_devices_setup(egc, cds);
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+}
+
+static void colo_restore_setup_cds_done(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+
+ /* Convenience aliases */
+ libxl__save_helper_state *const shs = &dcs->shs;
+
+ STATE_AO_GC(cds->ao);
+
+ if (rc) {
+ LOG(ERROR, "COLO: failed to setup device for guest with domid %u",
+ cds->domid);
+ goto out;
+ }
+
+ colo_send_svm_ready(egc, crcs);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+}
+
+static void colo_unpause_svm(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crcs->crs, *dcs, crs);
+ int rc;
+
+ /* Convenience aliases */
+ const uint32_t domid = crcs->crs->domid;
+ libxl__save_helper_state *const shs = &dcs->shs;
+
+ STATE_AO_GC(crcs->crs->ao);
+
+ /* We have enabled secondary vm's logdirty, so we can unpause it now */
+ rc = libxl_domain_unpause(CTX, domid);
+ if (rc) {
+ LOG(ERROR, "cannot unpause secondary vm");
+ goto out;
+ }
+
+ colo_write_svm_resumed(egc, crcs);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+}
+
+
+/* ===================== colo: wait new checkpoint ===================== */
+static void colo_restore_commit_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+static void colo_stream_read_done(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs,
+ int real_size);
+
+static void libxl__colo_restore_domain_should_checkpoint_callback(void *data)
+{
+ libxl__save_helper_state *shs = data;
+ libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
+
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *cds = &dcs->crs.cds;
+
+ cds->callback = colo_restore_commit_cb;
+ libxl__checkpoint_devices_commit(shs->egc, cds);
+}
+
+static void colo_restore_commit_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+
+ STATE_AO_GC(cds->ao);
+
+ if (rc) {
+ LOG(ERROR, "commit fails");
+ goto out;
+ }
+
+ crcs->callback = colo_stream_read_done;
+ dcs->srs.read_records_callback = colo_common_read_stream_done;
+ libxl__stream_read_colo_context(egc, &dcs->srs);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dcs->shs, 0);
+}
+
+static void colo_stream_read_done(libxl__egc *egc,
+ libxl__colo_restore_checkpoint_state *crcs,
+ int id)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crcs->crs, *dcs, crs);
+ int ok = 0;
+
+ STATE_AO_GC(dcs->ao);
+
+ if (id != COLO_NEW_CHECKPOINT) {
+ LOG(ERROR, "invalid section: %d", id);
+ goto out;
+ }
+
+ ok = 1;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dcs->shs, ok);
+}
+
+
+/* ===================== colo: suspend secondary vm ===================== */
+/*
+ * Do the following things when resuming secondary vm:
+ * 1. suspend secondary vm
+ * 2. send LIBXL_COLO_SVM_SUSPENDED
+ */
+static void colo_suspend_vm_done(libxl__egc *egc,
+ libxl__domain_suspend_state *dsps,
+ int ok);
+static void colo_restore_postsuspend_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+
+static void libxl__colo_restore_domain_suspend_callback(void *data)
+{
+ libxl__save_helper_state *shs = data;
+ libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
+ libxl__colo_restore_checkpoint_state *crcs = dcs->crs.crcs;
+
+ STATE_AO_GC(dcs->ao);
+
+ /* Convenience aliases */
+ libxl__domain_suspend_state *const dsps = &crcs->dsps;
+
+ /* suspend secondary vm */
+ dsps->callback_common_done = colo_suspend_vm_done;
+
+ libxl__domain_suspend(shs->egc, dsps);
+}
+
+static void colo_suspend_vm_done(libxl__egc *egc,
+ libxl__domain_suspend_state *dsps,
+ int ok)
+{
+ libxl__colo_restore_checkpoint_state *crcs = CONTAINER_OF(dsps, *crcs, dsps);
+ libxl__colo_restore_state *crs = crcs->crs;
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *cds = &crs->cds;
+
+ STATE_AO_GC(crs->ao);
+
+ if (!ok) {
+ LOG(ERROR, "cannot suspend secondary vm");
+ goto out;
+ }
+
+ crcs->status = LIBXL_COLO_SUSPENDED;
+
+ cds->callback = colo_restore_postsuspend_cb;
+ libxl__checkpoint_devices_postsuspend(egc, cds);
+
+ return;
+
+out:
+ ok = 0;
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dcs->shs, ok);
+}
+
+static void colo_restore_postsuspend_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ libxl__colo_restore_checkpoint_state *crcs = crs->crcs;
+ libxl_sr_colo_context colo_context = { .id = COLO_SVM_SUSPENDED };
+ int ok = 0;
+
+ STATE_AO_GC(crs->ao);
+
+ if (rc) {
+ LOG(ERROR, "postsuspend fails");
+ goto out;
+ }
+
+ crcs->callback = NULL;
+ crcs->sws.write_records_callback = colo_common_write_stream_done;
+ libxl__stream_write_colo_context(egc, &crcs->sws, &colo_context);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dcs->shs, ok);
+}
+
+
+/* ======================== colo: checkpoint ======================= */
+/*
+ * Do the following things when resuming secondary vm:
+ * 1. read toolstack context
+ * 2. read emulator context
+ */
+static void libxl__colo_restore_domain_checkpoint_callback(void *data)
+{
+ libxl__save_helper_state *shs = data;
+ libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
+ libxl__colo_restore_checkpoint_state *crcs = dcs->crs.crcs;
+
+ crcs->callback = NULL;
+ dcs->srs.read_records_callback = colo_common_read_stream_done;
+ libxl__stream_read_colo_context(shs->egc, &dcs->srs);
+}
+
+/* ===================== colo: common callback ===================== */
+static void colo_common_write_stream_done(libxl__egc *egc,
+ libxl__stream_write_state *stream,
+ int rc)
+{
+ libxl__colo_restore_checkpoint_state *crcs =
+ CONTAINER_OF(stream, *crcs, sws);
+ libxl__domain_create_state *dcs = CONTAINER_OF(crcs->crs, *dcs, crs);
+ int ok;
+
+ STATE_AO_GC(stream->ao);
+
+ if (rc < 0) {
+ /* TODO: it may be a internal error, but we don't know */
+ LOG(ERROR, "sending data fails");
+ ok = 2;
+ goto out;
+ }
+
+ if (!crcs->callback) {
+ /* Everythins is OK */
+ ok = 1;
+ goto out;
+ }
+
+ crcs->callback(egc, crcs, 0);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dcs->shs, ok);
+}
+
+static void colo_common_read_stream_done(libxl__egc *egc,
+ libxl__stream_read_state *stream,
+ int rc)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
+ libxl__colo_restore_checkpoint_state *crcs = dcs->crs.crcs;
+ int ok;
+
+ STATE_AO_GC(stream->ao);
+
+ if (rc < 0) {
+ /* TODO: it may be a internal error, but we don't know */
+ LOG(ERROR, "sending data fails");
+ ok = 2;
+ goto out;
+ }
+
+ if (!crcs->callback) {
+ /* Everythins is OK */
+ ok = 1;
+ goto out;
+ }
+
+ /* rc contains the id */
+ crcs->callback(egc, crcs, rc);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dcs->shs, ok);
+}
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 342aa01..2380368 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -19,6 +19,7 @@
#include "libxl_internal.h"
#include "libxl_arch.h"
+#include "libxl_colo.h"
#include <xc_dom.h>
#include <xenguest.h>
@@ -996,6 +997,93 @@ static void domcreate_console_available(libxl__egc *egc,
dcs->aop_console_how.for_event));
}
+static void libxl__colo_restore_teardown_done(libxl__egc *egc,
+ libxl__colo_restore_state *crs,
+ int rc)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ STATE_AO_GC(crs->ao);
+
+ /* convenience aliases */
+ libxl__save_helper_state *const shs = &dcs->shs;
+ const int domid = crs->domid;
+ const libxl_ctx *const ctx = libxl__gc_owner(gc);
+ xc_interface *const xch = ctx->xch;
+
+ if (!rc)
+ /* failover, no need to destroy the secondary vm */
+ goto out;
+
+ if (shs->retval)
+ /*
+ * shs->retval stores the return value of xc_domain_restore().
+ * If it is not 0, we have destroyed the secondary vm in
+ * xc_domain_restore();
+ */
+ goto out;
+
+ xc_domain_destroy(xch, domid);
+
+out:
+ dcs->callback(egc, dcs, rc, crs->domid);
+}
+
+void libxl__colo_restore_done(libxl__egc *egc, void *dcs_void,
+ int ret, int retval, int errnoval)
+{
+ libxl__domain_create_state *dcs = dcs_void;
+ int rc = 1;
+
+ /* convenience aliases */
+ libxl__colo_restore_state *const crs = &dcs->crs;
+ STATE_AO_GC(crs->ao);
+
+ /* teardown and failover */
+ crs->callback = libxl__colo_restore_teardown_done;
+
+ if (ret == 0 && retval == 0)
+ rc = 0;
+
+ LOG(INFO, "%s", rc ? "colo fails" : "failover");
+ libxl__colo_restore_teardown(egc, crs, rc);
+}
+
+static void libxl__colo_restore_cp_done(libxl__egc *egc,
+ libxl__colo_restore_state *crs,
+ int rc)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+ int ok = 0;
+
+ /* convenience aliases */
+ libxl__save_helper_state *const shs = &dcs->shs;
+
+ if (!rc)
+ ok = 1;
+
+ libxl__xc_domain_saverestore_async_callback_done(shs->egc, shs, ok);
+}
+
+static void libxl__colo_restore_setup_done(libxl__egc *egc,
+ libxl__colo_restore_state *crs,
+ int rc)
+{
+ libxl__domain_create_state *dcs = CONTAINER_OF(crs, *dcs, crs);
+
+ /* convenience aliases */
+ STATE_AO_GC(crs->ao);
+
+ if (rc) {
+ LOG(ERROR, "colo restore setup fails: %d", rc);
+ libxl__xc_domain_restore_done(egc, dcs, rc, 0, 0);
+ return;
+ }
+
+ crs->callback = libxl__colo_restore_cp_done;
+ /*TODO COLO*/
+ libxl__stream_read_start(egc, &dcs->srs);
+}
+
static void domcreate_bootloader_done(libxl__egc *egc,
libxl__bootloader_state *bl,
int rc)
@@ -1010,6 +1098,9 @@ static void domcreate_bootloader_done(libxl__egc *egc,
libxl__domain_build_state *const state = &dcs->build_state;
libxl__srm_restore_autogen_callbacks *const callbacks =
&dcs->shs.callbacks.restore.a;
+ const int checkpointed_stream = dcs->restore_params.checkpointed_stream;
+ libxl__colo_restore_state *const crs = &dcs->crs;
+ libxl_domain_build_info *const info = &d_config->b_info;
if (rc) {
domcreate_rebuild_done(egc, dcs, rc);
@@ -1039,6 +1130,13 @@ static void domcreate_bootloader_done(libxl__egc *egc,
/* Restore */
callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
+ /* COLO only supports HVM now */
+ if (info->type != LIBXL_DOMAIN_TYPE_HVM &&
+ checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
rc = libxl__build_pre(gc, domid, d_config, state);
if (rc)
goto out;
@@ -1049,7 +1147,18 @@ static void domcreate_bootloader_done(libxl__egc *egc,
dcs->srs.back_channel = false;
dcs->srs.completion_callback = domcreate_stream_done;
- libxl__stream_read_start(egc, &dcs->srs);
+ /* colo restore setup */
+ if (checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
+ crs->ao = ao;
+ crs->domid = domid;
+ crs->send_fd = dcs->send_fd;
+ crs->recv_fd = restore_fd;
+ crs->hvm = (info->type == LIBXL_DOMAIN_TYPE_HVM);
+ crs->callback = libxl__colo_restore_setup_done;
+ libxl__colo_restore_setup(egc, crs);
+ } else
+ libxl__stream_read_start(egc, &dcs->srs);
+
return;
out:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 1ef6fc8..0aafd59 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3280,6 +3280,24 @@ static inline bool libxl__stream_read_inuse(
return stream->running;
}
+/* colo related structure */
+typedef struct libxl__colo_restore_state libxl__colo_restore_state;
+typedef void libxl__colo_callback(libxl__egc *,
+ libxl__colo_restore_state *, int rc);
+struct libxl__colo_restore_state {
+ /* must set by caller of libxl__colo_(setup|teardown) */
+ libxl__ao *ao;
+ uint32_t domid;
+ int send_fd;
+ int recv_fd;
+ int hvm;
+ libxl__colo_callback *callback;
+
+ /* private, colo restore checkpoint state */
+ libxl__domain_create_cb *saved_cb;
+ void *crcs;
+ libxl__checkpoint_devices_state cds;
+};
struct libxl__domain_create_state {
/* filled in by user */
@@ -3294,6 +3312,7 @@ struct libxl__domain_create_state {
/* private to domain_create */
int guest_domid;
libxl__domain_build_state build_state;
+ libxl__colo_restore_state crs;
libxl__bootloader_state bl;
libxl__stub_dm_spawn_state dmss;
/* If we're not doing stubdom, we use only dmss.dm,
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 25817d6..5f500b2 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -15,6 +15,7 @@
#include "libxl_osdeps.h"
#include "libxl_internal.h"
+#include "libxl_colo.h"
/* stream_fd is as from the caller (eventually, the application).
* It may be 0, 1 or 2, in which case we need to dup it elsewhere.
@@ -65,7 +66,11 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
dcs->shs.ao = ao;
dcs->shs.domid = domid;
dcs->shs.recv_callback = libxl__srm_callout_received_restore;
- dcs->shs.completion_callback = libxl__xc_domain_restore_done;
+ if (dcs->restore_params.checkpointed_stream ==
+ LIBXL_CHECKPOINTED_STREAM_COLO)
+ dcs->shs.completion_callback = libxl__colo_restore_done;
+ else
+ dcs->shs.completion_callback = libxl__xc_domain_restore_done;
dcs->shs.caller_state = dcs;
dcs->shs.need_results = 1;
dcs->shs.toolstack_data_file = 0;
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 05/18] primary vm suspend/resume/checkpoint code
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (3 preceding siblings ...)
2015-06-25 6:30 ` [PATCH v7 COLO 04/18] secondary vm suspend/resume/checkpoint code Yang Hongyang
@ 2015-06-25 6:30 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 06/18] libxc/restore: support COLO restore Yang Hongyang
` (13 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:30 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
From: Wen Congyang <wency@cn.fujitsu.com>
We will do the following things again and again:
1. Suspend primary vm
a. Suspend primary vm
b. do postsuspend
c. Read LIBXL_COLO_SVM_SUSPENDED sent by secondary
2. Resume primary vm
a. Read LIBXL_COLO_SVM_READY from slave
b. Do presume
c. Resume primary vm
d. Read LIBXL_COLO_SVM_RESUMED from slave
3. Wait a new checkpoint
a. Wait a new checkpoint(not implemented)
b. Send LIBXL_COLO_NEW_CHECKPOINT to slave
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
tools/libxl/Makefile | 2 +-
tools/libxl/libxl.c | 6 +-
tools/libxl/libxl_colo.h | 21 +-
tools/libxl/libxl_colo_save.c | 565 ++++++++++++++++++++++++++++++++++++++++++
tools/libxl/libxl_dom_save.c | 13 +-
tools/libxl/libxl_internal.h | 121 +++++----
tools/libxl/libxl_types.idl | 1 +
7 files changed, 662 insertions(+), 67 deletions(-)
create mode 100644 tools/libxl/libxl_colo_save.c
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 66ae63d..252c4e9 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -57,7 +57,7 @@ LIBXL_OBJS-y += libxl_nonetbuffer.o
endif
LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
-LIBXL_OBJS-y += libxl_colo_restore.o
+LIBXL_OBJS-y += libxl_colo_restore.o libxl_colo_save.o
LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f851957..8b866f4 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -17,6 +17,7 @@
#include "libxl_osdeps.h"
#include "libxl_internal.h"
+#include "libxl_colo.h"
#define PAGE_TO_MEMKB(pages) ((pages) * 4)
#define BACKEND_STRING_SIZE 5
@@ -842,7 +843,10 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
assert(info);
/* Point of no return */
- libxl__remus_setup(egc, &dss->rs);
+ if (libxl_defbool_val(info->colo))
+ libxl__colo_save_setup(egc, &dss->css);
+ else
+ libxl__remus_setup(egc, &dss->rs);
return AO_INPROGRESS;
out:
diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
index 91df275..49a430b 100644
--- a/tools/libxl/libxl_colo.h
+++ b/tools/libxl/libxl_colo.h
@@ -16,17 +16,6 @@
#ifndef LIBXL_COLO_H
#define LIBXL_COLO_H
-/*
- * values to control suspend/resume primary vm and secondary vm
- * at the same time
- */
-enum {
- LIBXL_COLO_NEW_CHECKPOINT = 1,
- LIBXL_COLO_SVM_SUSPENDED,
- LIBXL_COLO_SVM_READY,
- LIBXL_COLO_SVM_RESUMED,
-};
-
extern void libxl__colo_restore_done(libxl__egc *egc, void *dcs_void,
int ret, int retval, int errnoval);
extern void libxl__colo_restore_setup(libxl__egc *egc,
@@ -35,4 +24,14 @@ extern void libxl__colo_restore_teardown(libxl__egc *egc,
libxl__colo_restore_state *crs,
int rc);
+extern void libxl__colo_save_domain_suspend_callback(void *data);
+extern void libxl__colo_save_domain_checkpoint_callback(void *data);
+extern void libxl__colo_save_domain_resume_callback(void *data);
+extern void libxl__colo_save_domain_should_checkpoint_callback(void *data);
+extern void libxl__colo_save_setup(libxl__egc *egc,
+ libxl__colo_save_state *css);
+extern void libxl__colo_save_teardown(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ int rc);
+
#endif
diff --git a/tools/libxl/libxl_colo_save.c b/tools/libxl/libxl_colo_save.c
new file mode 100644
index 0000000..4e059cc
--- /dev/null
+++ b/tools/libxl/libxl_colo_save.c
@@ -0,0 +1,565 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Wen Congyang <wency@cn.fujitsu.com>
+ * Yang Hongyang <yanghy@cn.fujitsu.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+#include "libxl_colo.h"
+
+static const libxl__checkpoint_device_instance_ops *colo_ops[] = {
+ NULL,
+};
+
+/* ================= helper functions ================= */
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+ /* init device subkind-specific state in the libxl ctx */
+ int rc;
+ STATE_AO_GC(cds->ao);
+
+ rc = 0;
+ return rc;
+}
+
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+ /* cleanup device subkind-specific state in the libxl ctx */
+ STATE_AO_GC(cds->ao);
+}
+
+/* ================= colo: setup save environment ================= */
+static void colo_save_setup_done(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+static void colo_save_setup_failed(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+
+void libxl__colo_save_setup(libxl__egc *egc, libxl__colo_save_state *css)
+{
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *const cds = &css->cds;
+
+ STATE_AO_GC(dss->ao);
+
+ if (dss->type != LIBXL_DOMAIN_TYPE_HVM) {
+ LOG(ERROR, "COLO only supports hvm now");
+ goto out;
+ }
+
+ css->send_fd = dss->fd;
+ css->recv_fd = dss->recv_fd;
+ css->svm_running = false;
+
+ /* TODO: disk/nic support */
+ cds->device_kind_flags = 0;
+ cds->ops = colo_ops;
+ cds->callback = colo_save_setup_done;
+ cds->ao = ao;
+ cds->domid = dss->domid;
+
+ css->srs.ao = ao;
+ css->srs.fd = css->recv_fd;
+ css->srs.back_channel = true;
+ libxl__stream_read_start(egc, &css->srs);
+
+ if (init_device_subkind(cds))
+ goto out;
+
+ libxl__checkpoint_devices_setup(egc, &css->cds);
+
+ return;
+
+out:
+ libxl__ao_complete(egc, ao, ERROR_FAIL);
+}
+
+static void colo_save_setup_done(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(cds, *css, cds);
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+ STATE_AO_GC(cds->ao);
+
+ if (!rc) {
+ libxl__domain_save(egc, dss);
+ return;
+ }
+
+ LOG(ERROR, "COLO: failed to setup device for guest with domid %u",
+ dss->domid);
+ css->cds.callback = colo_save_setup_failed;
+ libxl__checkpoint_devices_teardown(egc, &css->cds);
+}
+
+static void colo_save_setup_failed(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ STATE_AO_GC(cds->ao);
+
+ if (rc)
+ LOG(ERROR, "COLO: failed to teardown device after setup failed"
+ " for guest with domid %u, rc %d", cds->domid, rc);
+
+ cleanup_device_subkind(cds);
+ libxl__ao_complete(egc, ao, rc);
+}
+
+
+/* ================= colo: teardown save environment ================= */
+static void colo_teardown_done(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+
+void libxl__colo_save_teardown(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ int rc)
+{
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ STATE_AO_GC(css->cds.ao);
+
+ LOG(WARN, "COLO: Domain suspend terminated with rc %d,"
+ " teardown COLO devices...", rc);
+ dss->css.cds.callback = colo_teardown_done;
+ libxl__checkpoint_devices_teardown(egc, &dss->css.cds);
+ return;
+}
+
+static void colo_teardown_done(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(cds, *css, cds);
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ cleanup_device_subkind(cds);
+ dss->callback(egc, dss, rc);
+}
+
+/*
+ * checkpoint callbacks are called in the following order:
+ * 1. suspend
+ * 2. resume
+ * 3. checkpoint
+ */
+static void colo_common_write_stream_done(libxl__egc *egc,
+ libxl__stream_write_state *stream,
+ int rc);
+static void colo_common_read_stream_done(libxl__egc *egc,
+ libxl__stream_read_state *stream,
+ int rc);
+/* ===================== colo: suspend primary vm ===================== */
+
+static void colo_read_svm_suspended_done(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ int id);
+/*
+ * Do the following things when suspending primary vm:
+ * 1. suspend primary vm
+ * 2. do postsuspend
+ * 3. read LIBXL_COLO_SVM_SUSPENDED
+ * 4. read secondary vm's dirty pages
+ */
+static void colo_suspend_primary_vm_done(libxl__egc *egc,
+ libxl__domain_suspend_state *dsps,
+ int ok);
+static void colo_postsuspend_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+
+void libxl__colo_save_domain_suspend_callback(void *data)
+{
+ libxl__save_helper_state *shs = data;
+ libxl__egc *egc = shs->egc;
+ libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
+
+ /* Convenience aliases */
+ libxl__domain_suspend_state *dsps = &dss->dsps;
+
+ dsps->callback_common_done = colo_suspend_primary_vm_done;
+ libxl__domain_suspend(egc, dsps);
+}
+
+static void colo_suspend_primary_vm_done(libxl__egc *egc,
+ libxl__domain_suspend_state *dsps,
+ int ok)
+{
+ libxl__domain_save_state *dss = CONTAINER_OF(dsps, *dss, dsps);
+
+ STATE_AO_GC(dsps->ao);
+
+ if (!ok) {
+ LOG(ERROR, "cannot suspend primary vm");
+ goto out;
+ }
+
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *const cds = &dss->css.cds;
+
+ cds->callback = colo_postsuspend_cb;
+ libxl__checkpoint_devices_postsuspend(egc, cds);
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+}
+
+static void colo_postsuspend_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ int ok = 0;
+ libxl__colo_save_state *css = CONTAINER_OF(cds, *css, cds);
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ STATE_AO_GC(cds->ao);
+
+ if (rc) {
+ LOG(ERROR, "postsuspend fails");
+ goto out;
+ }
+
+ if (!css->svm_running) {
+ ok = 1;
+ goto out;
+ }
+
+ /*
+ * read COLO_SVM_SUSPENDED
+ */
+ css->callback = colo_read_svm_suspended_done;
+ css->srs.read_records_callback = colo_common_read_stream_done;
+ libxl__stream_read_colo_context(egc, &css->srs);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+}
+
+static void colo_read_svm_suspended_done(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ int id)
+{
+ int ok = 0;
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ STATE_AO_GC(css->cds.ao);
+
+ if (id != COLO_SVM_SUSPENDED) {
+ LOG(ERROR, "invalid section: %d, expected: %d", id, COLO_SVM_SUSPENDED);
+ goto out;
+ }
+
+ ok = 1;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+}
+
+
+/* ===================== colo: send tailbuf ========================== */
+void libxl__colo_save_domain_checkpoint_callback(void *data)
+{
+ libxl__save_helper_state *shs = data;
+ libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
+
+ /* Convenience aliases */
+ libxl__colo_save_state *const css = &dss->css;
+
+ /* write toolstack and emulator context, checkpoint end */
+ css->callback = NULL;
+ dss->sws.write_records_callback = colo_common_write_stream_done;
+ libxl__stream_write_start_checkpoint(shs->egc, &dss->sws);
+}
+
+/* ===================== colo: resume primary vm ===================== */
+/*
+ * Do the following things when resuming primary vm:
+ * 1. read LIBXL_COLO_SVM_READY
+ * 2. do preresume
+ * 3. resume primary vm
+ * 4. read LIBXL_COLO_SVM_RESUMED
+ */
+static void colo_preresume_dm_saved(libxl__egc *egc,
+ libxl__domain_save_state *dss, int rc);
+static void colo_read_svm_ready_done(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ int id);
+static void colo_preresume_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+static void colo_read_svm_resumed_done(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ int id);
+
+void libxl__colo_save_domain_resume_callback(void *data)
+{
+ libxl__save_helper_state *shs = data;
+ libxl__egc *egc = shs->egc;
+ libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
+
+ /* This would go into tailbuf. */
+ if (dss->hvm) {
+ libxl__domain_save_device_model(egc, dss, colo_preresume_dm_saved);
+ } else {
+ colo_preresume_dm_saved(egc, dss, 0);
+ }
+
+ return;
+}
+
+static void colo_preresume_dm_saved(libxl__egc *egc,
+ libxl__domain_save_state *dss, int rc)
+{
+ /* Convenience aliases */
+ libxl__colo_save_state *const css = &dss->css;
+
+ STATE_AO_GC(css->cds.ao);
+
+ if (rc) {
+ LOG(ERROR, "Failed to save device model. Terminating COLO..");
+ goto out;
+ }
+
+ /* read COLO_SVM_READY */
+ css->callback = colo_read_svm_ready_done;
+ css->srs.read_records_callback = colo_common_read_stream_done;
+ libxl__stream_read_colo_context(egc, &css->srs);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
+}
+
+static void colo_read_svm_ready_done(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ int id)
+{
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ STATE_AO_GC(css->cds.ao);
+
+ if (id != COLO_SVM_READY) {
+ LOG(ERROR, "invalid section: %d, expected: %d", id, COLO_SVM_READY);
+ goto out;
+ }
+
+ css->svm_running = true;
+ css->cds.callback = colo_preresume_cb;
+ libxl__checkpoint_devices_preresume(egc, &css->cds);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
+}
+
+static void colo_preresume_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(cds, *css, cds);
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ STATE_AO_GC(cds->ao);
+
+ if (rc) {
+ LOG(ERROR, "preresume fails");
+ goto out;
+ }
+
+ /* Resumes the domain and the device model */
+ if (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1)) {
+ LOG(ERROR, "cannot resume primary vm");
+ goto out;
+ }
+
+ /* read COLO_SVM_RESUMED */
+ css->callback = colo_read_svm_resumed_done;
+ css->srs.read_records_callback = colo_common_read_stream_done;
+ libxl__stream_read_colo_context(egc, &css->srs);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
+}
+
+static void colo_read_svm_resumed_done(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ int id)
+{
+ int ok = 0;
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ STATE_AO_GC(css->cds.ao);
+
+ if (id != COLO_SVM_RESUMED) {
+ LOG(ERROR, "invalid section: %d, expected: %d", id, COLO_SVM_RESUMED);
+ goto out;
+ }
+
+ ok = 1;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+}
+
+
+/* ===================== colo: wait new checkpoint ===================== */
+/*
+ * Do the following things:
+ * 1. do commit
+ * 2. wait for a new checkpoint
+ * 3. write LIBXL_COLO_NEW_CHECKPOINT
+ */
+static void colo_device_commit_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+static void colo_start_new_checkpoint(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+
+void libxl__colo_save_domain_should_checkpoint_callback(void *data)
+{
+ libxl__save_helper_state *shs = data;
+ libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
+ libxl__egc *egc = dss->shs.egc;
+
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *const cds = &dss->css.cds;
+
+ cds->callback = colo_device_commit_cb;
+ libxl__checkpoint_devices_commit(egc, cds);
+}
+
+static void colo_device_commit_cb(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(cds, *css, cds);
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+
+ STATE_AO_GC(cds->ao);
+
+ if (rc) {
+ LOG(ERROR, "commit fails");
+ goto out;
+ }
+
+ /* TODO: wait a new checkpoint */
+ colo_start_new_checkpoint(egc, cds, 0);
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
+}
+
+static void colo_start_new_checkpoint(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(cds, *css, cds);
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+ libxl_sr_colo_context colo_context = { .id = COLO_NEW_CHECKPOINT };
+
+ if (rc)
+ goto out;
+
+ /* write COLO_NEW_CHECKPOINT */
+ css->callback = NULL;
+ dss->sws.write_records_callback = colo_common_write_stream_done;
+ libxl__stream_write_colo_context(egc, &dss->sws, &colo_context);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
+}
+
+
+/* ===================== colo: common callback ===================== */
+static void colo_common_write_stream_done(libxl__egc *egc,
+ libxl__stream_write_state *stream,
+ int rc)
+{
+ libxl__domain_save_state *dss = CONTAINER_OF(stream, *dss, sws);
+ int ok;
+
+ /* Convenience aliases */
+ libxl__colo_save_state *const css = &dss->css;
+
+ STATE_AO_GC(stream->ao);
+
+ if (rc < 0) {
+ /* TODO: it may be a internal error, but we don't know */
+ LOG(ERROR, "sending data fails");
+ ok = 2;
+ goto out;
+ }
+
+ if (!css->callback) {
+ /* Everythins is OK */
+ ok = 1;
+ goto out;
+ }
+
+ css->callback(egc, css, 0);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+}
+
+static void colo_common_read_stream_done(libxl__egc *egc,
+ libxl__stream_read_state *stream,
+ int rc)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(stream, *css, srs);
+ libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
+ int ok;
+
+ STATE_AO_GC(stream->ao);
+
+ if (rc < 0) {
+ /* TODO: it may be a internal error, but we don't know */
+ LOG(ERROR, "sending data fails");
+ ok = 2;
+ goto out;
+ }
+
+ if (!css->callback) {
+ /* Everythins is OK */
+ ok = 1;
+ goto out;
+ }
+
+ /* rc contains the id */
+ css->callback(egc, css, rc);
+
+ return;
+
+out:
+ libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+}
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 9a3d009..26839cb 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -16,6 +16,7 @@
#include "libxl_osdeps.h" /* must come before any other headers */
#include "libxl_internal.h"
+#include "libxl_colo.h"
struct libxl__physmap_info {
uint64_t phys_offset;
@@ -437,6 +438,11 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
callbacks->suspend = libxl__remus_domain_suspend_callback;
callbacks->postcopy = libxl__remus_domain_resume_callback;
callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
+ } else if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
+ callbacks->suspend = libxl__colo_save_domain_suspend_callback;
+ callbacks->postcopy = libxl__colo_save_domain_resume_callback;
+ callbacks->checkpoint = libxl__colo_save_domain_checkpoint_callback;
+ callbacks->should_checkpoint = libxl__colo_save_domain_should_checkpoint_callback;
} else
callbacks->suspend = libxl__domain_suspend_callback;
@@ -575,12 +581,15 @@ static void domain_save_done(libxl__egc *egc,
}
/*
- * With Remus, if we reach this point, it means either
+ * With Remus/COLO, if we reach this point, it means either
* backup died or some network error occurred preventing us
* from sending checkpoints. Teardown the network buffers and
* release netlink resources. This is an async op.
*/
- libxl__remus_teardown(egc, &dss->rs, rc);
+ if (libxl_defbool_val(dss->remus->colo))
+ libxl__colo_save_teardown(egc, &dss->css, rc);
+ else
+ libxl__remus_teardown(egc, &dss->rs, rc);
}
/*========================= Domain restore ============================*/
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 0aafd59..bb5e298 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2655,7 +2655,7 @@ typedef struct libxl__save_helper_state {
/*
* The abstract checkpoint device layer exposes a common
* set of API to [external] libxl for manipulating devices attached to
- * a guest protected by Remus. The device layer also exposes a set of
+ * a guest protected by Remus/COLO. The device layer also exposes a set of
* [internal] interfaces that every device type must implement.
*
* The following API are exposed to libxl:
@@ -2673,7 +2673,7 @@ typedef struct libxl__save_helper_state {
* +libxl__checkpoint_devices_commit
*
* Each device type needs to implement the interfaces specified in
- * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
+ * the libxl__checkpoint_device_instance_ops if it wishes to support Remus/COLO.
*
* The high-level control flow through the checkpoint device layer is shown
* below:
@@ -2693,7 +2693,7 @@ typedef struct libxl__checkpoint_device_instance_ops libxl__checkpoint_device_in
/*
* Interfaces to be implemented by every device subkind that wishes to
- * support Remus. Functions must be implemented unless otherwise
+ * support Remus/COLO. Functions must be implemented unless otherwise
* stated. Many of these functions are asynchronous. They call
* dev->aodev.callback when done. The actual implementations may be
* synchronous and call dev->aodev.callback directly (as the last
@@ -2873,6 +2873,66 @@ static inline bool libxl__convert_legacy_stream_inuse(
return libxl__ev_child_inuse(&chs->child);
}
+/* State for manipulating a libxl migration v2 stream */
+typedef struct libxl__stream_read_state libxl__stream_read_state;
+
+struct libxl__stream_read_state {
+ /* filled by the user */
+ libxl__ao *ao;
+ int fd;
+ bool legacy;
+ bool back_channel;
+ void (*completion_callback)(libxl__egc *egc,
+ libxl__stream_read_state *stream,
+ int rc);
+ void (*read_records_callback)(libxl__egc *egc,
+ libxl__stream_read_state *stream,
+ int rc);
+ /* Private */
+ libxl__carefd *v2_carefd;
+ int rc;
+ int joined_rc;
+ bool running;
+ bool in_checkpoint;
+ bool in_colo_context;
+ libxl__datacopier_state dc;
+ size_t expected_len;
+ libxl_sr_hdr hdr;
+ libxl_sr_rec_hdr rec_hdr;
+ void *rec_body;
+};
+
+_hidden void libxl__stream_read_start(libxl__egc *egc,
+ libxl__stream_read_state *stream);
+
+_hidden void libxl__stream_read_continue(libxl__egc *egc,
+ libxl__stream_read_state *stream);
+_hidden void libxl__stream_read_start_checkpoint(
+ libxl__egc *egc, libxl__stream_read_state *stream);
+_hidden void libxl__stream_read_colo_context(
+ libxl__egc *egc, libxl__stream_read_state *stream);
+
+_hidden void libxl__stream_read_abort(libxl__egc *egc,
+ libxl__stream_read_state *stream, int rc);
+
+static inline bool libxl__stream_read_inuse(
+ const libxl__stream_read_state *stream)
+{
+ return stream->running;
+}
+
+/*----- colo related state structure -----*/
+typedef struct libxl__colo_save_state libxl__colo_save_state;
+struct libxl__colo_save_state {
+ libxl__checkpoint_devices_state cds;
+ int send_fd;
+ int recv_fd;
+
+ /* private */
+ libxl__stream_read_state srs;
+ void (*callback)(libxl__egc *, libxl__colo_save_state *, int);
+ bool svm_running;
+};
/*----- Domain suspend (save) state structure -----*/
@@ -2978,7 +3038,12 @@ struct libxl__domain_save_state {
libxl__domain_suspend_state dsps;
int hvm;
int xcflags;
- libxl__remus_state rs;
+ union {
+ /* for Remus */
+ libxl__remus_state rs;
+ /* for COLO */
+ libxl__colo_save_state css;
+ };
libxl__save_helper_state shs;
libxl__logdirty_switch logdirty;
/* private for libxl__domain_save_device_model */
@@ -3232,54 +3297,6 @@ typedef void libxl__domain_create_cb(libxl__egc *egc,
libxl__domain_create_state*,
int rc, uint32_t domid);
-/* State for manipulating a libxl migration v2 stream */
-typedef struct libxl__stream_read_state libxl__stream_read_state;
-
-struct libxl__stream_read_state {
- /* filled by the user */
- libxl__ao *ao;
- int fd;
- bool legacy;
- bool back_channel;
- void (*completion_callback)(libxl__egc *egc,
- libxl__stream_read_state *stream,
- int rc);
- void (*read_records_callback)(libxl__egc *egc,
- libxl__stream_read_state *stream,
- int rc);
- /* Private */
- libxl__carefd *v2_carefd;
- int rc;
- int joined_rc;
- bool running;
- bool in_checkpoint;
- bool in_colo_context;
- libxl__datacopier_state dc;
- size_t expected_len;
- libxl_sr_hdr hdr;
- libxl_sr_rec_hdr rec_hdr;
- void *rec_body;
-};
-
-_hidden void libxl__stream_read_start(libxl__egc *egc,
- libxl__stream_read_state *stream);
-
-_hidden void libxl__stream_read_continue(libxl__egc *egc,
- libxl__stream_read_state *stream);
-_hidden void libxl__stream_read_start_checkpoint(
- libxl__egc *egc, libxl__stream_read_state *stream);
-_hidden void libxl__stream_read_colo_context(
- libxl__egc *egc, libxl__stream_read_state *stream);
-
-_hidden void libxl__stream_read_abort(libxl__egc *egc,
- libxl__stream_read_state *stream, int rc);
-
-static inline bool libxl__stream_read_inuse(
- const libxl__stream_read_state *stream)
-{
- return stream->running;
-}
-
/* colo related structure */
typedef struct libxl__colo_restore_state libxl__colo_restore_state;
typedef void libxl__colo_callback(libxl__egc *,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index e05d12b..cf1eeb2 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -696,6 +696,7 @@ libxl_domain_remus_info = Struct("domain_remus_info",[
("netbuf", libxl_defbool),
("netbufscript", string),
("diskbuf", libxl_defbool),
+ ("colo", libxl_defbool)
])
libxl_event_type = Enumeration("event_type", [
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 06/18] libxc/restore: support COLO restore
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (4 preceding siblings ...)
2015-06-25 6:30 ` [PATCH v7 COLO 05/18] primary " Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 07/18] libxc/restore: send dirty bitmap to primary when checkpoint under colo Yang Hongyang
` (12 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
call the callbacks resume/checkpoint/suspend while secondary vm
status is consistent with primary.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
---
tools/libxc/xc_sr_common.h | 19 ++++++++++++--
tools/libxc/xc_sr_restore.c | 63 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 80 insertions(+), 2 deletions(-)
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 88ef135..229ba0a 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -132,8 +132,11 @@ struct xc_sr_restore_ops
*
* @return 0 for success, -1 for failure, or the sentinel value
* RECORD_NOT_PROCESSED.
+ * BROKEN_CHANNEL: if we are under Remus/COLO, this means that the master
+ * may dead, we will failover.
*/
#define RECORD_NOT_PROCESSED 1
+#define BROKEN_CHANNEL 2
int (*process_record)(struct xc_sr_context *ctx, struct xc_sr_record *rec);
/**
@@ -164,6 +167,18 @@ struct xc_sr_context
xc_dominfo_t dominfo;
+ /*
+ * migration stream
+ * 0: Plain VM
+ * 1: Remus
+ * 2: COLO
+ */
+ enum {
+ MIG_STREAM_PLAIN,
+ MIG_STREAM_REMUS,
+ MIG_STREAM_COLO,
+ } migration_stream;
+
union /* Common save or restore data. */
{
struct /* Save data. */
@@ -206,13 +221,13 @@ struct xc_sr_context
uint32_t guest_page_size;
/* Plain VM, or checkpoints over time. */
- bool checkpointed;
+ int checkpointed;
/* Currently buffering records between a checkpoint */
bool buffer_all_records;
/*
- * With Remus, we buffer the records sent by the primary at checkpoint,
+ * With Remus/COLO, we buffer the records sent by the primary at checkpoint,
* in case the primary will fail, we can recover from the last
* checkpoint state.
* This should be enough for most of the cases because primary only send
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index e6f00db..2ce207c 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -1,4 +1,5 @@
#include <arpa/inet.h>
+#include <assert.h>
#include <assert.h>
@@ -446,6 +447,49 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
else
ctx->restore.buffer_all_records = true;
+ if ( ctx->restore.checkpointed == MIG_STREAM_COLO )
+ {
+#define HANDLE_CALLBACK_RETURN_VALUE(ret) \
+ do { \
+ if ( ret == 1 ) \
+ rc = 0; /* Success */ \
+ else \
+ { \
+ if ( ret == 2 ) \
+ rc = BROKEN_CHANNEL; \
+ else \
+ rc = -1; /* Some unspecified error */ \
+ goto err; \
+ } \
+ } while (0)
+
+ /* COLO */
+
+ /* We need to resume guest */
+ rc = ctx->restore.ops.stream_complete(ctx);
+ if ( rc )
+ goto err;
+
+ /* TODO: call restore_results */
+
+ /* Resume secondary vm */
+ ret = ctx->restore.callbacks->postcopy(ctx->restore.callbacks->data);
+ HANDLE_CALLBACK_RETURN_VALUE(ret);
+
+ /* Wait for a new checkpoint */
+ ret = ctx->restore.callbacks->should_checkpoint(
+ ctx->restore.callbacks->data);
+ HANDLE_CALLBACK_RETURN_VALUE(ret);
+
+ /* suspend secondary vm */
+ ret = ctx->restore.callbacks->suspend(ctx->restore.callbacks->data);
+ HANDLE_CALLBACK_RETURN_VALUE(ret);
+
+#undef HANDLE_CALLBACK_RETURN_VALUE
+
+ /* TODO: send dirty bitmap to primary */
+ }
+
err:
return rc;
}
@@ -608,6 +652,8 @@ static int restore(struct xc_sr_context *ctx)
goto err;
}
}
+ else if ( rc == BROKEN_CHANNEL )
+ goto remus_failover;
else if ( rc )
goto err;
}
@@ -615,6 +661,15 @@ static int restore(struct xc_sr_context *ctx)
} while ( rec.type != REC_TYPE_END );
remus_failover:
+
+ if ( ctx->restore.checkpointed == MIG_STREAM_COLO )
+ {
+ /* With COLO, we have already called stream_complete */
+ rc = 0;
+ IPRINTF("COLO Failover");
+ goto done;
+ }
+
/*
* With Remus, if we reach here, there must be some error on primary,
* failover from the last checkpoint state.
@@ -669,6 +724,14 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
if (checkpointed_stream)
assert(callbacks->checkpoint);
+ if ( ctx.restore.checkpointed == MIG_STREAM_COLO )
+ {
+ /* this is COLO restore */
+ assert(callbacks->suspend &&
+ callbacks->postcopy &&
+ callbacks->should_checkpoint);
+ }
+
IPRINTF("In experimental %s", __func__);
DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d"
", checkpointed_stream %d", io_fd, dom, hvm, pae,
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 07/18] libxc/restore: send dirty bitmap to primary when checkpoint under colo
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (5 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 06/18] libxc/restore: support COLO restore Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 08/18] send store mfn and console mfn to xl before resuming secondary vm Yang Hongyang
` (11 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
Send dirty bitmap to primary when checkpoint under colo.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
tools/libxc/xc_sr_common.h | 4 ++
tools/libxc/xc_sr_restore.c | 120 +++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 123 insertions(+), 1 deletion(-)
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 229ba0a..01ee2e7 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -213,6 +213,10 @@ struct xc_sr_context
struct xc_sr_restore_ops ops;
struct restore_callbacks *callbacks;
+ int send_fd;
+ unsigned long p2m_size;
+ xc_hypercall_buffer_t dirty_bitmap_hbuf;
+
/* From Image Header. */
uint32_t format_version;
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 2ce207c..5f98927 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -410,6 +410,92 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
return rc;
}
+/*
+ * Send dirty_bitmap to primary.
+ */
+static int send_dirty_bitmap(struct xc_sr_context *ctx)
+{
+ xc_interface *xch = ctx->xch;
+ int rc = -1;
+ unsigned count, written;
+ uint64_t i, *pfns = NULL;
+ struct iovec *iov = NULL;
+ xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
+ struct xc_sr_record rec =
+ {
+ .type = REC_TYPE_DIRTY_BITMAP,
+ };
+ DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+ &ctx->save.dirty_bitmap_hbuf);
+
+ if ( xc_shadow_control(
+ xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_CLEAN,
+ HYPERCALL_BUFFER(dirty_bitmap), ctx->restore.p2m_size,
+ NULL, 0, &stats) != ctx->restore.p2m_size )
+ {
+ PERROR("Failed to retrieve logdirty bitmap");
+ goto err;
+ }
+
+ for ( i = 0, count = 0; i < ctx->restore.p2m_size; i++ )
+ {
+ if ( test_bit(i, dirty_bitmap) )
+ count++;
+ }
+
+
+ pfns = malloc(count * sizeof(*pfns));
+ if ( !pfns )
+ {
+ ERROR("Unable to allocate %zu bytes of memory for dirty pfn list",
+ count * sizeof(*pfns));
+ goto err;
+ }
+
+ for ( i = 0, written = 0; i < ctx->restore.p2m_size; ++i )
+ {
+ if ( !test_bit(i, dirty_bitmap) )
+ continue;
+
+ if ( written > count )
+ {
+ ERROR("Dirty pfn list exceed");
+ goto err;
+ }
+
+ pfns[written++] = i;
+ }
+
+ /* iovec[] for writev(). */
+ iov = malloc(3 * sizeof(*iov));
+ if ( !iov )
+ {
+ ERROR("Unable to allocate memory for sending dirty bitmap");
+ goto err;
+ }
+
+ rec.length = count * sizeof(*pfns);
+
+ iov[0].iov_base = &rec.type;
+ iov[0].iov_len = sizeof(rec.type);
+
+ iov[1].iov_base = &rec.length;
+ iov[1].iov_len = sizeof(rec.length);
+
+ iov[2].iov_base = pfns;
+ iov[2].iov_len = count * sizeof(*pfns);
+
+ if ( writev_exact(ctx->restore.send_fd, iov, 3) )
+ {
+ PERROR("Failed to write dirty bitmap to stream");
+ goto err;
+ }
+
+ rc = 0;
+ err:
+ return rc;
+}
+
static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec);
static int handle_checkpoint(struct xc_sr_context *ctx)
{
@@ -487,7 +573,9 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
#undef HANDLE_CALLBACK_RETURN_VALUE
- /* TODO: send dirty bitmap to primary */
+ rc = send_dirty_bitmap(ctx);
+ if ( rc )
+ goto err;
}
err:
@@ -559,6 +647,21 @@ static int setup(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
int rc;
+ DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+ &ctx->restore.dirty_bitmap_hbuf);
+
+ if ( ctx->restore.checkpointed == MIG_STREAM_COLO )
+ {
+ dirty_bitmap = xc_hypercall_buffer_alloc_pages(xch, dirty_bitmap,
+ NRPAGES(bitmap_size(ctx->restore.p2m_size)));
+
+ if ( !dirty_bitmap )
+ {
+ ERROR("Unable to allocate memory for dirty bitmap");
+ rc = -1;
+ goto err;
+ }
+ }
rc = ctx->restore.ops.setup(ctx);
if ( rc )
@@ -592,10 +695,15 @@ static void cleanup(struct xc_sr_context *ctx)
{
xc_interface *xch = ctx->xch;
unsigned i;
+ DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+ &ctx->save.dirty_bitmap_hbuf);
for ( i = 0; i < ctx->restore.buffered_rec_num; i++ )
free(ctx->restore.buffered_records[i].data);
+ if ( ctx->restore.checkpointed == MIG_STREAM_COLO )
+ xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
+ NRPAGES(bitmap_size(ctx->save.p2m_size)));
free(ctx->restore.buffered_records);
free(ctx->restore.populated_pfns);
if ( ctx->restore.ops.cleanup(ctx) )
@@ -706,6 +814,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
int checkpointed_stream,
struct restore_callbacks *callbacks, int back_fd)
{
+ xen_pfn_t nr_pfns;
struct xc_sr_context ctx =
{
.xch = xch,
@@ -719,6 +828,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
ctx.restore.xenstore_domid = store_domid;
ctx.restore.checkpointed = checkpointed_stream;
ctx.restore.callbacks = callbacks;
+ ctx.restore.send_fd = back_fd;
/* Sanity checks for callbacks. */
if (checkpointed_stream)
@@ -754,6 +864,14 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
if ( read_headers(&ctx) )
return -1;
+ if ( xc_domain_nr_gpfns(xch, dom, &nr_pfns) < 0 )
+ {
+ PERROR("Unable to obtain the guest p2m size");
+ return -1;
+ }
+
+ ctx.restore.p2m_size = nr_pfns;
+
if ( ctx.dominfo.hvm )
{
ctx.restore.ops = restore_ops_x86_hvm;
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 08/18] send store mfn and console mfn to xl before resuming secondary vm
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (6 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 07/18] libxc/restore: send dirty bitmap to primary when checkpoint under colo Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 09/18] libxc/save: support COLO save Yang Hongyang
` (10 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
From: Wen Congyang <wency@cn.fujitsu.com>
We will call libxl__xc_domain_restore_done() to rebuild secondary vm. But
we need store mfn and console mfn when rebuilding secondary vm. So make
restore_results a function pointer in callback struct and struct
{save,restore}_callbacks, and use this callback to send store mfn and
console mfn to xl.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
---
tools/libxc/include/xenguest.h | 8 ++++++++
tools/libxc/xc_sr_restore.c | 7 +++++--
tools/libxl/libxl_colo_restore.c | 5 -----
tools/libxl/libxl_create.c | 2 ++
tools/libxl/libxl_save_msgs_gen.pl | 2 +-
5 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index dcc441a..b2a9818 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -136,6 +136,14 @@ struct restore_callbacks {
*/
int (*should_checkpoint)(void* data);
+ /*
+ * callback to send store mfn and console mfn to xl
+ * if we want to resume vm before xc_domain_save()
+ * exits.
+ */
+ void (*restore_results)(unsigned long store_mfn, unsigned long console_mfn,
+ void *data);
+
/* to be provided as the last argument to each callback function */
void* data;
};
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 5f98927..0247e84 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -556,7 +556,9 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
if ( rc )
goto err;
- /* TODO: call restore_results */
+ ctx->restore.callbacks->restore_results(ctx->restore.xenstore_gfn,
+ ctx->restore.console_gfn,
+ ctx->restore.callbacks->data);
/* Resume secondary vm */
ret = ctx->restore.callbacks->postcopy(ctx->restore.callbacks->data);
@@ -839,7 +841,8 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
/* this is COLO restore */
assert(callbacks->suspend &&
callbacks->postcopy &&
- callbacks->should_checkpoint);
+ callbacks->should_checkpoint &&
+ callbacks->restore_results);
}
IPRINTF("In experimental %s", __func__);
diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
index 40fd170..ada9a35 100644
--- a/tools/libxl/libxl_colo_restore.c
+++ b/tools/libxl/libxl_colo_restore.c
@@ -137,11 +137,6 @@ static void colo_resume_vm(libxl__egc *egc,
return;
}
- /*
- * TODO: get store mfn and console mfn
- * We should call the callback restore_results in
- * xc_domain_restore() before resuming the guest.
- */
libxl__xc_domain_restore_done(egc, dcs, 0, 0, 0);
return;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 2380368..aaa14e3 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1148,6 +1148,8 @@ static void domcreate_bootloader_done(libxl__egc *egc,
dcs->srs.completion_callback = domcreate_stream_done;
/* colo restore setup */
+ callbacks->restore_results = libxl__srm_callout_callback_restore_results;
+
if (checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
crs->ao = ao;
crs->domid = domid;
diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl
index 86cd395..e96673e 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -29,7 +29,7 @@ our @msgs = (
[ 6, 'srcxA', "should_checkpoint", [] ],
[ 7, 'scxA', "switch_qemu_logdirty", [qw(int domid
unsigned enable)] ],
- [ 8, 'r', "restore_results", ['unsigned long', 'store_mfn',
+ [ 8, 'rcx', "restore_results", ['unsigned long', 'store_mfn',
'unsigned long', 'console_mfn'] ],
[ 9, 'srW', "complete", [qw(int retval
int errnoval)] ],
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 09/18] libxc/save: support COLO save
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (7 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 08/18] send store mfn and console mfn to xl before resuming secondary vm Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 10/18] implement the cmdline for COLO Yang Hongyang
` (9 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
After suspend primary vm, get dirty bitmap on secondary vm,
and send pages both dirty on primary/secondary to secondary.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
---
tools/libxc/xc_sr_common.h | 2 +
tools/libxc/xc_sr_save.c | 104 +++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 102 insertions(+), 4 deletions(-)
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 01ee2e7..92d8da0 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -183,6 +183,8 @@ struct xc_sr_context
{
struct /* Save data. */
{
+ int recv_fd;
+
struct xc_sr_save_ops ops;
struct save_callbacks *callbacks;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index d12e5b1..6f13706 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -515,6 +515,58 @@ static int send_memory_live(struct xc_sr_context *ctx)
return rc;
}
+static int merge_secondary_dirty_bitmap(struct xc_sr_context *ctx)
+{
+ xc_interface *xch = ctx->xch;
+ struct xc_sr_record rec;
+ uint64_t *pfns = NULL;
+ uint64_t pfn;
+ unsigned count, i;
+ int rc;
+ DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+ &ctx->save.dirty_bitmap_hbuf);
+
+ rc = read_record(ctx, ctx->save.recv_fd, &rec);
+ if ( rc )
+ goto err;
+
+ if ( rec.type != REC_TYPE_DIRTY_BITMAP )
+ {
+ PERROR("Expect dirty bitmap record, but received %u", rec.type );
+ rc = -1;
+ goto err;
+ }
+
+ if ( rec.length % sizeof(*pfns) )
+ {
+ PERROR("Invalid dirty bitmap record length %u", rec.length );
+ rc = -1;
+ goto err;
+ }
+
+ count = rec.length / sizeof(*pfns);
+ pfns = rec.data;
+
+ for ( i = 0; i < count; i++ )
+ {
+ pfn = pfns[i];
+ if (pfn > ctx->save.p2m_size)
+ {
+ PERROR("Invalid pfn %#lx", pfn );
+ rc = -1;
+ goto err;
+ }
+
+ set_bit(pfn, dirty_bitmap);
+ }
+
+ rc = 0;
+
+ err:
+ free(rec.data);
+ return rc;
+}
+
/*
* Suspend the domain and send dirty memory.
* This is the last iteration of the live migration and the
@@ -555,6 +607,16 @@ static int suspend_and_send_dirty(struct xc_sr_context *ctx)
bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size);
+ if ( !ctx->save.live && ctx->save.checkpointed == MIG_STREAM_COLO )
+ {
+ rc = merge_secondary_dirty_bitmap(ctx);
+ if ( rc )
+ {
+ PERROR("Failed to get secondary vm's dirty pages");
+ goto out;
+ }
+ }
+
rc = send_dirty_pages(ctx, stats.dirty_count + ctx->save.nr_deferred_pages);
if ( rc )
goto out;
@@ -784,11 +846,42 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
if ( rc )
goto err;
- ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+ if ( ctx->save.checkpointed == MIG_STREAM_COLO )
+ {
+ rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
+ if ( !rc )
+ {
+ rc = -1;
+ goto err;
+ }
+ }
- rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
- if ( rc <= 0 )
- ctx->save.checkpointed = false;
+ rc = ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+ if ( !rc )
+ {
+ rc = -1;
+ goto err;
+ }
+
+ if ( ctx->save.checkpointed == MIG_STREAM_COLO )
+ {
+ rc = ctx->save.callbacks->should_checkpoint(
+ ctx->save.callbacks->data);
+ if ( rc <= 0 )
+ ctx->save.checkpointed = false;
+ }
+ else if ( ctx->save.checkpointed == MIG_STREAM_REMUS )
+ {
+ rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
+ if ( rc <= 0 )
+ ctx->save.checkpointed = false;
+ }
+ else
+ {
+ ERROR("Unknown checkpointed stream");
+ rc = -1;
+ goto err;
+ }
}
} while ( ctx->save.checkpointed );
@@ -835,6 +928,7 @@ int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
ctx.save.live = !!(flags & XCFLAGS_LIVE);
ctx.save.debug = !!(flags & XCFLAGS_DEBUG);
ctx.save.checkpointed = checkpointed_stream;
+ ctx.save.recv_fd = back_fd;
/*
* TODO: Find some time to better tweak the live migration algorithm.
@@ -850,6 +944,8 @@ int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
assert(callbacks->switch_qemu_logdirty);
if ( ctx.save.checkpointed )
assert(callbacks->checkpoint && callbacks->postcopy);
+ if ( ctx.save.checkpointed == MIG_STREAM_COLO )
+ assert(callbacks->should_checkpoint);
IPRINTF("In experimental %s", __func__);
DPRINTF("fd %d, dom %u, max_iters %u, max_factor %u, flags %u, hvm %d",
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 10/18] implement the cmdline for COLO
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (8 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 09/18] libxc/save: support COLO save Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 11/18] Support colo mode for qemu disk Yang Hongyang
` (8 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
From: Wen Congyang <wency@cn.fujitsu.com>
Add a new option -c to the command 'xl remus'. If you want
to use COLO HA instead of Remus HA, please use -c option.
Update man pages to reflect the addition of a new option to
'xl remus' command.
Also add a new option -c to the internal command 'xl migrate-receive'.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
docs/man/xl.pod.1 | 12 ++++++++--
tools/libxl/libxl.c | 23 ++++++++++++++++--
tools/libxl/xl_cmdimpl.c | 61 ++++++++++++++++++++++++++++++++++++-----------
tools/libxl/xl_cmdtable.c | 4 +++-
4 files changed, 81 insertions(+), 19 deletions(-)
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 4eb929d..4260c60 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -447,12 +447,15 @@ Print huge (!) amount of debug during the migration process.
=item B<remus> [I<OPTIONS>] I<domain-id> I<host>
-Enable Remus HA for domain. By default B<xl> relies on ssh as a transport
-mechanism between the two hosts.
+Enable Remus HA or COLO HA for domain. By default B<xl> relies on ssh as a
+transport mechanism between the two hosts.
N.B: Remus support in xl is still in experimental (proof-of-concept) phase.
Disk replication support is limited to DRBD disks.
+ COLO support in xl is still in experimental (proof-of-concept) phase.
+ There is no support for network or disk at the moment.
+
B<OPTIONS>
=over 4
@@ -498,6 +501,11 @@ Disable network output buffering. Requires enabling unsafe mode.
Disable disk replication. Requires enabling unsafe mode.
+=item B<-c>
+
+Enable COLO HA. This conflicts with B<-i> and B<-b>, and memory
+checkpoint compression must be disabled.
+
=back
=item B<pause> I<domain-id>
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 8b866f4..08ae7a7 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -811,12 +811,28 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
goto out;
}
+ /* The caller must set this defbool */
+ if (libxl_defbool_is_default(info->colo)) {
+ LOG(ERROR, "colo mode must be enabled/disabled");
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
libxl_defbool_setdefault(&info->allow_unsafe, false);
libxl_defbool_setdefault(&info->blackhole, false);
- libxl_defbool_setdefault(&info->compression, true);
+ libxl_defbool_setdefault(&info->compression,
+ !libxl_defbool_val(info->colo));
libxl_defbool_setdefault(&info->netbuf, true);
libxl_defbool_setdefault(&info->diskbuf, true);
+ if (libxl_defbool_val(info->colo)) {
+ if (libxl_defbool_val(info->compression)) {
+ LOG(ERROR, "cannot use memory checkpoint compression in COLO mode");
+ rc = ERROR_FAIL;
+ goto out;
+ }
+ }
+
if (!libxl_defbool_val(info->allow_unsafe) &&
(libxl_defbool_val(info->blackhole) ||
!libxl_defbool_val(info->netbuf) ||
@@ -838,7 +854,10 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
dss->live = 1;
dss->debug = 0;
dss->remus = info;
- dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
+ if (libxl_defbool_val(info->colo))
+ dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_COLO;
+ else
+ dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
assert(info);
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 83164bc..eb1b45f 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4259,6 +4259,8 @@ static void migrate_receive(int debug, int daemonize, int monitor,
char rc_buf;
char *migration_domname;
struct domain_create dom_info;
+ const char *ha = checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO ?
+ "COLO" : "Remus";
signal(SIGPIPE, SIG_IGN);
/* if we get SIGPIPE we'd rather just have it as an error */
@@ -4279,6 +4281,9 @@ static void migrate_receive(int debug, int daemonize, int monitor,
dom_info.send_fd = send_fd;
dom_info.migration_domname_r = &migration_domname;
dom_info.checkpointed_stream = checkpointed;
+ if (checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO)
+ /* COLO uses stdout to send control message to master */
+ dom_info.quiet = 1;
rc = create_domain(&dom_info);
if (rc < 0) {
@@ -4293,8 +4298,8 @@ static void migrate_receive(int debug, int daemonize, int monitor,
/* If we are here, it means that the sender (primary) has crashed.
* TODO: Split-Brain Check.
*/
- fprintf(stderr, "migration target: Remus Failover for domain %u\n",
- domid);
+ fprintf(stderr, "migration target: %s Failover for domain %u\n",
+ ha, domid);
/*
* If domain renaming fails, lets just continue (as we need the domain
@@ -4310,16 +4315,20 @@ static void migrate_receive(int debug, int daemonize, int monitor,
rc = libxl_domain_rename(ctx, domid, migration_domname,
common_domname);
if (rc)
- fprintf(stderr, "migration target (Remus): "
+ fprintf(stderr, "migration target (%s): "
"Failed to rename domain from %s to %s:%d\n",
- migration_domname, common_domname, rc);
+ ha, migration_domname, common_domname, rc);
}
+ if (checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO)
+ /* The guest is running after failover in COLO mode */
+ exit(rc ? -ERROR_FAIL: 0);
+
rc = libxl_domain_unpause(ctx, domid);
if (rc)
- fprintf(stderr, "migration target (Remus): "
+ fprintf(stderr, "migration target (%s): "
"Failed to unpause domain %s (id: %u):%d\n",
- common_domname, domid, rc);
+ ha, common_domname, domid, rc);
exit(rc ? -ERROR_FAIL: 0);
}
@@ -4465,7 +4474,7 @@ int main_migrate_receive(int argc, char **argv)
int checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
int opt;
- SWITCH_FOREACH_OPT(opt, "Fedr", NULL, "migrate-receive", 0) {
+ SWITCH_FOREACH_OPT(opt, "Fedrc", NULL, "migrate-receive", 0) {
case 'F':
daemonize = 0;
break;
@@ -4479,6 +4488,9 @@ int main_migrate_receive(int argc, char **argv)
case 'r':
checkpointed = LIBXL_CHECKPOINTED_STREAM_REMUS;
break;
+ case 'c':
+ checkpointed = LIBXL_CHECKPOINTED_STREAM_COLO;
+ break;
}
if (argc-optind != 0) {
@@ -7967,11 +7979,8 @@ int main_remus(int argc, char **argv)
int config_len;
memset(&r_info, 0, sizeof(libxl_domain_remus_info));
- /* Defaults */
- r_info.interval = 200;
- libxl_defbool_setdefault(&r_info.blackhole, false);
- SWITCH_FOREACH_OPT(opt, "Fbundi:s:N:e", NULL, "remus", 2) {
+ SWITCH_FOREACH_OPT(opt, "Fbundi:s:N:ec", NULL, "remus", 2) {
case 'i':
r_info.interval = atoi(optarg);
break;
@@ -7999,11 +8008,32 @@ int main_remus(int argc, char **argv)
case 'e':
daemonize = 0;
break;
+ case 'c':
+ libxl_defbool_set(&r_info.colo, true);
}
domid = find_domain(argv[optind]);
host = argv[optind + 1];
+ /* Defaults */
+ libxl_defbool_setdefault(&r_info.blackhole, false);
+ libxl_defbool_setdefault(&r_info.colo, false);
+ if (!libxl_defbool_val(r_info.colo) && !r_info.interval)
+ r_info.interval = 200;
+
+ if (libxl_defbool_val(r_info.colo)) {
+ if (r_info.interval || libxl_defbool_val(r_info.blackhole)) {
+ perror("Option -c conflicts with -i or -b");
+ exit(-1);
+ }
+
+ if (libxl_defbool_is_default(r_info.compression)) {
+ perror("COLO can't be used with memory compression. "
+ "Disable memory checkpoint compression now...");
+ libxl_defbool_set(&r_info.compression, false);
+ }
+ }
+
if (!r_info.netbufscript)
r_info.netbufscript = default_remus_netbufscript;
@@ -8018,8 +8048,9 @@ int main_remus(int argc, char **argv)
if (!ssh_command[0]) {
rune = host;
} else {
- if (asprintf(&rune, "exec %s %s xl migrate-receive -r %s",
+ if (asprintf(&rune, "exec %s %s xl migrate-receive %s %s",
ssh_command, host,
+ libxl_defbool_val(r_info.colo) ? "-c" : "-r",
daemonize ? "" : " -e") < 0)
return 1;
}
@@ -8048,7 +8079,8 @@ int main_remus(int argc, char **argv)
* domain to force failover
*/
if (libxl_domain_info(ctx, 0, domid)) {
- fprintf(stderr, "Remus: Primary domain has been destroyed.\n");
+ fprintf(stderr, "%s: Primary domain has been destroyed.\n",
+ libxl_defbool_val(r_info.colo) ? "COLO" : "Remus");
close(send_fd);
return 0;
}
@@ -8060,7 +8092,8 @@ int main_remus(int argc, char **argv)
if (rc == ERROR_GUEST_TIMEDOUT)
fprintf(stderr, "Failed to suspend domain at primary.\n");
else {
- fprintf(stderr, "Remus: Backup failed? resuming domain at primary.\n");
+ fprintf(stderr, "%s: Backup failed? resuming domain at primary.\n",
+ libxl_defbool_val(r_info.colo) ? "COLO" : "Remus");
libxl_domain_resume(ctx, domid, 1, 0);
}
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 7f4759b..611accf 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -515,7 +515,9 @@ struct cmd_spec cmd_table[] = {
"-b Replicate memory checkpoints to /dev/null (blackhole).\n"
" Works only in unsafe mode.\n"
"-n Disable network output buffering. Works only in unsafe mode.\n"
- "-d Disable disk replication. Works only in unsafe mode."
+ "-d Disable disk replication. Works only in unsafe mode.\n"
+ "-c Enable COLO HA. It is conflict with -i and -b, and memory\n"
+ " checkpoint must be disabled"
},
#endif
{ "devd",
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 11/18] Support colo mode for qemu disk
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (9 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 10/18] implement the cmdline for COLO Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 12/18] COLO: use qemu block replication Yang Hongyang
` (7 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
From: Wen Congyang <wency@cn.fujitsu.com>
Usage: disk = ['...,colo,colo-params=xxx,active-disk=xxx,hidden-disk=xxx...']
The format of colo-params: host:port:exportname=xx
For QEMU block replication details:
http://wiki.qemu.org/Features/BlockReplication
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
docs/man/xl.pod.1 | 2 +-
docs/misc/xl-disk-configuration.txt | 38 ++++++
tools/libxl/libxl.c | 42 +++++-
tools/libxl/libxl_create.c | 25 +++-
tools/libxl/libxl_device.c | 38 ++++++
tools/libxl/libxl_dm.c | 262 ++++++++++++++++++++++++++++++++++--
tools/libxl/libxl_types.idl | 5 +
tools/libxl/libxlu_disk_l.l | 5 +
8 files changed, 406 insertions(+), 11 deletions(-)
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 4260c60..600facb 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -454,7 +454,7 @@ N.B: Remus support in xl is still in experimental (proof-of-concept) phase.
Disk replication support is limited to DRBD disks.
COLO support in xl is still in experimental (proof-of-concept) phase.
- There is no support for network or disk at the moment.
+ There is no support for network at the moment.
B<OPTIONS>
diff --git a/docs/misc/xl-disk-configuration.txt b/docs/misc/xl-disk-configuration.txt
index 6a2118d..e366e8d 100644
--- a/docs/misc/xl-disk-configuration.txt
+++ b/docs/misc/xl-disk-configuration.txt
@@ -234,6 +234,44 @@ were intentionally created non-sparse to avoid fragmentation of the
file.
+===============
+COLO PARAMETERS
+===============
+
+
+colo
+----
+
+Enable COLO HA for disk. For better understanding block replication on
+QEMU, please refer to:
+http://wiki.qemu.org/Features/BlockReplication
+
+
+colo-params=host:port:exportname=<name>
+---------------------------------------
+
+Description: Secondary host's address and port information,
+ We will run a nbd server on secondary host,
+ exportname is the nbd server's disk export name.
+Mandatory: Yes when COLO enabled
+
+
+active-disk
+-----------
+
+Description: This is used by secondary. Secondary guest's write
+ will be buffered in this disk.
+Mandatory: Yes when COLO enabled
+
+
+hidden-disk
+-----------
+
+Description: This is used by secondary. It buffers the original
+ content that is modified by the primary VM.
+Mandatory: Yes when COLO enabled
+
+
============================================
DEPRECATED PARAMETERS, PREFIXES AND SYNTAXES
============================================
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 08ae7a7..db774e4 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2241,6 +2241,8 @@ int libxl__device_disk_setdefault(libxl__gc *gc, libxl_device_disk *disk)
int rc;
libxl_defbool_setdefault(&disk->discard_enable, !!disk->readwrite);
+ libxl_defbool_setdefault(&disk->colo_enable, false);
+ libxl_defbool_setdefault(&disk->colo_restore_enable, false);
rc = libxl__resolve_domid(gc, disk->backend_domname, &disk->backend_domid);
if (rc < 0) return rc;
@@ -2441,6 +2443,14 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
flexarray_append(back, "params");
flexarray_append(back, libxl__sprintf(gc, "%s:%s",
libxl__device_disk_string_of_format(disk->format), disk->pdev_path));
+ if (libxl_defbool_val(disk->colo_enable)) {
+ flexarray_append(back, "colo-params");
+ flexarray_append(back, libxl__sprintf(gc, "%s", disk->colo_params));
+ flexarray_append(back, "active-disk");
+ flexarray_append(back, libxl__sprintf(gc, "%s", disk->active_disk));
+ flexarray_append(back, "hidden-disk");
+ flexarray_append(back, libxl__sprintf(gc, "%s", disk->hidden_disk));
+ }
assert(device->backend_kind == LIBXL__DEVICE_KIND_QDISK);
break;
default:
@@ -2555,7 +2565,10 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
goto cleanup;
}
- /* "params" may not be present; but everything else must be. */
+ /*
+ * "params" and "colo-params" may not be present; but everything
+ * else must be.
+ */
tmp = xs_read(ctx->xsh, XBT_NULL,
libxl__sprintf(gc, "%s/params", be_path), &len);
if (tmp && strchr(tmp, ':')) {
@@ -2565,6 +2578,33 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
disk->pdev_path = tmp;
}
+ tmp = xs_read(ctx->xsh, XBT_NULL,
+ libxl__sprintf(gc, "%s/colo-params", be_path), &len);
+ if (tmp) {
+ libxl_defbool_set(&disk->colo_enable, true);
+ disk->colo_params = tmp;
+ } else {
+ libxl_defbool_set(&disk->colo_enable, false);
+ }
+
+ if (libxl_defbool_val(disk->colo_enable)) {
+ tmp = xs_read(ctx->xsh, XBT_NULL,
+ libxl__sprintf(gc, "%s/active-disk", be_path), &len);
+ if (!tmp) {
+ LOG(ERROR, "Missing xenstore node %s/active-disk", be_path);
+ goto cleanup;
+ }
+ disk->active_disk = tmp;
+
+ tmp = xs_read(ctx->xsh, XBT_NULL,
+ libxl__sprintf(gc, "%s/hidden-disk", be_path), &len);
+ if (!tmp) {
+ LOG(ERROR, "Missing xenstore node %s/hidden-disk", be_path);
+ goto cleanup;
+ }
+ disk->hidden_disk = tmp;
+ }
+
tmp = libxl__xs_read(gc, XBT_NULL,
libxl__sprintf(gc, "%s/type", be_path));
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index aaa14e3..f7bf629 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1711,12 +1711,29 @@ static void domain_create_cb(libxl__egc *egc,
libxl__ao_complete(egc, ao, rc);
}
-
+
+static void set_disk_colo_restore(libxl_domain_config *d_config)
+{
+ int i;
+
+ for (i = 0; i < d_config->num_disks; i++)
+ libxl_defbool_set(&d_config->disks[i].colo_restore_enable, true);
+}
+
+static void unset_disk_colo_restore(libxl_domain_config *d_config)
+{
+ int i;
+
+ for (i = 0; i < d_config->num_disks; i++)
+ libxl_defbool_set(&d_config->disks[i].colo_restore_enable, false);
+}
+
int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
uint32_t *domid,
const libxl_asyncop_how *ao_how,
const libxl_asyncprogress_how *aop_console_how)
{
+ unset_disk_colo_restore(d_config);
return do_domain_create(ctx, d_config, domid, -1, -1, NULL,
ao_how, aop_console_how);
}
@@ -1727,6 +1744,12 @@ int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
const libxl_asyncop_how *ao_how,
const libxl_asyncprogress_how *aop_console_how)
{
+ if (params->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
+ set_disk_colo_restore(d_config);
+ } else {
+ unset_disk_colo_restore(d_config);
+ }
+
return do_domain_create(ctx, d_config, domid, restore_fd, send_fd, params,
ao_how, aop_console_how);
}
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 93bb41e..df29bc3 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -196,6 +196,10 @@ static int disk_try_backend(disk_try_backend_args *a,
goto bad_format;
}
+ if (libxl_defbool_val(a->disk->colo_enable) ||
+ a->disk->active_disk || a->disk->hidden_disk)
+ goto bad_colo;
+
if (a->disk->backend_domid != LIBXL_TOOLSTACK_DOMID) {
LOG(DEBUG, "Disk vdev=%s, is using a storage driver domain, "
"skipping physical device check", a->disk->vdev);
@@ -218,6 +222,10 @@ static int disk_try_backend(disk_try_backend_args *a,
case LIBXL_DISK_BACKEND_TAP:
if (a->disk->script) goto bad_script;
+ if (libxl_defbool_val(a->disk->colo_enable) ||
+ a->disk->active_disk || a->disk->hidden_disk)
+ goto bad_colo;
+
if (a->disk->is_cdrom) {
LOG(DEBUG, "Disk vdev=%s, backend tap unsuitable for cdroms",
a->disk->vdev);
@@ -236,6 +244,16 @@ static int disk_try_backend(disk_try_backend_args *a,
case LIBXL_DISK_BACKEND_QDISK:
if (a->disk->script) goto bad_script;
+ if (libxl_defbool_val(a->disk->colo_enable)) {
+ if (!a->disk->colo_params)
+ goto bad_colo_params;
+
+ if (!a->disk->active_disk)
+ goto bad_active_disk;
+
+ if (!a->disk->hidden_disk)
+ goto bad_hidden_disk;
+ }
return backend;
default:
@@ -256,6 +274,26 @@ static int disk_try_backend(disk_try_backend_args *a,
LOG(DEBUG, "Disk vdev=%s, backend %s not compatible with script=...",
a->disk->vdev, libxl_disk_backend_to_string(backend));
return 0;
+
+ bad_colo:
+ LOG(DEBUG, "Disk vdev=%s, backend %s not compatible with colo",
+ a->disk->vdev, libxl_disk_backend_to_string(backend));
+ return 0;
+
+ bad_colo_params:
+ LOG(DEBUG, "Disk vdev=%s, backend %s needs colo-params=... for colo",
+ a->disk->vdev, libxl_disk_backend_to_string(backend));
+ return 0;
+
+ bad_active_disk:
+ LOG(DEBUG, "Disk vdev=%s, backend %s needs active-disk=... for colo",
+ a->disk->vdev, libxl_disk_backend_to_string(backend));
+ return 0;
+
+ bad_hidden_disk:
+ LOG(DEBUG, "Disk vdev=%s, backend %s needs hidden-disk=... for colo",
+ a->disk->vdev, libxl_disk_backend_to_string(backend));
+ return 0;
}
int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) {
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 33f9ce6..ac97baa 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -427,6 +427,211 @@ static char *dm_spice_options(libxl__gc *gc,
return opt;
}
+/* colo mode */
+enum {
+ LIBXL__COLO_NONE = 0,
+ LIBXL__COLO_PRIMARY,
+ LIBXL__COLO_SECONDARY,
+};
+
+/* The format of colo-params: host:port:exportname=xx */
+static int parse_colo_params(libxl__gc *gc, const char *colo_params,
+ const char **host, const char **port,
+ const char **exportname)
+{
+ const char *delim;
+
+ delim = strstr(colo_params, ":");
+ if (!delim)
+ return 1;
+ if (delim == colo_params)
+ return 1;
+ *host = libxl__strndup(gc, colo_params, delim - colo_params);
+ colo_params = delim + 1;
+
+ delim = strstr(colo_params, ":");
+ if (!delim)
+ return 1;
+ if (delim == colo_params)
+ return 1;
+ *port = libxl__strndup(gc, colo_params, delim - colo_params);
+ colo_params = delim + 1;
+
+ if (strncmp(colo_params, "exportname=", strlen("exportname=")))
+ return 1;
+ *exportname = colo_params + strlen("exportname=");
+ if ((*exportname)[0] == 0)
+ return 1;
+
+ return 0;
+}
+
+static char *qemu_disk_scsi_drive_string(libxl__gc *gc, const char *pdev_path,
+ int unit, const char *format,
+ const libxl_device_disk *disk,
+ const char *nbd_target,
+ int colo_mode)
+{
+ char *drive = NULL;
+ const char *host = NULL, *port = NULL, *exportname = NULL;
+ libxl_ctx *ctx = libxl__gc_owner(gc);
+ const char *colo_params = disk->colo_params;
+ const char *active_disk = disk->active_disk;
+ const char *hidden_disk = disk->hidden_disk;
+
+ switch (colo_mode) {
+ case LIBXL__COLO_NONE:
+ drive = libxl__sprintf
+ (gc, "file=%s,if=scsi,bus=0,unit=%d,format=%s,cache=writeback",
+ pdev_path, unit, format);
+ break;
+ case LIBXL__COLO_PRIMARY:
+ /*
+ * primary:
+ * -dirve if=scsi,bus=0,unit=x,cache=writeback,driver=quorum,\
+ * children.0.file.filename=pdev_path,\
+ * children.0.driver=format,\
+ * children.1.file.host=host,\
+ * children.1.file.port=port,\
+ * children.1.file.export=exportname,\
+ * children.1.file.driver=nbd+colo,\
+ * children.1.driver=raw,\
+ * children.1.ignore-errors=on,\
+ * read-pattern=fifo
+ */
+
+ if (parse_colo_params(gc, colo_params, &host, &port, &exportname))
+ break;
+
+ drive = libxl__sprintf
+ (gc, "if=scsi,bus=0,unit=%d,cache=writeback,driver=quorum,"
+ "children.0.file.filename=%s,"
+ "children.0.driver=%s,"
+ "children.1.file.host=%s,"
+ "children.1.file.port=%s,"
+ "children.1.file.export=%s,"
+ "children.1.file.driver=nbd+colo,"
+ "children.1.driver=raw,"
+ "children.1.ignore-errors=on,"
+ "read-pattern=fifo",
+ unit, pdev_path, format, host, port, exportname);
+ break;
+ case LIBXL__COLO_SECONDARY:
+ /*
+ * secondary:
+ * -drive if=scsi,bus=0,unit=x,cache=writeback,driver=qcow2+colo,\
+ * file=active_disk,\
+ * backing_reference.drive_id=nbd_target,\
+ * backing_reference.hidden-disk.file.filename=hidden_disk,\
+ * backing_reference.hidden-disk.allow-write-backing-file=on,\
+ * export=exportname,
+ */
+
+ if (parse_colo_params(gc, colo_params, &host, &port, &exportname))
+ break;
+
+ drive = libxl__sprintf
+ (gc, "if=scsi,bus=0,unit=%d,cache=writeback,driver=qcow2+colo,"
+ "file=%s,"
+ "backing_reference.drive_id=%s,"
+ "backing_reference.hidden-disk.file.filename=%s,"
+ "backing_reference.hidden-disk.allow-write-backing-file=on,"
+ "export=%s",
+ unit, active_disk, nbd_target, hidden_disk, exportname);
+ break;
+ default:
+ abort();
+ }
+
+ if (!drive)
+ LIBXL__LOG(ctx, LIBXL__LOG_WARNING,
+ "colo-params is invalid for %s", pdev_path);
+ return drive;
+}
+
+static char *qemu_disk_ide_drive_string(libxl__gc *gc, const char *pdev_path,
+ int unit, const char *format,
+ const libxl_device_disk *disk,
+ const char *nbd_target,
+ int colo_mode)
+{
+ char *drive = NULL;
+ const char *host = NULL, *port = NULL, *exportname = NULL;
+ libxl_ctx *ctx = libxl__gc_owner(gc);
+ const char *colo_params = disk->colo_params;
+ const char *active_disk = disk->active_disk;
+ const char *hidden_disk = disk->hidden_disk;
+
+ switch (colo_mode) {
+ case LIBXL__COLO_NONE:
+ drive = libxl__sprintf
+ (gc, "file=%s,if=ide,index=%d,media=disk,format=%s,cache=writeback",
+ pdev_path, unit, format);
+ break;
+ case LIBXL__COLO_PRIMARY:
+ /*
+ * primary:
+ * -dirve if=ide,index=x,media=disk,cache=writeback,driver=quorum,\
+ * children.0.file.filename=pdev_path,\
+ * children.0.driver=format,\
+ * children.1.file.host=host,\
+ * children.1.file.port=port,\
+ * children.1.file.export=exportname,\
+ * children.1.file.driver=nbd+colo,\
+ * children.1.driver=raw,\
+ * children.1.ignored-errors=on,\
+ * read-pattern=fifo
+ */
+
+ if (parse_colo_params(gc, colo_params, &host, &port, &exportname))
+ break;
+
+ drive = libxl__sprintf
+ (gc, "if=ide,index=%d,media=disk,cache=writeback,driver=quorum,"
+ "children.0.file.filename=%s,"
+ "children.0.driver=%s,"
+ "children.1.file.host=%s,"
+ "children.1.file.port=%s,"
+ "children.1.file.export=%s,"
+ "children.1.file.driver=nbd+colo,"
+ "children.1.driver=raw,"
+ "children.1.ignore-errors=on,"
+ "read-pattern=fifo",
+ unit, pdev_path, format, host, port, exportname);
+ break;
+ case LIBXL__COLO_SECONDARY:
+ /*
+ * secondary:
+ * -drive if=ide,index=x,media=disk,cache=writeback,driver=qcow2+colo,\
+ * file=active_disk,\
+ * backing_reference.drive_id=nbd_target,\
+ * backing_reference.hidden-disk.file.filename=hidden_disk,\
+ * backing_reference.hidden-disk.allow-write-backing-file=on,\
+ * export=exportname,
+ */
+
+ if (parse_colo_params(gc, colo_params, &host, &port, &exportname))
+ break;
+
+ drive = libxl__sprintf
+ (gc, "if=ide,index=%d,media=disk,cache=writeback,driver=qcow2+colo,"
+ "file=%s,"
+ "backing_reference.drive_id=%s,"
+ "backing_reference.hidden-disk.file.filename=%s,"
+ "backing_reference.hidden-disk.allow-write-backing-file=on,"
+ "export=%s",
+ unit, active_disk, nbd_target, hidden_disk, exportname);
+ break;
+ default:
+ abort();
+ }
+
+ if (!drive)
+ LIBXL__LOG(ctx, LIBXL__LOG_WARNING,
+ "colo-params is invalid for %s", pdev_path);
+ return drive;
+}
+
static int libxl__build_device_model_args_new(libxl__gc *gc,
const char *dm, int guest_domid,
const libxl_domain_config *guest_config,
@@ -825,6 +1030,8 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
const char *format = qemu_disk_format_string(disks[i].format);
char *drive;
const char *pdev_path;
+ int colo_mode;
+ char *drive_id;
if (dev_number == -1) {
LIBXL__LOG(ctx, LIBXL__LOG_WARNING, "unable to determine"
@@ -868,16 +1075,55 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
* For other disks we translate devices 0..3 into
* hd[a-d] and ignore the rest.
*/
- if (strncmp(disks[i].vdev, "sd", 2) == 0)
- drive = libxl__sprintf
- (gc, "file=%s,if=scsi,bus=0,unit=%d,format=%s,cache=writeback",
- pdev_path, disk, format);
- else if (disk < 4)
+ if (libxl_defbool_val(disks[i].colo_enable)) {
+ if (libxl_defbool_val(disks[i].colo_restore_enable))
+ colo_mode = LIBXL__COLO_SECONDARY;
+ else
+ colo_mode = LIBXL__COLO_PRIMARY;
+ } else {
+ colo_mode = LIBXL__COLO_NONE;
+ }
+
+ if (colo_mode == LIBXL__COLO_SECONDARY) {
+ /*
+ * -drive if=none,driver=format,file=pdev_path,\
+ * id=nbd_targetx
+ */
+ if (strncmp(disks[i].vdev, "sd", 2) == 0) {
+ drive_id = libxl__sprintf(gc, "nbd_target%d", disk + 4);
+ } else if (disk < 4) {
+ drive_id = libxl__sprintf(gc, "nbd_target%d", disk);
+ } else {
+ continue; /* Do not emulate this disk */
+ }
drive = libxl__sprintf
- (gc, "file=%s,if=ide,index=%d,media=disk,format=%s,cache=writeback",
- pdev_path, disk, format);
- else
+ (gc, "if=none,driver=%s,file=%s,id=%s",
+ format, pdev_path, drive_id);
+
+ flexarray_append(dm_args, "-drive");
+ flexarray_append(dm_args, drive);
+ } else {
+ drive_id = NULL;
+ }
+
+ if (strncmp(disks[i].vdev, "sd", 2) == 0) {
+ drive = qemu_disk_scsi_drive_string(gc, pdev_path, disk,
+ format,
+ &disks[i],
+ drive_id,
+ colo_mode);
+ } else if (disk < 4) {
+ drive = qemu_disk_ide_drive_string(gc, pdev_path, disk,
+ format,
+ &disks[i],
+ drive_id,
+ colo_mode);
+ } else {
continue; /* Do not emulate this disk */
+ }
+
+ if (!drive)
+ continue;
}
flexarray_append(dm_args, "-drive");
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index cf1eeb2..9adc3ce 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -517,6 +517,11 @@ libxl_device_disk = Struct("device_disk", [
("is_cdrom", integer),
("direct_io_safe", bool),
("discard_enable", libxl_defbool),
+ ("colo_enable", libxl_defbool),
+ ("colo_restore_enable", libxl_defbool),
+ ("colo_params", string),
+ ("active_disk", string),
+ ("hidden_disk", string)
])
libxl_device_nic = Struct("device_nic", [
diff --git a/tools/libxl/libxlu_disk_l.l b/tools/libxl/libxlu_disk_l.l
index 1a5deb5..566aa1e 100644
--- a/tools/libxl/libxlu_disk_l.l
+++ b/tools/libxl/libxlu_disk_l.l
@@ -176,6 +176,11 @@ script=[^,]*,? { STRIP(','); SAVESTRING("script", script, FROMEQUALS); }
direct-io-safe,? { DPC->disk->direct_io_safe = 1; }
discard,? { libxl_defbool_set(&DPC->disk->discard_enable, true); }
no-discard,? { libxl_defbool_set(&DPC->disk->discard_enable, false); }
+colo,? { libxl_defbool_set(&DPC->disk->colo_enable, true); }
+no-colo,? { libxl_defbool_set(&DPC->disk->colo_enable, false); }
+colo-params=[^,]*,? { STRIP(','); SAVESTRING("colo-params", colo_params, FROMEQUALS); }
+active-disk=[^,]*,? { STRIP(','); SAVESTRING("active-disk", active_disk, FROMEQUALS); }
+hidden-disk=[^,]*,? { STRIP(','); SAVESTRING("hidden-disk", hidden_disk, FROMEQUALS); }
/* the target magic parameter, eats the rest of the string */
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 12/18] COLO: use qemu block replication
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (10 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 11/18] Support colo mode for qemu disk Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 13/18] COLO proxy: implement setup/teardown of COLO proxy module Yang Hongyang
` (6 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
From: Wen Congyang <wency@cn.fujitsu.com>
Use qemu block replication as our block replication solution.
Note that guest must be paused before starting COLO, otherwise,
the disk won't be consistent between primary and secondary.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
for commit message,
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
tools/libxl/Makefile | 1 +
tools/libxl/libxl_colo_qdisk.c | 209 +++++++++++++++++++++++++++++++++++++++
tools/libxl/libxl_colo_restore.c | 20 +++-
tools/libxl/libxl_colo_save.c | 36 ++++++-
tools/libxl/libxl_internal.h | 18 ++++
tools/libxl/libxl_qmp.c | 31 ++++++
6 files changed, 311 insertions(+), 4 deletions(-)
create mode 100644 tools/libxl/libxl_colo_qdisk.c
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 252c4e9..2e62b88 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -58,6 +58,7 @@ endif
LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
LIBXL_OBJS-y += libxl_colo_restore.o libxl_colo_save.o
+LIBXL_OBJS-y += libxl_colo_qdisk.o
LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_colo_qdisk.c b/tools/libxl/libxl_colo_qdisk.c
new file mode 100644
index 0000000..d73572e
--- /dev/null
+++ b/tools/libxl/libxl_colo_qdisk.c
@@ -0,0 +1,209 @@
+/*
+ * Copyright (C) 2015 FUJITSU LIMITED
+ * Author: Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+typedef struct libxl__colo_qdisk {
+ libxl__checkpoint_device *dev;
+} libxl__colo_qdisk;
+
+/* ========== init() and cleanup() ========== */
+int init_subkind_qdisk(libxl__checkpoint_devices_state *cds)
+{
+ /*
+ * We don't know if we use qemu block replication, so
+ * we cannot start block replication here.
+ */
+ return 0;
+}
+
+void cleanup_subkind_qdisk(libxl__checkpoint_devices_state *cds)
+{
+}
+
+/* ========== setup() and teardown() ========== */
+static void colo_qdisk_setup(libxl__egc *egc, libxl__checkpoint_device *dev,
+ bool primary)
+{
+ const libxl_device_disk *disk = dev->backend_dev;
+ const char *addr = NULL;
+ const char *export_name;
+ int ret, rc = 0;
+
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *const cds = dev->cds;
+ const char *colo_params = disk->colo_params;
+ const int domid = cds->domid;
+
+ EGC_GC;
+
+ if (disk->backend != LIBXL_DISK_BACKEND_QDISK ||
+ !libxl_defbool_val(disk->colo_enable)) {
+ rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
+ goto out;
+ }
+
+ export_name = strstr(colo_params, ":exportname=");
+ if (!export_name) {
+ rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
+ goto out;
+ }
+ export_name += strlen(":exportname=");
+ if (export_name[0] == 0) {
+ rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
+ goto out;
+ }
+
+ dev->matched = 1;
+
+ if (primary) {
+ /* NBD server is not ready, so we cannot start block replication now */
+ goto out;
+ } else {
+ libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+ int len;
+
+ if (crs->qdisk_setuped)
+ goto out;
+
+ crs->qdisk_setuped = true;
+
+ len = export_name - strlen(":exportname=") - colo_params;
+ addr = libxl__strndup(gc, colo_params, len);
+ }
+
+ ret = libxl__qmp_block_start_replication(gc, domid, primary, addr);
+ if (ret)
+ rc = ERROR_FAIL;
+
+out:
+ dev->aodev.rc = rc;
+ dev->aodev.callback(egc, &dev->aodev);
+}
+
+static void colo_qdisk_teardown(libxl__egc *egc, libxl__checkpoint_device *dev,
+ bool primary)
+{
+ int ret, rc = 0;
+
+ /* Convenience aliases */
+ libxl__checkpoint_devices_state *const cds = dev->cds;
+ const int domid = cds->domid;
+
+ EGC_GC;
+
+ if (primary) {
+ libxl__colo_save_state *css = CONTAINER_OF(cds, *css, cds);
+
+ if (!css->qdisk_setuped)
+ goto out;
+
+ css->qdisk_setuped = false;
+ } else {
+ libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+
+ if (!crs->qdisk_setuped)
+ goto out;
+
+ crs->qdisk_setuped = false;
+ }
+
+ ret = libxl__qmp_block_stop_replication(gc, domid, primary);
+ if (ret)
+ rc = ERROR_FAIL;
+
+out:
+ dev->aodev.rc = rc;
+ dev->aodev.callback(egc, &dev->aodev);
+}
+
+/* ========== checkpointing APIs ========== */
+/* should be called after libxl__checkpoint_device_instance_ops.preresume */
+int colo_qdisk_preresume(libxl_ctx *ctx, domid_t domid)
+{
+ GC_INIT(ctx);
+ int ret;
+
+ ret = libxl__qmp_block_do_checkpoint(gc, domid);
+
+ GC_FREE;
+ return ret;
+}
+
+static void colo_qdisk_save_preresume(libxl__egc *egc,
+ libxl__checkpoint_device *dev)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(dev->cds, *css, cds);
+ int ret, rc = 0;
+
+ /* Convenience aliases */
+ const int domid = dev->cds->domid;
+
+ EGC_GC;
+
+ if (css->qdisk_setuped)
+ goto out;
+
+ css->qdisk_setuped = true;
+
+ ret = libxl__qmp_block_start_replication(gc, domid, true, NULL);
+ if (ret)
+ rc = ERROR_FAIL;
+
+out:
+ dev->aodev.rc = rc;
+ dev->aodev.callback(egc, &dev->aodev);
+}
+
+/* ======== primary ======== */
+static void colo_qdisk_save_setup(libxl__egc *egc,
+ libxl__checkpoint_device *dev)
+{
+ colo_qdisk_setup(egc, dev, true);
+}
+
+static void colo_qdisk_save_teardown(libxl__egc *egc,
+ libxl__checkpoint_device *dev)
+{
+ colo_qdisk_teardown(egc, dev, true);
+}
+
+const libxl__checkpoint_device_instance_ops colo_save_device_qdisk = {
+ .kind = LIBXL__DEVICE_KIND_VBD,
+ .setup = colo_qdisk_save_setup,
+ .teardown = colo_qdisk_save_teardown,
+ .preresume = colo_qdisk_save_preresume,
+};
+
+/* ======== secondary ======== */
+static void colo_qdisk_restore_setup(libxl__egc *egc,
+ libxl__checkpoint_device *dev)
+{
+ colo_qdisk_setup(egc, dev, false);
+}
+
+static void colo_qdisk_restore_teardown(libxl__egc *egc,
+ libxl__checkpoint_device *dev)
+{
+ colo_qdisk_teardown(egc, dev, false);
+}
+
+const libxl__checkpoint_device_instance_ops colo_restore_device_qdisk = {
+ .kind = LIBXL__DEVICE_KIND_VBD,
+ .setup = colo_qdisk_restore_setup,
+ .teardown = colo_qdisk_restore_teardown,
+};
diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
index ada9a35..0a58b86 100644
--- a/tools/libxl/libxl_colo_restore.c
+++ b/tools/libxl/libxl_colo_restore.c
@@ -49,7 +49,10 @@ static void libxl__colo_restore_domain_checkpoint_callback(void *data);
static void libxl__colo_restore_domain_should_checkpoint_callback(void *data);
static void libxl__colo_restore_domain_suspend_callback(void *data);
+extern const libxl__checkpoint_device_instance_ops colo_restore_device_qdisk;
+
static const libxl__checkpoint_device_instance_ops *colo_restore_ops[] = {
+ &colo_restore_device_qdisk,
NULL,
};
@@ -148,7 +151,11 @@ static int init_device_subkind(libxl__checkpoint_devices_state *cds)
int rc;
STATE_AO_GC(cds->ao);
+ rc = init_subkind_qdisk(cds);
+ if (rc) goto out;
+
rc = 0;
+out:
return rc;
}
@@ -156,6 +163,8 @@ static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
{
/* cleanup device subkind-specific state in the libxl ctx */
STATE_AO_GC(cds->ao);
+
+ cleanup_subkind_qdisk(cds);
}
@@ -215,6 +224,7 @@ void libxl__colo_restore_setup(libxl__egc *egc,
GCNEW(crcs);
crs->crcs = crcs;
crcs->crs = crs;
+ crs->qdisk_setuped = false;
/* setup dsps */
crcs->dsps.ao = ao;
@@ -518,6 +528,12 @@ static void colo_restore_preresume_cb(libxl__egc *egc,
goto out;
}
+ rc = colo_qdisk_preresume(CTX, crs->domid);
+ if (rc) {
+ LOG(ERROR, "colo_qdisk_preresume() fails");
+ goto out;
+ }
+
colo_restore_resume_vm(egc, crcs);
return;
@@ -673,8 +689,8 @@ static void colo_setup_checkpoint_devices(libxl__egc *egc,
STATE_AO_GC(crs->ao);
- /* TODO: disk/nic support */
- cds->device_kind_flags = 0;
+ /* TODO: nic support */
+ cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VBD);
cds->callback = colo_restore_setup_cds_done;
cds->ao = ao;
cds->domid = crs->domid;
diff --git a/tools/libxl/libxl_colo_save.c b/tools/libxl/libxl_colo_save.c
index 4e059cc..633887b 100644
--- a/tools/libxl/libxl_colo_save.c
+++ b/tools/libxl/libxl_colo_save.c
@@ -19,7 +19,10 @@
#include "libxl_internal.h"
#include "libxl_colo.h"
+extern const libxl__checkpoint_device_instance_ops colo_save_device_qdisk;
+
static const libxl__checkpoint_device_instance_ops *colo_ops[] = {
+ &colo_save_device_qdisk,
NULL,
};
@@ -30,7 +33,11 @@ static int init_device_subkind(libxl__checkpoint_devices_state *cds)
int rc;
STATE_AO_GC(cds->ao);
+ rc = init_subkind_qdisk(cds);
+ if (rc) goto out;
+
rc = 0;
+out:
return rc;
}
@@ -38,6 +45,8 @@ static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
{
/* cleanup device subkind-specific state in the libxl ctx */
STATE_AO_GC(cds->ao);
+
+ cleanup_subkind_qdisk(cds);
}
/* ================= colo: setup save environment ================= */
@@ -65,9 +74,11 @@ void libxl__colo_save_setup(libxl__egc *egc, libxl__colo_save_state *css)
css->send_fd = dss->fd;
css->recv_fd = dss->recv_fd;
css->svm_running = false;
+ css->paused = true;
+ css->qdisk_setuped = false;
- /* TODO: disk/nic support */
- cds->device_kind_flags = 0;
+ /* TODO: nic support */
+ cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VBD);
cds->ops = colo_ops;
cds->callback = colo_save_setup_done;
cds->ao = ao;
@@ -388,12 +399,33 @@ static void colo_preresume_cb(libxl__egc *egc,
goto out;
}
+ if (!css->paused) {
+ rc = colo_qdisk_preresume(CTX, dss->domid);
+ if (rc) {
+ LOG(ERROR, "colo_qdisk_preresume() fails");
+ goto out;
+ }
+ }
+
/* Resumes the domain and the device model */
if (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1)) {
LOG(ERROR, "cannot resume primary vm");
goto out;
}
+ /*
+ * The guest should be paused before doing colo because there is
+ * no disk migration.
+ */
+ if (css->paused) {
+ rc = libxl_domain_unpause(CTX, dss->domid);
+ if (rc) {
+ LOG(ERROR, "cannot unpause primary vm");
+ goto out;
+ }
+ css->paused = false;
+ }
+
/* read COLO_SVM_RESUMED */
css->callback = colo_read_svm_resumed_done;
css->srs.read_records_callback = colo_common_read_stream_done;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index bb5e298..18adf66 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1662,6 +1662,14 @@ _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool enabl
_hidden int libxl__qmp_insert_cdrom(libxl__gc *gc, int domid, const libxl_device_disk *disk);
/* Add a virtual CPU */
_hidden int libxl__qmp_cpu_add(libxl__gc *gc, int domid, int index);
+/* Start block replication */
+_hidden int libxl__qmp_block_start_replication(libxl__gc *gc, int domid,
+ bool primary, const char *addr);
+/* Do block checkpoint */
+_hidden int libxl__qmp_block_do_checkpoint(libxl__gc *gc, int domid);
+/* Stop block replication */
+_hidden int libxl__qmp_block_stop_replication(libxl__gc *gc, int domid,
+ bool primary);
/* close and free the QMP handler */
_hidden void libxl__qmp_close(libxl__qmp_handler *qmp);
/* remove the socket file, if the file has already been removed,
@@ -2735,6 +2743,9 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds);
void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds);
int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
+int init_subkind_qdisk(libxl__checkpoint_devices_state *cds);
+void cleanup_subkind_qdisk(libxl__checkpoint_devices_state *cds);
+int colo_qdisk_preresume(libxl_ctx *ctx, domid_t domid);
typedef void libxl__checkpoint_callback(libxl__egc *,
libxl__checkpoint_devices_state *,
@@ -2932,6 +2943,10 @@ struct libxl__colo_save_state {
libxl__stream_read_state srs;
void (*callback)(libxl__egc *, libxl__colo_save_state *, int);
bool svm_running;
+ bool paused;
+
+ /* private, used by qdisk block replication */
+ bool qdisk_setuped;
};
/*----- Domain suspend (save) state structure -----*/
@@ -3314,6 +3329,9 @@ struct libxl__colo_restore_state {
libxl__domain_create_cb *saved_cb;
void *crcs;
libxl__checkpoint_devices_state cds;
+
+ /* private, used by qdisk block replication */
+ bool qdisk_setuped;
};
struct libxl__domain_create_state {
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index a6f1a21..9714bdf 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -965,6 +965,37 @@ int libxl__qmp_cpu_add(libxl__gc *gc, int domid, int idx)
return qmp_run_command(gc, domid, "cpu-add", args, NULL, NULL);
}
+int libxl__qmp_block_start_replication(libxl__gc *gc, int domid,
+ bool primary, const char *addr)
+{
+ libxl__json_object *args = NULL;
+
+ qmp_parameters_add_bool(gc, &args, "enable", true);
+ qmp_parameters_add_bool(gc, &args, "primary", primary);
+ if (!primary)
+ qmp_parameters_add_string(gc, &args, "addr", addr);
+
+ return qmp_run_command(gc, domid, "xen-set-block-replication", args,
+ NULL, NULL);
+}
+
+int libxl__qmp_block_do_checkpoint(libxl__gc *gc, int domid)
+{
+ return qmp_run_command(gc, domid, "xen-do-block-checkpoint", NULL,
+ NULL, NULL);
+}
+
+int libxl__qmp_block_stop_replication(libxl__gc *gc, int domid, bool primary)
+{
+ libxl__json_object *args = NULL;
+
+ qmp_parameters_add_bool(gc, &args, "enable", false);
+ qmp_parameters_add_bool(gc, &args, "primary", primary);
+
+ return qmp_run_command(gc, domid, "xen-set-block-replication", args,
+ NULL, NULL);
+}
+
int libxl__qmp_initializations(libxl__gc *gc, uint32_t domid,
const libxl_domain_config *guest_config)
{
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 13/18] COLO proxy: implement setup/teardown of COLO proxy module
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (11 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 12/18] COLO: use qemu block replication Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 14/18] COLO proxy: preresume, postresume and checkpoint Yang Hongyang
` (5 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
setup/teardown of COLO proxy module.
we use netlink to communicate with proxy module.
About colo-proxy module:
https://lkml.org/lkml/2015/6/18/32
How to use:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
tools/libxl/Makefile | 1 +
tools/libxl/libxl_colo.h | 2 +
tools/libxl/libxl_colo_proxy.c | 210 +++++++++++++++++++++++++++++++++++++++++
tools/libxl/libxl_internal.h | 12 +++
4 files changed, 225 insertions(+)
create mode 100644 tools/libxl/libxl_colo_proxy.c
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 2e62b88..1beef6c 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -59,6 +59,7 @@ endif
LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
LIBXL_OBJS-y += libxl_colo_restore.o libxl_colo_save.o
LIBXL_OBJS-y += libxl_colo_qdisk.o
+LIBXL_OBJS-y += libxl_colo_proxy.o
LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
index 49a430b..46ca4cf 100644
--- a/tools/libxl/libxl_colo.h
+++ b/tools/libxl/libxl_colo.h
@@ -34,4 +34,6 @@ extern void libxl__colo_save_teardown(libxl__egc *egc,
libxl__colo_save_state *css,
int rc);
+extern int colo_proxy_setup(libxl__colo_proxy_state *cps);
+extern void colo_proxy_teardown(libxl__colo_proxy_state *cps);
#endif
diff --git a/tools/libxl/libxl_colo_proxy.c b/tools/libxl/libxl_colo_proxy.c
new file mode 100644
index 0000000..9f1243e
--- /dev/null
+++ b/tools/libxl/libxl_colo_proxy.c
@@ -0,0 +1,210 @@
+/*
+ * Copyright (C) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang <yanghy@cn.fujitsu.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+#include "libxl_colo.h"
+#include <linux/netlink.h>
+
+#define NETLINK_COLO 28
+
+enum colo_netlink_op {
+ COLO_QUERY_CHECKPOINT = (NLMSG_MIN_TYPE + 1),
+ COLO_CHECKPOINT,
+ COLO_FAILOVER,
+ COLO_PROXY_INIT,
+ COLO_PROXY_RESET, /* UNUSED, will be used for continuous FT */
+};
+
+/* ========= colo-proxy: helper functions ========== */
+
+static int colo_proxy_send(libxl__colo_proxy_state *cps, uint8_t *buff, uint64_t size, int type)
+{
+ struct sockaddr_nl sa;
+ struct nlmsghdr msg;
+ struct iovec iov;
+ struct msghdr mh;
+ int ret;
+
+ STATE_AO_GC(cps->ao);
+
+ memset(&sa, 0, sizeof(sa));
+ sa.nl_family = AF_NETLINK;
+ sa.nl_pid = 0;
+ sa.nl_groups = 0;
+
+ msg.nlmsg_len = NLMSG_SPACE(0);
+ msg.nlmsg_flags = NLM_F_REQUEST;
+ if (type == COLO_PROXY_INIT) {
+ msg.nlmsg_flags |= NLM_F_ACK;
+ }
+ msg.nlmsg_seq = 0;
+ /* This is untrusty */
+ msg.nlmsg_pid = cps->index;
+ msg.nlmsg_type = type;
+
+ iov.iov_base = &msg;
+ iov.iov_len = msg.nlmsg_len;
+
+ mh.msg_name = &sa;
+ mh.msg_namelen = sizeof(sa);
+ mh.msg_iov = &iov;
+ mh.msg_iovlen = 1;
+ mh.msg_control = NULL;
+ mh.msg_controllen = 0;
+ mh.msg_flags = 0;
+
+ ret = sendmsg(cps->sock_fd, &mh, 0);
+ if (ret <= 0) {
+ LOG(ERROR, "can't send msg to kernel by netlink: %s",
+ strerror(errno));
+ }
+
+ return ret;
+}
+
+/* error: return -1, otherwise return 0 */
+static int64_t colo_proxy_recv(libxl__colo_proxy_state *cps, uint8_t **buff, int flags)
+{
+ struct sockaddr_nl sa;
+ struct iovec iov;
+ struct msghdr mh = {
+ .msg_name = &sa,
+ .msg_namelen = sizeof(sa),
+ .msg_iov = &iov,
+ .msg_iovlen = 1,
+ };
+ uint32_t size = 16384;
+ int64_t len = 0;
+ int ret;
+
+ STATE_AO_GC(cps->ao);
+ uint8_t *tmp = libxl__malloc(NOGC, size);
+
+ iov.iov_base = tmp;
+ iov.iov_len = size;
+next:
+ ret = recvmsg(cps->sock_fd, &mh, flags);
+ if (ret <= 0) {
+ goto out;
+ }
+
+ len += ret;
+ if (mh.msg_flags & MSG_TRUNC) {
+ size += 16384;
+ tmp = libxl__realloc(NOGC, tmp, size);
+ iov.iov_base = tmp + len;
+ iov.iov_len = size - len;
+ goto next;
+ }
+
+ *buff = tmp;
+ return len;
+
+out:
+ free(tmp);
+ *buff = NULL;
+ return ret;
+}
+
+/* ========= colo-proxy: setup and teardown ========== */
+
+int colo_proxy_setup(libxl__colo_proxy_state *cps)
+{
+ int skfd = 0;
+ struct sockaddr_nl sa;
+ struct nlmsghdr *h;
+ struct timeval tv = {0, 500000}; /* timeout for recvmsg from kernel */
+ int i = 1;
+ int ret = ERROR_FAIL;
+ uint8_t *buff = NULL;
+ int64_t size;
+
+ STATE_AO_GC(cps->ao);
+
+ skfd = socket(PF_NETLINK, SOCK_RAW, NETLINK_COLO);
+ if (skfd < 0) {
+ LOG(ERROR, "can not create a netlink socket: %s", strerror(errno));
+ goto out;
+ }
+ cps->sock_fd = skfd;
+ memset(&sa, 0, sizeof(sa));
+ sa.nl_family = AF_NETLINK;
+ sa.nl_groups = 0;
+retry:
+ sa.nl_pid = i++;
+
+ if (i > 10) {
+ LOG(ERROR, "netlink bind error");
+ goto out;
+ }
+
+ ret = bind(skfd, (struct sockaddr *)&sa, sizeof(sa));
+ if (ret < 0 && errno == EADDRINUSE) {
+ LOG(ERROR, "colo index %d has already in used", sa.nl_pid);
+ goto retry;
+ }
+
+ cps->index = sa.nl_pid;
+ ret = colo_proxy_send(cps, NULL, 0, COLO_PROXY_INIT);
+ if (ret < 0) {
+ goto out;
+ }
+ setsockopt(cps->sock_fd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));
+ ret = -1;
+ size = colo_proxy_recv(cps, &buff, 0);
+ /* disable SO_RCVTIMEO */
+ tv.tv_usec = 0;
+ setsockopt(cps->sock_fd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));
+ if (size < 0) {
+ LOG(ERROR, "Can't recv msg from kernel by netlink: %s",
+ strerror(errno));
+ goto out;
+ }
+
+ if (size) {
+ h = (struct nlmsghdr *)buff;
+
+ if (h->nlmsg_type == NLMSG_ERROR) {
+ struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(h);
+ if (size - sizeof(*h) < sizeof(*err)) {
+ goto out;
+ }
+ ret = -err->error;
+ if (ret) {
+ goto out;
+ }
+ }
+ }
+
+ ret = 0;
+
+out:
+ free(buff);
+ if (ret) {
+ close(cps->sock_fd);
+ cps->sock_fd = -1;
+ }
+ return ret;
+}
+
+void colo_proxy_teardown(libxl__colo_proxy_state *cps)
+{
+ if (cps->sock_fd >= 0) {
+ close(cps->sock_fd);
+ cps->sock_fd = -1;
+ }
+}
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 18adf66..2e8b3d4 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2933,6 +2933,15 @@ static inline bool libxl__stream_read_inuse(
}
/*----- colo related state structure -----*/
+typedef struct libxl__colo_proxy_state libxl__colo_proxy_state;
+struct libxl__colo_proxy_state {
+ /* set by caller of colo_proxy_setup */
+ libxl__ao *ao;
+
+ int sock_fd;
+ int index;
+};
+
typedef struct libxl__colo_save_state libxl__colo_save_state;
struct libxl__colo_save_state {
libxl__checkpoint_devices_state cds;
@@ -2947,6 +2956,9 @@ struct libxl__colo_save_state {
/* private, used by qdisk block replication */
bool qdisk_setuped;
+
+ /* private, used by colo-proxy */
+ libxl__colo_proxy_state cps;
};
/*----- Domain suspend (save) state structure -----*/
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 14/18] COLO proxy: preresume, postresume and checkpoint
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (12 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 13/18] COLO proxy: implement setup/teardown of COLO proxy module Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 15/18] COLO nic: implement COLO nic subkind Yang Hongyang
` (4 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
preresume, postresume and checkpoint
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
tools/libxl/libxl_colo.h | 3 +++
tools/libxl/libxl_colo_proxy.c | 57 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 60 insertions(+)
diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
index 46ca4cf..4e5f02a 100644
--- a/tools/libxl/libxl_colo.h
+++ b/tools/libxl/libxl_colo.h
@@ -36,4 +36,7 @@ extern void libxl__colo_save_teardown(libxl__egc *egc,
extern int colo_proxy_setup(libxl__colo_proxy_state *cps);
extern void colo_proxy_teardown(libxl__colo_proxy_state *cps);
+extern void colo_proxy_preresume(libxl__colo_proxy_state *cps);
+extern void colo_proxy_postresume(libxl__colo_proxy_state *cps);
+extern int colo_proxy_checkpoint(libxl__colo_proxy_state *cps);
#endif
diff --git a/tools/libxl/libxl_colo_proxy.c b/tools/libxl/libxl_colo_proxy.c
index 9f1243e..c8ff722 100644
--- a/tools/libxl/libxl_colo_proxy.c
+++ b/tools/libxl/libxl_colo_proxy.c
@@ -208,3 +208,60 @@ void colo_proxy_teardown(libxl__colo_proxy_state *cps)
cps->sock_fd = -1;
}
}
+
+/* ========= colo-proxy: preresume, postresume and checkpoint ========== */
+
+void colo_proxy_preresume(libxl__colo_proxy_state *cps)
+{
+ colo_proxy_send(cps, NULL, 0, COLO_CHECKPOINT);
+ /* TODO: need to handle if the call fails... */
+}
+
+void colo_proxy_postresume(libxl__colo_proxy_state *cps)
+{
+ /* nothing to do... */
+}
+
+
+typedef struct colo_msg {
+ bool is_checkpoint;
+} colo_msg;
+
+/*
+do checkpoint: return 1
+error: return -1
+do not checkpoint: return 0
+*/
+int colo_proxy_checkpoint(libxl__colo_proxy_state *cps)
+{
+ uint8_t *buff;
+ int64_t size;
+ struct nlmsghdr *h;
+ struct colo_msg *m;
+ int ret = -1;
+
+ size = colo_proxy_recv(cps, &buff, MSG_DONTWAIT);
+
+ /* timeout, return no checkpoint message. */
+ if (size <= 0) {
+ return 0;
+ }
+
+ h = (struct nlmsghdr *) buff;
+
+ if (h->nlmsg_type == NLMSG_ERROR) {
+ goto out;
+ }
+
+ if (h->nlmsg_len < NLMSG_LENGTH(sizeof(*m))) {
+ goto out;
+ }
+
+ m = NLMSG_DATA(h);
+
+ ret = m->is_checkpoint ? 1 : 0;
+
+out:
+ free(buff);
+ return ret;
+}
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 15/18] COLO nic: implement COLO nic subkind
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (13 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 14/18] COLO proxy: preresume, postresume and checkpoint Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 16/18] setup and control colo proxy on primary side Yang Hongyang
` (3 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
implement COLO nic subkind.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
tools/hotplug/Linux/Makefile | 1 +
tools/hotplug/Linux/colo-proxy-setup | 131 +++++++++++++++
tools/libxl/Makefile | 1 +
tools/libxl/libxl_colo_nic.c | 317 +++++++++++++++++++++++++++++++++++
tools/libxl/libxl_internal.h | 5 +
tools/libxl/libxl_types.idl | 1 +
6 files changed, 456 insertions(+)
create mode 100755 tools/hotplug/Linux/colo-proxy-setup
create mode 100644 tools/libxl/libxl_colo_nic.c
diff --git a/tools/hotplug/Linux/Makefile b/tools/hotplug/Linux/Makefile
index d94a9cb..1c28bea 100644
--- a/tools/hotplug/Linux/Makefile
+++ b/tools/hotplug/Linux/Makefile
@@ -25,6 +25,7 @@ XEN_SCRIPTS += vscsi
XEN_SCRIPTS += block-iscsi
XEN_SCRIPTS += block-drbd-probe
XEN_SCRIPTS += $(XEN_SCRIPTS-y)
+XEN_SCRIPTS += colo-proxy-setup
SUBDIRS-$(CONFIG_SYSTEMD) += systemd
diff --git a/tools/hotplug/Linux/colo-proxy-setup b/tools/hotplug/Linux/colo-proxy-setup
new file mode 100755
index 0000000..3096a9c
--- /dev/null
+++ b/tools/hotplug/Linux/colo-proxy-setup
@@ -0,0 +1,131 @@
+#! /bin/bash
+
+dir=$(dirname "$0")
+. "$dir/xen-hotplug-common.sh"
+. "$dir/hotplugpath.sh"
+. "$dir/xen-network-ft.sh"
+
+findCommand "$@"
+
+if [ "$command" != "setup" -a "$command" != "teardown" ]
+then
+ echo "Invalid command: $command"
+ log err "Invalid command: $command"
+ exit 1
+fi
+
+evalVariables "$@"
+
+: ${vifname:?}
+: ${forwarddev:?}
+: ${mode:?}
+: ${index:?}
+: ${bridge:?}
+
+forwardbr="colobr0"
+
+if [ "$mode" != "primary" -a "$mode" != "secondary" ]
+then
+ echo "Invalid mode: $mode"
+ log err "Invalid mode: $mode"
+ exit 1
+fi
+
+if [ $index -lt 0 ] || [ $index -gt 100 ]; then
+ echo "index overflow"
+ exit 1
+fi
+
+function setup_primary()
+{
+ do_without_error tc qdisc add dev $vifname root handle 1: prio
+ do_without_error tc filter add dev $vifname parent 1: protocol ip prio 10 \
+ u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev $forwarddev
+ do_without_error tc filter add dev $vifname parent 1: protocol arp prio 11 \
+ u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev $forwarddev
+ do_without_error tc filter add dev $vifname parent 1: protocol ipv6 prio \
+ 12 u32 match u32 0 0 flowid 1:2 action mirred egress mirror \
+ dev $forwarddev
+
+ do_without_error modprobe nf_conntrack_ipv4
+ do_without_error modprobe xt_PMYCOLO sec_dev=$forwarddev
+
+ do_without_error iptables -t mangle -I PREROUTING -m physdev --physdev-in \
+ $vifname -j PMYCOLO --index $index
+ do_without_error ip6tables -t mangle -I PREROUTING -m physdev --physdev-in \
+ $vifname -j PMYCOLO --index $index
+ do_without_error arptables -I INPUT -i $forwarddev -j MARK --set-mark $index
+}
+
+function teardown_primary()
+{
+ do_without_error tc filter del dev $vifname parent 1: protocol ip prio 10 u32 match u32 \
+ 0 0 flowid 1:2 action mirred egress mirror dev $forwarddev
+ do_without_error tc filter del dev $vifname parent 1: protocol arp prio 11 u32 match u32 \
+ 0 0 flowid 1:2 action mirred egress mirror dev $forwarddev
+ do_without_error tc filter del dev $vifname parent 1: protocol ipv6 prio 12 u32 match u32 \
+ 0 0 flowid 1:2 action mirred egress mirror dev $forwarddev
+ do_without_error tc qdisc del dev $vifname root handle 1: prio
+
+ do_without_error iptables -t mangle -F
+ do_without_error ip6tables -t mangle -F
+ do_without_error arptables -F
+ do_without_error rmmod xt_PMYCOLO
+}
+
+function setup_secondary()
+{
+ do_without_error brctl delif $bridge $vifname
+ do_without_error brctl addbr $forwardbr
+ do_without_error brctl addif $forwardbr $vifname
+ do_without_error brctl addif $forwardbr $forwarddev
+ do_without_error modprobe xt_SECCOLO
+
+ do_without_error iptables -t mangle -I PREROUTING -m physdev --physdev-in \
+ $vifname -j SECCOLO --index $index
+ do_without_error ip6tables -t mangle -I PREROUTING -m physdev --physdev-in \
+ $vifname -j SECCOLO --index $index
+}
+
+function teardown_secondary()
+{
+ do_without_error brctl delif $forwardbr $forwarddev
+ do_without_error brctl delif $forwardbr $vifname
+ do_without_error brctl delbr $forwardbr
+ do_without_error brctl addif $bridge $vifname
+
+ do_without_error iptables -t mangle -F
+ do_without_error ip6tables -t mangle -F
+ do_without_error rmmod xt_SECCOLO
+}
+
+case "$command" in
+ setup)
+ if [ "$mode" = "primary" ]
+ then
+ setup_primary
+ else
+ setup_secondary
+ fi
+
+ success
+ ;;
+ teardown)
+ if [ "$mode" = "primary" ]
+ then
+ teardown_primary
+ else
+ teardown_secondary
+ fi
+ ;;
+esac
+
+if [ "$mode" = "primary" ]
+then
+ log debug "Successful colo-proxy-setup $command for $vifname." \
+ " vifname: $vifname, index: $index, forwarddev: $forwarddev."
+else
+ log debug "Successful colo-proxy-setup $command for $vifname." \
+ " vifname: $vifname, index: $index, forwarddev: $forwarddev,"\
+ " forwardbr: $forwardbr."
+fi
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 1beef6c..907b195 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -60,6 +60,7 @@ LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
LIBXL_OBJS-y += libxl_colo_restore.o libxl_colo_save.o
LIBXL_OBJS-y += libxl_colo_qdisk.o
LIBXL_OBJS-y += libxl_colo_proxy.o
+LIBXL_OBJS-y += libxl_colo_nic.o
LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_colo_nic.c b/tools/libxl/libxl_colo_nic.c
new file mode 100644
index 0000000..6bbbded
--- /dev/null
+++ b/tools/libxl/libxl_colo_nic.c
@@ -0,0 +1,317 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+typedef struct libxl__colo_device_nic {
+ int devid;
+ const char *vif;
+} libxl__colo_device_nic;
+
+enum {
+ primary,
+ secondary,
+};
+
+
+/* ========== init() and cleanup() ========== */
+int init_subkind_colo_nic(libxl__checkpoint_devices_state *cds)
+{
+ return 0;
+}
+
+void cleanup_subkind_colo_nic(libxl__checkpoint_devices_state *cds)
+{
+}
+
+/* ========== helper functions ========== */
+static void colo_save_setup_script_cb(libxl__egc *egc,
+ libxl__async_exec_state *aes,
+ int status);
+static void colo_save_teardown_script_cb(libxl__egc *egc,
+ libxl__async_exec_state *aes,
+ int status);
+
+/*
+ * If the device has a vifname, then use that instead of
+ * the vifX.Y format.
+ * it must ONLY be used for remus because if driver domains
+ * were in use it would constitute a security vulnerability.
+ */
+static const char *get_vifname(libxl__checkpoint_device *dev,
+ const libxl_device_nic *nic)
+{
+ const char *vifname = NULL;
+ const char *path;
+ int rc;
+
+ STATE_AO_GC(dev->cds->ao);
+
+ /* Convenience aliases */
+ const uint32_t domid = dev->cds->domid;
+
+ path = GCSPRINTF("%s/backend/vif/%d/%d/vifname",
+ libxl__xs_get_dompath(gc, 0), domid, nic->devid);
+ rc = libxl__xs_read_checked(gc, XBT_NULL, path, &vifname);
+ if (!rc && !vifname) {
+ vifname = libxl__device_nic_devname(gc, domid,
+ nic->devid,
+ nic->nictype);
+ }
+
+ return vifname;
+}
+
+/*
+ * the script needs the following env & args
+ * $vifname
+ * $forwarddev
+ * $mode(primary/secondary)
+ * $index
+ * $bridge
+ * setup/teardown as command line arg.
+ */
+static void setup_async_exec(libxl__checkpoint_device *dev, char *op, int side,
+ char *colo_proxy_script)
+{
+ int arraysize, nr = 0;
+ char **env = NULL, **args = NULL;
+ libxl__colo_device_nic *colo_nic = dev->concrete_data;
+ libxl__checkpoint_devices_state *cds = dev->cds;
+ libxl__async_exec_state *aes = &dev->aodev.aes;
+ const libxl_device_nic *nic = dev->backend_dev;
+ libxl__colo_save_state *css = CONTAINER_OF(dev->cds, *css, cds);
+
+ STATE_AO_GC(cds->ao);
+
+ /* Convenience aliases */
+ const char *const vif = colo_nic->vif;
+
+ arraysize = 11;
+ GCNEW_ARRAY(env, arraysize);
+ env[nr++] = "vifname";
+ env[nr++] = libxl__strdup(gc, vif);
+ env[nr++] = "forwarddev";
+ env[nr++] = libxl__strdup(gc, nic->forwarddev);
+ env[nr++] = "mode";
+ if (side == primary)
+ env[nr++] = "primary";
+ else
+ env[nr++] = "secondary";
+ env[nr++] = "index";
+ env[nr++] = GCSPRINTF("%d", css->cps.index);
+ env[nr++] = "bridge";
+ env[nr++] = libxl__strdup(gc, nic->bridge);
+ env[nr++] = NULL;
+ assert(nr == arraysize);
+
+ arraysize = 3; nr = 0;
+ GCNEW_ARRAY(args, arraysize);
+ args[nr++] = colo_proxy_script;
+ args[nr++] = op;
+ args[nr++] = NULL;
+ assert(nr == arraysize);
+
+ aes->ao = dev->cds->ao;
+ aes->what = GCSPRINTF("%s %s", args[0], args[1]);
+ aes->env = env;
+ aes->args = args;
+ aes->timeout_ms = LIBXL_HOTPLUG_TIMEOUT * 1000;
+ aes->stdfds[0] = -1;
+ aes->stdfds[1] = -1;
+ aes->stdfds[2] = -1;
+
+ if (!strcmp(op, "teardown"))
+ aes->callback = colo_save_teardown_script_cb;
+ else
+ aes->callback = colo_save_setup_script_cb;
+}
+
+/* ========== setup() and teardown() ========== */
+static void colo_nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev,
+ int side, char *colo_proxy_script)
+{
+ int rc;
+ libxl__colo_device_nic *colo_nic;
+ const libxl_device_nic *nic = dev->backend_dev;
+
+ STATE_AO_GC(dev->cds->ao);
+
+ /*
+ * thers's no subkind of nic devices, so nic ops is always matched
+ * with nic devices, we begin to setup the nic device
+ */
+ dev->matched = 1;
+
+ if (!nic->forwarddev) {
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
+ GCNEW(colo_nic);
+ dev->concrete_data = colo_nic;
+ colo_nic->devid = nic->devid;
+ colo_nic->vif = get_vifname(dev, nic);
+ if (!colo_nic->vif) {
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
+ setup_async_exec(dev, "setup", side, colo_proxy_script);
+ rc = libxl__async_exec_start(gc, &dev->aodev.aes);
+ if (rc)
+ goto out;
+
+ return;
+
+out:
+ dev->aodev.rc = rc;
+ dev->aodev.callback(egc, &dev->aodev);
+}
+
+static void colo_save_setup_script_cb(libxl__egc *egc,
+ libxl__async_exec_state *aes,
+ int status)
+{
+ libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
+ libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+ libxl__colo_device_nic *colo_nic = dev->concrete_data;
+ libxl__checkpoint_devices_state *cds = dev->cds;
+ const char *out_path_base, *hotplug_error = NULL;
+ int rc;
+
+ STATE_AO_GC(cds->ao);
+
+ /* Convenience aliases */
+ const uint32_t domid = cds->domid;
+ const int devid = colo_nic->devid;
+ const char *const vif = colo_nic->vif;
+
+ out_path_base = GCSPRINTF("%s/colo_proxy/%d",
+ libxl__xs_libxl_path(gc, domid), devid);
+
+ rc = libxl__xs_read_checked(gc, XBT_NULL,
+ GCSPRINTF("%s/hotplug-error", out_path_base),
+ &hotplug_error);
+ if (rc)
+ goto out;
+
+ if (hotplug_error) {
+ LOG(ERROR, "colo_proxy script %s setup failed for vif %s: %s",
+ aes->args[0], vif, hotplug_error);
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
+ if (status) {
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
+ rc = 0;
+
+out:
+ aodev->rc = rc;
+ aodev->callback(egc, aodev);
+}
+
+static void colo_nic_teardown(libxl__egc *egc, libxl__checkpoint_device *dev,
+ int side, char *colo_proxy_script)
+{
+ int rc;
+ libxl__colo_device_nic *colo_nic = dev->concrete_data;
+ STATE_AO_GC(dev->cds->ao);
+
+ if (!colo_nic || !colo_nic->vif) {
+ /* colo nic has not yet been set up, just return */
+ rc = 0;
+ goto out;
+ }
+
+ setup_async_exec(dev, "teardown", side, colo_proxy_script);
+
+ rc = libxl__async_exec_start(gc, &dev->aodev.aes);
+ if (rc)
+ goto out;
+
+ return;
+
+out:
+ dev->aodev.rc = rc;
+ dev->aodev.callback(egc, &dev->aodev);
+}
+
+static void colo_save_teardown_script_cb(libxl__egc *egc,
+ libxl__async_exec_state *aes,
+ int status)
+{
+ int rc;
+ libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
+
+ if (status)
+ rc = ERROR_FAIL;
+ else
+ rc = 0;
+
+ aodev->rc = rc;
+ aodev->callback(egc, aodev);
+}
+
+/* ======== primary ======== */
+static void colo_nic_save_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(dev->cds, *css, cds);
+
+ colo_nic_setup(egc, dev, primary, css->colo_proxy_script);
+}
+
+static void colo_nic_save_teardown(libxl__egc *egc,
+ libxl__checkpoint_device *dev)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(dev->cds, *css, cds);
+
+ colo_nic_teardown(egc, dev, primary, css->colo_proxy_script);
+}
+
+const libxl__checkpoint_device_instance_ops colo_save_device_nic = {
+ .kind = LIBXL__DEVICE_KIND_VIF,
+ .setup = colo_nic_save_setup,
+ .teardown = colo_nic_save_teardown,
+};
+
+/* ======== secondary ======== */
+static void colo_nic_restore_setup(libxl__egc *egc,
+ libxl__checkpoint_device *dev)
+{
+ libxl__colo_restore_state *crs = CONTAINER_OF(dev->cds, *crs, cds);
+
+ colo_nic_setup(egc, dev, secondary, crs->colo_proxy_script);
+}
+
+static void colo_nic_restore_teardown(libxl__egc *egc,
+ libxl__checkpoint_device *dev)
+{
+ libxl__colo_restore_state *crs = CONTAINER_OF(dev->cds, *crs, cds);
+
+ colo_nic_teardown(egc, dev, secondary, crs->colo_proxy_script);
+}
+
+const libxl__checkpoint_device_instance_ops colo_restore_device_nic = {
+ .kind = LIBXL__DEVICE_KIND_VIF,
+ .setup = colo_nic_restore_setup,
+ .teardown = colo_nic_restore_teardown,
+};
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 2e8b3d4..69306c0 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2746,6 +2746,8 @@ void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
int init_subkind_qdisk(libxl__checkpoint_devices_state *cds);
void cleanup_subkind_qdisk(libxl__checkpoint_devices_state *cds);
int colo_qdisk_preresume(libxl_ctx *ctx, domid_t domid);
+int init_subkind_colo_nic(libxl__checkpoint_devices_state *cds);
+void cleanup_subkind_colo_nic(libxl__checkpoint_devices_state *cds);
typedef void libxl__checkpoint_callback(libxl__egc *,
libxl__checkpoint_devices_state *,
@@ -2947,6 +2949,7 @@ struct libxl__colo_save_state {
libxl__checkpoint_devices_state cds;
int send_fd;
int recv_fd;
+ char *colo_proxy_script;
/* private */
libxl__stream_read_state srs;
@@ -3336,6 +3339,7 @@ struct libxl__colo_restore_state {
int recv_fd;
int hvm;
libxl__colo_callback *callback;
+ char *colo_proxy_script;
/* private, colo restore checkpoint state */
libxl__domain_create_cb *saved_cb;
@@ -3358,6 +3362,7 @@ struct libxl__domain_create_state {
libxl_asyncprogress_how aop_console_how;
/* private to domain_create */
int guest_domid;
+ const char *colo_proxy_script;
libxl__domain_build_state build_state;
libxl__colo_restore_state crs;
libxl__bootloader_state bl;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9adc3ce..399c75b 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -539,6 +539,7 @@ libxl_device_nic = Struct("device_nic", [
("rate_bytes_per_interval", uint64),
("rate_interval_usecs", uint32),
("gatewaydev", string),
+ ("forwarddev", string)
])
libxl_device_pci = Struct("device_pci", [
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 16/18] setup and control colo proxy on primary side
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (14 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 15/18] COLO nic: implement COLO nic subkind Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 17/18] setup and control colo proxy on secondary side Yang Hongyang
` (2 subsequent siblings)
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
setup and control colo proxy on primary side
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
tools/libxl/libxl_colo_save.c | 124 +++++++++++++++++++++++++++++++++++++++---
tools/libxl/libxl_internal.h | 1 +
2 files changed, 117 insertions(+), 8 deletions(-)
diff --git a/tools/libxl/libxl_colo_save.c b/tools/libxl/libxl_colo_save.c
index 633887b..1b9c1a8 100644
--- a/tools/libxl/libxl_colo_save.c
+++ b/tools/libxl/libxl_colo_save.c
@@ -19,9 +19,11 @@
#include "libxl_internal.h"
#include "libxl_colo.h"
+extern const libxl__checkpoint_device_instance_ops colo_save_device_nic;
extern const libxl__checkpoint_device_instance_ops colo_save_device_qdisk;
static const libxl__checkpoint_device_instance_ops *colo_ops[] = {
+ &colo_save_device_nic,
&colo_save_device_qdisk,
NULL,
};
@@ -33,9 +35,15 @@ static int init_device_subkind(libxl__checkpoint_devices_state *cds)
int rc;
STATE_AO_GC(cds->ao);
- rc = init_subkind_qdisk(cds);
+ rc = init_subkind_colo_nic(cds);
if (rc) goto out;
+ rc = init_subkind_qdisk(cds);
+ if (rc) {
+ cleanup_subkind_colo_nic(cds);
+ goto out;
+ }
+
rc = 0;
out:
return rc;
@@ -46,6 +54,7 @@ static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
/* cleanup device subkind-specific state in the libxl ctx */
STATE_AO_GC(cds->ao);
+ cleanup_subkind_colo_nic(cds);
cleanup_subkind_qdisk(cds);
}
@@ -76,9 +85,16 @@ void libxl__colo_save_setup(libxl__egc *egc, libxl__colo_save_state *css)
css->svm_running = false;
css->paused = true;
css->qdisk_setuped = false;
+ libxl__ev_child_init(&css->child);
- /* TODO: nic support */
- cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VBD);
+ if (dss->remus->netbufscript)
+ css->colo_proxy_script = libxl__strdup(gc, dss->remus->netbufscript);
+ else
+ css->colo_proxy_script = GCSPRINTF("%s/colo-proxy-setup",
+ libxl__xen_script_dir_path());
+
+ cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VIF) |
+ (1 << LIBXL__DEVICE_KIND_VBD);
cds->ops = colo_ops;
cds->callback = colo_save_setup_done;
cds->ao = ao;
@@ -88,6 +104,12 @@ void libxl__colo_save_setup(libxl__egc *egc, libxl__colo_save_state *css)
css->srs.fd = css->recv_fd;
css->srs.back_channel = true;
libxl__stream_read_start(egc, &css->srs);
+ css->cps.ao = ao;
+ if (colo_proxy_setup(&css->cps)) {
+ LOG(ERROR, "COLO: failed to setup colo proxy for guest with domid %u",
+ cds->domid);
+ goto out;
+ }
if (init_device_subkind(cds))
goto out;
@@ -162,6 +184,7 @@ static void colo_teardown_done(libxl__egc *egc,
libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
cleanup_device_subkind(cds);
+ colo_proxy_teardown(&css->cps);
dss->callback(egc, dss, rc);
}
@@ -375,6 +398,8 @@ static void colo_read_svm_ready_done(libxl__egc *egc,
goto out;
}
+ colo_proxy_preresume(&css->cps);
+
css->svm_running = true;
css->cds.callback = colo_preresume_cb;
libxl__checkpoint_devices_preresume(egc, &css->cds);
@@ -451,6 +476,8 @@ static void colo_read_svm_resumed_done(libxl__egc *egc,
goto out;
}
+ colo_proxy_postresume(&css->cps);
+
ok = 1;
out:
@@ -459,6 +486,91 @@ out:
/* ===================== colo: wait new checkpoint ===================== */
+
+static void colo_start_new_checkpoint(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds,
+ int rc);
+static void colo_proxy_async_wait_for_checkpoint(libxl__colo_save_state *css);
+static void colo_proxy_async_call_done(libxl__egc *egc,
+ libxl__ev_child *child,
+ int pid,
+ int status);
+
+static void colo_proxy_async_call(libxl__egc *egc,
+ libxl__colo_save_state *css,
+ void func(libxl__colo_save_state *),
+ libxl__ev_child_callback callback)
+{
+ int pid = -1, rc;
+
+ STATE_AO_GC(css->cds.ao);
+
+ /* Fork and call */
+ pid = libxl__ev_child_fork(gc, &css->child, callback);
+ if (pid == -1) {
+ LOG(ERROR, "unable to fork");
+ rc = ERROR_FAIL;
+ goto out;
+ }
+
+ if (!pid) {
+ /* child */
+ func(css);
+ /* notreached */
+ abort();
+ }
+
+ return;
+
+out:
+ callback(egc, &css->child, -1, 1);
+}
+
+static void colo_proxy_wait_for_checkpoint(libxl__egc *egc,
+ libxl__colo_save_state *css)
+{
+ colo_proxy_async_call(egc, css,
+ colo_proxy_async_wait_for_checkpoint,
+ colo_proxy_async_call_done);
+}
+
+static void colo_proxy_async_wait_for_checkpoint(libxl__colo_save_state *css)
+{
+ int req;
+
+again:
+ req = colo_proxy_checkpoint(&css->cps);
+ if (req < 0) {
+ /* some error happens */
+ _exit(1);
+ } else if (!req) {
+ /* no checkpoint is needed, wait for 1ms and the check again */
+ usleep(1000);
+ goto again;
+ } else {
+ /* net packets is not consistent, we need to start a checkpoint */
+ _exit(0);
+ }
+}
+
+static void colo_proxy_async_call_done(libxl__egc *egc,
+ libxl__ev_child *child,
+ int pid,
+ int status)
+{
+ libxl__colo_save_state *css = CONTAINER_OF(child, *css, child);
+
+ EGC_GC;
+
+ if (status) {
+ LOG(ERROR, "failed to wait for new checkpoint");
+ colo_start_new_checkpoint(egc, &css->cds, ERROR_FAIL);
+ return;
+ }
+
+ colo_start_new_checkpoint(egc, &css->cds, 0);
+}
+
/*
* Do the following things:
* 1. do commit
@@ -468,9 +580,6 @@ out:
static void colo_device_commit_cb(libxl__egc *egc,
libxl__checkpoint_devices_state *cds,
int rc);
-static void colo_start_new_checkpoint(libxl__egc *egc,
- libxl__checkpoint_devices_state *cds,
- int rc);
void libxl__colo_save_domain_should_checkpoint_callback(void *data)
{
@@ -499,8 +608,7 @@ static void colo_device_commit_cb(libxl__egc *egc,
goto out;
}
- /* TODO: wait a new checkpoint */
- colo_start_new_checkpoint(egc, cds, 0);
+ colo_proxy_wait_for_checkpoint(egc, css);
return;
out:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 69306c0..368b452 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2962,6 +2962,7 @@ struct libxl__colo_save_state {
/* private, used by colo-proxy */
libxl__colo_proxy_state cps;
+ libxl__ev_child child;
};
/*----- Domain suspend (save) state structure -----*/
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 17/18] setup and control colo proxy on secondary side
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (15 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 16/18] setup and control colo proxy on primary side Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-06-25 6:31 ` [PATCH v7 COLO 18/18] cmdline switches and config vars to control colo-proxy Yang Hongyang
2015-07-14 15:55 ` [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Ian Campbell
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
setup and control colo proxy on secondary side
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
tools/libxl/libxl_colo_restore.c | 28 +++++++++++++++++++++++++---
tools/libxl/libxl_internal.h | 3 +++
2 files changed, 28 insertions(+), 3 deletions(-)
diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
index 0a58b86..f8e5167 100644
--- a/tools/libxl/libxl_colo_restore.c
+++ b/tools/libxl/libxl_colo_restore.c
@@ -49,9 +49,11 @@ static void libxl__colo_restore_domain_checkpoint_callback(void *data);
static void libxl__colo_restore_domain_should_checkpoint_callback(void *data);
static void libxl__colo_restore_domain_suspend_callback(void *data);
+extern const libxl__checkpoint_device_instance_ops colo_restore_device_nic;
extern const libxl__checkpoint_device_instance_ops colo_restore_device_qdisk;
static const libxl__checkpoint_device_instance_ops *colo_restore_ops[] = {
+ &colo_restore_device_nic,
&colo_restore_device_qdisk,
NULL,
};
@@ -151,8 +153,14 @@ static int init_device_subkind(libxl__checkpoint_devices_state *cds)
int rc;
STATE_AO_GC(cds->ao);
+ rc = init_subkind_colo_nic(cds);
+ if (rc) goto out;
+
rc = init_subkind_qdisk(cds);
- if (rc) goto out;
+ if (rc) {
+ cleanup_subkind_colo_nic(cds);
+ goto out;
+ }
rc = 0;
out:
@@ -164,6 +172,7 @@ static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
/* cleanup device subkind-specific state in the libxl ctx */
STATE_AO_GC(cds->ao);
+ cleanup_subkind_colo_nic(cds);
cleanup_subkind_qdisk(cds);
}
@@ -351,6 +360,8 @@ static void colo_restore_teardown_done(libxl__egc *egc,
if (crcs->teardown_devices)
cleanup_device_subkind(cds);
+ colo_proxy_teardown(&crs->cps);
+
rc = crcs->saved_rc;
if (!rc) {
crcs->callback = do_failover_done;
@@ -534,6 +545,8 @@ static void colo_restore_preresume_cb(libxl__egc *egc,
goto out;
}
+ colo_proxy_preresume(&crs->cps);
+
colo_restore_resume_vm(egc, crcs);
return;
@@ -570,6 +583,8 @@ static void colo_resume_vm_done(libxl__egc *egc,
crcs->status = LIBXL_COLO_RESUMED;
+ colo_proxy_postresume(&crs->cps);
+
/* avoid calling libxl__xc_domain_restore_done() more than once */
if (crs->saved_cb) {
dcs->callback = crs->saved_cb;
@@ -689,13 +704,20 @@ static void colo_setup_checkpoint_devices(libxl__egc *egc,
STATE_AO_GC(crs->ao);
- /* TODO: nic support */
- cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VBD);
+ cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VIF) |
+ (1 << LIBXL__DEVICE_KIND_VBD);
cds->callback = colo_restore_setup_cds_done;
cds->ao = ao;
cds->domid = crs->domid;
cds->ops = colo_restore_ops;
+ crs->cps.ao = ao;
+ if (colo_proxy_setup(&crs->cps)) {
+ LOG(ERROR, "COLO: failed to setup colo proxy for guest with domid %u",
+ cds->domid);
+ goto out;
+ }
+
if (init_device_subkind(cds))
goto out;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 368b452..29b0f64 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3349,6 +3349,9 @@ struct libxl__colo_restore_state {
/* private, used by qdisk block replication */
bool qdisk_setuped;
+
+ /* private, used by colo proxy */
+ libxl__colo_proxy_state cps;
};
struct libxl__domain_create_state {
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v7 COLO 18/18] cmdline switches and config vars to control colo-proxy
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (16 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 17/18] setup and control colo proxy on secondary side Yang Hongyang
@ 2015-06-25 6:31 ` Yang Hongyang
2015-07-14 15:55 ` [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Ian Campbell
18 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-06-25 6:31 UTC (permalink / raw)
To: xen-devel
Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
eddie.dong, guijianfeng, rshriram, ian.jackson
Add cmdline switches to 'xl migrate-receive' command to specify
a domain-specific hotplug script to setup COLO proxy.
Add a new config var 'colo.default.agentscript' to xl.conf, that
allows the user to override the default global script used to
setup COLO proxy.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
docs/man/xl.conf.pod.5 | 6 ++++++
docs/man/xl.pod.1 | 1 -
tools/libxl/libxl.c | 6 ++++++
tools/libxl/libxl_create.c | 14 +++++++++++--
tools/libxl/libxl_types.idl | 1 +
tools/libxl/xl.c | 3 +++
tools/libxl/xl.h | 1 +
tools/libxl/xl_cmdimpl.c | 50 ++++++++++++++++++++++++++++++++++-----------
8 files changed, 67 insertions(+), 15 deletions(-)
diff --git a/docs/man/xl.conf.pod.5 b/docs/man/xl.conf.pod.5
index 8ae19bb..8f7fd28 100644
--- a/docs/man/xl.conf.pod.5
+++ b/docs/man/xl.conf.pod.5
@@ -111,6 +111,12 @@ Configures the default script used by Remus to setup network buffering.
Default: C</etc/xen/scripts/remus-netbuf-setup>
+=item B<colo.default.proxyscript="PATH">
+
+Configures the default script used by COLO to setup colo-proxy.
+
+Default: C</etc/xen/scripts/colo-proxy-setup>
+
=item B<output_format="json|sxp">
Configures the default output format used by xl when printing "machine
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 600facb..d9d834d 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -454,7 +454,6 @@ N.B: Remus support in xl is still in experimental (proof-of-concept) phase.
Disk replication support is limited to DRBD disks.
COLO support in xl is still in experimental (proof-of-concept) phase.
- There is no support for network at the moment.
B<OPTIONS>
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index db774e4..4e940dd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -3343,6 +3343,11 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
flexarray_append(back, nic->ifname);
}
+ if (nic->forwarddev) {
+ flexarray_append(back, "forwarddev");
+ flexarray_append(back, nic->forwarddev);
+ }
+
flexarray_append(back, "mac");
flexarray_append(back,libxl__sprintf(gc,
LIBXL_MAC_FMT, LIBXL_MAC_BYTES(nic->mac)));
@@ -3466,6 +3471,7 @@ static int libxl__device_nic_from_xs_be(libxl__gc *gc,
nic->ip = READ_BACKEND(NOGC, "ip");
nic->bridge = READ_BACKEND(NOGC, "bridge");
nic->script = READ_BACKEND(NOGC, "script");
+ nic->forwarddev = READ_BACKEND(NOGC, "forwarddev");
/* vif_ioemu nics use the same xenstore entries as vif interfaces */
tmp = READ_BACKEND(gc, "type");
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f7bf629..2c25d60 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1157,6 +1157,11 @@ static void domcreate_bootloader_done(libxl__egc *egc,
crs->recv_fd = restore_fd;
crs->hvm = (info->type == LIBXL_DOMAIN_TYPE_HVM);
crs->callback = libxl__colo_restore_setup_done;
+ if (dcs->colo_proxy_script)
+ crs->colo_proxy_script = libxl__strdup(gc, dcs->colo_proxy_script);
+ else
+ crs->colo_proxy_script = GCSPRINTF("%s/colo-proxy-setup",
+ libxl__xen_script_dir_path());
libxl__colo_restore_setup(egc, crs);
} else
libxl__stream_read_start(egc, &dcs->srs);
@@ -1676,6 +1681,7 @@ static void domain_create_cb(libxl__egc *egc,
static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
uint32_t *domid, int restore_fd, int send_fd,
const libxl_domain_restore_params *params,
+ const char *colo_proxy_script,
const libxl_asyncop_how *ao_how,
const libxl_asyncprogress_how *aop_console_how)
{
@@ -1691,6 +1697,7 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
if (params) cdcs->dcs.restore_params = *params;
cdcs->dcs.send_fd = send_fd;
cdcs->dcs.callback = domain_create_cb;
+ cdcs->dcs.colo_proxy_script = colo_proxy_script;
libxl__ao_progress_gethow(&cdcs->dcs.aop_console_how, aop_console_how);
cdcs->domid_out = domid;
@@ -1734,7 +1741,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
const libxl_asyncprogress_how *aop_console_how)
{
unset_disk_colo_restore(d_config);
- return do_domain_create(ctx, d_config, domid, -1, -1, NULL,
+ return do_domain_create(ctx, d_config, domid, -1, -1, NULL, NULL,
ao_how, aop_console_how);
}
@@ -1744,14 +1751,17 @@ int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
const libxl_asyncop_how *ao_how,
const libxl_asyncprogress_how *aop_console_how)
{
+ char *colo_proxy_script = NULL;
+
if (params->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
+ colo_proxy_script = params->colo_proxy_script;
set_disk_colo_restore(d_config);
} else {
unset_disk_colo_restore(d_config);
}
return do_domain_create(ctx, d_config, domid, restore_fd, send_fd, params,
- ao_how, aop_console_how);
+ colo_proxy_script, ao_how, aop_console_how);
}
/*
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 399c75b..a13c3f9 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -354,6 +354,7 @@ libxl_domain_restore_params = Struct("domain_restore_params", [
("checkpointed_stream", integer),
("stream_version", uint32, {'init_val': '1'}),
("legacy_width", uint32),
+ ("colo_proxy_script", string),
])
libxl_domain_sched_params = Struct("domain_sched_params",[
diff --git a/tools/libxl/xl.c b/tools/libxl/xl.c
index f014306..f44f04f 100644
--- a/tools/libxl/xl.c
+++ b/tools/libxl/xl.c
@@ -45,6 +45,7 @@ char *default_bridge = NULL;
char *default_gatewaydev = NULL;
char *default_vifbackend = NULL;
char *default_remus_netbufscript = NULL;
+char *default_colo_proxy_script = NULL;
enum output_format default_output_format = OUTPUT_FORMAT_JSON;
int claim_mode = 1;
bool progress_use_cr = 0;
@@ -179,6 +180,8 @@ static void parse_global_config(const char *configfile,
xlu_cfg_replace_string (config, "remus.default.netbufscript",
&default_remus_netbufscript, 0);
+ xlu_cfg_replace_string (config, "colo.default.proxyscript",
+ &default_colo_proxy_script, 0);
xlu_cfg_destroy(config);
}
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
index 5bc138c..33f25d1 100644
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -178,6 +178,7 @@ extern char *default_bridge;
extern char *default_gatewaydev;
extern char *default_vifbackend;
extern char *default_remus_netbufscript;
+extern char *default_colo_proxy_script;
extern char *blkdev_start;
enum output_format {
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index eb1b45f..7515ff8 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -156,6 +156,7 @@ struct domain_create {
const char *config_file;
const char *extra_config; /* extra config string */
const char *restore_file;
+ char *colo_proxy_script;
int migrate_fd; /* -1 means none */
int send_fd; /* -1 means none */
char **migration_domname_r; /* from malloc */
@@ -985,6 +986,8 @@ static int parse_nic_config(libxl_device_nic *nic, XLU_Config **config, char *to
replace_string(&nic->model, oparg);
} else if (MATCH_OPTION("rate", token, oparg)) {
parse_vif_rate(config, oparg, nic);
+ } else if (MATCH_OPTION("forwarddev", token, oparg)) {
+ replace_string(&nic->forwarddev, oparg);
} else if (MATCH_OPTION("accel", token, oparg)) {
fprintf(stderr, "the accel parameter for vifs is currently not supported\n");
} else {
@@ -2731,6 +2734,7 @@ start:
params.checkpointed_stream = dom_info->checkpointed_stream;
params.stream_version =
(hdr.mandatory_flags & XL_MANDATORY_FLAG_STREAMv2) ? 2 : 1;
+ params.colo_proxy_script = dom_info->colo_proxy_script;
ret = libxl_domain_create_restore(ctx, &d_config,
&domid, restore_fd,
@@ -4252,7 +4256,8 @@ static void migrate_domain(uint32_t domid, const char *rune, int debug,
}
static void migrate_receive(int debug, int daemonize, int monitor,
- int send_fd, int recv_fd, int checkpointed)
+ int send_fd, int recv_fd, int checkpointed,
+ char *colo_proxy_script)
{
uint32_t domid;
int rc, rc2;
@@ -4281,6 +4286,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
dom_info.send_fd = send_fd;
dom_info.migration_domname_r = &migration_domname;
dom_info.checkpointed_stream = checkpointed;
+ dom_info.colo_proxy_script = colo_proxy_script;
if (checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO)
/* COLO uses stdout to send control message to master */
dom_info.quiet = 1;
@@ -4473,8 +4479,9 @@ int main_migrate_receive(int argc, char **argv)
int debug = 0, daemonize = 1, monitor = 1;
int checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
int opt;
+ char *script = NULL;
- SWITCH_FOREACH_OPT(opt, "Fedrc", NULL, "migrate-receive", 0) {
+ SWITCH_FOREACH_OPT(opt, "Fedrcn:", NULL, "migrate-receive", 0) {
case 'F':
daemonize = 0;
break;
@@ -4491,6 +4498,9 @@ int main_migrate_receive(int argc, char **argv)
case 'c':
checkpointed = LIBXL_CHECKPOINTED_STREAM_COLO;
break;
+ case 'n':
+ script = optarg;
+ break;
}
if (argc-optind != 0) {
@@ -4499,7 +4509,7 @@ int main_migrate_receive(int argc, char **argv)
}
migrate_receive(debug, daemonize, monitor,
STDOUT_FILENO, STDIN_FILENO,
- checkpointed);
+ checkpointed, script);
return 0;
}
@@ -8022,8 +8032,10 @@ int main_remus(int argc, char **argv)
r_info.interval = 200;
if (libxl_defbool_val(r_info.colo)) {
- if (r_info.interval || libxl_defbool_val(r_info.blackhole)) {
- perror("Option -c conflicts with -i or -b");
+ if (r_info.interval || libxl_defbool_val(r_info.blackhole) ||
+ !libxl_defbool_is_default(r_info.netbuf) ||
+ !libxl_defbool_is_default(r_info.diskbuf)) {
+ perror("option -c is conflict with -i, -d, -n or -b");
exit(-1);
}
@@ -8034,8 +8046,12 @@ int main_remus(int argc, char **argv)
}
}
- if (!r_info.netbufscript)
- r_info.netbufscript = default_remus_netbufscript;
+ if (!r_info.netbufscript) {
+ if (libxl_defbool_val(r_info.colo))
+ r_info.netbufscript = default_colo_proxy_script;
+ else
+ r_info.netbufscript = default_remus_netbufscript;
+ }
if (libxl_defbool_val(r_info.blackhole)) {
send_fd = open("/dev/null", O_RDWR, 0644);
@@ -8048,11 +8064,21 @@ int main_remus(int argc, char **argv)
if (!ssh_command[0]) {
rune = host;
} else {
- if (asprintf(&rune, "exec %s %s xl migrate-receive %s %s",
- ssh_command, host,
- libxl_defbool_val(r_info.colo) ? "-c" : "-r",
- daemonize ? "" : " -e") < 0)
- return 1;
+ if (!libxl_defbool_val(r_info.colo)) {
+ if (asprintf(&rune, "exec %s %s xl migrate-receive %s %s",
+ ssh_command, host,
+ "-r",
+ daemonize ? "" : " -e") < 0)
+ return 1;
+ } else {
+ if (asprintf(&rune, "exec %s %s xl migrate-receive %s %s %s %s",
+ ssh_command, host,
+ "-c",
+ r_info.netbufscript ? "-n" : "",
+ r_info.netbufscript ? r_info.netbufscript : "",
+ daemonize ? "" : " -e") < 0)
+ return 1;
+ }
}
save_domain_core_begin(domid, NULL, &config_data, &config_len);
--
1.9.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service
2015-06-25 6:30 [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Yang Hongyang
` (17 preceding siblings ...)
2015-06-25 6:31 ` [PATCH v7 COLO 18/18] cmdline switches and config vars to control colo-proxy Yang Hongyang
@ 2015-07-14 15:55 ` Ian Campbell
2015-07-15 0:41 ` Yang Hongyang
18 siblings, 1 reply; 24+ messages in thread
From: Ian Campbell @ 2015-07-14 15:55 UTC (permalink / raw)
To: Yang Hongyang
Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
xen-devel, guijianfeng, rshriram, ian.jackson
On Thu, 2015-06-25 at 14:30 +0800, Yang Hongyang wrote:
> This patchset implemented the COLO feature for Xen.
> For detail/install/use of COLO feature, refer to:
> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>
> This patchset is based on:
> [PATCH v3 COLOPre 00/26] Prerequisite patches for COLO
> and on:
> http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/libxl-migv2-v1
I started going over this but once I got to the meat I realised that it
was based on this older version and that quite a bit had changed in
migrv2 since (and I think migrv2 v4.1 includes bits of this series
already from the looks of it).
I tried to do the naive git rebase, but it resulted in more conflicts
than I could deal with.
So my question is, should I continue to look at this version or should I
await a v8 based on the latest migration stuff?
Have you been following the review of migrv2? In particular I think some
of Ian J's comments there on the state machines and the
comments/assumptions about states etc may apply to bits of this series
too. I'm thinking in particular of the comments on "tools/libxl:
Infrastructure for reading a libxl migration v2 stream" on v2 and v3 or
that series.
Ian.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service
2015-07-14 15:55 ` [PATCH v7 COLO 00/18] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Ian Campbell
@ 2015-07-15 0:41 ` Yang Hongyang
0 siblings, 0 replies; 24+ messages in thread
From: Yang Hongyang @ 2015-07-15 0:41 UTC (permalink / raw)
To: Ian Campbell
Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
xen-devel, guijianfeng, rshriram, ian.jackson
On 07/14/2015 11:55 PM, Ian Campbell wrote:
> On Thu, 2015-06-25 at 14:30 +0800, Yang Hongyang wrote:
>> This patchset implemented the COLO feature for Xen.
>> For detail/install/use of COLO feature, refer to:
>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>>
>> This patchset is based on:
>> [PATCH v3 COLOPre 00/26] Prerequisite patches for COLO
>> and on:
>> http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/libxl-migv2-v1
>
> I started going over this but once I got to the meat I realised that it
> was based on this older version and that quite a bit had changed in
> migrv2 since (and I think migrv2 v4.1 includes bits of this series
> already from the looks of it).
>
> I tried to do the naive git rebase, but it resulted in more conflicts
> than I could deal with.
>
> So my question is, should I continue to look at this version or should I
> await a v8 based on the latest migration stuff?
Yes. there's quite a lot rebase work to be done. However, I've done the
rebase, but there's some problems with Remus on migration v2(without colo),
I've reported them to Andy and I am looking into them too, hopefully I'll
fix the problem and send the v8 based on the latest migration today.
>
> Have you been following the review of migrv2? In particular I think some
> of Ian J's comments there on the state machines and the
> comments/assumptions about states etc may apply to bits of this series
> too. I'm thinking in particular of the comments on "tools/libxl:
> Infrastructure for reading a libxl migration v2 stream" on v2 and v3 or
> that series.
I'll read Ian's comment and see if we can apply to colo series. Thanks!
>
> Ian.
>
> .
>
--
Thanks,
Yang.
^ permalink raw reply [flat|nested] 24+ messages in thread