[RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2014-07-23 14:25 ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

Virtual machine (VM) replication is a well known technique for
providing application-agnostic software-implemented hardware fault
tolerance "non-stop service". COLO is a high availability solution.
Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
receive the same request from client, and generate response in parallel
too. If the response packets from PVM and SVM are identical, they are
released immediately. Otherwise, a VM checkpoint (on demand) is
conducted. The idea is presented in Xen summit 2012, and 2013,
and academia paper in SOCC 2013. It's also presented in KVM forum
2013:
http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
Please refer to above document for detailed information. 
Please also refer to previous posted RFC proposal:
http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html

The patchset is also hosted on github:
https://github.com/macrosheep/qemu/tree/colo_v0.1

This patchset is RFC, implements the frame of colo, without
failover and nic/disk replication. But it is ready for demo
the COLO idea above QEMU-Kvm.
Steps using this patchset to get an overview of COLO:
1. configure the source with --enable-colo option
2. compile
3. just like QEMU's normal migration, run 2 QEMU VM:
   - Primary VM 
   - Secondary VM with -incoming tcp:[IP]:[PORT] option
4. on Primary VM's QEMU monitor, run following command:
   migrate_set_capability colo on
   migrate tcp:[IP]:[PORT]
5. done
you will see two runing VMs, whenever you make changes to PVM, SVM
will be synced to PVM's state.

TODO list:
1. failover
2. nic replication
3. disk replication[COLO Disk manager]

Any comments/feedbacks are warmly welcomed.

Thanks,
Yang

Yang Hongyang (17):
  configure: add CONFIG_COLO to switch COLO support
  COLO: introduce an api colo_supported() to indicate COLO support
  COLO migration: add a migration capability 'colo'
  COLO info: use colo info to tell migration target colo is enabled
  COLO save: integrate COLO checkpointed save into qemu migration
  COLO restore: integrate COLO checkpointed restore into qemu restore
  COLO buffer: implement colo buffer as well as QEMUFileOps based on it
  COLO: disable qdev hotplug
  COLO ctl: implement API's that communicate with colo agent
  COLO ctl: introduce is_slave() and is_master()
  COLO ctl: implement colo checkpoint protocol
  COLO ctl: add a RunState RUN_STATE_COLO
  COLO ctl: implement colo save
  COLO ctl: implement colo restore
  COLO save: reuse migration bitmap under colo checkpoint
  COLO ram cache: implement colo ram cache on slaver
  HACK: trigger checkpoint every 500ms

 Makefile.objs                      |   2 +
 arch_init.c                        | 174 +++++++++-
 configure                          |  14 +
 include/exec/cpu-all.h             |   1 +
 include/migration/migration-colo.h |  36 +++
 include/migration/migration.h      |  13 +
 include/qapi/qmp/qerror.h          |   3 +
 migration-colo-comm.c              |  78 +++++
 migration-colo.c                   | 643 +++++++++++++++++++++++++++++++++++++
 migration.c                        |  45 ++-
 qapi-schema.json                   |   9 +-
 stubs/Makefile.objs                |   1 +
 stubs/migration-colo.c             |  34 ++
 vl.c                               |  12 +
 14 files changed, 1044 insertions(+), 21 deletions(-)
 create mode 100644 include/migration/migration-colo.h
 create mode 100644 migration-colo-comm.c
 create mode 100644 migration-colo.c
 create mode 100644 stubs/migration-colo.c

-- 
1.9.1


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2014-07-23 14:25 ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

Virtual machine (VM) replication is a well known technique for
providing application-agnostic software-implemented hardware fault
tolerance "non-stop service". COLO is a high availability solution.
Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
receive the same request from client, and generate response in parallel
too. If the response packets from PVM and SVM are identical, they are
released immediately. Otherwise, a VM checkpoint (on demand) is
conducted. The idea is presented in Xen summit 2012, and 2013,
and academia paper in SOCC 2013. It's also presented in KVM forum
2013:
http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
Please refer to above document for detailed information. 
Please also refer to previous posted RFC proposal:
http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html

The patchset is also hosted on github:
https://github.com/macrosheep/qemu/tree/colo_v0.1

This patchset is RFC, implements the frame of colo, without
failover and nic/disk replication. But it is ready for demo
the COLO idea above QEMU-Kvm.
Steps using this patchset to get an overview of COLO:
1. configure the source with --enable-colo option
2. compile
3. just like QEMU's normal migration, run 2 QEMU VM:
   - Primary VM 
   - Secondary VM with -incoming tcp:[IP]:[PORT] option
4. on Primary VM's QEMU monitor, run following command:
   migrate_set_capability colo on
   migrate tcp:[IP]:[PORT]
5. done
you will see two runing VMs, whenever you make changes to PVM, SVM
will be synced to PVM's state.

TODO list:
1. failover
2. nic replication
3. disk replication[COLO Disk manager]

Any comments/feedbacks are warmly welcomed.

Thanks,
Yang

Yang Hongyang (17):
  configure: add CONFIG_COLO to switch COLO support
  COLO: introduce an api colo_supported() to indicate COLO support
  COLO migration: add a migration capability 'colo'
  COLO info: use colo info to tell migration target colo is enabled
  COLO save: integrate COLO checkpointed save into qemu migration
  COLO restore: integrate COLO checkpointed restore into qemu restore
  COLO buffer: implement colo buffer as well as QEMUFileOps based on it
  COLO: disable qdev hotplug
  COLO ctl: implement API's that communicate with colo agent
  COLO ctl: introduce is_slave() and is_master()
  COLO ctl: implement colo checkpoint protocol
  COLO ctl: add a RunState RUN_STATE_COLO
  COLO ctl: implement colo save
  COLO ctl: implement colo restore
  COLO save: reuse migration bitmap under colo checkpoint
  COLO ram cache: implement colo ram cache on slaver
  HACK: trigger checkpoint every 500ms

 Makefile.objs                      |   2 +
 arch_init.c                        | 174 +++++++++-
 configure                          |  14 +
 include/exec/cpu-all.h             |   1 +
 include/migration/migration-colo.h |  36 +++
 include/migration/migration.h      |  13 +
 include/qapi/qmp/qerror.h          |   3 +
 migration-colo-comm.c              |  78 +++++
 migration-colo.c                   | 643 +++++++++++++++++++++++++++++++++++++
 migration.c                        |  45 ++-
 qapi-schema.json                   |   9 +-
 stubs/Makefile.objs                |   1 +
 stubs/migration-colo.c             |  34 ++
 vl.c                               |  12 +
 14 files changed, 1044 insertions(+), 21 deletions(-)
 create mode 100644 include/migration/migration-colo.h
 create mode 100644 migration-colo-comm.c
 create mode 100644 migration-colo.c
 create mode 100644 stubs/migration-colo.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC PATCH 01/17] configure: add CONFIG_COLO to switch COLO support
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

./configure --enable-colo/--disable-colo to switch COLO
support on/off.
COLO support is off by default.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 configure | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/configure b/configure
index f7685b5..4071943 100755
--- a/configure
+++ b/configure
@@ -258,6 +258,7 @@ xfs=""
 vhost_net="no"
 vhost_scsi="no"
 kvm="no"
+colo="no"
 rdma=""
 gprof="no"
 debug_tcg="no"
@@ -921,6 +922,10 @@ for opt do
   ;;
   --enable-kvm) kvm="yes"
   ;;
+  --disable-colo) colo="no"
+  ;;
+  --enable-colo) colo="yes"
+  ;;
   --disable-tcg-interpreter) tcg_interpreter="no"
   ;;
   --enable-tcg-interpreter) tcg_interpreter="yes"
@@ -1314,6 +1319,10 @@ Advanced options (experts only):
   --disable-slirp          disable SLIRP userspace network connectivity
   --disable-kvm            disable KVM acceleration support
   --enable-kvm             enable KVM acceleration support
+  --disable-colo           disable COarse-grain LOck-stepping Virtual
+                           Machines for Non-stop Service(default)
+  --enable-colo            enable COarse-grain LOck-stepping Virtual
+                           Machines for Non-stop Service
   --disable-rdma           disable RDMA-based migration support
   --enable-rdma            enable RDMA-based migration support
   --enable-tcg-interpreter enable TCG with bytecode interpreter (TCI)
@@ -4215,6 +4224,7 @@ echo "Linux AIO support $linux_aio"
 echo "ATTR/XATTR support $attr"
 echo "Install blobs     $blobs"
 echo "KVM support       $kvm"
+echo "COLO support      $colo"
 echo "RDMA support      $rdma"
 echo "TCG interpreter   $tcg_interpreter"
 echo "fdt support       $fdt"
@@ -4751,6 +4761,10 @@ if have_backend "ftrace"; then
 fi
 echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
 
+if test "$colo" = "yes"; then
+  echo "CONFIG_COLO=y" >> $config_host_mak
+fi
+
 if test "$rdma" = "yes" ; then
   echo "CONFIG_RDMA=y" >> $config_host_mak
 fi
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 01/17] configure: add CONFIG_COLO to switch COLO support
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

./configure --enable-colo/--disable-colo to switch COLO
support on/off.
COLO support is off by default.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 configure | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/configure b/configure
index f7685b5..4071943 100755
--- a/configure
+++ b/configure
@@ -258,6 +258,7 @@ xfs=""
 vhost_net="no"
 vhost_scsi="no"
 kvm="no"
+colo="no"
 rdma=""
 gprof="no"
 debug_tcg="no"
@@ -921,6 +922,10 @@ for opt do
   ;;
   --enable-kvm) kvm="yes"
   ;;
+  --disable-colo) colo="no"
+  ;;
+  --enable-colo) colo="yes"
+  ;;
   --disable-tcg-interpreter) tcg_interpreter="no"
   ;;
   --enable-tcg-interpreter) tcg_interpreter="yes"
@@ -1314,6 +1319,10 @@ Advanced options (experts only):
   --disable-slirp          disable SLIRP userspace network connectivity
   --disable-kvm            disable KVM acceleration support
   --enable-kvm             enable KVM acceleration support
+  --disable-colo           disable COarse-grain LOck-stepping Virtual
+                           Machines for Non-stop Service(default)
+  --enable-colo            enable COarse-grain LOck-stepping Virtual
+                           Machines for Non-stop Service
   --disable-rdma           disable RDMA-based migration support
   --enable-rdma            enable RDMA-based migration support
   --enable-tcg-interpreter enable TCG with bytecode interpreter (TCI)
@@ -4215,6 +4224,7 @@ echo "Linux AIO support $linux_aio"
 echo "ATTR/XATTR support $attr"
 echo "Install blobs     $blobs"
 echo "KVM support       $kvm"
+echo "COLO support      $colo"
 echo "RDMA support      $rdma"
 echo "TCG interpreter   $tcg_interpreter"
 echo "fdt support       $fdt"
@@ -4751,6 +4761,10 @@ if have_backend "ftrace"; then
 fi
 echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
 
+if test "$colo" = "yes"; then
+  echo "CONFIG_COLO=y" >> $config_host_mak
+fi
+
 if test "$rdma" = "yes" ; then
   echo "CONFIG_RDMA=y" >> $config_host_mak
 fi
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 02/17] COLO: introduce an api colo_supported() to indicate COLO support
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

introduce an api colo_supported() to indicate COLO support, returns
true if colo supported(configured with --enable-colo).

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 Makefile.objs                      |  1 +
 include/migration/migration-colo.h | 18 ++++++++++++++++++
 migration-colo.c                   | 16 ++++++++++++++++
 stubs/Makefile.objs                |  1 +
 stubs/migration-colo.c             | 16 ++++++++++++++++
 5 files changed, 52 insertions(+)
 create mode 100644 include/migration/migration-colo.h
 create mode 100644 migration-colo.c
 create mode 100644 stubs/migration-colo.c

diff --git a/Makefile.objs b/Makefile.objs
index 1f76cea..cab5824 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
 common-obj-$(CONFIG_LINUX) += fsdev/
 
 common-obj-y += migration.o migration-tcp.o
+common-obj-$(CONFIG_COLO) += migration-colo.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o
 common-obj-$(CONFIG_RDMA) += migration-rdma.o
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
new file mode 100644
index 0000000..35b384c
--- /dev/null
+++ b/include/migration/migration-colo.h
@@ -0,0 +1,18 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_MIGRATION_COLO_H
+#define QEMU_MIGRATION_COLO_H
+
+#include "qemu-common.h"
+
+bool colo_supported(void);
+
+#endif
diff --git a/migration-colo.c b/migration-colo.c
new file mode 100644
index 0000000..1d3bef8
--- /dev/null
+++ b/migration-colo.c
@@ -0,0 +1,16 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "migration/migration-colo.h"
+
+bool colo_supported(void)
+{
+    return true;
+}
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 528e161..6810c89 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -39,3 +39,4 @@ stub-obj-$(CONFIG_WIN32) += fd-register.o
 stub-obj-y += cpus.o
 stub-obj-y += kvm.o
 stub-obj-y += qmp_pc_dimm_device_list.o
+stub-obj-y += migration-colo.o
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
new file mode 100644
index 0000000..b9ee6a0
--- /dev/null
+++ b/stubs/migration-colo.c
@@ -0,0 +1,16 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "migration/migration-colo.h"
+
+bool colo_supported(void)
+{
+    return false;
+}
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 02/17] COLO: introduce an api colo_supported() to indicate COLO support
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

introduce an api colo_supported() to indicate COLO support, returns
true if colo supported(configured with --enable-colo).

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 Makefile.objs                      |  1 +
 include/migration/migration-colo.h | 18 ++++++++++++++++++
 migration-colo.c                   | 16 ++++++++++++++++
 stubs/Makefile.objs                |  1 +
 stubs/migration-colo.c             | 16 ++++++++++++++++
 5 files changed, 52 insertions(+)
 create mode 100644 include/migration/migration-colo.h
 create mode 100644 migration-colo.c
 create mode 100644 stubs/migration-colo.c

diff --git a/Makefile.objs b/Makefile.objs
index 1f76cea..cab5824 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
 common-obj-$(CONFIG_LINUX) += fsdev/
 
 common-obj-y += migration.o migration-tcp.o
+common-obj-$(CONFIG_COLO) += migration-colo.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o
 common-obj-$(CONFIG_RDMA) += migration-rdma.o
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
new file mode 100644
index 0000000..35b384c
--- /dev/null
+++ b/include/migration/migration-colo.h
@@ -0,0 +1,18 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_MIGRATION_COLO_H
+#define QEMU_MIGRATION_COLO_H
+
+#include "qemu-common.h"
+
+bool colo_supported(void);
+
+#endif
diff --git a/migration-colo.c b/migration-colo.c
new file mode 100644
index 0000000..1d3bef8
--- /dev/null
+++ b/migration-colo.c
@@ -0,0 +1,16 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "migration/migration-colo.h"
+
+bool colo_supported(void)
+{
+    return true;
+}
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 528e161..6810c89 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -39,3 +39,4 @@ stub-obj-$(CONFIG_WIN32) += fd-register.o
 stub-obj-y += cpus.o
 stub-obj-y += kvm.o
 stub-obj-y += qmp_pc_dimm_device_list.o
+stub-obj-y += migration-colo.o
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
new file mode 100644
index 0000000..b9ee6a0
--- /dev/null
+++ b/stubs/migration-colo.c
@@ -0,0 +1,16 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "migration/migration-colo.h"
+
+bool colo_supported(void)
+{
+    return false;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 03/17] COLO migration: add a migration capability 'colo'
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

Add a migration capability 'colo'. If this capability is on,
The migration will never end, and the VM will be continuously
checkpointed.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 include/qapi/qmp/qerror.h | 3 +++
 migration.c               | 6 ++++++
 qapi-schema.json          | 5 ++++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/qapi/qmp/qerror.h b/include/qapi/qmp/qerror.h
index 902d1a7..226b805 100644
--- a/include/qapi/qmp/qerror.h
+++ b/include/qapi/qmp/qerror.h
@@ -166,4 +166,7 @@ void qerror_report_err(Error *err);
 #define QERR_SOCKET_CREATE_FAILED \
     ERROR_CLASS_GENERIC_ERROR, "Failed to create socket"
 
+#define QERR_COLO_UNSUPPORTED \
+    ERROR_CLASS_GENERIC_ERROR, "COLO is not currently supported, please rerun configure with --enable-colo option in order to support COLO feature"
+
 #endif /* QERROR_H */
diff --git a/migration.c b/migration.c
index 8d675b3..ca83310 100644
--- a/migration.c
+++ b/migration.c
@@ -25,6 +25,7 @@
 #include "qemu/thread.h"
 #include "qmp-commands.h"
 #include "trace.h"
+#include "migration/migration-colo.h"
 
 enum {
     MIG_STATE_ERROR = -1,
@@ -277,6 +278,11 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
     }
 
     for (cap = params; cap; cap = cap->next) {
+        if (cap->value->capability == MIGRATION_CAPABILITY_COLO &&
+            cap->value->state && !colo_supported()) {
+            error_set(errp, QERR_COLO_UNSUPPORTED);
+            continue;
+        }
         s->enabled_capabilities[cap->value->capability] = cap->value->state;
     }
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index b11aad2..807f5a2 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -491,10 +491,13 @@
 # @auto-converge: If enabled, QEMU will automatically throttle down the guest
 #          to speed up convergence of RAM migration. (since 1.6)
 #
+# @colo: The migration will never end, and the VM will instead be continuously
+#        checkpointed. The feature is disabled by default. (since 2.1)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
-  'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks'] }
+  'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks', 'colo'] }
 
 ##
 # @MigrationCapabilityStatus
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 03/17] COLO migration: add a migration capability 'colo'
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

Add a migration capability 'colo'. If this capability is on,
The migration will never end, and the VM will be continuously
checkpointed.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 include/qapi/qmp/qerror.h | 3 +++
 migration.c               | 6 ++++++
 qapi-schema.json          | 5 ++++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/qapi/qmp/qerror.h b/include/qapi/qmp/qerror.h
index 902d1a7..226b805 100644
--- a/include/qapi/qmp/qerror.h
+++ b/include/qapi/qmp/qerror.h
@@ -166,4 +166,7 @@ void qerror_report_err(Error *err);
 #define QERR_SOCKET_CREATE_FAILED \
     ERROR_CLASS_GENERIC_ERROR, "Failed to create socket"
 
+#define QERR_COLO_UNSUPPORTED \
+    ERROR_CLASS_GENERIC_ERROR, "COLO is not currently supported, please rerun configure with --enable-colo option in order to support COLO feature"
+
 #endif /* QERROR_H */
diff --git a/migration.c b/migration.c
index 8d675b3..ca83310 100644
--- a/migration.c
+++ b/migration.c
@@ -25,6 +25,7 @@
 #include "qemu/thread.h"
 #include "qmp-commands.h"
 #include "trace.h"
+#include "migration/migration-colo.h"
 
 enum {
     MIG_STATE_ERROR = -1,
@@ -277,6 +278,11 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
     }
 
     for (cap = params; cap; cap = cap->next) {
+        if (cap->value->capability == MIGRATION_CAPABILITY_COLO &&
+            cap->value->state && !colo_supported()) {
+            error_set(errp, QERR_COLO_UNSUPPORTED);
+            continue;
+        }
         s->enabled_capabilities[cap->value->capability] = cap->value->state;
     }
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index b11aad2..807f5a2 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -491,10 +491,13 @@
 # @auto-converge: If enabled, QEMU will automatically throttle down the guest
 #          to speed up convergence of RAM migration. (since 1.6)
 #
+# @colo: The migration will never end, and the VM will instead be continuously
+#        checkpointed. The feature is disabled by default. (since 2.1)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
-  'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks'] }
+  'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks', 'colo'] }
 
 ##
 # @MigrationCapabilityStatus
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

migrate colo info to migration target to tell the target colo is
enabled.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 Makefile.objs                      |  1 +
 include/migration/migration-colo.h |  3 ++
 migration-colo-comm.c              | 68 ++++++++++++++++++++++++++++++++++++++
 vl.c                               |  4 +++
 4 files changed, 76 insertions(+)
 create mode 100644 migration-colo-comm.c

diff --git a/Makefile.objs b/Makefile.objs
index cab5824..1836a68 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
 common-obj-$(CONFIG_LINUX) += fsdev/
 
 common-obj-y += migration.o migration-tcp.o
+common-obj-y += migration-colo-comm.o
 common-obj-$(CONFIG_COLO) += migration-colo.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index 35b384c..e3735d8 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -12,6 +12,9 @@
 #define QEMU_MIGRATION_COLO_H
 
 #include "qemu-common.h"
+#include "migration/migration.h"
+
+void colo_info_mig_init(void);
 
 bool colo_supported(void);
 
diff --git a/migration-colo-comm.c b/migration-colo-comm.c
new file mode 100644
index 0000000..ccbc246
--- /dev/null
+++ b/migration-colo-comm.c
@@ -0,0 +1,68 @@
+/*
+ *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ *  (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ *  Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#include <migration/migration-colo.h>
+
+#define DEBUG_COLO
+
+#ifdef DEBUG_COLO
+#define DPRINTF(fmt, ...) \
+    do { fprintf(stdout, "COLO: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+    do { } while (0)
+#endif
+
+static bool colo_requested;
+
+/* save */
+
+static bool migrate_use_colo(void)
+{
+    MigrationState *s = migrate_get_current();
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
+}
+
+static void colo_info_save(QEMUFile *f, void *opaque)
+{
+    qemu_put_byte(f, migrate_use_colo());
+}
+
+/* restore */
+
+static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
+{
+    int value = qemu_get_byte(f);
+
+    if (value && !colo_supported()) {
+        fprintf(stderr, "COLO is not supported\n");
+        return -EINVAL;
+    }
+
+    if (value && !colo_requested) {
+        DPRINTF("COLO requested!\n");
+    }
+
+    colo_requested = value;
+
+    return 0;
+}
+
+static SaveVMHandlers savevm_colo_info_handlers = {
+    .save_state = colo_info_save,
+    .load_state = colo_info_load,
+};
+
+void colo_info_mig_init(void)
+{
+    register_savevm_live(NULL, "colo info", -1, 1,
+                         &savevm_colo_info_handlers, NULL);
+}
diff --git a/vl.c b/vl.c
index fe451aa..1a282d8 100644
--- a/vl.c
+++ b/vl.c
@@ -89,6 +89,7 @@ int main(int argc, char **argv)
 #include "sysemu/dma.h"
 #include "audio/audio.h"
 #include "migration/migration.h"
+#include "migration/migration-colo.h"
 #include "sysemu/kvm.h"
 #include "qapi/qmp/qjson.h"
 #include "qemu/option.h"
@@ -4339,6 +4340,9 @@ int main(int argc, char **argv, char **envp)
 
     blk_mig_init();
     ram_mig_init();
+    if (colo_supported()) {
+        colo_info_mig_init();
+    }
 
     /* open the virtual block devices */
     if (snapshot)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

migrate colo info to migration target to tell the target colo is
enabled.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 Makefile.objs                      |  1 +
 include/migration/migration-colo.h |  3 ++
 migration-colo-comm.c              | 68 ++++++++++++++++++++++++++++++++++++++
 vl.c                               |  4 +++
 4 files changed, 76 insertions(+)
 create mode 100644 migration-colo-comm.c

diff --git a/Makefile.objs b/Makefile.objs
index cab5824..1836a68 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
 common-obj-$(CONFIG_LINUX) += fsdev/
 
 common-obj-y += migration.o migration-tcp.o
+common-obj-y += migration-colo-comm.o
 common-obj-$(CONFIG_COLO) += migration-colo.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index 35b384c..e3735d8 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -12,6 +12,9 @@
 #define QEMU_MIGRATION_COLO_H
 
 #include "qemu-common.h"
+#include "migration/migration.h"
+
+void colo_info_mig_init(void);
 
 bool colo_supported(void);
 
diff --git a/migration-colo-comm.c b/migration-colo-comm.c
new file mode 100644
index 0000000..ccbc246
--- /dev/null
+++ b/migration-colo-comm.c
@@ -0,0 +1,68 @@
+/*
+ *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ *  (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ *  Copyright (C) 2014 FUJITSU LIMITED
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#include <migration/migration-colo.h>
+
+#define DEBUG_COLO
+
+#ifdef DEBUG_COLO
+#define DPRINTF(fmt, ...) \
+    do { fprintf(stdout, "COLO: " fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+    do { } while (0)
+#endif
+
+static bool colo_requested;
+
+/* save */
+
+static bool migrate_use_colo(void)
+{
+    MigrationState *s = migrate_get_current();
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
+}
+
+static void colo_info_save(QEMUFile *f, void *opaque)
+{
+    qemu_put_byte(f, migrate_use_colo());
+}
+
+/* restore */
+
+static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
+{
+    int value = qemu_get_byte(f);
+
+    if (value && !colo_supported()) {
+        fprintf(stderr, "COLO is not supported\n");
+        return -EINVAL;
+    }
+
+    if (value && !colo_requested) {
+        DPRINTF("COLO requested!\n");
+    }
+
+    colo_requested = value;
+
+    return 0;
+}
+
+static SaveVMHandlers savevm_colo_info_handlers = {
+    .save_state = colo_info_save,
+    .load_state = colo_info_load,
+};
+
+void colo_info_mig_init(void)
+{
+    register_savevm_live(NULL, "colo info", -1, 1,
+                         &savevm_colo_info_handlers, NULL);
+}
diff --git a/vl.c b/vl.c
index fe451aa..1a282d8 100644
--- a/vl.c
+++ b/vl.c
@@ -89,6 +89,7 @@ int main(int argc, char **argv)
 #include "sysemu/dma.h"
 #include "audio/audio.h"
 #include "migration/migration.h"
+#include "migration/migration-colo.h"
 #include "sysemu/kvm.h"
 #include "qapi/qmp/qjson.h"
 #include "qemu/option.h"
@@ -4339,6 +4340,9 @@ int main(int argc, char **argv, char **envp)
 
     blk_mig_init();
     ram_mig_init();
+    if (colo_supported()) {
+        colo_info_mig_init();
+    }
 
     /* open the virtual block devices */
     if (snapshot)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 05/17] COLO save: integrate COLO checkpointed save into qemu migration
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

  Integrate COLO checkpointed save flow into qemu migration.
  Add a migrate state: MIG_STATE_COLO, enter this migrate state
after the first live migration successfully finished.
  Create a colo thread to do the checkpointed save.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 include/migration/migration-colo.h |  4 ++++
 include/migration/migration.h      | 13 +++++++++++
 migration-colo-comm.c              |  2 +-
 migration-colo.c                   | 48 ++++++++++++++++++++++++++++++++++++++
 migration.c                        | 36 ++++++++++++++++------------
 stubs/migration-colo.c             |  4 ++++
 6 files changed, 91 insertions(+), 16 deletions(-)

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index e3735d8..24589c0 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -18,4 +18,8 @@ void colo_info_mig_init(void);
 
 bool colo_supported(void);
 
+/* save */
+bool migrate_use_colo(void);
+void colo_init_checkpointer(MigrationState *s);
+
 #endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3cb5ba8..3e81a27 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -64,6 +64,19 @@ struct MigrationState
     int64_t dirty_sync_count;
 };
 
+enum {
+    MIG_STATE_ERROR = -1,
+    MIG_STATE_NONE,
+    MIG_STATE_SETUP,
+    MIG_STATE_CANCELLING,
+    MIG_STATE_CANCELLED,
+    MIG_STATE_ACTIVE,
+    MIG_STATE_COLO,
+    MIG_STATE_COMPLETED,
+};
+
+void migrate_set_state(MigrationState *s, int old_state, int new_state);
+
 void process_incoming_migration(QEMUFile *f);
 
 void qemu_start_incoming_migration(const char *uri, Error **errp);
diff --git a/migration-colo-comm.c b/migration-colo-comm.c
index ccbc246..4504ceb 100644
--- a/migration-colo-comm.c
+++ b/migration-colo-comm.c
@@ -25,7 +25,7 @@ static bool colo_requested;
 
 /* save */
 
-static bool migrate_use_colo(void)
+bool migrate_use_colo(void)
 {
     MigrationState *s = migrate_get_current();
     return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
diff --git a/migration-colo.c b/migration-colo.c
index 1d3bef8..0cef8bd 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -8,9 +8,57 @@
  * the COPYING file in the top-level directory.
  */
 
+#include "qemu/main-loop.h"
+#include "qemu/thread.h"
 #include "migration/migration-colo.h"
 
+static QEMUBH *colo_bh;
+
 bool colo_supported(void)
 {
     return true;
 }
+
+/* save */
+
+static void *colo_thread(void *opaque)
+{
+    MigrationState *s = opaque;
+
+    /*TODO: COLO checkpointed save loop*/
+
+    if (s->state != MIG_STATE_ERROR) {
+        migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
+    }
+
+    qemu_mutex_lock_iothread();
+    qemu_bh_schedule(s->cleanup_bh);
+    qemu_mutex_unlock_iothread();
+
+    return NULL;
+}
+
+static void colo_start_checkpointer(void *opaque)
+{
+    MigrationState *s = opaque;
+
+    if (colo_bh) {
+        qemu_bh_delete(colo_bh);
+        colo_bh = NULL;
+    }
+
+    qemu_mutex_unlock_iothread();
+    qemu_thread_join(&s->thread);
+    qemu_mutex_lock_iothread();
+
+    migrate_set_state(s, MIG_STATE_ACTIVE, MIG_STATE_COLO);
+
+    qemu_thread_create(&s->thread, "colo", colo_thread, s,
+                       QEMU_THREAD_JOINABLE);
+}
+
+void colo_init_checkpointer(MigrationState *s)
+{
+    colo_bh = qemu_bh_new(colo_start_checkpointer, s);
+    qemu_bh_schedule(colo_bh);
+}
diff --git a/migration.c b/migration.c
index ca83310..b7f8e7e 100644
--- a/migration.c
+++ b/migration.c
@@ -27,16 +27,6 @@
 #include "trace.h"
 #include "migration/migration-colo.h"
 
-enum {
-    MIG_STATE_ERROR = -1,
-    MIG_STATE_NONE,
-    MIG_STATE_SETUP,
-    MIG_STATE_CANCELLING,
-    MIG_STATE_CANCELLED,
-    MIG_STATE_ACTIVE,
-    MIG_STATE_COMPLETED,
-};
-
 #define MAX_THROTTLE  (32 << 20)      /* Migration speed throttling */
 
 /* Amount of time to allocate to each "chunk" of bandwidth-throttled
@@ -229,6 +219,11 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 
         get_xbzrle_cache_stats(info);
         break;
+    case MIG_STATE_COLO:
+        info->has_status = true;
+        info->status = g_strdup("colo");
+        /* TODO: display COLO specific informations(checkpoint info etc.),*/
+        break;
     case MIG_STATE_COMPLETED:
         get_xbzrle_cache_stats(info);
 
@@ -272,7 +267,8 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
     MigrationState *s = migrate_get_current();
     MigrationCapabilityStatusList *cap;
 
-    if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP) {
+    if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP ||
+        s->state == MIG_STATE_COLO) {
         error_set(errp, QERR_MIGRATION_ACTIVE);
         return;
     }
@@ -289,7 +285,7 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
 
 /* shared migration helpers */
 
-static void migrate_set_state(MigrationState *s, int old_state, int new_state)
+void migrate_set_state(MigrationState *s, int old_state, int new_state)
 {
     if (atomic_cmpxchg(&s->state, old_state, new_state) == new_state) {
         trace_migrate_set_state(new_state);
@@ -423,7 +419,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
     params.shared = has_inc && inc;
 
     if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP ||
-        s->state == MIG_STATE_CANCELLING) {
+        s->state == MIG_STATE_CANCELLING || s->state == MIG_STATE_COLO) {
         error_set(errp, QERR_MIGRATION_ACTIVE);
         return;
     }
@@ -591,6 +587,7 @@ static void *migration_thread(void *opaque)
     int64_t max_size = 0;
     int64_t start_time = initial_time;
     bool old_vm_running = false;
+    bool use_colo = migrate_use_colo();
 
     qemu_savevm_state_begin(s->file, &s->params);
 
@@ -627,7 +624,10 @@ static void *migration_thread(void *opaque)
                 }
 
                 if (!qemu_file_get_error(s->file)) {
-                    migrate_set_state(s, MIG_STATE_ACTIVE, MIG_STATE_COMPLETED);
+                    if (!use_colo) {
+                        migrate_set_state(s, MIG_STATE_ACTIVE,
+                                          MIG_STATE_COMPLETED);
+                    }
                     break;
                 }
             }
@@ -677,11 +677,17 @@ static void *migration_thread(void *opaque)
         }
         runstate_set(RUN_STATE_POSTMIGRATE);
     } else {
+        if (s->state == MIG_STATE_ACTIVE && use_colo) {
+            colo_init_checkpointer(s);
+        }
         if (old_vm_running) {
             vm_start();
         }
     }
-    qemu_bh_schedule(s->cleanup_bh);
+
+    if (!use_colo) {
+        qemu_bh_schedule(s->cleanup_bh);
+    }
     qemu_mutex_unlock_iothread();
 
     return NULL;
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index b9ee6a0..9013c40 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -14,3 +14,7 @@ bool colo_supported(void)
 {
     return false;
 }
+
+void colo_init_checkpointer(MigrationState *s)
+{
+}
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 05/17] COLO save: integrate COLO checkpointed save into qemu migration
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

  Integrate COLO checkpointed save flow into qemu migration.
  Add a migrate state: MIG_STATE_COLO, enter this migrate state
after the first live migration successfully finished.
  Create a colo thread to do the checkpointed save.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 include/migration/migration-colo.h |  4 ++++
 include/migration/migration.h      | 13 +++++++++++
 migration-colo-comm.c              |  2 +-
 migration-colo.c                   | 48 ++++++++++++++++++++++++++++++++++++++
 migration.c                        | 36 ++++++++++++++++------------
 stubs/migration-colo.c             |  4 ++++
 6 files changed, 91 insertions(+), 16 deletions(-)

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index e3735d8..24589c0 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -18,4 +18,8 @@ void colo_info_mig_init(void);
 
 bool colo_supported(void);
 
+/* save */
+bool migrate_use_colo(void);
+void colo_init_checkpointer(MigrationState *s);
+
 #endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3cb5ba8..3e81a27 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -64,6 +64,19 @@ struct MigrationState
     int64_t dirty_sync_count;
 };
 
+enum {
+    MIG_STATE_ERROR = -1,
+    MIG_STATE_NONE,
+    MIG_STATE_SETUP,
+    MIG_STATE_CANCELLING,
+    MIG_STATE_CANCELLED,
+    MIG_STATE_ACTIVE,
+    MIG_STATE_COLO,
+    MIG_STATE_COMPLETED,
+};
+
+void migrate_set_state(MigrationState *s, int old_state, int new_state);
+
 void process_incoming_migration(QEMUFile *f);
 
 void qemu_start_incoming_migration(const char *uri, Error **errp);
diff --git a/migration-colo-comm.c b/migration-colo-comm.c
index ccbc246..4504ceb 100644
--- a/migration-colo-comm.c
+++ b/migration-colo-comm.c
@@ -25,7 +25,7 @@ static bool colo_requested;
 
 /* save */
 
-static bool migrate_use_colo(void)
+bool migrate_use_colo(void)
 {
     MigrationState *s = migrate_get_current();
     return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
diff --git a/migration-colo.c b/migration-colo.c
index 1d3bef8..0cef8bd 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -8,9 +8,57 @@
  * the COPYING file in the top-level directory.
  */
 
+#include "qemu/main-loop.h"
+#include "qemu/thread.h"
 #include "migration/migration-colo.h"
 
+static QEMUBH *colo_bh;
+
 bool colo_supported(void)
 {
     return true;
 }
+
+/* save */
+
+static void *colo_thread(void *opaque)
+{
+    MigrationState *s = opaque;
+
+    /*TODO: COLO checkpointed save loop*/
+
+    if (s->state != MIG_STATE_ERROR) {
+        migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
+    }
+
+    qemu_mutex_lock_iothread();
+    qemu_bh_schedule(s->cleanup_bh);
+    qemu_mutex_unlock_iothread();
+
+    return NULL;
+}
+
+static void colo_start_checkpointer(void *opaque)
+{
+    MigrationState *s = opaque;
+
+    if (colo_bh) {
+        qemu_bh_delete(colo_bh);
+        colo_bh = NULL;
+    }
+
+    qemu_mutex_unlock_iothread();
+    qemu_thread_join(&s->thread);
+    qemu_mutex_lock_iothread();
+
+    migrate_set_state(s, MIG_STATE_ACTIVE, MIG_STATE_COLO);
+
+    qemu_thread_create(&s->thread, "colo", colo_thread, s,
+                       QEMU_THREAD_JOINABLE);
+}
+
+void colo_init_checkpointer(MigrationState *s)
+{
+    colo_bh = qemu_bh_new(colo_start_checkpointer, s);
+    qemu_bh_schedule(colo_bh);
+}
diff --git a/migration.c b/migration.c
index ca83310..b7f8e7e 100644
--- a/migration.c
+++ b/migration.c
@@ -27,16 +27,6 @@
 #include "trace.h"
 #include "migration/migration-colo.h"
 
-enum {
-    MIG_STATE_ERROR = -1,
-    MIG_STATE_NONE,
-    MIG_STATE_SETUP,
-    MIG_STATE_CANCELLING,
-    MIG_STATE_CANCELLED,
-    MIG_STATE_ACTIVE,
-    MIG_STATE_COMPLETED,
-};
-
 #define MAX_THROTTLE  (32 << 20)      /* Migration speed throttling */
 
 /* Amount of time to allocate to each "chunk" of bandwidth-throttled
@@ -229,6 +219,11 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 
         get_xbzrle_cache_stats(info);
         break;
+    case MIG_STATE_COLO:
+        info->has_status = true;
+        info->status = g_strdup("colo");
+        /* TODO: display COLO specific informations(checkpoint info etc.),*/
+        break;
     case MIG_STATE_COMPLETED:
         get_xbzrle_cache_stats(info);
 
@@ -272,7 +267,8 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
     MigrationState *s = migrate_get_current();
     MigrationCapabilityStatusList *cap;
 
-    if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP) {
+    if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP ||
+        s->state == MIG_STATE_COLO) {
         error_set(errp, QERR_MIGRATION_ACTIVE);
         return;
     }
@@ -289,7 +285,7 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
 
 /* shared migration helpers */
 
-static void migrate_set_state(MigrationState *s, int old_state, int new_state)
+void migrate_set_state(MigrationState *s, int old_state, int new_state)
 {
     if (atomic_cmpxchg(&s->state, old_state, new_state) == new_state) {
         trace_migrate_set_state(new_state);
@@ -423,7 +419,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
     params.shared = has_inc && inc;
 
     if (s->state == MIG_STATE_ACTIVE || s->state == MIG_STATE_SETUP ||
-        s->state == MIG_STATE_CANCELLING) {
+        s->state == MIG_STATE_CANCELLING || s->state == MIG_STATE_COLO) {
         error_set(errp, QERR_MIGRATION_ACTIVE);
         return;
     }
@@ -591,6 +587,7 @@ static void *migration_thread(void *opaque)
     int64_t max_size = 0;
     int64_t start_time = initial_time;
     bool old_vm_running = false;
+    bool use_colo = migrate_use_colo();
 
     qemu_savevm_state_begin(s->file, &s->params);
 
@@ -627,7 +624,10 @@ static void *migration_thread(void *opaque)
                 }
 
                 if (!qemu_file_get_error(s->file)) {
-                    migrate_set_state(s, MIG_STATE_ACTIVE, MIG_STATE_COMPLETED);
+                    if (!use_colo) {
+                        migrate_set_state(s, MIG_STATE_ACTIVE,
+                                          MIG_STATE_COMPLETED);
+                    }
                     break;
                 }
             }
@@ -677,11 +677,17 @@ static void *migration_thread(void *opaque)
         }
         runstate_set(RUN_STATE_POSTMIGRATE);
     } else {
+        if (s->state == MIG_STATE_ACTIVE && use_colo) {
+            colo_init_checkpointer(s);
+        }
         if (old_vm_running) {
             vm_start();
         }
     }
-    qemu_bh_schedule(s->cleanup_bh);
+
+    if (!use_colo) {
+        qemu_bh_schedule(s->cleanup_bh);
+    }
     qemu_mutex_unlock_iothread();
 
     return NULL;
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index b9ee6a0..9013c40 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -14,3 +14,7 @@ bool colo_supported(void)
 {
     return false;
 }
+
+void colo_init_checkpointer(MigrationState *s)
+{
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 06/17] COLO restore: integrate COLO checkpointed restore into qemu restore
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

enter colo checkpointed restore loop after live migration.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 include/migration/migration-colo.h |  6 ++++++
 migration-colo-comm.c              | 10 ++++++++++
 migration-colo.c                   | 22 ++++++++++++++++++++++
 migration.c                        |  3 +++
 stubs/migration-colo.c             |  4 ++++
 5 files changed, 45 insertions(+)

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index 24589c0..861fa27 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -22,4 +22,10 @@ bool colo_supported(void);
 bool migrate_use_colo(void);
 void colo_init_checkpointer(MigrationState *s);
 
+/* restore */
+bool restore_use_colo(void);
+void restore_exit_colo(void);
+
+void colo_process_incoming_checkpoints(QEMUFile *f);
+
 #endif
diff --git a/migration-colo-comm.c b/migration-colo-comm.c
index 4504ceb..b12a57a 100644
--- a/migration-colo-comm.c
+++ b/migration-colo-comm.c
@@ -38,6 +38,16 @@ static void colo_info_save(QEMUFile *f, void *opaque)
 
 /* restore */
 
+bool restore_use_colo(void)
+{
+    return colo_requested;
+}
+
+void restore_exit_colo(void)
+{
+    colo_requested = false;
+}
+
 static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
 {
     int value = qemu_get_byte(f);
diff --git a/migration-colo.c b/migration-colo.c
index 0cef8bd..d566b9d 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -10,6 +10,7 @@
 
 #include "qemu/main-loop.h"
 #include "qemu/thread.h"
+#include "block/coroutine.h"
 #include "migration/migration-colo.h"
 
 static QEMUBH *colo_bh;
@@ -62,3 +63,24 @@ void colo_init_checkpointer(MigrationState *s)
     colo_bh = qemu_bh_new(colo_start_checkpointer, s);
     qemu_bh_schedule(colo_bh);
 }
+
+/* restore */
+
+static Coroutine *colo;
+
+void colo_process_incoming_checkpoints(QEMUFile *f)
+{
+    if (!restore_use_colo()) {
+        return;
+    }
+
+    colo = qemu_coroutine_self();
+    assert(colo != NULL);
+
+    /* TODO: COLO checkpointed restore loop */
+
+    colo = NULL;
+    restore_exit_colo();
+
+    return;
+}
diff --git a/migration.c b/migration.c
index b7f8e7e..190571d 100644
--- a/migration.c
+++ b/migration.c
@@ -86,6 +86,9 @@ static void process_incoming_migration_co(void *opaque)
     int ret;
 
     ret = qemu_loadvm_state(f);
+    if (!ret) {
+        colo_process_incoming_checkpoints(f);
+    }
     qemu_fclose(f);
     free_xbzrle_decoded_buf();
     if (ret < 0) {
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 9013c40..55f0d37 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -18,3 +18,7 @@ bool colo_supported(void)
 void colo_init_checkpointer(MigrationState *s)
 {
 }
+
+void colo_process_incoming_checkpoints(QEMUFile *f)
+{
+}
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 06/17] COLO restore: integrate COLO checkpointed restore into qemu restore
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

enter colo checkpointed restore loop after live migration.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 include/migration/migration-colo.h |  6 ++++++
 migration-colo-comm.c              | 10 ++++++++++
 migration-colo.c                   | 22 ++++++++++++++++++++++
 migration.c                        |  3 +++
 stubs/migration-colo.c             |  4 ++++
 5 files changed, 45 insertions(+)

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index 24589c0..861fa27 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -22,4 +22,10 @@ bool colo_supported(void);
 bool migrate_use_colo(void);
 void colo_init_checkpointer(MigrationState *s);
 
+/* restore */
+bool restore_use_colo(void);
+void restore_exit_colo(void);
+
+void colo_process_incoming_checkpoints(QEMUFile *f);
+
 #endif
diff --git a/migration-colo-comm.c b/migration-colo-comm.c
index 4504ceb..b12a57a 100644
--- a/migration-colo-comm.c
+++ b/migration-colo-comm.c
@@ -38,6 +38,16 @@ static void colo_info_save(QEMUFile *f, void *opaque)
 
 /* restore */
 
+bool restore_use_colo(void)
+{
+    return colo_requested;
+}
+
+void restore_exit_colo(void)
+{
+    colo_requested = false;
+}
+
 static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
 {
     int value = qemu_get_byte(f);
diff --git a/migration-colo.c b/migration-colo.c
index 0cef8bd..d566b9d 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -10,6 +10,7 @@
 
 #include "qemu/main-loop.h"
 #include "qemu/thread.h"
+#include "block/coroutine.h"
 #include "migration/migration-colo.h"
 
 static QEMUBH *colo_bh;
@@ -62,3 +63,24 @@ void colo_init_checkpointer(MigrationState *s)
     colo_bh = qemu_bh_new(colo_start_checkpointer, s);
     qemu_bh_schedule(colo_bh);
 }
+
+/* restore */
+
+static Coroutine *colo;
+
+void colo_process_incoming_checkpoints(QEMUFile *f)
+{
+    if (!restore_use_colo()) {
+        return;
+    }
+
+    colo = qemu_coroutine_self();
+    assert(colo != NULL);
+
+    /* TODO: COLO checkpointed restore loop */
+
+    colo = NULL;
+    restore_exit_colo();
+
+    return;
+}
diff --git a/migration.c b/migration.c
index b7f8e7e..190571d 100644
--- a/migration.c
+++ b/migration.c
@@ -86,6 +86,9 @@ static void process_incoming_migration_co(void *opaque)
     int ret;
 
     ret = qemu_loadvm_state(f);
+    if (!ret) {
+        colo_process_incoming_checkpoints(f);
+    }
     qemu_fclose(f);
     free_xbzrle_decoded_buf();
     if (ret < 0) {
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 9013c40..55f0d37 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -18,3 +18,7 @@ bool colo_supported(void)
 void colo_init_checkpointer(MigrationState *s)
 {
 }
+
+void colo_process_incoming_checkpoints(QEMUFile *f)
+{
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

We need a buffer to store migration data.

On save side:
  all saved data was write into colo buffer first, so that we can know
the total size of the migration data. this can also separate the data
transmission from colo control data, we use colo control data over
socket fd to synchronous both side's stat.

On restore side:
  all migration data was read into colo buffer first, then load data
from the buffer: If network error happens while data transmission,
the slaver can still functinal because the migration data are not yet
loaded.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 112 insertions(+)

diff --git a/migration-colo.c b/migration-colo.c
index d566b9d..b90d9b6 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -11,6 +11,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/thread.h"
 #include "block/coroutine.h"
+#include "qemu/error-report.h"
 #include "migration/migration-colo.h"
 
 static QEMUBH *colo_bh;
@@ -20,14 +21,122 @@ bool colo_supported(void)
     return true;
 }
 
+/* colo buffer */
+
+#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
+#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)
+
+typedef struct colo_buffer {
+    uint8_t *data;
+    uint64_t used;
+    uint64_t freed;
+    uint64_t size;
+} colo_buffer_t;
+
+static colo_buffer_t colo_buffer;
+
+static void colo_buffer_init(void)
+{
+    if (colo_buffer.size == 0) {
+        colo_buffer.data = g_malloc(COLO_BUFFER_BASE_SIZE);
+        colo_buffer.size = COLO_BUFFER_BASE_SIZE;
+    }
+    colo_buffer.used = 0;
+    colo_buffer.freed = 0;
+}
+
+static void colo_buffer_destroy(void)
+{
+    if (colo_buffer.data) {
+        g_free(colo_buffer.data);
+        colo_buffer.data = NULL;
+    }
+    colo_buffer.used = 0;
+    colo_buffer.freed = 0;
+    colo_buffer.size = 0;
+}
+
+static void colo_buffer_extend(uint64_t len)
+{
+    if (len > colo_buffer.size - colo_buffer.used) {
+        len = len + colo_buffer.used - colo_buffer.size;
+        len = ROUND_UP(len, COLO_BUFFER_BASE_SIZE) + COLO_BUFFER_BASE_SIZE;
+
+        colo_buffer.size += len;
+        if (colo_buffer.size > COLO_BUFFER_MAX_SIZE) {
+            error_report("colo_buffer overflow!\n");
+            exit(EXIT_FAILURE);
+        }
+        colo_buffer.data = g_realloc(colo_buffer.data, colo_buffer.size);
+    }
+}
+
+static int colo_put_buffer(void *opaque, const uint8_t *buf,
+                           int64_t pos, int size)
+{
+    colo_buffer_extend(size);
+    memcpy(colo_buffer.data + colo_buffer.used, buf, size);
+    colo_buffer.used += size;
+
+    return size;
+}
+
+static int colo_get_buffer_internal(uint8_t *buf, int size)
+{
+    if ((size + colo_buffer.freed) > colo_buffer.used) {
+        size = colo_buffer.used - colo_buffer.freed;
+    }
+    memcpy(buf, colo_buffer.data + colo_buffer.freed, size);
+    colo_buffer.freed += size;
+
+    return size;
+}
+
+static int colo_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
+{
+    return colo_get_buffer_internal(buf, size);
+}
+
+static int colo_close(void *opaque)
+{
+    colo_buffer_t *cb = opaque ;
+
+    cb->used = 0;
+    cb->freed = 0;
+
+    return 0;
+}
+
+static int colo_get_fd(void *opaque)
+{
+    /* colo buffer, no fd */
+    return -1;
+}
+
+static const QEMUFileOps colo_write_ops = {
+    .put_buffer = colo_put_buffer,
+    .get_fd = colo_get_fd,
+    .close = colo_close,
+};
+
+static const QEMUFileOps colo_read_ops = {
+    .get_buffer = colo_get_buffer,
+    .get_fd = colo_get_fd,
+    .close = colo_close,
+};
+
 /* save */
 
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
 
+    colo_buffer_init();
+
     /*TODO: COLO checkpointed save loop*/
 
+    colo_buffer_destroy();
+
     if (s->state != MIG_STATE_ERROR) {
         migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
     }
@@ -77,8 +186,11 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
+    colo_buffer_init();
+
     /* TODO: COLO checkpointed restore loop */
 
+    colo_buffer_destroy();
     colo = NULL;
     restore_exit_colo();
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

We need a buffer to store migration data.

On save side:
  all saved data was write into colo buffer first, so that we can know
the total size of the migration data. this can also separate the data
transmission from colo control data, we use colo control data over
socket fd to synchronous both side's stat.

On restore side:
  all migration data was read into colo buffer first, then load data
from the buffer: If network error happens while data transmission,
the slaver can still functinal because the migration data are not yet
loaded.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 112 insertions(+)

diff --git a/migration-colo.c b/migration-colo.c
index d566b9d..b90d9b6 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -11,6 +11,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/thread.h"
 #include "block/coroutine.h"
+#include "qemu/error-report.h"
 #include "migration/migration-colo.h"
 
 static QEMUBH *colo_bh;
@@ -20,14 +21,122 @@ bool colo_supported(void)
     return true;
 }
 
+/* colo buffer */
+
+#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
+#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)
+
+typedef struct colo_buffer {
+    uint8_t *data;
+    uint64_t used;
+    uint64_t freed;
+    uint64_t size;
+} colo_buffer_t;
+
+static colo_buffer_t colo_buffer;
+
+static void colo_buffer_init(void)
+{
+    if (colo_buffer.size == 0) {
+        colo_buffer.data = g_malloc(COLO_BUFFER_BASE_SIZE);
+        colo_buffer.size = COLO_BUFFER_BASE_SIZE;
+    }
+    colo_buffer.used = 0;
+    colo_buffer.freed = 0;
+}
+
+static void colo_buffer_destroy(void)
+{
+    if (colo_buffer.data) {
+        g_free(colo_buffer.data);
+        colo_buffer.data = NULL;
+    }
+    colo_buffer.used = 0;
+    colo_buffer.freed = 0;
+    colo_buffer.size = 0;
+}
+
+static void colo_buffer_extend(uint64_t len)
+{
+    if (len > colo_buffer.size - colo_buffer.used) {
+        len = len + colo_buffer.used - colo_buffer.size;
+        len = ROUND_UP(len, COLO_BUFFER_BASE_SIZE) + COLO_BUFFER_BASE_SIZE;
+
+        colo_buffer.size += len;
+        if (colo_buffer.size > COLO_BUFFER_MAX_SIZE) {
+            error_report("colo_buffer overflow!\n");
+            exit(EXIT_FAILURE);
+        }
+        colo_buffer.data = g_realloc(colo_buffer.data, colo_buffer.size);
+    }
+}
+
+static int colo_put_buffer(void *opaque, const uint8_t *buf,
+                           int64_t pos, int size)
+{
+    colo_buffer_extend(size);
+    memcpy(colo_buffer.data + colo_buffer.used, buf, size);
+    colo_buffer.used += size;
+
+    return size;
+}
+
+static int colo_get_buffer_internal(uint8_t *buf, int size)
+{
+    if ((size + colo_buffer.freed) > colo_buffer.used) {
+        size = colo_buffer.used - colo_buffer.freed;
+    }
+    memcpy(buf, colo_buffer.data + colo_buffer.freed, size);
+    colo_buffer.freed += size;
+
+    return size;
+}
+
+static int colo_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
+{
+    return colo_get_buffer_internal(buf, size);
+}
+
+static int colo_close(void *opaque)
+{
+    colo_buffer_t *cb = opaque ;
+
+    cb->used = 0;
+    cb->freed = 0;
+
+    return 0;
+}
+
+static int colo_get_fd(void *opaque)
+{
+    /* colo buffer, no fd */
+    return -1;
+}
+
+static const QEMUFileOps colo_write_ops = {
+    .put_buffer = colo_put_buffer,
+    .get_fd = colo_get_fd,
+    .close = colo_close,
+};
+
+static const QEMUFileOps colo_read_ops = {
+    .get_buffer = colo_get_buffer,
+    .get_fd = colo_get_fd,
+    .close = colo_close,
+};
+
 /* save */
 
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
 
+    colo_buffer_init();
+
     /*TODO: COLO checkpointed save loop*/
 
+    colo_buffer_destroy();
+
     if (s->state != MIG_STATE_ERROR) {
         migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
     }
@@ -77,8 +186,11 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
+    colo_buffer_init();
+
     /* TODO: COLO checkpointed restore loop */
 
+    colo_buffer_destroy();
     colo = NULL;
     restore_exit_colo();
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 08/17] COLO: disable qdev hotplug
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

COLO do not support qdev hotplug migration, disable it.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/migration-colo.c b/migration-colo.c
index b90d9b6..f295e56 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -12,6 +12,7 @@
 #include "qemu/thread.h"
 #include "block/coroutine.h"
 #include "qemu/error-report.h"
+#include "hw/qdev-core.h"
 #include "migration/migration-colo.h"
 
 static QEMUBH *colo_bh;
@@ -130,6 +131,9 @@ static const QEMUFileOps colo_read_ops = {
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
+    int dev_hotplug = qdev_hotplug;
+
+    qdev_hotplug = 0;
 
     colo_buffer_init();
 
@@ -145,6 +149,8 @@ static void *colo_thread(void *opaque)
     qemu_bh_schedule(s->cleanup_bh);
     qemu_mutex_unlock_iothread();
 
+    qdev_hotplug = dev_hotplug;
+
     return NULL;
 }
 
@@ -179,10 +185,14 @@ static Coroutine *colo;
 
 void colo_process_incoming_checkpoints(QEMUFile *f)
 {
+    int dev_hotplug = qdev_hotplug;
+
     if (!restore_use_colo()) {
         return;
     }
 
+    qdev_hotplug = 0;
+
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
@@ -194,5 +204,7 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
     colo = NULL;
     restore_exit_colo();
 
+    qdev_hotplug = dev_hotplug;
+
     return;
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 08/17] COLO: disable qdev hotplug
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

COLO do not support qdev hotplug migration, disable it.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/migration-colo.c b/migration-colo.c
index b90d9b6..f295e56 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -12,6 +12,7 @@
 #include "qemu/thread.h"
 #include "block/coroutine.h"
 #include "qemu/error-report.h"
+#include "hw/qdev-core.h"
 #include "migration/migration-colo.h"
 
 static QEMUBH *colo_bh;
@@ -130,6 +131,9 @@ static const QEMUFileOps colo_read_ops = {
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
+    int dev_hotplug = qdev_hotplug;
+
+    qdev_hotplug = 0;
 
     colo_buffer_init();
 
@@ -145,6 +149,8 @@ static void *colo_thread(void *opaque)
     qemu_bh_schedule(s->cleanup_bh);
     qemu_mutex_unlock_iothread();
 
+    qdev_hotplug = dev_hotplug;
+
     return NULL;
 }
 
@@ -179,10 +185,14 @@ static Coroutine *colo;
 
 void colo_process_incoming_checkpoints(QEMUFile *f)
 {
+    int dev_hotplug = qdev_hotplug;
+
     if (!restore_use_colo()) {
         return;
     }
 
+    qdev_hotplug = 0;
+
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
@@ -194,5 +204,7 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
     colo = NULL;
     restore_exit_colo();
 
+    qdev_hotplug = dev_hotplug;
+
     return;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 09/17] COLO ctl: implement API's that communicate with colo agent
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

We use COLO agent to compare the packets returned by
Primary VM and Secondary VM, and decide whether to start a
checkpoint according to some rules. It is a linux kernel
module for host.
COLO controller communicate with the agent through ioctl().

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 112 insertions(+), 3 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index f295e56..802f8b0 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -13,7 +13,16 @@
 #include "block/coroutine.h"
 #include "qemu/error-report.h"
 #include "hw/qdev-core.h"
+#include "qemu/timer.h"
 #include "migration/migration-colo.h"
+#include <sys/ioctl.h>
+
+/*
+ * checkpoint timer: unit ms
+ * this is large because COLO checkpoint will mostly depend on
+ * COLO compare module.
+ */
+#define CHKPOINT_TIMER 10000
 
 static QEMUBH *colo_bh;
 
@@ -22,6 +31,56 @@ bool colo_supported(void)
     return true;
 }
 
+/* colo compare */
+#define COMP_IOC_MAGIC 'k'
+#define COMP_IOCTWAIT   _IO(COMP_IOC_MAGIC, 0)
+#define COMP_IOCTFLUSH  _IO(COMP_IOC_MAGIC, 1)
+#define COMP_IOCTRESUME _IO(COMP_IOC_MAGIC, 2)
+
+#define COMPARE_DEV "/dev/HA_compare"
+/* COLO compare module FD */
+static int comp_fd = -1;
+
+static int colo_compare_init(void)
+{
+    comp_fd = open(COMPARE_DEV, O_RDONLY);
+    if (comp_fd < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static void colo_compare_destroy(void)
+{
+    if (comp_fd >= 0) {
+        close(comp_fd);
+        comp_fd = -1;
+    }
+}
+
+/*
+ * Communicate with COLO Agent through ioctl.
+ * return:
+ * 0: start a checkpoint
+ * other: errno == ETIME or ERESTART, try again
+ *        errno == other, error, quit colo save
+ */
+static int colo_compare(void)
+{
+    return ioctl(comp_fd, COMP_IOCTWAIT, 250);
+}
+
+static __attribute__((unused)) int colo_compare_flush(void)
+{
+    return ioctl(comp_fd, COMP_IOCTFLUSH, 1);
+}
+
+static __attribute__((unused)) int colo_compare_resume(void)
+{
+    return ioctl(comp_fd, COMP_IOCTRESUME, 1);
+}
+
 /* colo buffer */
 
 #define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
@@ -131,15 +190,48 @@ static const QEMUFileOps colo_read_ops = {
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
-    int dev_hotplug = qdev_hotplug;
+    int dev_hotplug = qdev_hotplug, wait_cp = 0;
+    int64_t start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    int64_t current_time;
+
+    if (colo_compare_init() < 0) {
+        error_report("Init colo compare error\n");
+        goto out;
+    }
 
     qdev_hotplug = 0;
 
     colo_buffer_init();
 
-    /*TODO: COLO checkpointed save loop*/
+    while (s->state == MIG_STATE_COLO) {
+        /* wait for a colo checkpoint */
+        wait_cp = colo_compare();
+        if (wait_cp) {
+            if (errno != ETIME && errno != ERESTART) {
+                error_report("compare module failed(%s)", strerror(errno));
+                goto out;
+            }
+            /*
+             * no checkpoint is needed, wait for 1ms and then
+             * check if we need checkpoint
+             */
+            current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+            if (current_time - start_time < CHKPOINT_TIMER) {
+                usleep(1000);
+                continue;
+            }
+        }
+
+        /* start a colo checkpoint */
+
+        /*TODO: COLO save */
 
+        start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    }
+
+out:
     colo_buffer_destroy();
+    colo_compare_destroy();
 
     if (s->state != MIG_STATE_ERROR) {
         migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
@@ -183,6 +275,17 @@ void colo_init_checkpointer(MigrationState *s)
 
 static Coroutine *colo;
 
+/*
+ * return:
+ * 0: start a checkpoint
+ * 1: some error happend, exit colo restore
+ */
+static int slave_wait_new_checkpoint(QEMUFile *f)
+{
+    /* TODO: wait checkpoint start command from master */
+    return 1;
+}
+
 void colo_process_incoming_checkpoints(QEMUFile *f)
 {
     int dev_hotplug = qdev_hotplug;
@@ -198,7 +301,13 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
 
     colo_buffer_init();
 
-    /* TODO: COLO checkpointed restore loop */
+    while (true) {
+        if (slave_wait_new_checkpoint(f)) {
+            break;
+        }
+
+        /* TODO: COLO restore */
+    }
 
     colo_buffer_destroy();
     colo = NULL;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 09/17] COLO ctl: implement API's that communicate with colo agent
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

We use COLO agent to compare the packets returned by
Primary VM and Secondary VM, and decide whether to start a
checkpoint according to some rules. It is a linux kernel
module for host.
COLO controller communicate with the agent through ioctl().

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 112 insertions(+), 3 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index f295e56..802f8b0 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -13,7 +13,16 @@
 #include "block/coroutine.h"
 #include "qemu/error-report.h"
 #include "hw/qdev-core.h"
+#include "qemu/timer.h"
 #include "migration/migration-colo.h"
+#include <sys/ioctl.h>
+
+/*
+ * checkpoint timer: unit ms
+ * this is large because COLO checkpoint will mostly depend on
+ * COLO compare module.
+ */
+#define CHKPOINT_TIMER 10000
 
 static QEMUBH *colo_bh;
 
@@ -22,6 +31,56 @@ bool colo_supported(void)
     return true;
 }
 
+/* colo compare */
+#define COMP_IOC_MAGIC 'k'
+#define COMP_IOCTWAIT   _IO(COMP_IOC_MAGIC, 0)
+#define COMP_IOCTFLUSH  _IO(COMP_IOC_MAGIC, 1)
+#define COMP_IOCTRESUME _IO(COMP_IOC_MAGIC, 2)
+
+#define COMPARE_DEV "/dev/HA_compare"
+/* COLO compare module FD */
+static int comp_fd = -1;
+
+static int colo_compare_init(void)
+{
+    comp_fd = open(COMPARE_DEV, O_RDONLY);
+    if (comp_fd < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static void colo_compare_destroy(void)
+{
+    if (comp_fd >= 0) {
+        close(comp_fd);
+        comp_fd = -1;
+    }
+}
+
+/*
+ * Communicate with COLO Agent through ioctl.
+ * return:
+ * 0: start a checkpoint
+ * other: errno == ETIME or ERESTART, try again
+ *        errno == other, error, quit colo save
+ */
+static int colo_compare(void)
+{
+    return ioctl(comp_fd, COMP_IOCTWAIT, 250);
+}
+
+static __attribute__((unused)) int colo_compare_flush(void)
+{
+    return ioctl(comp_fd, COMP_IOCTFLUSH, 1);
+}
+
+static __attribute__((unused)) int colo_compare_resume(void)
+{
+    return ioctl(comp_fd, COMP_IOCTRESUME, 1);
+}
+
 /* colo buffer */
 
 #define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
@@ -131,15 +190,48 @@ static const QEMUFileOps colo_read_ops = {
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
-    int dev_hotplug = qdev_hotplug;
+    int dev_hotplug = qdev_hotplug, wait_cp = 0;
+    int64_t start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    int64_t current_time;
+
+    if (colo_compare_init() < 0) {
+        error_report("Init colo compare error\n");
+        goto out;
+    }
 
     qdev_hotplug = 0;
 
     colo_buffer_init();
 
-    /*TODO: COLO checkpointed save loop*/
+    while (s->state == MIG_STATE_COLO) {
+        /* wait for a colo checkpoint */
+        wait_cp = colo_compare();
+        if (wait_cp) {
+            if (errno != ETIME && errno != ERESTART) {
+                error_report("compare module failed(%s)", strerror(errno));
+                goto out;
+            }
+            /*
+             * no checkpoint is needed, wait for 1ms and then
+             * check if we need checkpoint
+             */
+            current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+            if (current_time - start_time < CHKPOINT_TIMER) {
+                usleep(1000);
+                continue;
+            }
+        }
+
+        /* start a colo checkpoint */
+
+        /*TODO: COLO save */
 
+        start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    }
+
+out:
     colo_buffer_destroy();
+    colo_compare_destroy();
 
     if (s->state != MIG_STATE_ERROR) {
         migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
@@ -183,6 +275,17 @@ void colo_init_checkpointer(MigrationState *s)
 
 static Coroutine *colo;
 
+/*
+ * return:
+ * 0: start a checkpoint
+ * 1: some error happend, exit colo restore
+ */
+static int slave_wait_new_checkpoint(QEMUFile *f)
+{
+    /* TODO: wait checkpoint start command from master */
+    return 1;
+}
+
 void colo_process_incoming_checkpoints(QEMUFile *f)
 {
     int dev_hotplug = qdev_hotplug;
@@ -198,7 +301,13 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
 
     colo_buffer_init();
 
-    /* TODO: COLO checkpointed restore loop */
+    while (true) {
+        if (slave_wait_new_checkpoint(f)) {
+            break;
+        }
+
+        /* TODO: COLO restore */
+    }
 
     colo_buffer_destroy();
     colo = NULL;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 10/17] COLO ctl: introduce is_slave() and is_master()
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

is_slaver is to determine whether the QEMU instance is a
slaver(migration target) at runtime.
is_master is to determine whether the QEMU instance is a
master(migration starter) at runtime.
This 2 APIs will be used later.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/migration-colo.c b/migration-colo.c
index 802f8b0..2699e77 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -187,6 +187,12 @@ static const QEMUFileOps colo_read_ops = {
 
 /* save */
 
+static __attribute__((unused)) bool is_master(void)
+{
+    MigrationState *s = migrate_get_current();
+    return (s->state == MIG_STATE_COLO);
+}
+
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
@@ -275,6 +281,11 @@ void colo_init_checkpointer(MigrationState *s)
 
 static Coroutine *colo;
 
+static __attribute__((unused)) bool is_slave(void)
+{
+    return colo != NULL;
+}
+
 /*
  * return:
  * 0: start a checkpoint
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 10/17] COLO ctl: introduce is_slave() and is_master()
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

is_slaver is to determine whether the QEMU instance is a
slaver(migration target) at runtime.
is_master is to determine whether the QEMU instance is a
master(migration starter) at runtime.
This 2 APIs will be used later.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/migration-colo.c b/migration-colo.c
index 802f8b0..2699e77 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -187,6 +187,12 @@ static const QEMUFileOps colo_read_ops = {
 
 /* save */
 
+static __attribute__((unused)) bool is_master(void)
+{
+    MigrationState *s = migrate_get_current();
+    return (s->state == MIG_STATE_COLO);
+}
+
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
@@ -275,6 +281,11 @@ void colo_init_checkpointer(MigrationState *s)
 
 static Coroutine *colo;
 
+static __attribute__((unused)) bool is_slave(void)
+{
+    return colo != NULL;
+}
+
 /*
  * return:
  * 0: start a checkpoint
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

implement colo checkpoint protocol.

Checkpoint synchronzing points.

                  Primary                 Secondary
  NEW             @
                                          Suspend
  SUSPENDED                               @
                  Suspend&Save state
  SEND            @
                  Send state              Receive state
  RECEIVED                                @
                  Flush network           Load state
  LOADED                                  @
                  Resume                  Resume

                  Start Comparing
NOTE:
 1) '@' who sends the message
 2) Every sync-point is synchronized by two sides with only
    one handshake(single direction) for low-latency.
    If more strict synchronization is required, a opposite direction
    sync-point should be added.
 3) Since sync-points are single direction, the remote side may
    go forward a lot when this side just receives the sync-point.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 262 insertions(+), 6 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index 2699e77..a708872 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -24,6 +24,41 @@
  */
 #define CHKPOINT_TIMER 10000
 
+enum {
+    COLO_READY = 0x46,
+
+    /*
+     * Checkpoint synchronzing points.
+     *
+     *                  Primary                 Secondary
+     *  NEW             @
+     *                                          Suspend
+     *  SUSPENDED                               @
+     *                  Suspend&Save state
+     *  SEND            @
+     *                  Send state              Receive state
+     *  RECEIVED                                @
+     *                  Flush network           Load state
+     *  LOADED                                  @
+     *                  Resume                  Resume
+     *
+     *                  Start Comparing
+     * NOTE:
+     * 1) '@' who sends the message
+     * 2) Every sync-point is synchronized by two sides with only
+     *    one handshake(single direction) for low-latency.
+     *    If more strict synchronization is required, a opposite direction
+     *    sync-point should be added.
+     * 3) Since sync-points are single direction, the remote side may
+     *    go forward a lot when this side just receives the sync-point.
+     */
+    COLO_CHECKPOINT_NEW,
+    COLO_CHECKPOINT_SUSPENDED,
+    COLO_CHECKPOINT_SEND,
+    COLO_CHECKPOINT_RECEIVED,
+    COLO_CHECKPOINT_LOADED,
+};
+
 static QEMUBH *colo_bh;
 
 bool colo_supported(void)
@@ -185,30 +220,161 @@ static const QEMUFileOps colo_read_ops = {
     .close = colo_close,
 };
 
+/* colo checkpoint control helper */
+static bool is_master(void);
+static bool is_slave(void);
+
+static void ctl_error_handler(void *opaque, int err)
+{
+    if (is_slave()) {
+        /* TODO: determine whether we need to failover */
+        /* FIXME: we will not failover currently, just kill slave */
+        error_report("error: colo transmission failed!\n");
+        exit(1);
+    } else if (is_master()) {
+        /* Master still alive, do not failover */
+        error_report("error: colo transmission failed!\n");
+        return;
+    } else {
+        error_report("COLO: Unexpected error happend!\n");
+        exit(EXIT_FAILURE);
+    }
+}
+
+static int colo_ctl_put(QEMUFile *f, uint64_t request)
+{
+    int ret = 0;
+
+    qemu_put_be64(f, request);
+    qemu_fflush(f);
+
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        ctl_error_handler(f, ret);
+        return 1;
+    }
+
+    return ret;
+}
+
+static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
+{
+    int ret = 0;
+    uint64_t temp;
+
+    temp = qemu_get_be64(f);
+
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        ctl_error_handler(f, ret);
+        return 1;
+    }
+
+    *value = temp;
+    return 0;
+}
+
+static int colo_ctl_get(QEMUFile *f, uint64_t require)
+{
+    int ret;
+    uint64_t value;
+
+    ret = colo_ctl_get_value(f, &value);
+    if (ret) {
+        return ret;
+    }
+
+    if (value != require) {
+        error_report("unexpected state received!\n");
+        exit(1);
+    }
+
+    return ret;
+}
+
 /* save */
 
-static __attribute__((unused)) bool is_master(void)
+static bool is_master(void)
 {
     MigrationState *s = migrate_get_current();
     return (s->state == MIG_STATE_COLO);
 }
 
+static int do_colo_transaction(MigrationState *s, QEMUFile *control,
+                               QEMUFile *trans)
+{
+    int ret;
+
+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
+    if (ret) {
+        goto out;
+    }
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: suspend and save vm state to colo buffer */
+
+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: send vmstate to slave */
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: Flush network etc. */
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: resume master */
+
+out:
+    return ret;
+}
+
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
     int dev_hotplug = qdev_hotplug, wait_cp = 0;
     int64_t start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int64_t current_time;
+    QEMUFile *colo_control = NULL, *colo_trans = NULL;
+    int ret;
 
     if (colo_compare_init() < 0) {
         error_report("Init colo compare error\n");
         goto out;
     }
 
+    colo_control = qemu_fopen_socket(qemu_get_fd(s->file), "rb");
+    if (!colo_control) {
+        error_report("open colo_control failed\n");
+        goto out;
+    }
+
     qdev_hotplug = 0;
 
     colo_buffer_init();
 
+    /*
+     * Wait for slave finish loading vm states and enter COLO
+     * restore.
+     */
+    ret = colo_ctl_get(colo_control, COLO_READY);
+    if (ret) {
+        goto out;
+    }
+
     while (s->state == MIG_STATE_COLO) {
         /* wait for a colo checkpoint */
         wait_cp = colo_compare();
@@ -230,13 +396,33 @@ static void *colo_thread(void *opaque)
 
         /* start a colo checkpoint */
 
-        /*TODO: COLO save */
+        /* open colo buffer for write */
+        colo_trans = qemu_fopen_ops(&colo_buffer, &colo_write_ops);
+        if (!colo_trans) {
+            error_report("open colo buffer failed\n");
+            goto out;
+        }
 
+        if (do_colo_transaction(s, colo_control, colo_trans)) {
+            goto out;
+        }
+
+        qemu_fclose(colo_trans);
+        colo_trans = NULL;
         start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     }
 
 out:
+    if (colo_trans) {
+        qemu_fclose(colo_trans);
+    }
+
     colo_buffer_destroy();
+
+    if (colo_control) {
+        qemu_fclose(colo_control);
+    }
+
     colo_compare_destroy();
 
     if (s->state != MIG_STATE_ERROR) {
@@ -281,7 +467,7 @@ void colo_init_checkpointer(MigrationState *s)
 
 static Coroutine *colo;
 
-static __attribute__((unused)) bool is_slave(void)
+static bool is_slave(void)
 {
     return colo != NULL;
 }
@@ -293,13 +479,32 @@ static __attribute__((unused)) bool is_slave(void)
  */
 static int slave_wait_new_checkpoint(QEMUFile *f)
 {
-    /* TODO: wait checkpoint start command from master */
-    return 1;
+    int fd = qemu_get_fd(f);
+    int ret;
+    uint64_t cmd;
+
+    yield_until_fd_readable(fd);
+
+    ret = colo_ctl_get_value(f, &cmd);
+    if (ret) {
+        return 1;
+    }
+
+    if (cmd == COLO_CHECKPOINT_NEW) {
+        return 0;
+    } else {
+        /* Unexpected data received */
+        ctl_error_handler(f, ret);
+        return 1;
+    }
 }
 
 void colo_process_incoming_checkpoints(QEMUFile *f)
 {
+    int fd = qemu_get_fd(f);
     int dev_hotplug = qdev_hotplug;
+    QEMUFile *ctl = NULL;
+    int ret;
 
     if (!restore_use_colo()) {
         return;
@@ -310,18 +515,69 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
+    ctl = qemu_fopen_socket(fd, "wb");
+    if (!ctl) {
+        error_report("can't open incoming channel\n");
+        goto out;
+    }
+
     colo_buffer_init();
 
+    ret = colo_ctl_put(ctl, COLO_READY);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: in COLO mode, slave is runing, so start the vm */
+
     while (true) {
         if (slave_wait_new_checkpoint(f)) {
             break;
         }
 
-        /* TODO: COLO restore */
+        /* start colo checkpoint */
+
+        /* TODO: suspend guest */
+
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
+        if (ret) {
+            goto out;
+        }
+
+        /* TODO: open colo buffer for read */
+
+        ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
+        if (ret) {
+            goto out;
+        }
+
+        /* TODO: read migration data into colo buffer */
+
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
+        if (ret) {
+            goto out;
+        }
+
+        /* TODO: load vm state */
+
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
+        if (ret) {
+            goto out;
+        }
+
+        /* TODO: resume guest */
+
+        /* TODO: close colo buffer */
     }
 
+out:
     colo_buffer_destroy();
     colo = NULL;
+
+    if (ctl) {
+        qemu_fclose(ctl);
+    }
+
     restore_exit_colo();
 
     qdev_hotplug = dev_hotplug;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

implement colo checkpoint protocol.

Checkpoint synchronzing points.

                  Primary                 Secondary
  NEW             @
                                          Suspend
  SUSPENDED                               @
                  Suspend&Save state
  SEND            @
                  Send state              Receive state
  RECEIVED                                @
                  Flush network           Load state
  LOADED                                  @
                  Resume                  Resume

                  Start Comparing
NOTE:
 1) '@' who sends the message
 2) Every sync-point is synchronized by two sides with only
    one handshake(single direction) for low-latency.
    If more strict synchronization is required, a opposite direction
    sync-point should be added.
 3) Since sync-points are single direction, the remote side may
    go forward a lot when this side just receives the sync-point.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 262 insertions(+), 6 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index 2699e77..a708872 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -24,6 +24,41 @@
  */
 #define CHKPOINT_TIMER 10000
 
+enum {
+    COLO_READY = 0x46,
+
+    /*
+     * Checkpoint synchronzing points.
+     *
+     *                  Primary                 Secondary
+     *  NEW             @
+     *                                          Suspend
+     *  SUSPENDED                               @
+     *                  Suspend&Save state
+     *  SEND            @
+     *                  Send state              Receive state
+     *  RECEIVED                                @
+     *                  Flush network           Load state
+     *  LOADED                                  @
+     *                  Resume                  Resume
+     *
+     *                  Start Comparing
+     * NOTE:
+     * 1) '@' who sends the message
+     * 2) Every sync-point is synchronized by two sides with only
+     *    one handshake(single direction) for low-latency.
+     *    If more strict synchronization is required, a opposite direction
+     *    sync-point should be added.
+     * 3) Since sync-points are single direction, the remote side may
+     *    go forward a lot when this side just receives the sync-point.
+     */
+    COLO_CHECKPOINT_NEW,
+    COLO_CHECKPOINT_SUSPENDED,
+    COLO_CHECKPOINT_SEND,
+    COLO_CHECKPOINT_RECEIVED,
+    COLO_CHECKPOINT_LOADED,
+};
+
 static QEMUBH *colo_bh;
 
 bool colo_supported(void)
@@ -185,30 +220,161 @@ static const QEMUFileOps colo_read_ops = {
     .close = colo_close,
 };
 
+/* colo checkpoint control helper */
+static bool is_master(void);
+static bool is_slave(void);
+
+static void ctl_error_handler(void *opaque, int err)
+{
+    if (is_slave()) {
+        /* TODO: determine whether we need to failover */
+        /* FIXME: we will not failover currently, just kill slave */
+        error_report("error: colo transmission failed!\n");
+        exit(1);
+    } else if (is_master()) {
+        /* Master still alive, do not failover */
+        error_report("error: colo transmission failed!\n");
+        return;
+    } else {
+        error_report("COLO: Unexpected error happend!\n");
+        exit(EXIT_FAILURE);
+    }
+}
+
+static int colo_ctl_put(QEMUFile *f, uint64_t request)
+{
+    int ret = 0;
+
+    qemu_put_be64(f, request);
+    qemu_fflush(f);
+
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        ctl_error_handler(f, ret);
+        return 1;
+    }
+
+    return ret;
+}
+
+static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
+{
+    int ret = 0;
+    uint64_t temp;
+
+    temp = qemu_get_be64(f);
+
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        ctl_error_handler(f, ret);
+        return 1;
+    }
+
+    *value = temp;
+    return 0;
+}
+
+static int colo_ctl_get(QEMUFile *f, uint64_t require)
+{
+    int ret;
+    uint64_t value;
+
+    ret = colo_ctl_get_value(f, &value);
+    if (ret) {
+        return ret;
+    }
+
+    if (value != require) {
+        error_report("unexpected state received!\n");
+        exit(1);
+    }
+
+    return ret;
+}
+
 /* save */
 
-static __attribute__((unused)) bool is_master(void)
+static bool is_master(void)
 {
     MigrationState *s = migrate_get_current();
     return (s->state == MIG_STATE_COLO);
 }
 
+static int do_colo_transaction(MigrationState *s, QEMUFile *control,
+                               QEMUFile *trans)
+{
+    int ret;
+
+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
+    if (ret) {
+        goto out;
+    }
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: suspend and save vm state to colo buffer */
+
+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: send vmstate to slave */
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: Flush network etc. */
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: resume master */
+
+out:
+    return ret;
+}
+
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
     int dev_hotplug = qdev_hotplug, wait_cp = 0;
     int64_t start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int64_t current_time;
+    QEMUFile *colo_control = NULL, *colo_trans = NULL;
+    int ret;
 
     if (colo_compare_init() < 0) {
         error_report("Init colo compare error\n");
         goto out;
     }
 
+    colo_control = qemu_fopen_socket(qemu_get_fd(s->file), "rb");
+    if (!colo_control) {
+        error_report("open colo_control failed\n");
+        goto out;
+    }
+
     qdev_hotplug = 0;
 
     colo_buffer_init();
 
+    /*
+     * Wait for slave finish loading vm states and enter COLO
+     * restore.
+     */
+    ret = colo_ctl_get(colo_control, COLO_READY);
+    if (ret) {
+        goto out;
+    }
+
     while (s->state == MIG_STATE_COLO) {
         /* wait for a colo checkpoint */
         wait_cp = colo_compare();
@@ -230,13 +396,33 @@ static void *colo_thread(void *opaque)
 
         /* start a colo checkpoint */
 
-        /*TODO: COLO save */
+        /* open colo buffer for write */
+        colo_trans = qemu_fopen_ops(&colo_buffer, &colo_write_ops);
+        if (!colo_trans) {
+            error_report("open colo buffer failed\n");
+            goto out;
+        }
 
+        if (do_colo_transaction(s, colo_control, colo_trans)) {
+            goto out;
+        }
+
+        qemu_fclose(colo_trans);
+        colo_trans = NULL;
         start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     }
 
 out:
+    if (colo_trans) {
+        qemu_fclose(colo_trans);
+    }
+
     colo_buffer_destroy();
+
+    if (colo_control) {
+        qemu_fclose(colo_control);
+    }
+
     colo_compare_destroy();
 
     if (s->state != MIG_STATE_ERROR) {
@@ -281,7 +467,7 @@ void colo_init_checkpointer(MigrationState *s)
 
 static Coroutine *colo;
 
-static __attribute__((unused)) bool is_slave(void)
+static bool is_slave(void)
 {
     return colo != NULL;
 }
@@ -293,13 +479,32 @@ static __attribute__((unused)) bool is_slave(void)
  */
 static int slave_wait_new_checkpoint(QEMUFile *f)
 {
-    /* TODO: wait checkpoint start command from master */
-    return 1;
+    int fd = qemu_get_fd(f);
+    int ret;
+    uint64_t cmd;
+
+    yield_until_fd_readable(fd);
+
+    ret = colo_ctl_get_value(f, &cmd);
+    if (ret) {
+        return 1;
+    }
+
+    if (cmd == COLO_CHECKPOINT_NEW) {
+        return 0;
+    } else {
+        /* Unexpected data received */
+        ctl_error_handler(f, ret);
+        return 1;
+    }
 }
 
 void colo_process_incoming_checkpoints(QEMUFile *f)
 {
+    int fd = qemu_get_fd(f);
     int dev_hotplug = qdev_hotplug;
+    QEMUFile *ctl = NULL;
+    int ret;
 
     if (!restore_use_colo()) {
         return;
@@ -310,18 +515,69 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
+    ctl = qemu_fopen_socket(fd, "wb");
+    if (!ctl) {
+        error_report("can't open incoming channel\n");
+        goto out;
+    }
+
     colo_buffer_init();
 
+    ret = colo_ctl_put(ctl, COLO_READY);
+    if (ret) {
+        goto out;
+    }
+
+    /* TODO: in COLO mode, slave is runing, so start the vm */
+
     while (true) {
         if (slave_wait_new_checkpoint(f)) {
             break;
         }
 
-        /* TODO: COLO restore */
+        /* start colo checkpoint */
+
+        /* TODO: suspend guest */
+
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
+        if (ret) {
+            goto out;
+        }
+
+        /* TODO: open colo buffer for read */
+
+        ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
+        if (ret) {
+            goto out;
+        }
+
+        /* TODO: read migration data into colo buffer */
+
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
+        if (ret) {
+            goto out;
+        }
+
+        /* TODO: load vm state */
+
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
+        if (ret) {
+            goto out;
+        }
+
+        /* TODO: resume guest */
+
+        /* TODO: close colo buffer */
     }
 
+out:
     colo_buffer_destroy();
     colo = NULL;
+
+    if (ctl) {
+        qemu_fclose(ctl);
+    }
+
     restore_exit_colo();
 
     qdev_hotplug = dev_hotplug;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 12/17] COLO ctl: add a RunState RUN_STATE_COLO
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

Guest will enter this state when paused to save/resore VM state
under colo checkpoint.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 qapi-schema.json | 4 +++-
 vl.c             | 8 ++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 807f5a2..b42171c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -145,12 +145,14 @@
 # @watchdog: the watchdog action is configured to pause and has been triggered
 #
 # @guest-panicked: guest has been panicked as a result of guest OS panic
+#
+# @colo: guest is paused to save/restore VM state under colo checkpoint
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
             'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
             'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
-            'guest-panicked' ] }
+            'guest-panicked', 'colo' ] }
 
 ##
 # @StatusInfo:
diff --git a/vl.c b/vl.c
index 1a282d8..545155d 100644
--- a/vl.c
+++ b/vl.c
@@ -597,6 +597,7 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
+    { RUN_STATE_INMIGRATE, RUN_STATE_COLO },
 
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
@@ -606,6 +607,7 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
     { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_PAUSED, RUN_STATE_COLO},
 
     { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -616,9 +618,12 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
 
+    { RUN_STATE_COLO, RUN_STATE_RUNNING },
+
     { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
     { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
     { RUN_STATE_RUNNING, RUN_STATE_IO_ERROR },
@@ -629,6 +634,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
     { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
     { RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
+    { RUN_STATE_RUNNING, RUN_STATE_COLO},
 
     { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
@@ -639,9 +645,11 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
     { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
     { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_WATCHDOG, RUN_STATE_COLO},
 
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 12/17] COLO ctl: add a RunState RUN_STATE_COLO
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

Guest will enter this state when paused to save/resore VM state
under colo checkpoint.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 qapi-schema.json | 4 +++-
 vl.c             | 8 ++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 807f5a2..b42171c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -145,12 +145,14 @@
 # @watchdog: the watchdog action is configured to pause and has been triggered
 #
 # @guest-panicked: guest has been panicked as a result of guest OS panic
+#
+# @colo: guest is paused to save/restore VM state under colo checkpoint
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
             'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
             'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
-            'guest-panicked' ] }
+            'guest-panicked', 'colo' ] }
 
 ##
 # @StatusInfo:
diff --git a/vl.c b/vl.c
index 1a282d8..545155d 100644
--- a/vl.c
+++ b/vl.c
@@ -597,6 +597,7 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
+    { RUN_STATE_INMIGRATE, RUN_STATE_COLO },
 
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
@@ -606,6 +607,7 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
     { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_PAUSED, RUN_STATE_COLO},
 
     { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -616,9 +618,12 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
 
+    { RUN_STATE_COLO, RUN_STATE_RUNNING },
+
     { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
     { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
     { RUN_STATE_RUNNING, RUN_STATE_IO_ERROR },
@@ -629,6 +634,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
     { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
     { RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
+    { RUN_STATE_RUNNING, RUN_STATE_COLO},
 
     { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
@@ -639,9 +645,11 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
     { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
     { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_WATCHDOG, RUN_STATE_COLO},
 
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 13/17] COLO ctl: implement colo save
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

implement colo save

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 44 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index a708872..03ac157 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -14,6 +14,7 @@
 #include "qemu/error-report.h"
 #include "hw/qdev-core.h"
 #include "qemu/timer.h"
+#include "sysemu/sysemu.h"
 #include "migration/migration-colo.h"
 #include <sys/ioctl.h>
 
@@ -106,12 +107,12 @@ static int colo_compare(void)
     return ioctl(comp_fd, COMP_IOCTWAIT, 250);
 }
 
-static __attribute__((unused)) int colo_compare_flush(void)
+static int colo_compare_flush(void)
 {
     return ioctl(comp_fd, COMP_IOCTFLUSH, 1);
 }
 
-static __attribute__((unused)) int colo_compare_resume(void)
+static int colo_compare_resume(void)
 {
     return ioctl(comp_fd, COMP_IOCTRESUME, 1);
 }
@@ -315,30 +316,61 @@ static int do_colo_transaction(MigrationState *s, QEMUFile *control,
         goto out;
     }
 
-    /* TODO: suspend and save vm state to colo buffer */
+    /* suspend and save vm state to colo buffer */
+
+    qemu_mutex_lock_iothread();
+    vm_stop_force_state(RUN_STATE_COLO);
+    qemu_mutex_unlock_iothread();
+    /* Disable block migration */
+    s->params.blk = 0;
+    s->params.shared = 0;
+    qemu_savevm_state_begin(trans, &s->params);
+    qemu_savevm_state_complete(trans);
+
+    qemu_fflush(trans);
 
     ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
     if (ret) {
         goto out;
     }
 
-    /* TODO: send vmstate to slave */
+    /* send vmstate to slave */
+
+    /* we send the total size of the vmstate first */
+    ret = colo_ctl_put(s->file, colo_buffer.used);
+    if (ret) {
+        goto out;
+    }
+
+    qemu_put_buffer_async(s->file, colo_buffer.data, colo_buffer.used);
+    ret = qemu_file_get_error(s->file);
+    if (ret < 0) {
+        goto out;
+    }
+    qemu_fflush(s->file);
 
     ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
     if (ret) {
         goto out;
     }
 
-    /* TODO: Flush network etc. */
+    /* Flush network etc. */
+    colo_compare_flush();
 
     ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
     if (ret) {
         goto out;
     }
 
-    /* TODO: resume master */
+    colo_compare_resume();
+    ret = 0;
 
 out:
+    /* resume master */
+    qemu_mutex_lock_iothread();
+    vm_start();
+    qemu_mutex_unlock_iothread();
+
     return ret;
 }
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 13/17] COLO ctl: implement colo save
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

implement colo save

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 44 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index a708872..03ac157 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -14,6 +14,7 @@
 #include "qemu/error-report.h"
 #include "hw/qdev-core.h"
 #include "qemu/timer.h"
+#include "sysemu/sysemu.h"
 #include "migration/migration-colo.h"
 #include <sys/ioctl.h>
 
@@ -106,12 +107,12 @@ static int colo_compare(void)
     return ioctl(comp_fd, COMP_IOCTWAIT, 250);
 }
 
-static __attribute__((unused)) int colo_compare_flush(void)
+static int colo_compare_flush(void)
 {
     return ioctl(comp_fd, COMP_IOCTFLUSH, 1);
 }
 
-static __attribute__((unused)) int colo_compare_resume(void)
+static int colo_compare_resume(void)
 {
     return ioctl(comp_fd, COMP_IOCTRESUME, 1);
 }
@@ -315,30 +316,61 @@ static int do_colo_transaction(MigrationState *s, QEMUFile *control,
         goto out;
     }
 
-    /* TODO: suspend and save vm state to colo buffer */
+    /* suspend and save vm state to colo buffer */
+
+    qemu_mutex_lock_iothread();
+    vm_stop_force_state(RUN_STATE_COLO);
+    qemu_mutex_unlock_iothread();
+    /* Disable block migration */
+    s->params.blk = 0;
+    s->params.shared = 0;
+    qemu_savevm_state_begin(trans, &s->params);
+    qemu_savevm_state_complete(trans);
+
+    qemu_fflush(trans);
 
     ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
     if (ret) {
         goto out;
     }
 
-    /* TODO: send vmstate to slave */
+    /* send vmstate to slave */
+
+    /* we send the total size of the vmstate first */
+    ret = colo_ctl_put(s->file, colo_buffer.used);
+    if (ret) {
+        goto out;
+    }
+
+    qemu_put_buffer_async(s->file, colo_buffer.data, colo_buffer.used);
+    ret = qemu_file_get_error(s->file);
+    if (ret < 0) {
+        goto out;
+    }
+    qemu_fflush(s->file);
 
     ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
     if (ret) {
         goto out;
     }
 
-    /* TODO: Flush network etc. */
+    /* Flush network etc. */
+    colo_compare_flush();
 
     ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
     if (ret) {
         goto out;
     }
 
-    /* TODO: resume master */
+    colo_compare_resume();
+    ret = 0;
 
 out:
+    /* resume master */
+    qemu_mutex_lock_iothread();
+    vm_start();
+    qemu_mutex_unlock_iothread();
+
     return ret;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 14/17] COLO ctl: implement colo restore
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

implement colo restore

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 43 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index 03ac157..8596845 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -535,8 +535,9 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
 {
     int fd = qemu_get_fd(f);
     int dev_hotplug = qdev_hotplug;
-    QEMUFile *ctl = NULL;
+    QEMUFile *ctl = NULL, *fb = NULL;
     int ret;
+    uint64_t total_size;
 
     if (!restore_use_colo()) {
         return;
@@ -560,7 +561,8 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
         goto out;
     }
 
-    /* TODO: in COLO mode, slave is runing, so start the vm */
+    /* in COLO mode, slave is runing, so start the vm */
+    vm_start();
 
     while (true) {
         if (slave_wait_new_checkpoint(f)) {
@@ -569,43 +571,68 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
 
         /* start colo checkpoint */
 
-        /* TODO: suspend guest */
+        /* suspend guest */
+        vm_stop_force_state(RUN_STATE_COLO);
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
         if (ret) {
             goto out;
         }
 
-        /* TODO: open colo buffer for read */
+        /* open colo buffer for read */
+        fb = qemu_fopen_ops(&colo_buffer, &colo_read_ops);
+        if (!fb) {
+            error_report("can't open colo buffer\n");
+            goto out;
+        }
 
         ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
         if (ret) {
             goto out;
         }
 
-        /* TODO: read migration data into colo buffer */
+        /* read migration data into colo buffer */
+
+        /* read the vmstate total size first */
+        ret = colo_ctl_get_value(f, &total_size);
+        if (ret) {
+            goto out;
+        }
+        colo_buffer_extend(total_size);
+        qemu_get_buffer(f, colo_buffer.data, total_size);
+        colo_buffer.used = total_size;
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
         if (ret) {
             goto out;
         }
 
-        /* TODO: load vm state */
+        /* load vm state */
+        if (qemu_loadvm_state(fb) < 0) {
+            error_report("COLO: loadvm failed\n");
+            goto out;
+        }
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
         if (ret) {
             goto out;
         }
 
-        /* TODO: resume guest */
+        /* resume guest */
+        vm_start();
 
-        /* TODO: close colo buffer */
+        qemu_fclose(fb);
+        fb = NULL;
     }
 
 out:
     colo_buffer_destroy();
     colo = NULL;
 
+    if (fb) {
+        qemu_fclose(fb);
+    }
+
     if (ctl) {
         qemu_fclose(ctl);
     }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 14/17] COLO ctl: implement colo restore
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

implement colo restore

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 43 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index 03ac157..8596845 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -535,8 +535,9 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
 {
     int fd = qemu_get_fd(f);
     int dev_hotplug = qdev_hotplug;
-    QEMUFile *ctl = NULL;
+    QEMUFile *ctl = NULL, *fb = NULL;
     int ret;
+    uint64_t total_size;
 
     if (!restore_use_colo()) {
         return;
@@ -560,7 +561,8 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
         goto out;
     }
 
-    /* TODO: in COLO mode, slave is runing, so start the vm */
+    /* in COLO mode, slave is runing, so start the vm */
+    vm_start();
 
     while (true) {
         if (slave_wait_new_checkpoint(f)) {
@@ -569,43 +571,68 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
 
         /* start colo checkpoint */
 
-        /* TODO: suspend guest */
+        /* suspend guest */
+        vm_stop_force_state(RUN_STATE_COLO);
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
         if (ret) {
             goto out;
         }
 
-        /* TODO: open colo buffer for read */
+        /* open colo buffer for read */
+        fb = qemu_fopen_ops(&colo_buffer, &colo_read_ops);
+        if (!fb) {
+            error_report("can't open colo buffer\n");
+            goto out;
+        }
 
         ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
         if (ret) {
             goto out;
         }
 
-        /* TODO: read migration data into colo buffer */
+        /* read migration data into colo buffer */
+
+        /* read the vmstate total size first */
+        ret = colo_ctl_get_value(f, &total_size);
+        if (ret) {
+            goto out;
+        }
+        colo_buffer_extend(total_size);
+        qemu_get_buffer(f, colo_buffer.data, total_size);
+        colo_buffer.used = total_size;
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
         if (ret) {
             goto out;
         }
 
-        /* TODO: load vm state */
+        /* load vm state */
+        if (qemu_loadvm_state(fb) < 0) {
+            error_report("COLO: loadvm failed\n");
+            goto out;
+        }
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
         if (ret) {
             goto out;
         }
 
-        /* TODO: resume guest */
+        /* resume guest */
+        vm_start();
 
-        /* TODO: close colo buffer */
+        qemu_fclose(fb);
+        fb = NULL;
     }
 
 out:
     colo_buffer_destroy();
     colo = NULL;
 
+    if (fb) {
+        qemu_fclose(fb);
+    }
+
     if (ctl) {
         qemu_fclose(ctl);
     }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 15/17] COLO save: reuse migration bitmap under colo checkpoint
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

reuse migration bitmap under colo checkpoint, only send dirty pages
per-checkpoint.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 arch_init.c                        | 20 +++++++++++++++++++-
 include/migration/migration-colo.h |  2 ++
 migration-colo.c                   |  6 ++----
 stubs/migration-colo.c             | 10 ++++++++++
 4 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 8ddaf35..c84e6c8 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -52,6 +52,7 @@
 #include "exec/ram_addr.h"
 #include "hw/acpi/acpi.h"
 #include "qemu/host-utils.h"
+#include "migration/migration-colo.h"
 
 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -769,6 +770,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     RAMBlock *block;
     int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
 
+    /*
+     * migration has already setup the bitmap, reuse it.
+     */
+    if (is_master()) {
+        qemu_mutex_lock_ramlist();
+        reset_ram_globals();
+        goto out_setup;
+    }
+
     mig_throttle_on = false;
     dirty_rate_high_cnt = 0;
     bitmap_sync_count = 0;
@@ -828,6 +838,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     migration_bitmap_sync();
     qemu_mutex_unlock_iothread();
 
+out_setup:
     qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
@@ -937,7 +948,14 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
     }
 
     ram_control_after_iterate(f, RAM_CONTROL_FINISH);
-    migration_end();
+
+    /*
+     * Since we need to reuse dirty bitmap in colo,
+     * don't cleanup the bitmap.
+     */
+    if (!migrate_use_colo() || migration_has_failed(migrate_get_current())) {
+        migration_end();
+    }
 
     qemu_mutex_unlock_ramlist();
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index 861fa27..c286a60 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -21,10 +21,12 @@ bool colo_supported(void);
 /* save */
 bool migrate_use_colo(void);
 void colo_init_checkpointer(MigrationState *s);
+bool is_master(void);
 
 /* restore */
 bool restore_use_colo(void);
 void restore_exit_colo(void);
+bool is_slave(void);
 
 void colo_process_incoming_checkpoints(QEMUFile *f);
 
diff --git a/migration-colo.c b/migration-colo.c
index 8596845..13a6a57 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -222,8 +222,6 @@ static const QEMUFileOps colo_read_ops = {
 };
 
 /* colo checkpoint control helper */
-static bool is_master(void);
-static bool is_slave(void);
 
 static void ctl_error_handler(void *opaque, int err)
 {
@@ -295,7 +293,7 @@ static int colo_ctl_get(QEMUFile *f, uint64_t require)
 
 /* save */
 
-static bool is_master(void)
+bool is_master(void)
 {
     MigrationState *s = migrate_get_current();
     return (s->state == MIG_STATE_COLO);
@@ -499,7 +497,7 @@ void colo_init_checkpointer(MigrationState *s)
 
 static Coroutine *colo;
 
-static bool is_slave(void)
+bool is_slave(void)
 {
     return colo != NULL;
 }
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 55f0d37..ef65be6 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -22,3 +22,13 @@ void colo_init_checkpointer(MigrationState *s)
 void colo_process_incoming_checkpoints(QEMUFile *f)
 {
 }
+
+bool is_master(void)
+{
+    return false;
+}
+
+bool is_slave(void)
+{
+    return false;
+}
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 15/17] COLO save: reuse migration bitmap under colo checkpoint
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

reuse migration bitmap under colo checkpoint, only send dirty pages
per-checkpoint.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 arch_init.c                        | 20 +++++++++++++++++++-
 include/migration/migration-colo.h |  2 ++
 migration-colo.c                   |  6 ++----
 stubs/migration-colo.c             | 10 ++++++++++
 4 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 8ddaf35..c84e6c8 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -52,6 +52,7 @@
 #include "exec/ram_addr.h"
 #include "hw/acpi/acpi.h"
 #include "qemu/host-utils.h"
+#include "migration/migration-colo.h"
 
 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -769,6 +770,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     RAMBlock *block;
     int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
 
+    /*
+     * migration has already setup the bitmap, reuse it.
+     */
+    if (is_master()) {
+        qemu_mutex_lock_ramlist();
+        reset_ram_globals();
+        goto out_setup;
+    }
+
     mig_throttle_on = false;
     dirty_rate_high_cnt = 0;
     bitmap_sync_count = 0;
@@ -828,6 +838,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     migration_bitmap_sync();
     qemu_mutex_unlock_iothread();
 
+out_setup:
     qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
@@ -937,7 +948,14 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
     }
 
     ram_control_after_iterate(f, RAM_CONTROL_FINISH);
-    migration_end();
+
+    /*
+     * Since we need to reuse dirty bitmap in colo,
+     * don't cleanup the bitmap.
+     */
+    if (!migrate_use_colo() || migration_has_failed(migrate_get_current())) {
+        migration_end();
+    }
 
     qemu_mutex_unlock_ramlist();
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index 861fa27..c286a60 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -21,10 +21,12 @@ bool colo_supported(void);
 /* save */
 bool migrate_use_colo(void);
 void colo_init_checkpointer(MigrationState *s);
+bool is_master(void);
 
 /* restore */
 bool restore_use_colo(void);
 void restore_exit_colo(void);
+bool is_slave(void);
 
 void colo_process_incoming_checkpoints(QEMUFile *f);
 
diff --git a/migration-colo.c b/migration-colo.c
index 8596845..13a6a57 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -222,8 +222,6 @@ static const QEMUFileOps colo_read_ops = {
 };
 
 /* colo checkpoint control helper */
-static bool is_master(void);
-static bool is_slave(void);
 
 static void ctl_error_handler(void *opaque, int err)
 {
@@ -295,7 +293,7 @@ static int colo_ctl_get(QEMUFile *f, uint64_t require)
 
 /* save */
 
-static bool is_master(void)
+bool is_master(void)
 {
     MigrationState *s = migrate_get_current();
     return (s->state == MIG_STATE_COLO);
@@ -499,7 +497,7 @@ void colo_init_checkpointer(MigrationState *s)
 
 static Coroutine *colo;
 
-static bool is_slave(void)
+bool is_slave(void)
 {
     return colo != NULL;
 }
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 55f0d37..ef65be6 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -22,3 +22,13 @@ void colo_init_checkpointer(MigrationState *s)
 void colo_process_incoming_checkpoints(QEMUFile *f)
 {
 }
+
+bool is_master(void)
+{
+    return false;
+}
+
+bool is_slave(void)
+{
+    return false;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

The ram cache was initially the same as PVM's memory. At
checkpoint, we cache the dirty memory of PVM into ram cache
(so that ram cache always the same as PVM's memory at every
checkpoint), flush cached memory to SVM after we received
all PVM dirty memory(only needed to flush memory that was
both dirty on PVM and SVM since last checkpoint).

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 arch_init.c                        | 154 ++++++++++++++++++++++++++++++++++++-
 include/exec/cpu-all.h             |   1 +
 include/migration/migration-colo.h |   3 +
 migration-colo.c                   |   4 +
 4 files changed, 159 insertions(+), 3 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index c84e6c8..009bcb5 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -1013,6 +1013,7 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host)
     return 0;
 }
 
+static void *memory_region_get_ram_cache_ptr(MemoryRegion *mr, RAMBlock *block);
 static inline void *host_from_stream_offset(QEMUFile *f,
                                             ram_addr_t offset,
                                             int flags)
@@ -1027,7 +1028,12 @@ static inline void *host_from_stream_offset(QEMUFile *f,
             return NULL;
         }
 
-        return memory_region_get_ram_ptr(block->mr) + offset;
+        if (is_slave()) {
+            migration_bitmap_set_dirty(block->mr->ram_addr + offset);
+            return memory_region_get_ram_cache_ptr(block->mr, block) + offset;
+        } else {
+            return memory_region_get_ram_ptr(block->mr) + offset;
+        }
     }
 
     len = qemu_get_byte(f);
@@ -1035,8 +1041,15 @@ static inline void *host_from_stream_offset(QEMUFile *f,
     id[len] = 0;
 
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
-        if (!strncmp(id, block->idstr, sizeof(id)))
-            return memory_region_get_ram_ptr(block->mr) + offset;
+        if (!strncmp(id, block->idstr, sizeof(id))) {
+            if (is_slave()) {
+                migration_bitmap_set_dirty(block->mr->ram_addr + offset);
+                return memory_region_get_ram_cache_ptr(block->mr, block)
+                       + offset;
+            } else {
+                return memory_region_get_ram_ptr(block->mr) + offset;
+            }
+        }
     }
 
     error_report("Can't find block %s!", id);
@@ -1054,11 +1067,13 @@ void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
     }
 }
 
+static void ram_flush_cache(void);
 static int ram_load(QEMUFile *f, void *opaque, int version_id)
 {
     ram_addr_t addr;
     int flags, ret = 0;
     static uint64_t seq_iter;
+    bool need_flush = false;
 
     seq_iter++;
 
@@ -1121,6 +1136,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 break;
             }
 
+            need_flush = true;
             ch = qemu_get_byte(f);
             ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
         } else if (flags & RAM_SAVE_FLAG_PAGE) {
@@ -1133,6 +1149,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 break;
             }
 
+            need_flush = true;
             qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
         } else if (flags & RAM_SAVE_FLAG_XBZRLE) {
             void *host = host_from_stream_offset(f, addr, flags);
@@ -1148,6 +1165,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
+            need_flush = true;
         } else if (flags & RAM_SAVE_FLAG_HOOK) {
             ram_control_load_hook(f, flags);
         } else if (flags & RAM_SAVE_FLAG_EOS) {
@@ -1161,11 +1179,141 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
         ret = qemu_file_get_error(f);
     }
 
+    if (!ret && is_slave() && need_flush) {
+        ram_flush_cache();
+    }
+
     DPRINTF("Completed load of VM with exit code %d seq iteration "
             "%" PRIu64 "\n", ret, seq_iter);
     return ret;
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM.
+ */
+void create_and_init_ram_cache(void)
+{
+    /*
+     * called after first migration
+     */
+    RAMBlock *block;
+    int64_t ram_cache_pages = last_ram_offset() >> TARGET_PAGE_BITS;
+
+    QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+        block->host_cache = g_malloc(block->length);
+        memcpy(block->host_cache, block->host, block->length);
+    }
+
+    migration_bitmap = bitmap_new(ram_cache_pages);
+    migration_dirty_pages = 0;
+    memory_global_dirty_log_start();
+}
+
+void release_ram_cache(void)
+{
+    RAMBlock *block;
+
+    if (migration_bitmap) {
+        memory_global_dirty_log_stop();
+        g_free(migration_bitmap);
+        migration_bitmap = NULL;
+    }
+
+    QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+        g_free(block->host_cache);
+    }
+}
+
+static void *memory_region_get_ram_cache_ptr(MemoryRegion *mr, RAMBlock *block)
+{
+   if (mr->alias) {
+        return memory_region_get_ram_cache_ptr(mr->alias, block) +
+               mr->alias_offset;
+    }
+
+    assert(mr->terminates);
+
+    ram_addr_t addr = mr->ram_addr & TARGET_PAGE_MASK;
+
+    assert(addr - block->offset < block->length);
+
+    return block->host_cache + (addr - block->offset);
+}
+
+static inline
+ram_addr_t host_bitmap_find_and_reset_dirty(MemoryRegion *mr,
+                                                 ram_addr_t start)
+{
+    unsigned long base = mr->ram_addr >> TARGET_PAGE_BITS;
+    unsigned long nr = base + (start >> TARGET_PAGE_BITS);
+    unsigned long size = base + (int128_get64(mr->size) >> TARGET_PAGE_BITS);
+
+    unsigned long next;
+
+    next = find_next_bit(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION],
+                         size, nr);
+    if (next < size) {
+        clear_bit(next, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
+    }
+    return (next - base) << TARGET_PAGE_BITS;
+}
+
+static void ram_flush_cache(void)
+{
+    RAMBlock *block = NULL;
+    void *dst_host;
+    void *src_host;
+    ram_addr_t ca  = 0, ha = 0;
+    bool got_ca = 0, got_ha = 0;
+    int64_t host_dirty = 0, both_dirty = 0;
+
+    address_space_sync_dirty_bitmap(&address_space_memory);
+
+    block = QTAILQ_FIRST(&ram_list.blocks);
+    while (true) {
+        if (ca < block->length && ca <= ha) {
+            ca = migration_bitmap_find_and_reset_dirty(block->mr, ca);
+            if (ca < block->length) {
+                got_ca = 1;
+            }
+        }
+        if (ha < block->length && ha <= ca) {
+            ha = host_bitmap_find_and_reset_dirty(block->mr, ha);
+            if (ha < block->length && ha != ca) {
+                got_ha = 1;
+            }
+            host_dirty += (ha < block->length ? 1 : 0);
+            both_dirty += (ha < block->length && ha == ca ? 1 : 0);
+        }
+        if (ca >= block->length && ha >= block->length) {
+            ca = 0;
+            ha = 0;
+            block = QTAILQ_NEXT(block, next);
+            if (!block) {
+                break;
+            }
+        } else {
+            if (got_ha) {
+                got_ha = 0;
+                dst_host = memory_region_get_ram_ptr(block->mr) + ha;
+                src_host = memory_region_get_ram_cache_ptr(block->mr, block)
+                           + ha;
+                memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+            }
+            if (got_ca) {
+                got_ca = 0;
+                dst_host = memory_region_get_ram_ptr(block->mr) + ca;
+                src_host = memory_region_get_ram_cache_ptr(block->mr, block)
+                           + ca;
+                memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+            }
+        }
+    }
+
+    assert(migration_dirty_pages == 0);
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index f91581f..029c984 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -300,6 +300,7 @@ CPUArchState *cpu_copy(CPUArchState *env);
 typedef struct RAMBlock {
     struct MemoryRegion *mr;
     uint8_t *host;
+    uint8_t *host_cache;
     ram_addr_t offset;
     ram_addr_t length;
     uint32_t flags;
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index c286a60..52187dd 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -29,5 +29,8 @@ void restore_exit_colo(void);
 bool is_slave(void);
 
 void colo_process_incoming_checkpoints(QEMUFile *f);
+/* ram cache */
+void create_and_init_ram_cache(void);
+void release_ram_cache(void);
 
 #endif
diff --git a/migration-colo.c b/migration-colo.c
index 13a6a57..52156e7 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -554,6 +554,8 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
 
     colo_buffer_init();
 
+    create_and_init_ram_cache();
+
     ret = colo_ctl_put(ctl, COLO_READY);
     if (ret) {
         goto out;
@@ -631,6 +633,8 @@ out:
         qemu_fclose(fb);
     }
 
+    release_ram_cache();
+
     if (ctl) {
         qemu_fclose(ctl);
     }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

The ram cache was initially the same as PVM's memory. At
checkpoint, we cache the dirty memory of PVM into ram cache
(so that ram cache always the same as PVM's memory at every
checkpoint), flush cached memory to SVM after we received
all PVM dirty memory(only needed to flush memory that was
both dirty on PVM and SVM since last checkpoint).

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 arch_init.c                        | 154 ++++++++++++++++++++++++++++++++++++-
 include/exec/cpu-all.h             |   1 +
 include/migration/migration-colo.h |   3 +
 migration-colo.c                   |   4 +
 4 files changed, 159 insertions(+), 3 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index c84e6c8..009bcb5 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -1013,6 +1013,7 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host)
     return 0;
 }
 
+static void *memory_region_get_ram_cache_ptr(MemoryRegion *mr, RAMBlock *block);
 static inline void *host_from_stream_offset(QEMUFile *f,
                                             ram_addr_t offset,
                                             int flags)
@@ -1027,7 +1028,12 @@ static inline void *host_from_stream_offset(QEMUFile *f,
             return NULL;
         }
 
-        return memory_region_get_ram_ptr(block->mr) + offset;
+        if (is_slave()) {
+            migration_bitmap_set_dirty(block->mr->ram_addr + offset);
+            return memory_region_get_ram_cache_ptr(block->mr, block) + offset;
+        } else {
+            return memory_region_get_ram_ptr(block->mr) + offset;
+        }
     }
 
     len = qemu_get_byte(f);
@@ -1035,8 +1041,15 @@ static inline void *host_from_stream_offset(QEMUFile *f,
     id[len] = 0;
 
     QTAILQ_FOREACH(block, &ram_list.blocks, next) {
-        if (!strncmp(id, block->idstr, sizeof(id)))
-            return memory_region_get_ram_ptr(block->mr) + offset;
+        if (!strncmp(id, block->idstr, sizeof(id))) {
+            if (is_slave()) {
+                migration_bitmap_set_dirty(block->mr->ram_addr + offset);
+                return memory_region_get_ram_cache_ptr(block->mr, block)
+                       + offset;
+            } else {
+                return memory_region_get_ram_ptr(block->mr) + offset;
+            }
+        }
     }
 
     error_report("Can't find block %s!", id);
@@ -1054,11 +1067,13 @@ void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
     }
 }
 
+static void ram_flush_cache(void);
 static int ram_load(QEMUFile *f, void *opaque, int version_id)
 {
     ram_addr_t addr;
     int flags, ret = 0;
     static uint64_t seq_iter;
+    bool need_flush = false;
 
     seq_iter++;
 
@@ -1121,6 +1136,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 break;
             }
 
+            need_flush = true;
             ch = qemu_get_byte(f);
             ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
         } else if (flags & RAM_SAVE_FLAG_PAGE) {
@@ -1133,6 +1149,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 break;
             }
 
+            need_flush = true;
             qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
         } else if (flags & RAM_SAVE_FLAG_XBZRLE) {
             void *host = host_from_stream_offset(f, addr, flags);
@@ -1148,6 +1165,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
+            need_flush = true;
         } else if (flags & RAM_SAVE_FLAG_HOOK) {
             ram_control_load_hook(f, flags);
         } else if (flags & RAM_SAVE_FLAG_EOS) {
@@ -1161,11 +1179,141 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
         ret = qemu_file_get_error(f);
     }
 
+    if (!ret && is_slave() && need_flush) {
+        ram_flush_cache();
+    }
+
     DPRINTF("Completed load of VM with exit code %d seq iteration "
             "%" PRIu64 "\n", ret, seq_iter);
     return ret;
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM.
+ */
+void create_and_init_ram_cache(void)
+{
+    /*
+     * called after first migration
+     */
+    RAMBlock *block;
+    int64_t ram_cache_pages = last_ram_offset() >> TARGET_PAGE_BITS;
+
+    QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+        block->host_cache = g_malloc(block->length);
+        memcpy(block->host_cache, block->host, block->length);
+    }
+
+    migration_bitmap = bitmap_new(ram_cache_pages);
+    migration_dirty_pages = 0;
+    memory_global_dirty_log_start();
+}
+
+void release_ram_cache(void)
+{
+    RAMBlock *block;
+
+    if (migration_bitmap) {
+        memory_global_dirty_log_stop();
+        g_free(migration_bitmap);
+        migration_bitmap = NULL;
+    }
+
+    QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+        g_free(block->host_cache);
+    }
+}
+
+static void *memory_region_get_ram_cache_ptr(MemoryRegion *mr, RAMBlock *block)
+{
+   if (mr->alias) {
+        return memory_region_get_ram_cache_ptr(mr->alias, block) +
+               mr->alias_offset;
+    }
+
+    assert(mr->terminates);
+
+    ram_addr_t addr = mr->ram_addr & TARGET_PAGE_MASK;
+
+    assert(addr - block->offset < block->length);
+
+    return block->host_cache + (addr - block->offset);
+}
+
+static inline
+ram_addr_t host_bitmap_find_and_reset_dirty(MemoryRegion *mr,
+                                                 ram_addr_t start)
+{
+    unsigned long base = mr->ram_addr >> TARGET_PAGE_BITS;
+    unsigned long nr = base + (start >> TARGET_PAGE_BITS);
+    unsigned long size = base + (int128_get64(mr->size) >> TARGET_PAGE_BITS);
+
+    unsigned long next;
+
+    next = find_next_bit(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION],
+                         size, nr);
+    if (next < size) {
+        clear_bit(next, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
+    }
+    return (next - base) << TARGET_PAGE_BITS;
+}
+
+static void ram_flush_cache(void)
+{
+    RAMBlock *block = NULL;
+    void *dst_host;
+    void *src_host;
+    ram_addr_t ca  = 0, ha = 0;
+    bool got_ca = 0, got_ha = 0;
+    int64_t host_dirty = 0, both_dirty = 0;
+
+    address_space_sync_dirty_bitmap(&address_space_memory);
+
+    block = QTAILQ_FIRST(&ram_list.blocks);
+    while (true) {
+        if (ca < block->length && ca <= ha) {
+            ca = migration_bitmap_find_and_reset_dirty(block->mr, ca);
+            if (ca < block->length) {
+                got_ca = 1;
+            }
+        }
+        if (ha < block->length && ha <= ca) {
+            ha = host_bitmap_find_and_reset_dirty(block->mr, ha);
+            if (ha < block->length && ha != ca) {
+                got_ha = 1;
+            }
+            host_dirty += (ha < block->length ? 1 : 0);
+            both_dirty += (ha < block->length && ha == ca ? 1 : 0);
+        }
+        if (ca >= block->length && ha >= block->length) {
+            ca = 0;
+            ha = 0;
+            block = QTAILQ_NEXT(block, next);
+            if (!block) {
+                break;
+            }
+        } else {
+            if (got_ha) {
+                got_ha = 0;
+                dst_host = memory_region_get_ram_ptr(block->mr) + ha;
+                src_host = memory_region_get_ram_cache_ptr(block->mr, block)
+                           + ha;
+                memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+            }
+            if (got_ca) {
+                got_ca = 0;
+                dst_host = memory_region_get_ram_ptr(block->mr) + ca;
+                src_host = memory_region_get_ram_cache_ptr(block->mr, block)
+                           + ca;
+                memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+            }
+        }
+    }
+
+    assert(migration_dirty_pages == 0);
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index f91581f..029c984 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -300,6 +300,7 @@ CPUArchState *cpu_copy(CPUArchState *env);
 typedef struct RAMBlock {
     struct MemoryRegion *mr;
     uint8_t *host;
+    uint8_t *host_cache;
     ram_addr_t offset;
     ram_addr_t length;
     uint32_t flags;
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index c286a60..52187dd 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -29,5 +29,8 @@ void restore_exit_colo(void);
 bool is_slave(void);
 
 void colo_process_incoming_checkpoints(QEMUFile *f);
+/* ram cache */
+void create_and_init_ram_cache(void);
+void release_ram_cache(void);
 
 #endif
diff --git a/migration-colo.c b/migration-colo.c
index 13a6a57..52156e7 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -554,6 +554,8 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
 
     colo_buffer_init();
 
+    create_and_init_ram_cache();
+
     ret = colo_ctl_put(ctl, COLO_READY);
     if (ret) {
         goto out;
@@ -631,6 +633,8 @@ out:
         qemu_fclose(fb);
     }
 
+    release_ram_cache();
+
     if (ctl) {
         qemu_fclose(ctl);
     }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [RFC PATCH 17/17] HACK: trigger checkpoint every 500ms
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:25   ` Yang Hongyang
  -1 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency, Yang Hongyang

Because COLO Agent is under development. We add this hack for
test purpose. Trigger checkpoint every 500ms so that we can
test the process of COLO save/restore.
NOTE:
  This is only a hack, and will be removed at last.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index 52156e7..4be037e 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -23,7 +23,7 @@
  * this is large because COLO checkpoint will mostly depend on
  * COLO compare module.
  */
-#define CHKPOINT_TIMER 10000
+#define CHKPOINT_TIMER 500
 
 enum {
     COLO_READY = 0x46,
@@ -79,11 +79,6 @@ static int comp_fd = -1;
 
 static int colo_compare_init(void)
 {
-    comp_fd = open(COMPARE_DEV, O_RDONLY);
-    if (comp_fd < 0) {
-        return -1;
-    }
-
     return 0;
 }
 
@@ -104,17 +99,18 @@ static void colo_compare_destroy(void)
  */
 static int colo_compare(void)
 {
-    return ioctl(comp_fd, COMP_IOCTWAIT, 250);
+    errno = ERESTART;
+    return 1;
 }
 
 static int colo_compare_flush(void)
 {
-    return ioctl(comp_fd, COMP_IOCTFLUSH, 1);
+    return 0;
 }
 
 static int colo_compare_resume(void)
 {
-    return ioctl(comp_fd, COMP_IOCTRESUME, 1);
+    return 0;
 }
 
 /* colo buffer */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Qemu-devel] [RFC PATCH 17/17] HACK: trigger checkpoint every 500ms
@ 2014-07-23 14:25   ` Yang Hongyang
  0 siblings, 0 replies; 80+ messages in thread
From: Yang Hongyang @ 2014-07-23 14:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines, Yang Hongyang

Because COLO Agent is under development. We add this hack for
test purpose. Trigger checkpoint every 500ms so that we can
test the process of COLO save/restore.
NOTE:
  This is only a hack, and will be removed at last.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration-colo.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/migration-colo.c b/migration-colo.c
index 52156e7..4be037e 100644
--- a/migration-colo.c
+++ b/migration-colo.c
@@ -23,7 +23,7 @@
  * this is large because COLO checkpoint will mostly depend on
  * COLO compare module.
  */
-#define CHKPOINT_TIMER 10000
+#define CHKPOINT_TIMER 500
 
 enum {
     COLO_READY = 0x46,
@@ -79,11 +79,6 @@ static int comp_fd = -1;
 
 static int colo_compare_init(void)
 {
-    comp_fd = open(COMPARE_DEV, O_RDONLY);
-    if (comp_fd < 0) {
-        return -1;
-    }
-
     return 0;
 }
 
@@ -104,17 +99,18 @@ static void colo_compare_destroy(void)
  */
 static int colo_compare(void)
 {
-    return ioctl(comp_fd, COMP_IOCTWAIT, 250);
+    errno = ERESTART;
+    return 1;
 }
 
 static int colo_compare_flush(void)
 {
-    return ioctl(comp_fd, COMP_IOCTFLUSH, 1);
+    return 0;
 }
 
 static int colo_compare_resume(void)
 {
-    return ioctl(comp_fd, COMP_IOCTRESUME, 1);
+    return 0;
 }
 
 /* colo buffer */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 03/17] COLO migration: add a migration capability 'colo'
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 14:41     ` Eric Blake
  -1 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 14:41 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines

[-- Attachment #1: Type: text/plain, Size: 1658 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> Add a migration capability 'colo'. If this capability is on,
> The migration will never end, and the VM will be continuously
> checkpointed.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  include/qapi/qmp/qerror.h | 3 +++
>  migration.c               | 6 ++++++
>  qapi-schema.json          | 5 ++++-
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/include/qapi/qmp/qerror.h b/include/qapi/qmp/qerror.h
> index 902d1a7..226b805 100644
> --- a/include/qapi/qmp/qerror.h
> +++ b/include/qapi/qmp/qerror.h
> @@ -166,4 +166,7 @@ void qerror_report_err(Error *err);
>  #define QERR_SOCKET_CREATE_FAILED \
>      ERROR_CLASS_GENERIC_ERROR, "Failed to create socket"
>  
> +#define QERR_COLO_UNSUPPORTED \
> +    ERROR_CLASS_GENERIC_ERROR, "COLO is not currently supported, please rerun configure with --enable-colo option in order to support COLO feature"

Unless you plan on using this message in more than one place, we prefer
that you don't add new #defines here.  Instead, just use error_setg with
the message inline.


> +++ b/qapi-schema.json
> @@ -491,10 +491,13 @@
>  # @auto-converge: If enabled, QEMU will automatically throttle down the guest
>  #          to speed up convergence of RAM migration. (since 1.6)
>  #
> +# @colo: The migration will never end, and the VM will instead be continuously
> +#        checkpointed. The feature is disabled by default. (since 2.1)

You missed 2.1.  This has to be since 2.2.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 03/17] COLO migration: add a migration capability 'colo'
@ 2014-07-23 14:41     ` Eric Blake
  0 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 14:41 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: GuiJianfeng, mrhines, eddie.dong, dgilbert, kvm

[-- Attachment #1: Type: text/plain, Size: 1658 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> Add a migration capability 'colo'. If this capability is on,
> The migration will never end, and the VM will be continuously
> checkpointed.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  include/qapi/qmp/qerror.h | 3 +++
>  migration.c               | 6 ++++++
>  qapi-schema.json          | 5 ++++-
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/include/qapi/qmp/qerror.h b/include/qapi/qmp/qerror.h
> index 902d1a7..226b805 100644
> --- a/include/qapi/qmp/qerror.h
> +++ b/include/qapi/qmp/qerror.h
> @@ -166,4 +166,7 @@ void qerror_report_err(Error *err);
>  #define QERR_SOCKET_CREATE_FAILED \
>      ERROR_CLASS_GENERIC_ERROR, "Failed to create socket"
>  
> +#define QERR_COLO_UNSUPPORTED \
> +    ERROR_CLASS_GENERIC_ERROR, "COLO is not currently supported, please rerun configure with --enable-colo option in order to support COLO feature"

Unless you plan on using this message in more than one place, we prefer
that you don't add new #defines here.  Instead, just use error_setg with
the message inline.


> +++ b/qapi-schema.json
> @@ -491,10 +491,13 @@
>  # @auto-converge: If enabled, QEMU will automatically throttle down the guest
>  #          to speed up convergence of RAM migration. (since 1.6)
>  #
> +# @colo: The migration will never end, and the VM will instead be continuously
> +#        checkpointed. The feature is disabled by default. (since 2.1)

You missed 2.1.  This has to be since 2.2.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 15:44   ` Eric Blake
  -1 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 15:44 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines

[-- Attachment #1: Type: text/plain, Size: 1863 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> Virtual machine (VM) replication is a well known technique for
> providing application-agnostic software-implemented hardware fault
> tolerance "non-stop service". COLO is a high availability solution.
> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
> receive the same request from client, and generate response in parallel
> too. If the response packets from PVM and SVM are identical, they are
> released immediately. Otherwise, a VM checkpoint (on demand) is
> conducted. The idea is presented in Xen summit 2012, and 2013,
> and academia paper in SOCC 2013. It's also presented in KVM forum
> 2013:
> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
> Please refer to above document for detailed information. 
> Please also refer to previous posted RFC proposal:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
> 
> The patchset is also hosted on github:
> https://github.com/macrosheep/qemu/tree/colo_v0.1
> 
> This patchset is RFC, implements the frame of colo, without
> failover and nic/disk replication. But it is ready for demo
> the COLO idea above QEMU-Kvm.
> Steps using this patchset to get an overview of COLO:
> 1. configure the source with --enable-colo option

Code that has to be opt-in tends to bitrot, because people don't
configure their build-bots to opt in.  What sort of penalties does
opting in cause to the code if colo is not used?  I'd much rather make
the default to compile colo unless configured --disable-colo.  Are there
any pre-req libraries required for it to work?  That would be the only
reason to make the default of on or off conditional, rather than
defaulting to on.


-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2014-07-23 15:44   ` Eric Blake
  0 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 15:44 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: GuiJianfeng, mrhines, eddie.dong, dgilbert, kvm

[-- Attachment #1: Type: text/plain, Size: 1863 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> Virtual machine (VM) replication is a well known technique for
> providing application-agnostic software-implemented hardware fault
> tolerance "non-stop service". COLO is a high availability solution.
> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
> receive the same request from client, and generate response in parallel
> too. If the response packets from PVM and SVM are identical, they are
> released immediately. Otherwise, a VM checkpoint (on demand) is
> conducted. The idea is presented in Xen summit 2012, and 2013,
> and academia paper in SOCC 2013. It's also presented in KVM forum
> 2013:
> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
> Please refer to above document for detailed information. 
> Please also refer to previous posted RFC proposal:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
> 
> The patchset is also hosted on github:
> https://github.com/macrosheep/qemu/tree/colo_v0.1
> 
> This patchset is RFC, implements the frame of colo, without
> failover and nic/disk replication. But it is ready for demo
> the COLO idea above QEMU-Kvm.
> Steps using this patchset to get an overview of COLO:
> 1. configure the source with --enable-colo option

Code that has to be opt-in tends to bitrot, because people don't
configure their build-bots to opt in.  What sort of penalties does
opting in cause to the code if colo is not used?  I'd much rather make
the default to compile colo unless configured --disable-colo.  Are there
any pre-req libraries required for it to work?  That would be the only
reason to make the default of on or off conditional, rather than
defaulting to on.


-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 02/17] COLO: introduce an api colo_supported() to indicate COLO support
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 15:47     ` Eric Blake
  -1 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 15:47 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines

[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> introduce an api colo_supported() to indicate COLO support, returns
> true if colo supported(configured with --enable-colo).

Space before () in English sentences:
 s/supported(configured/supported (configured/

As I mentioned in the cover letter, defaulting to off is probably a bad
idea; I'd rather default to on or even make it unconditional if it
doesn't negatively affect the code base when not used.

> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  Makefile.objs                      |  1 +
>  include/migration/migration-colo.h | 18 ++++++++++++++++++
>  migration-colo.c                   | 16 ++++++++++++++++
>  stubs/Makefile.objs                |  1 +
>  stubs/migration-colo.c             | 16 ++++++++++++++++
>  5 files changed, 52 insertions(+)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 migration-colo.c
>  create mode 100644 stubs/migration-colo.c
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 02/17] COLO: introduce an api colo_supported() to indicate COLO support
@ 2014-07-23 15:47     ` Eric Blake
  0 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 15:47 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: GuiJianfeng, mrhines, eddie.dong, dgilbert, kvm

[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> introduce an api colo_supported() to indicate COLO support, returns
> true if colo supported(configured with --enable-colo).

Space before () in English sentences:
 s/supported(configured/supported (configured/

As I mentioned in the cover letter, defaulting to off is probably a bad
idea; I'd rather default to on or even make it unconditional if it
doesn't negatively affect the code base when not used.

> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  Makefile.objs                      |  1 +
>  include/migration/migration-colo.h | 18 ++++++++++++++++++
>  migration-colo.c                   | 16 ++++++++++++++++
>  stubs/Makefile.objs                |  1 +
>  stubs/migration-colo.c             | 16 ++++++++++++++++
>  5 files changed, 52 insertions(+)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 migration-colo.c
>  create mode 100644 stubs/migration-colo.c
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 12/17] COLO ctl: add a RunState RUN_STATE_COLO
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 15:48     ` Eric Blake
  -1 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 15:48 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> Guest will enter this state when paused to save/resore VM state

s/resore/restore/

> under colo checkpoint.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  qapi-schema.json | 4 +++-
>  vl.c             | 8 ++++++++
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 807f5a2..b42171c 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -145,12 +145,14 @@
>  # @watchdog: the watchdog action is configured to pause and has been triggered
>  #
>  # @guest-panicked: guest has been panicked as a result of guest OS panic
> +#
> +# @colo: guest is paused to save/restore VM state under colo checkpoint

Missing a '(since 2.2)' designation.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 12/17] COLO ctl: add a RunState RUN_STATE_COLO
@ 2014-07-23 15:48     ` Eric Blake
  0 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 15:48 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: GuiJianfeng, mrhines, eddie.dong, dgilbert, kvm

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> Guest will enter this state when paused to save/resore VM state

s/resore/restore/

> under colo checkpoint.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  qapi-schema.json | 4 +++-
>  vl.c             | 8 ++++++++
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 807f5a2..b42171c 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -145,12 +145,14 @@
>  # @watchdog: the watchdog action is configured to pause and has been triggered
>  #
>  # @guest-panicked: guest has been panicked as a result of guest OS panic
> +#
> +# @colo: guest is paused to save/restore VM state under colo checkpoint

Missing a '(since 2.2)' designation.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-07-23 18:24     ` Eric Blake
  -1 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 18:24 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: kvm, GuiJianfeng, eddie.dong, dgilbert, mrhines

[-- Attachment #1: Type: text/plain, Size: 2225 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> We need a buffer to store migration data.
> 
> On save side:
>   all saved data was write into colo buffer first, so that we can know

s/was write/is written/

> the total size of the migration data. this can also separate the data
> transmission from colo control data, we use colo control data over
> socket fd to synchronous both side's stat.
> 
> On restore side:
>   all migration data was read into colo buffer first, then load data
> from the buffer: If network error happens while data transmission,

s/while/during/

> the slaver can still functinal because the migration data are not yet

s/slaver/slave/
s/functinal/function/
s/are/is/

> loaded.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  migration-colo.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 112 insertions(+)
> 

> +/* colo buffer */
> +
> +#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
> +#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)

Spaces around binary operators.

> +
> +typedef struct colo_buffer {

For consistency with the rest of the code base, name this ColoBuffer,
not colo_buffer.

> +    uint8_t *data;
> +    uint64_t used;
> +    uint64_t freed;
> +    uint64_t size;
> +} colo_buffer_t;

HACKING says to NOT name types with a trailing _t.  Just name the
typedef ColoBuffer.


> +static void colo_buffer_destroy(void)
> +{
> +    if (colo_buffer.data) {
> +        g_free(colo_buffer.data);
> +        colo_buffer.data = NULL;

g_free(NULL) behaves sanely, just make these two lines unconditional.


> +static void colo_buffer_extend(uint64_t len)
> +{
> +    if (len > colo_buffer.size - colo_buffer.used) {
> +        len = len + colo_buffer.used - colo_buffer.size;
> +        len = ROUND_UP(len, COLO_BUFFER_BASE_SIZE) + COLO_BUFFER_BASE_SIZE;
> +
> +        colo_buffer.size += len;
> +        if (colo_buffer.size > COLO_BUFFER_MAX_SIZE) {
> +            error_report("colo_buffer overflow!\n");

No trailing \n in error_report().

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it
@ 2014-07-23 18:24     ` Eric Blake
  0 siblings, 0 replies; 80+ messages in thread
From: Eric Blake @ 2014-07-23 18:24 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel; +Cc: GuiJianfeng, mrhines, eddie.dong, dgilbert, kvm

[-- Attachment #1: Type: text/plain, Size: 2225 bytes --]

On 07/23/2014 08:25 AM, Yang Hongyang wrote:
> We need a buffer to store migration data.
> 
> On save side:
>   all saved data was write into colo buffer first, so that we can know

s/was write/is written/

> the total size of the migration data. this can also separate the data
> transmission from colo control data, we use colo control data over
> socket fd to synchronous both side's stat.
> 
> On restore side:
>   all migration data was read into colo buffer first, then load data
> from the buffer: If network error happens while data transmission,

s/while/during/

> the slaver can still functinal because the migration data are not yet

s/slaver/slave/
s/functinal/function/
s/are/is/

> loaded.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  migration-colo.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 112 insertions(+)
> 

> +/* colo buffer */
> +
> +#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
> +#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)

Spaces around binary operators.

> +
> +typedef struct colo_buffer {

For consistency with the rest of the code base, name this ColoBuffer,
not colo_buffer.

> +    uint8_t *data;
> +    uint64_t used;
> +    uint64_t freed;
> +    uint64_t size;
> +} colo_buffer_t;

HACKING says to NOT name types with a trailing _t.  Just name the
typedef ColoBuffer.


> +static void colo_buffer_destroy(void)
> +{
> +    if (colo_buffer.data) {
> +        g_free(colo_buffer.data);
> +        colo_buffer.data = NULL;

g_free(NULL) behaves sanely, just make these two lines unconditional.


> +static void colo_buffer_extend(uint64_t len)
> +{
> +    if (len > colo_buffer.size - colo_buffer.used) {
> +        len = len + colo_buffer.used - colo_buffer.size;
> +        len = ROUND_UP(len, COLO_BUFFER_BASE_SIZE) + COLO_BUFFER_BASE_SIZE;
> +
> +        colo_buffer.size += len;
> +        if (colo_buffer.size > COLO_BUFFER_MAX_SIZE) {
> +            error_report("colo_buffer overflow!\n");

No trailing \n in error_report().

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2014-07-23 15:44   ` Eric Blake
@ 2014-07-24  2:24     ` Hongyang Yang
  -1 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-07-24  2:24 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: GuiJianfeng, mrhines, eddie.dong, dgilbert, kvm

On 07/23/2014 11:44 PM, Eric Blake wrote:
> On 07/23/2014 08:25 AM, Yang Hongyang wrote:
>> Virtual machine (VM) replication is a well known technique for
>> providing application-agnostic software-implemented hardware fault
>> tolerance "non-stop service". COLO is a high availability solution.
>> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
>> receive the same request from client, and generate response in parallel
>> too. If the response packets from PVM and SVM are identical, they are
>> released immediately. Otherwise, a VM checkpoint (on demand) is
>> conducted. The idea is presented in Xen summit 2012, and 2013,
>> and academia paper in SOCC 2013. It's also presented in KVM forum
>> 2013:
>> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
>> Please refer to above document for detailed information.
>> Please also refer to previous posted RFC proposal:
>> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
>>
>> The patchset is also hosted on github:
>> https://github.com/macrosheep/qemu/tree/colo_v0.1
>>
>> This patchset is RFC, implements the frame of colo, without
>> failover and nic/disk replication. But it is ready for demo
>> the COLO idea above QEMU-Kvm.
>> Steps using this patchset to get an overview of COLO:
>> 1. configure the source with --enable-colo option
>
> Code that has to be opt-in tends to bitrot, because people don't
> configure their build-bots to opt in.  What sort of penalties does
> opting in cause to the code if colo is not used?  I'd much rather make
> the default to compile colo unless configured --disable-colo.  Are there
> any pre-req libraries required for it to work?  That would be the only
> reason to make the default of on or off conditional, rather than
> defaulting to on.

Thanks for all your comments on this patchset, will address them.
For this one, it will not affect the rest of the code if COLO is compiled
but not used, and it does not require pre-req libraries for now, so we can
make COLO support default to on next time.

>
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2014-07-24  2:24     ` Hongyang Yang
  0 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-07-24  2:24 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: GuiJianfeng, mrhines, eddie.dong, dgilbert, kvm

On 07/23/2014 11:44 PM, Eric Blake wrote:
> On 07/23/2014 08:25 AM, Yang Hongyang wrote:
>> Virtual machine (VM) replication is a well known technique for
>> providing application-agnostic software-implemented hardware fault
>> tolerance "non-stop service". COLO is a high availability solution.
>> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
>> receive the same request from client, and generate response in parallel
>> too. If the response packets from PVM and SVM are identical, they are
>> released immediately. Otherwise, a VM checkpoint (on demand) is
>> conducted. The idea is presented in Xen summit 2012, and 2013,
>> and academia paper in SOCC 2013. It's also presented in KVM forum
>> 2013:
>> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
>> Please refer to above document for detailed information.
>> Please also refer to previous posted RFC proposal:
>> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
>>
>> The patchset is also hosted on github:
>> https://github.com/macrosheep/qemu/tree/colo_v0.1
>>
>> This patchset is RFC, implements the frame of colo, without
>> failover and nic/disk replication. But it is ready for demo
>> the COLO idea above QEMU-Kvm.
>> Steps using this patchset to get an overview of COLO:
>> 1. configure the source with --enable-colo option
>
> Code that has to be opt-in tends to bitrot, because people don't
> configure their build-bots to opt in.  What sort of penalties does
> opting in cause to the code if colo is not used?  I'd much rather make
> the default to compile colo unless configured --disable-colo.  Are there
> any pre-req libraries required for it to work?  That would be the only
> reason to make the default of on or off conditional, rather than
> defaulting to on.

Thanks for all your comments on this patchset, will address them.
For this one, it will not affect the rest of the code if COLO is compiled
but not used, and it does not require pre-req libraries for now, so we can
make COLO support default to on next time.

>
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 14:43     ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 14:43 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> migrate colo info to migration target to tell the target colo is
> enabled.

If I understand this correctly this means that you send a 'colo info' device
information for migrations that don't have COLO enabled; that's bad because
it breaks migration unless the destination has it; I guess it's OK if you
were to guard it with a thing so it didn't do it for old machine-types.

You could use the QEMU_VM_COMMAND sections I've created for postcopy;
( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00889.html ) and 
add a QEMU_VM_CMD_COLO to indicate you want the destination to become an SVM,
  then check the capability near the start of migration and send the command.

Or perhaps there's a way to add the colo-info device on the command line so it's
not always there.

Dave

> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  Makefile.objs                      |  1 +
>  include/migration/migration-colo.h |  3 ++
>  migration-colo-comm.c              | 68 ++++++++++++++++++++++++++++++++++++++
>  vl.c                               |  4 +++
>  4 files changed, 76 insertions(+)
>  create mode 100644 migration-colo-comm.c
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index cab5824..1836a68 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
>  common-obj-$(CONFIG_LINUX) += fsdev/
>  
>  common-obj-y += migration.o migration-tcp.o
> +common-obj-y += migration-colo-comm.o
>  common-obj-$(CONFIG_COLO) += migration-colo.o
>  common-obj-y += vmstate.o
>  common-obj-y += qemu-file.o
> diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
> index 35b384c..e3735d8 100644
> --- a/include/migration/migration-colo.h
> +++ b/include/migration/migration-colo.h
> @@ -12,6 +12,9 @@
>  #define QEMU_MIGRATION_COLO_H
>  
>  #include "qemu-common.h"
> +#include "migration/migration.h"
> +
> +void colo_info_mig_init(void);
>  
>  bool colo_supported(void);
>  
> diff --git a/migration-colo-comm.c b/migration-colo-comm.c
> new file mode 100644
> index 0000000..ccbc246
> --- /dev/null
> +++ b/migration-colo-comm.c
> @@ -0,0 +1,68 @@
> +/*
> + *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + *  (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + *  Copyright (C) 2014 FUJITSU LIMITED
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include <migration/migration-colo.h>
> +
> +#define DEBUG_COLO
> +
> +#ifdef DEBUG_COLO
> +#define DPRINTF(fmt, ...) \
> +    do { fprintf(stdout, "COLO: " fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) \
> +    do { } while (0)
> +#endif
> +
> +static bool colo_requested;
> +
> +/* save */
> +
> +static bool migrate_use_colo(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
> +}
> +
> +static void colo_info_save(QEMUFile *f, void *opaque)
> +{
> +    qemu_put_byte(f, migrate_use_colo());
> +}
> +
> +/* restore */
> +
> +static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
> +{
> +    int value = qemu_get_byte(f);
> +
> +    if (value && !colo_supported()) {
> +        fprintf(stderr, "COLO is not supported\n");
> +        return -EINVAL;
> +    }
> +
> +    if (value && !colo_requested) {
> +        DPRINTF("COLO requested!\n");
> +    }
> +
> +    colo_requested = value;
> +
> +    return 0;
> +}
> +
> +static SaveVMHandlers savevm_colo_info_handlers = {
> +    .save_state = colo_info_save,
> +    .load_state = colo_info_load,
> +};
> +
> +void colo_info_mig_init(void)
> +{
> +    register_savevm_live(NULL, "colo info", -1, 1,
> +                         &savevm_colo_info_handlers, NULL);
> +}
> diff --git a/vl.c b/vl.c
> index fe451aa..1a282d8 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -89,6 +89,7 @@ int main(int argc, char **argv)
>  #include "sysemu/dma.h"
>  #include "audio/audio.h"
>  #include "migration/migration.h"
> +#include "migration/migration-colo.h"
>  #include "sysemu/kvm.h"
>  #include "qapi/qmp/qjson.h"
>  #include "qemu/option.h"
> @@ -4339,6 +4340,9 @@ int main(int argc, char **argv, char **envp)
>  
>      blk_mig_init();
>      ram_mig_init();
> +    if (colo_supported()) {
> +        colo_info_mig_init();
> +    }
>  
>      /* open the virtual block devices */
>      if (snapshot)
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled
@ 2014-08-01 14:43     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 14:43 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> migrate colo info to migration target to tell the target colo is
> enabled.

If I understand this correctly this means that you send a 'colo info' device
information for migrations that don't have COLO enabled; that's bad because
it breaks migration unless the destination has it; I guess it's OK if you
were to guard it with a thing so it didn't do it for old machine-types.

You could use the QEMU_VM_COMMAND sections I've created for postcopy;
( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00889.html ) and 
add a QEMU_VM_CMD_COLO to indicate you want the destination to become an SVM,
  then check the capability near the start of migration and send the command.

Or perhaps there's a way to add the colo-info device on the command line so it's
not always there.

Dave

> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  Makefile.objs                      |  1 +
>  include/migration/migration-colo.h |  3 ++
>  migration-colo-comm.c              | 68 ++++++++++++++++++++++++++++++++++++++
>  vl.c                               |  4 +++
>  4 files changed, 76 insertions(+)
>  create mode 100644 migration-colo-comm.c
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index cab5824..1836a68 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
>  common-obj-$(CONFIG_LINUX) += fsdev/
>  
>  common-obj-y += migration.o migration-tcp.o
> +common-obj-y += migration-colo-comm.o
>  common-obj-$(CONFIG_COLO) += migration-colo.o
>  common-obj-y += vmstate.o
>  common-obj-y += qemu-file.o
> diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
> index 35b384c..e3735d8 100644
> --- a/include/migration/migration-colo.h
> +++ b/include/migration/migration-colo.h
> @@ -12,6 +12,9 @@
>  #define QEMU_MIGRATION_COLO_H
>  
>  #include "qemu-common.h"
> +#include "migration/migration.h"
> +
> +void colo_info_mig_init(void);
>  
>  bool colo_supported(void);
>  
> diff --git a/migration-colo-comm.c b/migration-colo-comm.c
> new file mode 100644
> index 0000000..ccbc246
> --- /dev/null
> +++ b/migration-colo-comm.c
> @@ -0,0 +1,68 @@
> +/*
> + *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + *  (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + *  Copyright (C) 2014 FUJITSU LIMITED
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include <migration/migration-colo.h>
> +
> +#define DEBUG_COLO
> +
> +#ifdef DEBUG_COLO
> +#define DPRINTF(fmt, ...) \
> +    do { fprintf(stdout, "COLO: " fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) \
> +    do { } while (0)
> +#endif
> +
> +static bool colo_requested;
> +
> +/* save */
> +
> +static bool migrate_use_colo(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
> +}
> +
> +static void colo_info_save(QEMUFile *f, void *opaque)
> +{
> +    qemu_put_byte(f, migrate_use_colo());
> +}
> +
> +/* restore */
> +
> +static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
> +{
> +    int value = qemu_get_byte(f);
> +
> +    if (value && !colo_supported()) {
> +        fprintf(stderr, "COLO is not supported\n");
> +        return -EINVAL;
> +    }
> +
> +    if (value && !colo_requested) {
> +        DPRINTF("COLO requested!\n");
> +    }
> +
> +    colo_requested = value;
> +
> +    return 0;
> +}
> +
> +static SaveVMHandlers savevm_colo_info_handlers = {
> +    .save_state = colo_info_save,
> +    .load_state = colo_info_load,
> +};
> +
> +void colo_info_mig_init(void)
> +{
> +    register_savevm_live(NULL, "colo info", -1, 1,
> +                         &savevm_colo_info_handlers, NULL);
> +}
> diff --git a/vl.c b/vl.c
> index fe451aa..1a282d8 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -89,6 +89,7 @@ int main(int argc, char **argv)
>  #include "sysemu/dma.h"
>  #include "audio/audio.h"
>  #include "migration/migration.h"
> +#include "migration/migration-colo.h"
>  #include "sysemu/kvm.h"
>  #include "qapi/qmp/qjson.h"
>  #include "qemu/option.h"
> @@ -4339,6 +4340,9 @@ int main(int argc, char **argv, char **envp)
>  
>      blk_mig_init();
>      ram_mig_init();
> +    if (colo_supported()) {
> +        colo_info_mig_init();
> +    }
>  
>      /* open the virtual block devices */
>      if (snapshot)
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 05/17] COLO save: integrate COLO checkpointed save into qemu migration
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 14:46     ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 14:46 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, dgilbert, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>   Integrate COLO checkpointed save flow into qemu migration.
>   Add a migrate state: MIG_STATE_COLO, enter this migrate state
> after the first live migration successfully finished.
>   Create a colo thread to do the checkpointed save.

In postcopy I added a 'migration_already_active' function
to merge all the different places that check for ACTIVE/SETUP etc.
( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00850.html )

> +    /*TODO: COLO checkpointed save loop*/
> +
> +    if (s->state != MIG_STATE_ERROR) {
> +        migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
> +    }

I thought migrate_set_state only changed the state if the old state
matched the 1st value - i.e. I think it'll only change to COMPLETED
if the state is COLO; so I don't think you need the if.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 05/17] COLO save: integrate COLO checkpointed save into qemu migration
@ 2014-08-01 14:46     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 14:46 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines, dgilbert

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>   Integrate COLO checkpointed save flow into qemu migration.
>   Add a migrate state: MIG_STATE_COLO, enter this migrate state
> after the first live migration successfully finished.
>   Create a colo thread to do the checkpointed save.

In postcopy I added a 'migration_already_active' function
to merge all the different places that check for ACTIVE/SETUP etc.
( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00850.html )

> +    /*TODO: COLO checkpointed save loop*/
> +
> +    if (s->state != MIG_STATE_ERROR) {
> +        migrate_set_state(s, MIG_STATE_COLO, MIG_STATE_COMPLETED);
> +    }

I thought migrate_set_state only changed the state if the old state
matched the 1st value - i.e. I think it'll only change to COMPLETED
if the state is COLO; so I don't think you need the if.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 14:52     ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 14:52 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> We need a buffer to store migration data.
> 
> On save side:
>   all saved data was write into colo buffer first, so that we can know
> the total size of the migration data. this can also separate the data
> transmission from colo control data, we use colo control data over
> socket fd to synchronous both side's stat.
> 
> On restore side:
>   all migration data was read into colo buffer first, then load data
> from the buffer: If network error happens while data transmission,
> the slaver can still functinal because the migration data are not yet
> loaded.

This is very similar to the QEMUSizedBuffer based QEMUFile's that Stefan Berger
wrote and that I use in both my postcopy and BER patchsets:

 http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00846.html

 (and to the similar code from Isaku Yamahata).

I think we should be able to use a shared version even if we need some changes.

> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  migration-colo.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 112 insertions(+)
> 
> diff --git a/migration-colo.c b/migration-colo.c
> index d566b9d..b90d9b6 100644
> --- a/migration-colo.c
> +++ b/migration-colo.c
> @@ -11,6 +11,7 @@
>  #include "qemu/main-loop.h"
>  #include "qemu/thread.h"
>  #include "block/coroutine.h"
> +#include "qemu/error-report.h"
>  #include "migration/migration-colo.h"
>  
>  static QEMUBH *colo_bh;
> @@ -20,14 +21,122 @@ bool colo_supported(void)
>      return true;
>  }
>  
> +/* colo buffer */
> +
> +#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
> +#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)

Powers of 2 are nicer!

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it
@ 2014-08-01 14:52     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 14:52 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> We need a buffer to store migration data.
> 
> On save side:
>   all saved data was write into colo buffer first, so that we can know
> the total size of the migration data. this can also separate the data
> transmission from colo control data, we use colo control data over
> socket fd to synchronous both side's stat.
> 
> On restore side:
>   all migration data was read into colo buffer first, then load data
> from the buffer: If network error happens while data transmission,
> the slaver can still functinal because the migration data are not yet
> loaded.

This is very similar to the QEMUSizedBuffer based QEMUFile's that Stefan Berger
wrote and that I use in both my postcopy and BER patchsets:

 http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00846.html

 (and to the similar code from Isaku Yamahata).

I think we should be able to use a shared version even if we need some changes.

> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  migration-colo.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 112 insertions(+)
> 
> diff --git a/migration-colo.c b/migration-colo.c
> index d566b9d..b90d9b6 100644
> --- a/migration-colo.c
> +++ b/migration-colo.c
> @@ -11,6 +11,7 @@
>  #include "qemu/main-loop.h"
>  #include "qemu/thread.h"
>  #include "block/coroutine.h"
> +#include "qemu/error-report.h"
>  #include "migration/migration-colo.h"
>  
>  static QEMUBH *colo_bh;
> @@ -20,14 +21,122 @@ bool colo_supported(void)
>      return true;
>  }
>  
> +/* colo buffer */
> +
> +#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
> +#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)

Powers of 2 are nicer!

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 10/17] COLO ctl: introduce is_slave() and is_master()
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 14:55     ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 14:55 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> is_slaver is to determine whether the QEMU instance is a
> slaver(migration target) at runtime.
> is_master is to determine whether the QEMU instance is a
> master(migration starter) at runtime.
> This 2 APIs will be used later.

Since the names are made global in patch 15, I think it's best to
do it here, but also use a more specific name for them, like
colo_is_master.

Dave

> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  migration-colo.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/migration-colo.c b/migration-colo.c
> index 802f8b0..2699e77 100644
> --- a/migration-colo.c
> +++ b/migration-colo.c
> @@ -187,6 +187,12 @@ static const QEMUFileOps colo_read_ops = {
>  
>  /* save */
>  
> +static __attribute__((unused)) bool is_master(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +    return (s->state == MIG_STATE_COLO);
> +}
> +
>  static void *colo_thread(void *opaque)
>  {
>      MigrationState *s = opaque;
> @@ -275,6 +281,11 @@ void colo_init_checkpointer(MigrationState *s)
>  
>  static Coroutine *colo;
>  
> +static __attribute__((unused)) bool is_slave(void)
> +{
> +    return colo != NULL;
> +}
> +
>  /*
>   * return:
>   * 0: start a checkpoint
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 10/17] COLO ctl: introduce is_slave() and is_master()
@ 2014-08-01 14:55     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 14:55 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> is_slaver is to determine whether the QEMU instance is a
> slaver(migration target) at runtime.
> is_master is to determine whether the QEMU instance is a
> master(migration starter) at runtime.
> This 2 APIs will be used later.

Since the names are made global in patch 15, I think it's best to
do it here, but also use a more specific name for them, like
colo_is_master.

Dave

> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  migration-colo.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/migration-colo.c b/migration-colo.c
> index 802f8b0..2699e77 100644
> --- a/migration-colo.c
> +++ b/migration-colo.c
> @@ -187,6 +187,12 @@ static const QEMUFileOps colo_read_ops = {
>  
>  /* save */
>  
> +static __attribute__((unused)) bool is_master(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +    return (s->state == MIG_STATE_COLO);
> +}
> +
>  static void *colo_thread(void *opaque)
>  {
>      MigrationState *s = opaque;
> @@ -275,6 +281,11 @@ void colo_init_checkpointer(MigrationState *s)
>  
>  static Coroutine *colo;
>  
> +static __attribute__((unused)) bool is_slave(void)
> +{
> +    return colo != NULL;
> +}
> +
>  /*
>   * return:
>   * 0: start a checkpoint
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 15:03     ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 15:03 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> implement colo checkpoint protocol.
> 
> Checkpoint synchronzing points.
> 
>                   Primary                 Secondary
>   NEW             @
>                                           Suspend
>   SUSPENDED                               @
>                   Suspend&Save state
>   SEND            @
>                   Send state              Receive state
>   RECEIVED                                @
>                   Flush network           Load state
>   LOADED                                  @
>                   Resume                  Resume
> 
>                   Start Comparing
> NOTE:
>  1) '@' who sends the message
>  2) Every sync-point is synchronized by two sides with only
>     one handshake(single direction) for low-latency.
>     If more strict synchronization is required, a opposite direction
>     sync-point should be added.
>  3) Since sync-points are single direction, the remote side may
>     go forward a lot when this side just receives the sync-point.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  migration-colo.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 262 insertions(+), 6 deletions(-)
> 
> diff --git a/migration-colo.c b/migration-colo.c
> index 2699e77..a708872 100644
> --- a/migration-colo.c
> +++ b/migration-colo.c
> @@ -24,6 +24,41 @@
>   */
>  #define CHKPOINT_TIMER 10000
>  
> +enum {
> +    COLO_READY = 0x46,
> +
> +    /*
> +     * Checkpoint synchronzing points.
> +     *
> +     *                  Primary                 Secondary
> +     *  NEW             @
> +     *                                          Suspend
> +     *  SUSPENDED                               @
> +     *                  Suspend&Save state
> +     *  SEND            @
> +     *                  Send state              Receive state
> +     *  RECEIVED                                @
> +     *                  Flush network           Load state
> +     *  LOADED                                  @
> +     *                  Resume                  Resume
> +     *
> +     *                  Start Comparing
> +     * NOTE:
> +     * 1) '@' who sends the message
> +     * 2) Every sync-point is synchronized by two sides with only
> +     *    one handshake(single direction) for low-latency.
> +     *    If more strict synchronization is required, a opposite direction
> +     *    sync-point should be added.
> +     * 3) Since sync-points are single direction, the remote side may
> +     *    go forward a lot when this side just receives the sync-point.
> +     */
> +    COLO_CHECKPOINT_NEW,
> +    COLO_CHECKPOINT_SUSPENDED,
> +    COLO_CHECKPOINT_SEND,
> +    COLO_CHECKPOINT_RECEIVED,
> +    COLO_CHECKPOINT_LOADED,
> +};
> +
>  static QEMUBH *colo_bh;
>  
>  bool colo_supported(void)
> @@ -185,30 +220,161 @@ static const QEMUFileOps colo_read_ops = {
>      .close = colo_close,
>  };
>  
> +/* colo checkpoint control helper */
> +static bool is_master(void);
> +static bool is_slave(void);
> +
> +static void ctl_error_handler(void *opaque, int err)
> +{
> +    if (is_slave()) {
> +        /* TODO: determine whether we need to failover */
> +        /* FIXME: we will not failover currently, just kill slave */
> +        error_report("error: colo transmission failed!\n");
> +        exit(1);
> +    } else if (is_master()) {
> +        /* Master still alive, do not failover */
> +        error_report("error: colo transmission failed!\n");
> +        return;
> +    } else {
> +        error_report("COLO: Unexpected error happend!\n");
> +        exit(EXIT_FAILURE);
> +    }
> +}
> +
> +static int colo_ctl_put(QEMUFile *f, uint64_t request)
> +{
> +    int ret = 0;
> +
> +    qemu_put_be64(f, request);
> +    qemu_fflush(f);
> +
> +    ret = qemu_file_get_error(f);
> +    if (ret < 0) {
> +        ctl_error_handler(f, ret);
> +        return 1;
> +    }
> +
> +    return ret;
> +}
> +
> +static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
> +{
> +    int ret = 0;
> +    uint64_t temp;
> +
> +    temp = qemu_get_be64(f);
> +
> +    ret = qemu_file_get_error(f);
> +    if (ret < 0) {
> +        ctl_error_handler(f, ret);
> +        return 1;
> +    }
> +
> +    *value = temp;
> +    return 0;
> +}
> +
> +static int colo_ctl_get(QEMUFile *f, uint64_t require)
> +{
> +    int ret;
> +    uint64_t value;
> +
> +    ret = colo_ctl_get_value(f, &value);
> +    if (ret) {
> +        return ret;
> +    }
> +
> +    if (value != require) {
> +        error_report("unexpected state received!\n");

I find it useful to print the expected/received state to
be able to figure out what went wrong.

> +        exit(1);
> +    }
> +
> +    return ret;
> +}
> +
>  /* save */
>  
> -static __attribute__((unused)) bool is_master(void)
> +static bool is_master(void)
>  {
>      MigrationState *s = migrate_get_current();
>      return (s->state == MIG_STATE_COLO);
>  }
>  
> +static int do_colo_transaction(MigrationState *s, QEMUFile *control,
> +                               QEMUFile *trans)
> +{
> +    int ret;
> +
> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);

What happens at this point if the slave just doesn't respond?
(i.e. the socket doesn't drop - you just don't get the byte).

> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: suspend and save vm state to colo buffer */
> +
> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: send vmstate to slave */
> +
> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: Flush network etc. */
> +
> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: resume master */
> +
> +out:
> +    return ret;
> +}
> +
>  static void *colo_thread(void *opaque)
>  {
>      MigrationState *s = opaque;
>      int dev_hotplug = qdev_hotplug, wait_cp = 0;
>      int64_t start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      int64_t current_time;
> +    QEMUFile *colo_control = NULL, *colo_trans = NULL;
> +    int ret;
>  
>      if (colo_compare_init() < 0) {
>          error_report("Init colo compare error\n");
>          goto out;
>      }
>  
> +    colo_control = qemu_fopen_socket(qemu_get_fd(s->file), "rb");
> +    if (!colo_control) {
> +        error_report("open colo_control failed\n");
> +        goto out;
> +    }

In my postcopy world I'm trying to abstract this type of thing into a 'return path'
so that the QEMUFile can implement it however it wants and you don't
need to assume it's a socket.  But I'm still fighting some of those details.

Dave

> +
>      qdev_hotplug = 0;
>  
>      colo_buffer_init();
>  
> +    /*
> +     * Wait for slave finish loading vm states and enter COLO
> +     * restore.
> +     */
> +    ret = colo_ctl_get(colo_control, COLO_READY);
> +    if (ret) {
> +        goto out;
> +    }
> +
>      while (s->state == MIG_STATE_COLO) {
>          /* wait for a colo checkpoint */
>          wait_cp = colo_compare();
> @@ -230,13 +396,33 @@ static void *colo_thread(void *opaque)
>  
>          /* start a colo checkpoint */
>  
> -        /*TODO: COLO save */
> +        /* open colo buffer for write */
> +        colo_trans = qemu_fopen_ops(&colo_buffer, &colo_write_ops);
> +        if (!colo_trans) {
> +            error_report("open colo buffer failed\n");
> +            goto out;
> +        }
>  
> +        if (do_colo_transaction(s, colo_control, colo_trans)) {
> +            goto out;
> +        }
> +
> +        qemu_fclose(colo_trans);
> +        colo_trans = NULL;
>          start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      }
>  
>  out:
> +    if (colo_trans) {
> +        qemu_fclose(colo_trans);
> +    }
> +
>      colo_buffer_destroy();
> +
> +    if (colo_control) {
> +        qemu_fclose(colo_control);
> +    }
> +
>      colo_compare_destroy();
>  
>      if (s->state != MIG_STATE_ERROR) {
> @@ -281,7 +467,7 @@ void colo_init_checkpointer(MigrationState *s)
>  
>  static Coroutine *colo;
>  
> -static __attribute__((unused)) bool is_slave(void)
> +static bool is_slave(void)
>  {
>      return colo != NULL;
>  }
> @@ -293,13 +479,32 @@ static __attribute__((unused)) bool is_slave(void)
>   */
>  static int slave_wait_new_checkpoint(QEMUFile *f)
>  {
> -    /* TODO: wait checkpoint start command from master */
> -    return 1;
> +    int fd = qemu_get_fd(f);
> +    int ret;
> +    uint64_t cmd;
> +
> +    yield_until_fd_readable(fd);
> +
> +    ret = colo_ctl_get_value(f, &cmd);
> +    if (ret) {
> +        return 1;
> +    }
> +
> +    if (cmd == COLO_CHECKPOINT_NEW) {
> +        return 0;
> +    } else {
> +        /* Unexpected data received */
> +        ctl_error_handler(f, ret);
> +        return 1;
> +    }
>  }
>  
>  void colo_process_incoming_checkpoints(QEMUFile *f)
>  {
> +    int fd = qemu_get_fd(f);
>      int dev_hotplug = qdev_hotplug;
> +    QEMUFile *ctl = NULL;
> +    int ret;
>  
>      if (!restore_use_colo()) {
>          return;
> @@ -310,18 +515,69 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
>      colo = qemu_coroutine_self();
>      assert(colo != NULL);
>  
> +    ctl = qemu_fopen_socket(fd, "wb");
> +    if (!ctl) {
> +        error_report("can't open incoming channel\n");
> +        goto out;
> +    }
> +
>      colo_buffer_init();
>  
> +    ret = colo_ctl_put(ctl, COLO_READY);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: in COLO mode, slave is runing, so start the vm */
> +
>      while (true) {
>          if (slave_wait_new_checkpoint(f)) {
>              break;
>          }
>  
> -        /* TODO: COLO restore */
> +        /* start colo checkpoint */
> +
> +        /* TODO: suspend guest */
> +
> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
> +        if (ret) {
> +            goto out;
> +        }
> +
> +        /* TODO: open colo buffer for read */
> +
> +        ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
> +        if (ret) {
> +            goto out;
> +        }
> +
> +        /* TODO: read migration data into colo buffer */
> +
> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
> +        if (ret) {
> +            goto out;
> +        }
> +
> +        /* TODO: load vm state */
> +
> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
> +        if (ret) {
> +            goto out;
> +        }
> +
> +        /* TODO: resume guest */
> +
> +        /* TODO: close colo buffer */
>      }
>  
> +out:
>      colo_buffer_destroy();
>      colo = NULL;
> +
> +    if (ctl) {
> +        qemu_fclose(ctl);
> +    }
> +
>      restore_exit_colo();
>  
>      qdev_hotplug = dev_hotplug;
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
@ 2014-08-01 15:03     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 15:03 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> implement colo checkpoint protocol.
> 
> Checkpoint synchronzing points.
> 
>                   Primary                 Secondary
>   NEW             @
>                                           Suspend
>   SUSPENDED                               @
>                   Suspend&Save state
>   SEND            @
>                   Send state              Receive state
>   RECEIVED                                @
>                   Flush network           Load state
>   LOADED                                  @
>                   Resume                  Resume
> 
>                   Start Comparing
> NOTE:
>  1) '@' who sends the message
>  2) Every sync-point is synchronized by two sides with only
>     one handshake(single direction) for low-latency.
>     If more strict synchronization is required, a opposite direction
>     sync-point should be added.
>  3) Since sync-points are single direction, the remote side may
>     go forward a lot when this side just receives the sync-point.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  migration-colo.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 262 insertions(+), 6 deletions(-)
> 
> diff --git a/migration-colo.c b/migration-colo.c
> index 2699e77..a708872 100644
> --- a/migration-colo.c
> +++ b/migration-colo.c
> @@ -24,6 +24,41 @@
>   */
>  #define CHKPOINT_TIMER 10000
>  
> +enum {
> +    COLO_READY = 0x46,
> +
> +    /*
> +     * Checkpoint synchronzing points.
> +     *
> +     *                  Primary                 Secondary
> +     *  NEW             @
> +     *                                          Suspend
> +     *  SUSPENDED                               @
> +     *                  Suspend&Save state
> +     *  SEND            @
> +     *                  Send state              Receive state
> +     *  RECEIVED                                @
> +     *                  Flush network           Load state
> +     *  LOADED                                  @
> +     *                  Resume                  Resume
> +     *
> +     *                  Start Comparing
> +     * NOTE:
> +     * 1) '@' who sends the message
> +     * 2) Every sync-point is synchronized by two sides with only
> +     *    one handshake(single direction) for low-latency.
> +     *    If more strict synchronization is required, a opposite direction
> +     *    sync-point should be added.
> +     * 3) Since sync-points are single direction, the remote side may
> +     *    go forward a lot when this side just receives the sync-point.
> +     */
> +    COLO_CHECKPOINT_NEW,
> +    COLO_CHECKPOINT_SUSPENDED,
> +    COLO_CHECKPOINT_SEND,
> +    COLO_CHECKPOINT_RECEIVED,
> +    COLO_CHECKPOINT_LOADED,
> +};
> +
>  static QEMUBH *colo_bh;
>  
>  bool colo_supported(void)
> @@ -185,30 +220,161 @@ static const QEMUFileOps colo_read_ops = {
>      .close = colo_close,
>  };
>  
> +/* colo checkpoint control helper */
> +static bool is_master(void);
> +static bool is_slave(void);
> +
> +static void ctl_error_handler(void *opaque, int err)
> +{
> +    if (is_slave()) {
> +        /* TODO: determine whether we need to failover */
> +        /* FIXME: we will not failover currently, just kill slave */
> +        error_report("error: colo transmission failed!\n");
> +        exit(1);
> +    } else if (is_master()) {
> +        /* Master still alive, do not failover */
> +        error_report("error: colo transmission failed!\n");
> +        return;
> +    } else {
> +        error_report("COLO: Unexpected error happend!\n");
> +        exit(EXIT_FAILURE);
> +    }
> +}
> +
> +static int colo_ctl_put(QEMUFile *f, uint64_t request)
> +{
> +    int ret = 0;
> +
> +    qemu_put_be64(f, request);
> +    qemu_fflush(f);
> +
> +    ret = qemu_file_get_error(f);
> +    if (ret < 0) {
> +        ctl_error_handler(f, ret);
> +        return 1;
> +    }
> +
> +    return ret;
> +}
> +
> +static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
> +{
> +    int ret = 0;
> +    uint64_t temp;
> +
> +    temp = qemu_get_be64(f);
> +
> +    ret = qemu_file_get_error(f);
> +    if (ret < 0) {
> +        ctl_error_handler(f, ret);
> +        return 1;
> +    }
> +
> +    *value = temp;
> +    return 0;
> +}
> +
> +static int colo_ctl_get(QEMUFile *f, uint64_t require)
> +{
> +    int ret;
> +    uint64_t value;
> +
> +    ret = colo_ctl_get_value(f, &value);
> +    if (ret) {
> +        return ret;
> +    }
> +
> +    if (value != require) {
> +        error_report("unexpected state received!\n");

I find it useful to print the expected/received state to
be able to figure out what went wrong.

> +        exit(1);
> +    }
> +
> +    return ret;
> +}
> +
>  /* save */
>  
> -static __attribute__((unused)) bool is_master(void)
> +static bool is_master(void)
>  {
>      MigrationState *s = migrate_get_current();
>      return (s->state == MIG_STATE_COLO);
>  }
>  
> +static int do_colo_transaction(MigrationState *s, QEMUFile *control,
> +                               QEMUFile *trans)
> +{
> +    int ret;
> +
> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);

What happens at this point if the slave just doesn't respond?
(i.e. the socket doesn't drop - you just don't get the byte).

> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: suspend and save vm state to colo buffer */
> +
> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: send vmstate to slave */
> +
> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: Flush network etc. */
> +
> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: resume master */
> +
> +out:
> +    return ret;
> +}
> +
>  static void *colo_thread(void *opaque)
>  {
>      MigrationState *s = opaque;
>      int dev_hotplug = qdev_hotplug, wait_cp = 0;
>      int64_t start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      int64_t current_time;
> +    QEMUFile *colo_control = NULL, *colo_trans = NULL;
> +    int ret;
>  
>      if (colo_compare_init() < 0) {
>          error_report("Init colo compare error\n");
>          goto out;
>      }
>  
> +    colo_control = qemu_fopen_socket(qemu_get_fd(s->file), "rb");
> +    if (!colo_control) {
> +        error_report("open colo_control failed\n");
> +        goto out;
> +    }

In my postcopy world I'm trying to abstract this type of thing into a 'return path'
so that the QEMUFile can implement it however it wants and you don't
need to assume it's a socket.  But I'm still fighting some of those details.

Dave

> +
>      qdev_hotplug = 0;
>  
>      colo_buffer_init();
>  
> +    /*
> +     * Wait for slave finish loading vm states and enter COLO
> +     * restore.
> +     */
> +    ret = colo_ctl_get(colo_control, COLO_READY);
> +    if (ret) {
> +        goto out;
> +    }
> +
>      while (s->state == MIG_STATE_COLO) {
>          /* wait for a colo checkpoint */
>          wait_cp = colo_compare();
> @@ -230,13 +396,33 @@ static void *colo_thread(void *opaque)
>  
>          /* start a colo checkpoint */
>  
> -        /*TODO: COLO save */
> +        /* open colo buffer for write */
> +        colo_trans = qemu_fopen_ops(&colo_buffer, &colo_write_ops);
> +        if (!colo_trans) {
> +            error_report("open colo buffer failed\n");
> +            goto out;
> +        }
>  
> +        if (do_colo_transaction(s, colo_control, colo_trans)) {
> +            goto out;
> +        }
> +
> +        qemu_fclose(colo_trans);
> +        colo_trans = NULL;
>          start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      }
>  
>  out:
> +    if (colo_trans) {
> +        qemu_fclose(colo_trans);
> +    }
> +
>      colo_buffer_destroy();
> +
> +    if (colo_control) {
> +        qemu_fclose(colo_control);
> +    }
> +
>      colo_compare_destroy();
>  
>      if (s->state != MIG_STATE_ERROR) {
> @@ -281,7 +467,7 @@ void colo_init_checkpointer(MigrationState *s)
>  
>  static Coroutine *colo;
>  
> -static __attribute__((unused)) bool is_slave(void)
> +static bool is_slave(void)
>  {
>      return colo != NULL;
>  }
> @@ -293,13 +479,32 @@ static __attribute__((unused)) bool is_slave(void)
>   */
>  static int slave_wait_new_checkpoint(QEMUFile *f)
>  {
> -    /* TODO: wait checkpoint start command from master */
> -    return 1;
> +    int fd = qemu_get_fd(f);
> +    int ret;
> +    uint64_t cmd;
> +
> +    yield_until_fd_readable(fd);
> +
> +    ret = colo_ctl_get_value(f, &cmd);
> +    if (ret) {
> +        return 1;
> +    }
> +
> +    if (cmd == COLO_CHECKPOINT_NEW) {
> +        return 0;
> +    } else {
> +        /* Unexpected data received */
> +        ctl_error_handler(f, ret);
> +        return 1;
> +    }
>  }
>  
>  void colo_process_incoming_checkpoints(QEMUFile *f)
>  {
> +    int fd = qemu_get_fd(f);
>      int dev_hotplug = qdev_hotplug;
> +    QEMUFile *ctl = NULL;
> +    int ret;
>  
>      if (!restore_use_colo()) {
>          return;
> @@ -310,18 +515,69 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
>      colo = qemu_coroutine_self();
>      assert(colo != NULL);
>  
> +    ctl = qemu_fopen_socket(fd, "wb");
> +    if (!ctl) {
> +        error_report("can't open incoming channel\n");
> +        goto out;
> +    }
> +
>      colo_buffer_init();
>  
> +    ret = colo_ctl_put(ctl, COLO_READY);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    /* TODO: in COLO mode, slave is runing, so start the vm */
> +
>      while (true) {
>          if (slave_wait_new_checkpoint(f)) {
>              break;
>          }
>  
> -        /* TODO: COLO restore */
> +        /* start colo checkpoint */
> +
> +        /* TODO: suspend guest */
> +
> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
> +        if (ret) {
> +            goto out;
> +        }
> +
> +        /* TODO: open colo buffer for read */
> +
> +        ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
> +        if (ret) {
> +            goto out;
> +        }
> +
> +        /* TODO: read migration data into colo buffer */
> +
> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
> +        if (ret) {
> +            goto out;
> +        }
> +
> +        /* TODO: load vm state */
> +
> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
> +        if (ret) {
> +            goto out;
> +        }
> +
> +        /* TODO: resume guest */
> +
> +        /* TODO: close colo buffer */
>      }
>  
> +out:
>      colo_buffer_destroy();
>      colo = NULL;
> +
> +    if (ctl) {
> +        qemu_fclose(ctl);
> +    }
> +
>      restore_exit_colo();
>  
>      qdev_hotplug = dev_hotplug;
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 13/17] COLO ctl: implement colo save
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 15:07     ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 15:07 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> implement colo save

My postcopy 'QEMU_VM_CMD_PACKAGED' does something similar to
parts of this with the QEMUSizedBuffer, we might be able to share some more:
https://lists.nongnu.org/archive/html/qemu-devel/2014-07/msg00886.html

> +    /* we send the total size of the vmstate first */
> +    ret = colo_ctl_put(s->file, colo_buffer.used);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    qemu_put_buffer_async(s->file, colo_buffer.data, colo_buffer.used);
> +    ret = qemu_file_get_error(s->file);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +    qemu_fflush(s->file);

Is there a reason to use _async here?  I thought the only gain is
if you were going to do other writes in the shadow of the async, with the fflush
immediately after I'm not sure it helps.

Dave

>  
>      ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
>      if (ret) {
>          goto out;
>      }
>  
> -    /* TODO: Flush network etc. */
> +    /* Flush network etc. */
> +    colo_compare_flush();
>  
>      ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
>      if (ret) {
>          goto out;
>      }
>  
> -    /* TODO: resume master */
> +    colo_compare_resume();
> +    ret = 0;
>  
>  out:
> +    /* resume master */
> +    qemu_mutex_lock_iothread();
> +    vm_start();
> +    qemu_mutex_unlock_iothread();
> +
>      return ret;
>  }
>  
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 13/17] COLO ctl: implement colo save
@ 2014-08-01 15:07     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 15:07 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> implement colo save

My postcopy 'QEMU_VM_CMD_PACKAGED' does something similar to
parts of this with the QEMUSizedBuffer, we might be able to share some more:
https://lists.nongnu.org/archive/html/qemu-devel/2014-07/msg00886.html

> +    /* we send the total size of the vmstate first */
> +    ret = colo_ctl_put(s->file, colo_buffer.used);
> +    if (ret) {
> +        goto out;
> +    }
> +
> +    qemu_put_buffer_async(s->file, colo_buffer.data, colo_buffer.used);
> +    ret = qemu_file_get_error(s->file);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +    qemu_fflush(s->file);

Is there a reason to use _async here?  I thought the only gain is
if you were going to do other writes in the shadow of the async, with the fflush
immediately after I'm not sure it helps.

Dave

>  
>      ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
>      if (ret) {
>          goto out;
>      }
>  
> -    /* TODO: Flush network etc. */
> +    /* Flush network etc. */
> +    colo_compare_flush();
>  
>      ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
>      if (ret) {
>          goto out;
>      }
>  
> -    /* TODO: resume master */
> +    colo_compare_resume();
> +    ret = 0;
>  
>  out:
> +    /* resume master */
> +    qemu_mutex_lock_iothread();
> +    vm_start();
> +    qemu_mutex_unlock_iothread();
> +
>      return ret;
>  }
>  
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 15/17] COLO save: reuse migration bitmap under colo checkpoint
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 15:09     ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 15:09 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> reuse migration bitmap under colo checkpoint, only send dirty pages
> per-checkpoint.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  arch_init.c                        | 20 +++++++++++++++++++-
>  include/migration/migration-colo.h |  2 ++
>  migration-colo.c                   |  6 ++----
>  stubs/migration-colo.c             | 10 ++++++++++
>  4 files changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 8ddaf35..c84e6c8 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -52,6 +52,7 @@
>  #include "exec/ram_addr.h"
>  #include "hw/acpi/acpi.h"
>  #include "qemu/host-utils.h"
> +#include "migration/migration-colo.h"
>  
>  #ifdef DEBUG_ARCH_INIT
>  #define DPRINTF(fmt, ...) \
> @@ -769,6 +770,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      RAMBlock *block;
>      int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
>  
> +    /*
> +     * migration has already setup the bitmap, reuse it.
> +     */
> +    if (is_master()) {
> +        qemu_mutex_lock_ramlist();
> +        reset_ram_globals();
> +        goto out_setup;
> +    }
> +
>      mig_throttle_on = false;
>      dirty_rate_high_cnt = 0;
>      bitmap_sync_count = 0;
> @@ -828,6 +838,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      migration_bitmap_sync();
>      qemu_mutex_unlock_iothread();
>  
> +out_setup:
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
>      QTAILQ_FOREACH(block, &ram_list.blocks, next) {

Is it necessary to send the block list for each of your snapshots?

Dave

> @@ -937,7 +948,14 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>      }
>  
>      ram_control_after_iterate(f, RAM_CONTROL_FINISH);
> -    migration_end();
> +
> +    /*
> +     * Since we need to reuse dirty bitmap in colo,
> +     * don't cleanup the bitmap.
> +     */
> +    if (!migrate_use_colo() || migration_has_failed(migrate_get_current())) {
> +        migration_end();
> +    }
>  
>      qemu_mutex_unlock_ramlist();
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
> index 861fa27..c286a60 100644
> --- a/include/migration/migration-colo.h
> +++ b/include/migration/migration-colo.h
> @@ -21,10 +21,12 @@ bool colo_supported(void);
>  /* save */
>  bool migrate_use_colo(void);
>  void colo_init_checkpointer(MigrationState *s);
> +bool is_master(void);
>  
>  /* restore */
>  bool restore_use_colo(void);
>  void restore_exit_colo(void);
> +bool is_slave(void);
>  
>  void colo_process_incoming_checkpoints(QEMUFile *f);
>  
> diff --git a/migration-colo.c b/migration-colo.c
> index 8596845..13a6a57 100644
> --- a/migration-colo.c
> +++ b/migration-colo.c
> @@ -222,8 +222,6 @@ static const QEMUFileOps colo_read_ops = {
>  };
>  
>  /* colo checkpoint control helper */
> -static bool is_master(void);
> -static bool is_slave(void);
>  
>  static void ctl_error_handler(void *opaque, int err)
>  {
> @@ -295,7 +293,7 @@ static int colo_ctl_get(QEMUFile *f, uint64_t require)
>  
>  /* save */
>  
> -static bool is_master(void)
> +bool is_master(void)
>  {
>      MigrationState *s = migrate_get_current();
>      return (s->state == MIG_STATE_COLO);
> @@ -499,7 +497,7 @@ void colo_init_checkpointer(MigrationState *s)
>  
>  static Coroutine *colo;
>  
> -static bool is_slave(void)
> +bool is_slave(void)
>  {
>      return colo != NULL;
>  }
> diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
> index 55f0d37..ef65be6 100644
> --- a/stubs/migration-colo.c
> +++ b/stubs/migration-colo.c
> @@ -22,3 +22,13 @@ void colo_init_checkpointer(MigrationState *s)
>  void colo_process_incoming_checkpoints(QEMUFile *f)
>  {
>  }
> +
> +bool is_master(void)
> +{
> +    return false;
> +}
> +
> +bool is_slave(void)
> +{
> +    return false;
> +}
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 15/17] COLO save: reuse migration bitmap under colo checkpoint
@ 2014-08-01 15:09     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 15:09 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> reuse migration bitmap under colo checkpoint, only send dirty pages
> per-checkpoint.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  arch_init.c                        | 20 +++++++++++++++++++-
>  include/migration/migration-colo.h |  2 ++
>  migration-colo.c                   |  6 ++----
>  stubs/migration-colo.c             | 10 ++++++++++
>  4 files changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 8ddaf35..c84e6c8 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -52,6 +52,7 @@
>  #include "exec/ram_addr.h"
>  #include "hw/acpi/acpi.h"
>  #include "qemu/host-utils.h"
> +#include "migration/migration-colo.h"
>  
>  #ifdef DEBUG_ARCH_INIT
>  #define DPRINTF(fmt, ...) \
> @@ -769,6 +770,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      RAMBlock *block;
>      int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
>  
> +    /*
> +     * migration has already setup the bitmap, reuse it.
> +     */
> +    if (is_master()) {
> +        qemu_mutex_lock_ramlist();
> +        reset_ram_globals();
> +        goto out_setup;
> +    }
> +
>      mig_throttle_on = false;
>      dirty_rate_high_cnt = 0;
>      bitmap_sync_count = 0;
> @@ -828,6 +838,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      migration_bitmap_sync();
>      qemu_mutex_unlock_iothread();
>  
> +out_setup:
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
>      QTAILQ_FOREACH(block, &ram_list.blocks, next) {

Is it necessary to send the block list for each of your snapshots?

Dave

> @@ -937,7 +948,14 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>      }
>  
>      ram_control_after_iterate(f, RAM_CONTROL_FINISH);
> -    migration_end();
> +
> +    /*
> +     * Since we need to reuse dirty bitmap in colo,
> +     * don't cleanup the bitmap.
> +     */
> +    if (!migrate_use_colo() || migration_has_failed(migrate_get_current())) {
> +        migration_end();
> +    }
>  
>      qemu_mutex_unlock_ramlist();
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
> index 861fa27..c286a60 100644
> --- a/include/migration/migration-colo.h
> +++ b/include/migration/migration-colo.h
> @@ -21,10 +21,12 @@ bool colo_supported(void);
>  /* save */
>  bool migrate_use_colo(void);
>  void colo_init_checkpointer(MigrationState *s);
> +bool is_master(void);
>  
>  /* restore */
>  bool restore_use_colo(void);
>  void restore_exit_colo(void);
> +bool is_slave(void);
>  
>  void colo_process_incoming_checkpoints(QEMUFile *f);
>  
> diff --git a/migration-colo.c b/migration-colo.c
> index 8596845..13a6a57 100644
> --- a/migration-colo.c
> +++ b/migration-colo.c
> @@ -222,8 +222,6 @@ static const QEMUFileOps colo_read_ops = {
>  };
>  
>  /* colo checkpoint control helper */
> -static bool is_master(void);
> -static bool is_slave(void);
>  
>  static void ctl_error_handler(void *opaque, int err)
>  {
> @@ -295,7 +293,7 @@ static int colo_ctl_get(QEMUFile *f, uint64_t require)
>  
>  /* save */
>  
> -static bool is_master(void)
> +bool is_master(void)
>  {
>      MigrationState *s = migrate_get_current();
>      return (s->state == MIG_STATE_COLO);
> @@ -499,7 +497,7 @@ void colo_init_checkpointer(MigrationState *s)
>  
>  static Coroutine *colo;
>  
> -static bool is_slave(void)
> +bool is_slave(void)
>  {
>      return colo != NULL;
>  }
> diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
> index 55f0d37..ef65be6 100644
> --- a/stubs/migration-colo.c
> +++ b/stubs/migration-colo.c
> @@ -22,3 +22,13 @@ void colo_init_checkpointer(MigrationState *s)
>  void colo_process_incoming_checkpoints(QEMUFile *f)
>  {
>  }
> +
> +bool is_master(void)
> +{
> +    return false;
> +}
> +
> +bool is_slave(void)
> +{
> +    return false;
> +}
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver
  2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 15:10     ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 15:10 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> The ram cache was initially the same as PVM's memory. At
> checkpoint, we cache the dirty memory of PVM into ram cache
> (so that ram cache always the same as PVM's memory at every
> checkpoint), flush cached memory to SVM after we received
> all PVM dirty memory(only needed to flush memory that was
> both dirty on PVM and SVM since last checkpoint).

(Typo: 'r' on the end of the title)

I think I understand the need for the cache, to be able to restore pages
that the SVM has modified that the PVM hadn't; however, if I understand
the change here, (to host_from_stream_offset) the SVM will load the
snapshot into the ram_cache rather than directly into host memory - why
is this necessary?  If the SVMs CPU is stopped at this point couldn't
it load snapshot pages directly into host memory, clearing pages in the SVMs
bitmap, so that the only pages that then get copied in flush_cache are
the pages that the SVM modified but the PVM *didn't* include in the snapshot?
I can see that you would need to do it the way you've done it if the
snapshot-load could fail (at the sametime the PVM failed) and thus the old SVM
state would be the surviving state, but how could it fail at this point
given the whole stream is in the colo-buffer?

> +static void ram_flush_cache(void);
>  static int ram_load(QEMUFile *f, void *opaque, int version_id)
>  {
>      ram_addr_t addr;
>      int flags, ret = 0;
>      static uint64_t seq_iter;
> +    bool need_flush = false;

Probably better as 'ram_cache_needs_flush'

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver
@ 2014-08-01 15:10     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 15:10 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> The ram cache was initially the same as PVM's memory. At
> checkpoint, we cache the dirty memory of PVM into ram cache
> (so that ram cache always the same as PVM's memory at every
> checkpoint), flush cached memory to SVM after we received
> all PVM dirty memory(only needed to flush memory that was
> both dirty on PVM and SVM since last checkpoint).

(Typo: 'r' on the end of the title)

I think I understand the need for the cache, to be able to restore pages
that the SVM has modified that the PVM hadn't; however, if I understand
the change here, (to host_from_stream_offset) the SVM will load the
snapshot into the ram_cache rather than directly into host memory - why
is this necessary?  If the SVMs CPU is stopped at this point couldn't
it load snapshot pages directly into host memory, clearing pages in the SVMs
bitmap, so that the only pages that then get copied in flush_cache are
the pages that the SVM modified but the PVM *didn't* include in the snapshot?
I can see that you would need to do it the way you've done it if the
snapshot-load could fail (at the sametime the PVM failed) and thus the old SVM
state would be the surviving state, but how could it fail at this point
given the whole stream is in the colo-buffer?

> +static void ram_flush_cache(void);
>  static int ram_load(QEMUFile *f, void *opaque, int version_id)
>  {
>      ram_addr_t addr;
>      int flags, ret = 0;
>      static uint64_t seq_iter;
> +    bool need_flush = false;

Probably better as 'ram_cache_needs_flush'

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
@ 2014-08-01 16:02   ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 16:02 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> Virtual machine (VM) replication is a well known technique for
> providing application-agnostic software-implemented hardware fault
> tolerance "non-stop service". COLO is a high availability solution.
> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
> receive the same request from client, and generate response in parallel
> too. If the response packets from PVM and SVM are identical, they are
> released immediately. Otherwise, a VM checkpoint (on demand) is
> conducted. The idea is presented in Xen summit 2012, and 2013,
> and academia paper in SOCC 2013. It's also presented in KVM forum
> 2013:
> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
> Please refer to above document for detailed information. 
> Please also refer to previous posted RFC proposal:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html

Hi Yang,
  Thanks for this set of patches (and I've replied to many individually).

> The patchset is also hosted on github:
> https://github.com/macrosheep/qemu/tree/colo_v0.1
> 
> This patchset is RFC, implements the frame of colo, without
> failover and nic/disk replication. But it is ready for demo
> the COLO idea above QEMU-Kvm.
> Steps using this patchset to get an overview of COLO:
> 1. configure the source with --enable-colo option
> 2. compile
> 3. just like QEMU's normal migration, run 2 QEMU VM:
>    - Primary VM 
>    - Secondary VM with -incoming tcp:[IP]:[PORT] option
> 4. on Primary VM's QEMU monitor, run following command:
>    migrate_set_capability colo on
>    migrate tcp:[IP]:[PORT]
> 5. done
> you will see two runing VMs, whenever you make changes to PVM, SVM
> will be synced to PVM's state.
> 
> TODO list:
> 1. failover
> 2. nic replication
> 3. disk replication[COLO Disk manager]

I wonder if there are any parts that can be borrowed from other code
to get it going; I notice that the reverse execution patchset
has a network packet record/replay mode:

https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00157.html

What was used for the nic comparison in the 2013 kvm forum paper?

Dave

> 
> Any comments/feedbacks are warmly welcomed.
> 
> Thanks,
> Yang
> 
> Yang Hongyang (17):
>   configure: add CONFIG_COLO to switch COLO support
>   COLO: introduce an api colo_supported() to indicate COLO support
>   COLO migration: add a migration capability 'colo'
>   COLO info: use colo info to tell migration target colo is enabled
>   COLO save: integrate COLO checkpointed save into qemu migration
>   COLO restore: integrate COLO checkpointed restore into qemu restore
>   COLO buffer: implement colo buffer as well as QEMUFileOps based on it
>   COLO: disable qdev hotplug
>   COLO ctl: implement API's that communicate with colo agent
>   COLO ctl: introduce is_slave() and is_master()
>   COLO ctl: implement colo checkpoint protocol
>   COLO ctl: add a RunState RUN_STATE_COLO
>   COLO ctl: implement colo save
>   COLO ctl: implement colo restore
>   COLO save: reuse migration bitmap under colo checkpoint
>   COLO ram cache: implement colo ram cache on slaver
>   HACK: trigger checkpoint every 500ms
> 
>  Makefile.objs                      |   2 +
>  arch_init.c                        | 174 +++++++++-
>  configure                          |  14 +
>  include/exec/cpu-all.h             |   1 +
>  include/migration/migration-colo.h |  36 +++
>  include/migration/migration.h      |  13 +
>  include/qapi/qmp/qerror.h          |   3 +
>  migration-colo-comm.c              |  78 +++++
>  migration-colo.c                   | 643 +++++++++++++++++++++++++++++++++++++
>  migration.c                        |  45 ++-
>  qapi-schema.json                   |   9 +-
>  stubs/Makefile.objs                |   1 +
>  stubs/migration-colo.c             |  34 ++
>  vl.c                               |  12 +
>  14 files changed, 1044 insertions(+), 21 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 migration-colo-comm.c
>  create mode 100644 migration-colo.c
>  create mode 100644 stubs/migration-colo.c
> 
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2014-08-01 16:02   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-08-01 16:02 UTC (permalink / raw)
  To: Yang Hongyang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> Virtual machine (VM) replication is a well known technique for
> providing application-agnostic software-implemented hardware fault
> tolerance "non-stop service". COLO is a high availability solution.
> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
> receive the same request from client, and generate response in parallel
> too. If the response packets from PVM and SVM are identical, they are
> released immediately. Otherwise, a VM checkpoint (on demand) is
> conducted. The idea is presented in Xen summit 2012, and 2013,
> and academia paper in SOCC 2013. It's also presented in KVM forum
> 2013:
> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
> Please refer to above document for detailed information. 
> Please also refer to previous posted RFC proposal:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html

Hi Yang,
  Thanks for this set of patches (and I've replied to many individually).

> The patchset is also hosted on github:
> https://github.com/macrosheep/qemu/tree/colo_v0.1
> 
> This patchset is RFC, implements the frame of colo, without
> failover and nic/disk replication. But it is ready for demo
> the COLO idea above QEMU-Kvm.
> Steps using this patchset to get an overview of COLO:
> 1. configure the source with --enable-colo option
> 2. compile
> 3. just like QEMU's normal migration, run 2 QEMU VM:
>    - Primary VM 
>    - Secondary VM with -incoming tcp:[IP]:[PORT] option
> 4. on Primary VM's QEMU monitor, run following command:
>    migrate_set_capability colo on
>    migrate tcp:[IP]:[PORT]
> 5. done
> you will see two runing VMs, whenever you make changes to PVM, SVM
> will be synced to PVM's state.
> 
> TODO list:
> 1. failover
> 2. nic replication
> 3. disk replication[COLO Disk manager]

I wonder if there are any parts that can be borrowed from other code
to get it going; I notice that the reverse execution patchset
has a network packet record/replay mode:

https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00157.html

What was used for the nic comparison in the 2013 kvm forum paper?

Dave

> 
> Any comments/feedbacks are warmly welcomed.
> 
> Thanks,
> Yang
> 
> Yang Hongyang (17):
>   configure: add CONFIG_COLO to switch COLO support
>   COLO: introduce an api colo_supported() to indicate COLO support
>   COLO migration: add a migration capability 'colo'
>   COLO info: use colo info to tell migration target colo is enabled
>   COLO save: integrate COLO checkpointed save into qemu migration
>   COLO restore: integrate COLO checkpointed restore into qemu restore
>   COLO buffer: implement colo buffer as well as QEMUFileOps based on it
>   COLO: disable qdev hotplug
>   COLO ctl: implement API's that communicate with colo agent
>   COLO ctl: introduce is_slave() and is_master()
>   COLO ctl: implement colo checkpoint protocol
>   COLO ctl: add a RunState RUN_STATE_COLO
>   COLO ctl: implement colo save
>   COLO ctl: implement colo restore
>   COLO save: reuse migration bitmap under colo checkpoint
>   COLO ram cache: implement colo ram cache on slaver
>   HACK: trigger checkpoint every 500ms
> 
>  Makefile.objs                      |   2 +
>  arch_init.c                        | 174 +++++++++-
>  configure                          |  14 +
>  include/exec/cpu-all.h             |   1 +
>  include/migration/migration-colo.h |  36 +++
>  include/migration/migration.h      |  13 +
>  include/qapi/qmp/qerror.h          |   3 +
>  migration-colo-comm.c              |  78 +++++
>  migration-colo.c                   | 643 +++++++++++++++++++++++++++++++++++++
>  migration.c                        |  45 ++-
>  qapi-schema.json                   |   9 +-
>  stubs/Makefile.objs                |   1 +
>  stubs/migration-colo.c             |  34 ++
>  vl.c                               |  12 +
>  14 files changed, 1044 insertions(+), 21 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 migration-colo-comm.c
>  create mode 100644 migration-colo.c
>  create mode 100644 stubs/migration-colo.c
> 
> -- 
> 1.9.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
  2014-08-01 15:03     ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2014-09-12  6:20       ` Hongyang Yang
  -1 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-12  6:20 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency



在 08/01/2014 11:03 PM, Dr. David Alan Gilbert 写道:
> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>> implement colo checkpoint protocol.
>>
>> Checkpoint synchronzing points.
>>
>>                    Primary                 Secondary
>>    NEW             @
>>                                            Suspend
>>    SUSPENDED                               @
>>                    Suspend&Save state
>>    SEND            @
>>                    Send state              Receive state
>>    RECEIVED                                @
>>                    Flush network           Load state
>>    LOADED                                  @
>>                    Resume                  Resume
>>
>>                    Start Comparing
>> NOTE:
>>   1) '@' who sends the message
>>   2) Every sync-point is synchronized by two sides with only
>>      one handshake(single direction) for low-latency.
>>      If more strict synchronization is required, a opposite direction
>>      sync-point should be added.
>>   3) Since sync-points are single direction, the remote side may
>>      go forward a lot when this side just receives the sync-point.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   migration-colo.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 262 insertions(+), 6 deletions(-)
>>
>> diff --git a/migration-colo.c b/migration-colo.c
>> index 2699e77..a708872 100644
>> --- a/migration-colo.c
>> +++ b/migration-colo.c
>> @@ -24,6 +24,41 @@
>>    */
>>   #define CHKPOINT_TIMER 10000
>>
>> +enum {
>> +    COLO_READY = 0x46,
>> +
>> +    /*
>> +     * Checkpoint synchronzing points.
>> +     *
>> +     *                  Primary                 Secondary
>> +     *  NEW             @
>> +     *                                          Suspend
>> +     *  SUSPENDED                               @
>> +     *                  Suspend&Save state
>> +     *  SEND            @
>> +     *                  Send state              Receive state
>> +     *  RECEIVED                                @
>> +     *                  Flush network           Load state
>> +     *  LOADED                                  @
>> +     *                  Resume                  Resume
>> +     *
>> +     *                  Start Comparing
>> +     * NOTE:
>> +     * 1) '@' who sends the message
>> +     * 2) Every sync-point is synchronized by two sides with only
>> +     *    one handshake(single direction) for low-latency.
>> +     *    If more strict synchronization is required, a opposite direction
>> +     *    sync-point should be added.
>> +     * 3) Since sync-points are single direction, the remote side may
>> +     *    go forward a lot when this side just receives the sync-point.
>> +     */
>> +    COLO_CHECKPOINT_NEW,
>> +    COLO_CHECKPOINT_SUSPENDED,
>> +    COLO_CHECKPOINT_SEND,
>> +    COLO_CHECKPOINT_RECEIVED,
>> +    COLO_CHECKPOINT_LOADED,
>> +};
>> +
>>   static QEMUBH *colo_bh;
>>
>>   bool colo_supported(void)
>> @@ -185,30 +220,161 @@ static const QEMUFileOps colo_read_ops = {
>>       .close = colo_close,
>>   };
>>
>> +/* colo checkpoint control helper */
>> +static bool is_master(void);
>> +static bool is_slave(void);
>> +
>> +static void ctl_error_handler(void *opaque, int err)
>> +{
>> +    if (is_slave()) {
>> +        /* TODO: determine whether we need to failover */
>> +        /* FIXME: we will not failover currently, just kill slave */
>> +        error_report("error: colo transmission failed!\n");
>> +        exit(1);
>> +    } else if (is_master()) {
>> +        /* Master still alive, do not failover */
>> +        error_report("error: colo transmission failed!\n");
>> +        return;
>> +    } else {
>> +        error_report("COLO: Unexpected error happend!\n");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +}
>> +
>> +static int colo_ctl_put(QEMUFile *f, uint64_t request)
>> +{
>> +    int ret = 0;
>> +
>> +    qemu_put_be64(f, request);
>> +    qemu_fflush(f);
>> +
>> +    ret = qemu_file_get_error(f);
>> +    if (ret < 0) {
>> +        ctl_error_handler(f, ret);
>> +        return 1;
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
>> +{
>> +    int ret = 0;
>> +    uint64_t temp;
>> +
>> +    temp = qemu_get_be64(f);
>> +
>> +    ret = qemu_file_get_error(f);
>> +    if (ret < 0) {
>> +        ctl_error_handler(f, ret);
>> +        return 1;
>> +    }
>> +
>> +    *value = temp;
>> +    return 0;
>> +}
>> +
>> +static int colo_ctl_get(QEMUFile *f, uint64_t require)
>> +{
>> +    int ret;
>> +    uint64_t value;
>> +
>> +    ret = colo_ctl_get_value(f, &value);
>> +    if (ret) {
>> +        return ret;
>> +    }
>> +
>> +    if (value != require) {
>> +        error_report("unexpected state received!\n");
>
> I find it useful to print the expected/received state to
> be able to figure out what went wrong.

Good idea!

>
>> +        exit(1);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>>   /* save */
>>
>> -static __attribute__((unused)) bool is_master(void)
>> +static bool is_master(void)
>>   {
>>       MigrationState *s = migrate_get_current();
>>       return (s->state == MIG_STATE_COLO);
>>   }
>>
>> +static int do_colo_transaction(MigrationState *s, QEMUFile *control,
>> +                               QEMUFile *trans)
>> +{
>> +    int ret;
>> +
>> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
>
> What happens at this point if the slave just doesn't respond?
> (i.e. the socket doesn't drop - you just don't get the byte).

If the socket return bytes that were not expected, exit. If
socket return error, do some cleanup and quit COLO process.
refer to: colo_ctl_get() and colo_ctl_get_value()

>
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: suspend and save vm state to colo buffer */
>> +
>> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: send vmstate to slave */
>> +
>> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: Flush network etc. */
>> +
>> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: resume master */
>> +
>> +out:
>> +    return ret;
>> +}
>> +
>>   static void *colo_thread(void *opaque)
>>   {
>>       MigrationState *s = opaque;
>>       int dev_hotplug = qdev_hotplug, wait_cp = 0;
>>       int64_t start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>       int64_t current_time;
>> +    QEMUFile *colo_control = NULL, *colo_trans = NULL;
>> +    int ret;
>>
>>       if (colo_compare_init() < 0) {
>>           error_report("Init colo compare error\n");
>>           goto out;
>>       }
>>
>> +    colo_control = qemu_fopen_socket(qemu_get_fd(s->file), "rb");
>> +    if (!colo_control) {
>> +        error_report("open colo_control failed\n");
>> +        goto out;
>> +    }
>
> In my postcopy world I'm trying to abstract this type of thing into a 'return path'
> so that the QEMUFile can implement it however it wants and you don't
> need to assume it's a socket.  But I'm still fighting some of those details.
>
> Dave
>
>> +
>>       qdev_hotplug = 0;
>>
>>       colo_buffer_init();
>>
>> +    /*
>> +     * Wait for slave finish loading vm states and enter COLO
>> +     * restore.
>> +     */
>> +    ret = colo_ctl_get(colo_control, COLO_READY);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>>       while (s->state == MIG_STATE_COLO) {
>>           /* wait for a colo checkpoint */
>>           wait_cp = colo_compare();
>> @@ -230,13 +396,33 @@ static void *colo_thread(void *opaque)
>>
>>           /* start a colo checkpoint */
>>
>> -        /*TODO: COLO save */
>> +        /* open colo buffer for write */
>> +        colo_trans = qemu_fopen_ops(&colo_buffer, &colo_write_ops);
>> +        if (!colo_trans) {
>> +            error_report("open colo buffer failed\n");
>> +            goto out;
>> +        }
>>
>> +        if (do_colo_transaction(s, colo_control, colo_trans)) {
>> +            goto out;
>> +        }
>> +
>> +        qemu_fclose(colo_trans);
>> +        colo_trans = NULL;
>>           start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>       }
>>
>>   out:
>> +    if (colo_trans) {
>> +        qemu_fclose(colo_trans);
>> +    }
>> +
>>       colo_buffer_destroy();
>> +
>> +    if (colo_control) {
>> +        qemu_fclose(colo_control);
>> +    }
>> +
>>       colo_compare_destroy();
>>
>>       if (s->state != MIG_STATE_ERROR) {
>> @@ -281,7 +467,7 @@ void colo_init_checkpointer(MigrationState *s)
>>
>>   static Coroutine *colo;
>>
>> -static __attribute__((unused)) bool is_slave(void)
>> +static bool is_slave(void)
>>   {
>>       return colo != NULL;
>>   }
>> @@ -293,13 +479,32 @@ static __attribute__((unused)) bool is_slave(void)
>>    */
>>   static int slave_wait_new_checkpoint(QEMUFile *f)
>>   {
>> -    /* TODO: wait checkpoint start command from master */
>> -    return 1;
>> +    int fd = qemu_get_fd(f);
>> +    int ret;
>> +    uint64_t cmd;
>> +
>> +    yield_until_fd_readable(fd);
>> +
>> +    ret = colo_ctl_get_value(f, &cmd);
>> +    if (ret) {
>> +        return 1;
>> +    }
>> +
>> +    if (cmd == COLO_CHECKPOINT_NEW) {
>> +        return 0;
>> +    } else {
>> +        /* Unexpected data received */
>> +        ctl_error_handler(f, ret);
>> +        return 1;
>> +    }
>>   }
>>
>>   void colo_process_incoming_checkpoints(QEMUFile *f)
>>   {
>> +    int fd = qemu_get_fd(f);
>>       int dev_hotplug = qdev_hotplug;
>> +    QEMUFile *ctl = NULL;
>> +    int ret;
>>
>>       if (!restore_use_colo()) {
>>           return;
>> @@ -310,18 +515,69 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
>>       colo = qemu_coroutine_self();
>>       assert(colo != NULL);
>>
>> +    ctl = qemu_fopen_socket(fd, "wb");
>> +    if (!ctl) {
>> +        error_report("can't open incoming channel\n");
>> +        goto out;
>> +    }
>> +
>>       colo_buffer_init();
>>
>> +    ret = colo_ctl_put(ctl, COLO_READY);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: in COLO mode, slave is runing, so start the vm */
>> +
>>       while (true) {
>>           if (slave_wait_new_checkpoint(f)) {
>>               break;
>>           }
>>
>> -        /* TODO: COLO restore */
>> +        /* start colo checkpoint */
>> +
>> +        /* TODO: suspend guest */
>> +
>> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
>> +        if (ret) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: open colo buffer for read */
>> +
>> +        ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
>> +        if (ret) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: read migration data into colo buffer */
>> +
>> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
>> +        if (ret) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: load vm state */
>> +
>> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
>> +        if (ret) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: resume guest */
>> +
>> +        /* TODO: close colo buffer */
>>       }
>>
>> +out:
>>       colo_buffer_destroy();
>>       colo = NULL;
>> +
>> +    if (ctl) {
>> +        qemu_fclose(ctl);
>> +    }
>> +
>>       restore_exit_colo();
>>
>>       qdev_hotplug = dev_hotplug;
>> --
>> 1.9.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
@ 2014-09-12  6:20       ` Hongyang Yang
  0 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-12  6:20 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines



在 08/01/2014 11:03 PM, Dr. David Alan Gilbert 写道:
> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>> implement colo checkpoint protocol.
>>
>> Checkpoint synchronzing points.
>>
>>                    Primary                 Secondary
>>    NEW             @
>>                                            Suspend
>>    SUSPENDED                               @
>>                    Suspend&Save state
>>    SEND            @
>>                    Send state              Receive state
>>    RECEIVED                                @
>>                    Flush network           Load state
>>    LOADED                                  @
>>                    Resume                  Resume
>>
>>                    Start Comparing
>> NOTE:
>>   1) '@' who sends the message
>>   2) Every sync-point is synchronized by two sides with only
>>      one handshake(single direction) for low-latency.
>>      If more strict synchronization is required, a opposite direction
>>      sync-point should be added.
>>   3) Since sync-points are single direction, the remote side may
>>      go forward a lot when this side just receives the sync-point.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   migration-colo.c | 268 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 262 insertions(+), 6 deletions(-)
>>
>> diff --git a/migration-colo.c b/migration-colo.c
>> index 2699e77..a708872 100644
>> --- a/migration-colo.c
>> +++ b/migration-colo.c
>> @@ -24,6 +24,41 @@
>>    */
>>   #define CHKPOINT_TIMER 10000
>>
>> +enum {
>> +    COLO_READY = 0x46,
>> +
>> +    /*
>> +     * Checkpoint synchronzing points.
>> +     *
>> +     *                  Primary                 Secondary
>> +     *  NEW             @
>> +     *                                          Suspend
>> +     *  SUSPENDED                               @
>> +     *                  Suspend&Save state
>> +     *  SEND            @
>> +     *                  Send state              Receive state
>> +     *  RECEIVED                                @
>> +     *                  Flush network           Load state
>> +     *  LOADED                                  @
>> +     *                  Resume                  Resume
>> +     *
>> +     *                  Start Comparing
>> +     * NOTE:
>> +     * 1) '@' who sends the message
>> +     * 2) Every sync-point is synchronized by two sides with only
>> +     *    one handshake(single direction) for low-latency.
>> +     *    If more strict synchronization is required, a opposite direction
>> +     *    sync-point should be added.
>> +     * 3) Since sync-points are single direction, the remote side may
>> +     *    go forward a lot when this side just receives the sync-point.
>> +     */
>> +    COLO_CHECKPOINT_NEW,
>> +    COLO_CHECKPOINT_SUSPENDED,
>> +    COLO_CHECKPOINT_SEND,
>> +    COLO_CHECKPOINT_RECEIVED,
>> +    COLO_CHECKPOINT_LOADED,
>> +};
>> +
>>   static QEMUBH *colo_bh;
>>
>>   bool colo_supported(void)
>> @@ -185,30 +220,161 @@ static const QEMUFileOps colo_read_ops = {
>>       .close = colo_close,
>>   };
>>
>> +/* colo checkpoint control helper */
>> +static bool is_master(void);
>> +static bool is_slave(void);
>> +
>> +static void ctl_error_handler(void *opaque, int err)
>> +{
>> +    if (is_slave()) {
>> +        /* TODO: determine whether we need to failover */
>> +        /* FIXME: we will not failover currently, just kill slave */
>> +        error_report("error: colo transmission failed!\n");
>> +        exit(1);
>> +    } else if (is_master()) {
>> +        /* Master still alive, do not failover */
>> +        error_report("error: colo transmission failed!\n");
>> +        return;
>> +    } else {
>> +        error_report("COLO: Unexpected error happend!\n");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +}
>> +
>> +static int colo_ctl_put(QEMUFile *f, uint64_t request)
>> +{
>> +    int ret = 0;
>> +
>> +    qemu_put_be64(f, request);
>> +    qemu_fflush(f);
>> +
>> +    ret = qemu_file_get_error(f);
>> +    if (ret < 0) {
>> +        ctl_error_handler(f, ret);
>> +        return 1;
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
>> +{
>> +    int ret = 0;
>> +    uint64_t temp;
>> +
>> +    temp = qemu_get_be64(f);
>> +
>> +    ret = qemu_file_get_error(f);
>> +    if (ret < 0) {
>> +        ctl_error_handler(f, ret);
>> +        return 1;
>> +    }
>> +
>> +    *value = temp;
>> +    return 0;
>> +}
>> +
>> +static int colo_ctl_get(QEMUFile *f, uint64_t require)
>> +{
>> +    int ret;
>> +    uint64_t value;
>> +
>> +    ret = colo_ctl_get_value(f, &value);
>> +    if (ret) {
>> +        return ret;
>> +    }
>> +
>> +    if (value != require) {
>> +        error_report("unexpected state received!\n");
>
> I find it useful to print the expected/received state to
> be able to figure out what went wrong.

Good idea!

>
>> +        exit(1);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>>   /* save */
>>
>> -static __attribute__((unused)) bool is_master(void)
>> +static bool is_master(void)
>>   {
>>       MigrationState *s = migrate_get_current();
>>       return (s->state == MIG_STATE_COLO);
>>   }
>>
>> +static int do_colo_transaction(MigrationState *s, QEMUFile *control,
>> +                               QEMUFile *trans)
>> +{
>> +    int ret;
>> +
>> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
>
> What happens at this point if the slave just doesn't respond?
> (i.e. the socket doesn't drop - you just don't get the byte).

If the socket return bytes that were not expected, exit. If
socket return error, do some cleanup and quit COLO process.
refer to: colo_ctl_get() and colo_ctl_get_value()

>
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: suspend and save vm state to colo buffer */
>> +
>> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: send vmstate to slave */
>> +
>> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: Flush network etc. */
>> +
>> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: resume master */
>> +
>> +out:
>> +    return ret;
>> +}
>> +
>>   static void *colo_thread(void *opaque)
>>   {
>>       MigrationState *s = opaque;
>>       int dev_hotplug = qdev_hotplug, wait_cp = 0;
>>       int64_t start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>       int64_t current_time;
>> +    QEMUFile *colo_control = NULL, *colo_trans = NULL;
>> +    int ret;
>>
>>       if (colo_compare_init() < 0) {
>>           error_report("Init colo compare error\n");
>>           goto out;
>>       }
>>
>> +    colo_control = qemu_fopen_socket(qemu_get_fd(s->file), "rb");
>> +    if (!colo_control) {
>> +        error_report("open colo_control failed\n");
>> +        goto out;
>> +    }
>
> In my postcopy world I'm trying to abstract this type of thing into a 'return path'
> so that the QEMUFile can implement it however it wants and you don't
> need to assume it's a socket.  But I'm still fighting some of those details.
>
> Dave
>
>> +
>>       qdev_hotplug = 0;
>>
>>       colo_buffer_init();
>>
>> +    /*
>> +     * Wait for slave finish loading vm states and enter COLO
>> +     * restore.
>> +     */
>> +    ret = colo_ctl_get(colo_control, COLO_READY);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>>       while (s->state == MIG_STATE_COLO) {
>>           /* wait for a colo checkpoint */
>>           wait_cp = colo_compare();
>> @@ -230,13 +396,33 @@ static void *colo_thread(void *opaque)
>>
>>           /* start a colo checkpoint */
>>
>> -        /*TODO: COLO save */
>> +        /* open colo buffer for write */
>> +        colo_trans = qemu_fopen_ops(&colo_buffer, &colo_write_ops);
>> +        if (!colo_trans) {
>> +            error_report("open colo buffer failed\n");
>> +            goto out;
>> +        }
>>
>> +        if (do_colo_transaction(s, colo_control, colo_trans)) {
>> +            goto out;
>> +        }
>> +
>> +        qemu_fclose(colo_trans);
>> +        colo_trans = NULL;
>>           start_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>       }
>>
>>   out:
>> +    if (colo_trans) {
>> +        qemu_fclose(colo_trans);
>> +    }
>> +
>>       colo_buffer_destroy();
>> +
>> +    if (colo_control) {
>> +        qemu_fclose(colo_control);
>> +    }
>> +
>>       colo_compare_destroy();
>>
>>       if (s->state != MIG_STATE_ERROR) {
>> @@ -281,7 +467,7 @@ void colo_init_checkpointer(MigrationState *s)
>>
>>   static Coroutine *colo;
>>
>> -static __attribute__((unused)) bool is_slave(void)
>> +static bool is_slave(void)
>>   {
>>       return colo != NULL;
>>   }
>> @@ -293,13 +479,32 @@ static __attribute__((unused)) bool is_slave(void)
>>    */
>>   static int slave_wait_new_checkpoint(QEMUFile *f)
>>   {
>> -    /* TODO: wait checkpoint start command from master */
>> -    return 1;
>> +    int fd = qemu_get_fd(f);
>> +    int ret;
>> +    uint64_t cmd;
>> +
>> +    yield_until_fd_readable(fd);
>> +
>> +    ret = colo_ctl_get_value(f, &cmd);
>> +    if (ret) {
>> +        return 1;
>> +    }
>> +
>> +    if (cmd == COLO_CHECKPOINT_NEW) {
>> +        return 0;
>> +    } else {
>> +        /* Unexpected data received */
>> +        ctl_error_handler(f, ret);
>> +        return 1;
>> +    }
>>   }
>>
>>   void colo_process_incoming_checkpoints(QEMUFile *f)
>>   {
>> +    int fd = qemu_get_fd(f);
>>       int dev_hotplug = qdev_hotplug;
>> +    QEMUFile *ctl = NULL;
>> +    int ret;
>>
>>       if (!restore_use_colo()) {
>>           return;
>> @@ -310,18 +515,69 @@ void colo_process_incoming_checkpoints(QEMUFile *f)
>>       colo = qemu_coroutine_self();
>>       assert(colo != NULL);
>>
>> +    ctl = qemu_fopen_socket(fd, "wb");
>> +    if (!ctl) {
>> +        error_report("can't open incoming channel\n");
>> +        goto out;
>> +    }
>> +
>>       colo_buffer_init();
>>
>> +    ret = colo_ctl_put(ctl, COLO_READY);
>> +    if (ret) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: in COLO mode, slave is runing, so start the vm */
>> +
>>       while (true) {
>>           if (slave_wait_new_checkpoint(f)) {
>>               break;
>>           }
>>
>> -        /* TODO: COLO restore */
>> +        /* start colo checkpoint */
>> +
>> +        /* TODO: suspend guest */
>> +
>> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
>> +        if (ret) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: open colo buffer for read */
>> +
>> +        ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
>> +        if (ret) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: read migration data into colo buffer */
>> +
>> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
>> +        if (ret) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: load vm state */
>> +
>> +        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
>> +        if (ret) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: resume guest */
>> +
>> +        /* TODO: close colo buffer */
>>       }
>>
>> +out:
>>       colo_buffer_destroy();
>>       colo = NULL;
>> +
>> +    if (ctl) {
>> +        qemu_fclose(ctl);
>> +    }
>> +
>>       restore_exit_colo();
>>
>>       qdev_hotplug = dev_hotplug;
>> --
>> 1.9.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver
  2014-08-01 15:10     ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2014-09-12  6:30       ` Hongyang Yang
  -1 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-12  6:30 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency



在 08/01/2014 11:10 PM, Dr. David Alan Gilbert 写道:
> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>> The ram cache was initially the same as PVM's memory. At
>> checkpoint, we cache the dirty memory of PVM into ram cache
>> (so that ram cache always the same as PVM's memory at every
>> checkpoint), flush cached memory to SVM after we received
>> all PVM dirty memory(only needed to flush memory that was
>> both dirty on PVM and SVM since last checkpoint).
>
> (Typo: 'r' on the end of the title)
>
> I think I understand the need for the cache, to be able to restore pages
> that the SVM has modified that the PVM hadn't; however, if I understand
> the change here, (to host_from_stream_offset) the SVM will load the
> snapshot into the ram_cache rather than directly into host memory - why
> is this necessary?  If the SVMs CPU is stopped at this point couldn't
> it load snapshot pages directly into host memory, clearing pages in the SVMs
> bitmap, so that the only pages that then get copied in flush_cache are
> the pages that the SVM modified but the PVM *didn't* include in the snapshot?
> I can see that you would need to do it the way you've done it if the
> snapshot-load could fail (at the sametime the PVM failed) and thus the old SVM
> state would be the surviving state, but how could it fail at this point
> given the whole stream is in the colo-buffer?

I can see your confusion. Yes, you are right, we can do as what you said, but
at last, we still need to copy the dirty pages into ram cache as well (because
the ram cache is a snapshot and we need to keep this updated). So the question
is whether we load the dirty pages into snapshot first or into host memory
first. I think both methods can work and make no difference...

>
>
>> +static void ram_flush_cache(void);
>>   static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>   {
>>       ram_addr_t addr;
>>       int flags, ret = 0;
>>       static uint64_t seq_iter;
>> +    bool need_flush = false;
>
> Probably better as 'ram_cache_needs_flush'
>
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver
@ 2014-09-12  6:30       ` Hongyang Yang
  0 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-12  6:30 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines



在 08/01/2014 11:10 PM, Dr. David Alan Gilbert 写道:
> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>> The ram cache was initially the same as PVM's memory. At
>> checkpoint, we cache the dirty memory of PVM into ram cache
>> (so that ram cache always the same as PVM's memory at every
>> checkpoint), flush cached memory to SVM after we received
>> all PVM dirty memory(only needed to flush memory that was
>> both dirty on PVM and SVM since last checkpoint).
>
> (Typo: 'r' on the end of the title)
>
> I think I understand the need for the cache, to be able to restore pages
> that the SVM has modified that the PVM hadn't; however, if I understand
> the change here, (to host_from_stream_offset) the SVM will load the
> snapshot into the ram_cache rather than directly into host memory - why
> is this necessary?  If the SVMs CPU is stopped at this point couldn't
> it load snapshot pages directly into host memory, clearing pages in the SVMs
> bitmap, so that the only pages that then get copied in flush_cache are
> the pages that the SVM modified but the PVM *didn't* include in the snapshot?
> I can see that you would need to do it the way you've done it if the
> snapshot-load could fail (at the sametime the PVM failed) and thus the old SVM
> state would be the surviving state, but how could it fail at this point
> given the whole stream is in the colo-buffer?

I can see your confusion. Yes, you are right, we can do as what you said, but
at last, we still need to copy the dirty pages into ram cache as well (because
the ram cache is a snapshot and we need to keep this updated). So the question
is whether we load the dirty pages into snapshot first or into host memory
first. I think both methods can work and make no difference...

>
>
>> +static void ram_flush_cache(void);
>>   static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>   {
>>       ram_addr_t addr;
>>       int flags, ret = 0;
>>       static uint64_t seq_iter;
>> +    bool need_flush = false;
>
> Probably better as 'ram_cache_needs_flush'
>
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled
  2014-08-01 14:43     ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2014-09-12  6:36       ` Hongyang Yang
  -1 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-12  6:36 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency



在 08/01/2014 10:43 PM, Dr. David Alan Gilbert 写道:
> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>> migrate colo info to migration target to tell the target colo is
>> enabled.
>
> If I understand this correctly this means that you send a 'colo info' device
> information for migrations that don't have COLO enabled; that's bad because
> it breaks migration unless the destination has it; I guess it's OK if you
> were to guard it with a thing so it didn't do it for old machine-types.
>
> You could use the QEMU_VM_COMMAND sections I've created for postcopy;
> ( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00889.html ) and
> add a QEMU_VM_CMD_COLO to indicate you want the destination to become an SVM,
>    then check the capability near the start of migration and send the command.

Thank you for the reference, I've read part of your Postcopy patches, but
haven't into detailed implementation. I will use QEMUSizedBuffer/QEMUFile in
next version. For QEMU_VM_COMMAND sections, can you separate it out so that I
can make use of it? Do you have a public git tree or something?

>
> Or perhaps there's a way to add the colo-info device on the command line so it's
> not always there.
>
> Dave
>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   Makefile.objs                      |  1 +
>>   include/migration/migration-colo.h |  3 ++
>>   migration-colo-comm.c              | 68 ++++++++++++++++++++++++++++++++++++++
>>   vl.c                               |  4 +++
>>   4 files changed, 76 insertions(+)
>>   create mode 100644 migration-colo-comm.c
>>
>> diff --git a/Makefile.objs b/Makefile.objs
>> index cab5824..1836a68 100644
>> --- a/Makefile.objs
>> +++ b/Makefile.objs
>> @@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
>>   common-obj-$(CONFIG_LINUX) += fsdev/
>>
>>   common-obj-y += migration.o migration-tcp.o
>> +common-obj-y += migration-colo-comm.o
>>   common-obj-$(CONFIG_COLO) += migration-colo.o
>>   common-obj-y += vmstate.o
>>   common-obj-y += qemu-file.o
>> diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
>> index 35b384c..e3735d8 100644
>> --- a/include/migration/migration-colo.h
>> +++ b/include/migration/migration-colo.h
>> @@ -12,6 +12,9 @@
>>   #define QEMU_MIGRATION_COLO_H
>>
>>   #include "qemu-common.h"
>> +#include "migration/migration.h"
>> +
>> +void colo_info_mig_init(void);
>>
>>   bool colo_supported(void);
>>
>> diff --git a/migration-colo-comm.c b/migration-colo-comm.c
>> new file mode 100644
>> index 0000000..ccbc246
>> --- /dev/null
>> +++ b/migration-colo-comm.c
>> @@ -0,0 +1,68 @@
>> +/*
>> + *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
>> + *  (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + *  Copyright (C) 2014 FUJITSU LIMITED
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + *
>> + */
>> +
>> +#include <migration/migration-colo.h>
>> +
>> +#define DEBUG_COLO
>> +
>> +#ifdef DEBUG_COLO
>> +#define DPRINTF(fmt, ...) \
>> +    do { fprintf(stdout, "COLO: " fmt, ## __VA_ARGS__); } while (0)
>> +#else
>> +#define DPRINTF(fmt, ...) \
>> +    do { } while (0)
>> +#endif
>> +
>> +static bool colo_requested;
>> +
>> +/* save */
>> +
>> +static bool migrate_use_colo(void)
>> +{
>> +    MigrationState *s = migrate_get_current();
>> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
>> +}
>> +
>> +static void colo_info_save(QEMUFile *f, void *opaque)
>> +{
>> +    qemu_put_byte(f, migrate_use_colo());
>> +}
>> +
>> +/* restore */
>> +
>> +static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
>> +{
>> +    int value = qemu_get_byte(f);
>> +
>> +    if (value && !colo_supported()) {
>> +        fprintf(stderr, "COLO is not supported\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if (value && !colo_requested) {
>> +        DPRINTF("COLO requested!\n");
>> +    }
>> +
>> +    colo_requested = value;
>> +
>> +    return 0;
>> +}
>> +
>> +static SaveVMHandlers savevm_colo_info_handlers = {
>> +    .save_state = colo_info_save,
>> +    .load_state = colo_info_load,
>> +};
>> +
>> +void colo_info_mig_init(void)
>> +{
>> +    register_savevm_live(NULL, "colo info", -1, 1,
>> +                         &savevm_colo_info_handlers, NULL);
>> +}
>> diff --git a/vl.c b/vl.c
>> index fe451aa..1a282d8 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -89,6 +89,7 @@ int main(int argc, char **argv)
>>   #include "sysemu/dma.h"
>>   #include "audio/audio.h"
>>   #include "migration/migration.h"
>> +#include "migration/migration-colo.h"
>>   #include "sysemu/kvm.h"
>>   #include "qapi/qmp/qjson.h"
>>   #include "qemu/option.h"
>> @@ -4339,6 +4340,9 @@ int main(int argc, char **argv, char **envp)
>>
>>       blk_mig_init();
>>       ram_mig_init();
>> +    if (colo_supported()) {
>> +        colo_info_mig_init();
>> +    }
>>
>>       /* open the virtual block devices */
>>       if (snapshot)
>> --
>> 1.9.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled
@ 2014-09-12  6:36       ` Hongyang Yang
  0 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-12  6:36 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines



在 08/01/2014 10:43 PM, Dr. David Alan Gilbert 写道:
> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>> migrate colo info to migration target to tell the target colo is
>> enabled.
>
> If I understand this correctly this means that you send a 'colo info' device
> information for migrations that don't have COLO enabled; that's bad because
> it breaks migration unless the destination has it; I guess it's OK if you
> were to guard it with a thing so it didn't do it for old machine-types.
>
> You could use the QEMU_VM_COMMAND sections I've created for postcopy;
> ( http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00889.html ) and
> add a QEMU_VM_CMD_COLO to indicate you want the destination to become an SVM,
>    then check the capability near the start of migration and send the command.

Thank you for the reference, I've read part of your Postcopy patches, but
haven't into detailed implementation. I will use QEMUSizedBuffer/QEMUFile in
next version. For QEMU_VM_COMMAND sections, can you separate it out so that I
can make use of it? Do you have a public git tree or something?

>
> Or perhaps there's a way to add the colo-info device on the command line so it's
> not always there.
>
> Dave
>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   Makefile.objs                      |  1 +
>>   include/migration/migration-colo.h |  3 ++
>>   migration-colo-comm.c              | 68 ++++++++++++++++++++++++++++++++++++++
>>   vl.c                               |  4 +++
>>   4 files changed, 76 insertions(+)
>>   create mode 100644 migration-colo-comm.c
>>
>> diff --git a/Makefile.objs b/Makefile.objs
>> index cab5824..1836a68 100644
>> --- a/Makefile.objs
>> +++ b/Makefile.objs
>> @@ -50,6 +50,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
>>   common-obj-$(CONFIG_LINUX) += fsdev/
>>
>>   common-obj-y += migration.o migration-tcp.o
>> +common-obj-y += migration-colo-comm.o
>>   common-obj-$(CONFIG_COLO) += migration-colo.o
>>   common-obj-y += vmstate.o
>>   common-obj-y += qemu-file.o
>> diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
>> index 35b384c..e3735d8 100644
>> --- a/include/migration/migration-colo.h
>> +++ b/include/migration/migration-colo.h
>> @@ -12,6 +12,9 @@
>>   #define QEMU_MIGRATION_COLO_H
>>
>>   #include "qemu-common.h"
>> +#include "migration/migration.h"
>> +
>> +void colo_info_mig_init(void);
>>
>>   bool colo_supported(void);
>>
>> diff --git a/migration-colo-comm.c b/migration-colo-comm.c
>> new file mode 100644
>> index 0000000..ccbc246
>> --- /dev/null
>> +++ b/migration-colo-comm.c
>> @@ -0,0 +1,68 @@
>> +/*
>> + *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
>> + *  (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + *  Copyright (C) 2014 FUJITSU LIMITED
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + *
>> + */
>> +
>> +#include <migration/migration-colo.h>
>> +
>> +#define DEBUG_COLO
>> +
>> +#ifdef DEBUG_COLO
>> +#define DPRINTF(fmt, ...) \
>> +    do { fprintf(stdout, "COLO: " fmt, ## __VA_ARGS__); } while (0)
>> +#else
>> +#define DPRINTF(fmt, ...) \
>> +    do { } while (0)
>> +#endif
>> +
>> +static bool colo_requested;
>> +
>> +/* save */
>> +
>> +static bool migrate_use_colo(void)
>> +{
>> +    MigrationState *s = migrate_get_current();
>> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
>> +}
>> +
>> +static void colo_info_save(QEMUFile *f, void *opaque)
>> +{
>> +    qemu_put_byte(f, migrate_use_colo());
>> +}
>> +
>> +/* restore */
>> +
>> +static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
>> +{
>> +    int value = qemu_get_byte(f);
>> +
>> +    if (value && !colo_supported()) {
>> +        fprintf(stderr, "COLO is not supported\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if (value && !colo_requested) {
>> +        DPRINTF("COLO requested!\n");
>> +    }
>> +
>> +    colo_requested = value;
>> +
>> +    return 0;
>> +}
>> +
>> +static SaveVMHandlers savevm_colo_info_handlers = {
>> +    .save_state = colo_info_save,
>> +    .load_state = colo_info_load,
>> +};
>> +
>> +void colo_info_mig_init(void)
>> +{
>> +    register_savevm_live(NULL, "colo info", -1, 1,
>> +                         &savevm_colo_info_handlers, NULL);
>> +}
>> diff --git a/vl.c b/vl.c
>> index fe451aa..1a282d8 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -89,6 +89,7 @@ int main(int argc, char **argv)
>>   #include "sysemu/dma.h"
>>   #include "audio/audio.h"
>>   #include "migration/migration.h"
>> +#include "migration/migration-colo.h"
>>   #include "sysemu/kvm.h"
>>   #include "qapi/qmp/qjson.h"
>>   #include "qemu/option.h"
>> @@ -4339,6 +4340,9 @@ int main(int argc, char **argv, char **envp)
>>
>>       blk_mig_init();
>>       ram_mig_init();
>> +    if (colo_supported()) {
>> +        colo_info_mig_init();
>> +    }
>>
>>       /* open the virtual block devices */
>>       if (snapshot)
>> --
>> 1.9.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
  2014-09-12  6:20       ` [Qemu-devel] " Hongyang Yang
@ 2014-09-12 11:17         ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-09-12 11:17 UTC (permalink / raw)
  To: Hongyang Yang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
> 
> 
> ??? 08/01/2014 11:03 PM, Dr. David Alan Gilbert ??????:
> >* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:

<snip>

> >>+static int do_colo_transaction(MigrationState *s, QEMUFile *control,
> >>+                               QEMUFile *trans)
> >>+{
> >>+    int ret;
> >>+
> >>+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
> >>+    if (ret) {
> >>+        goto out;
> >>+    }
> >>+
> >>+    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
> >
> >What happens at this point if the slave just doesn't respond?
> >(i.e. the socket doesn't drop - you just don't get the byte).
> 
> If the socket return bytes that were not expected, exit. If
> socket return error, do some cleanup and quit COLO process.
> refer to: colo_ctl_get() and colo_ctl_get_value()

But what happens if the slave just doesn't respond at all; e.g.
if the slave host loses power, it'll take a while (many seconds)
before the socket will timeout.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
@ 2014-09-12 11:17         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-09-12 11:17 UTC (permalink / raw)
  To: Hongyang Yang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
> 
> 
> ??? 08/01/2014 11:03 PM, Dr. David Alan Gilbert ??????:
> >* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:

<snip>

> >>+static int do_colo_transaction(MigrationState *s, QEMUFile *control,
> >>+                               QEMUFile *trans)
> >>+{
> >>+    int ret;
> >>+
> >>+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
> >>+    if (ret) {
> >>+        goto out;
> >>+    }
> >>+
> >>+    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
> >
> >What happens at this point if the slave just doesn't respond?
> >(i.e. the socket doesn't drop - you just don't get the byte).
> 
> If the socket return bytes that were not expected, exit. If
> socket return error, do some cleanup and quit COLO process.
> refer to: colo_ctl_get() and colo_ctl_get_value()

But what happens if the slave just doesn't respond at all; e.g.
if the slave host loses power, it'll take a while (many seconds)
before the socket will timeout.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
  2014-09-12 11:17         ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2014-09-12 11:40           ` Hongyang Yang
  -1 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-12 11:40 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency



在 09/12/2014 07:17 PM, Dr. David Alan Gilbert 写道:
> * Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
>>
>>
>> ??? 08/01/2014 11:03 PM, Dr. David Alan Gilbert ??????:
>>> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>
> <snip>
>
>>>> +static int do_colo_transaction(MigrationState *s, QEMUFile *control,
>>>> +                               QEMUFile *trans)
>>>> +{
>>>> +    int ret;
>>>> +
>>>> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
>>>> +    if (ret) {
>>>> +        goto out;
>>>> +    }
>>>> +
>>>> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
>>>
>>> What happens at this point if the slave just doesn't respond?
>>> (i.e. the socket doesn't drop - you just don't get the byte).
>>
>> If the socket return bytes that were not expected, exit. If
>> socket return error, do some cleanup and quit COLO process.
>> refer to: colo_ctl_get() and colo_ctl_get_value()
>
> But what happens if the slave just doesn't respond at all; e.g.
> if the slave host loses power, it'll take a while (many seconds)
> before the socket will timeout.

It will wait until the call returns timeout error, and then do some
cleanup and quit COLO process. There may be better way to handle
this?

>
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
@ 2014-09-12 11:40           ` Hongyang Yang
  0 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-12 11:40 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines



在 09/12/2014 07:17 PM, Dr. David Alan Gilbert 写道:
> * Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
>>
>>
>> ??? 08/01/2014 11:03 PM, Dr. David Alan Gilbert ??????:
>>> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>
> <snip>
>
>>>> +static int do_colo_transaction(MigrationState *s, QEMUFile *control,
>>>> +                               QEMUFile *trans)
>>>> +{
>>>> +    int ret;
>>>> +
>>>> +    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
>>>> +    if (ret) {
>>>> +        goto out;
>>>> +    }
>>>> +
>>>> +    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
>>>
>>> What happens at this point if the slave just doesn't respond?
>>> (i.e. the socket doesn't drop - you just don't get the byte).
>>
>> If the socket return bytes that were not expected, exit. If
>> socket return error, do some cleanup and quit COLO process.
>> refer to: colo_ctl_get() and colo_ctl_get_value()
>
> But what happens if the slave just doesn't respond at all; e.g.
> if the slave host loses power, it'll take a while (many seconds)
> before the socket will timeout.

It will wait until the call returns timeout error, and then do some
cleanup and quit COLO process. There may be better way to handle
this?

>
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
  2014-09-12 11:40           ` [Qemu-devel] " Hongyang Yang
@ 2014-09-12 11:57             ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-09-12 11:57 UTC (permalink / raw)
  To: Hongyang Yang; +Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

* Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
> 
> 
> ??? 09/12/2014 07:17 PM, Dr. David Alan Gilbert ??????:
> >* Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
> >>
> >>
> >>??? 08/01/2014 11:03 PM, Dr. David Alan Gilbert ??????:
> >>>* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> >
> ><snip>
> >
> >>>>+static int do_colo_transaction(MigrationState *s, QEMUFile *control,
> >>>>+                               QEMUFile *trans)
> >>>>+{
> >>>>+    int ret;
> >>>>+
> >>>>+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
> >>>>+    if (ret) {
> >>>>+        goto out;
> >>>>+    }
> >>>>+
> >>>>+    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
> >>>
> >>>What happens at this point if the slave just doesn't respond?
> >>>(i.e. the socket doesn't drop - you just don't get the byte).
> >>
> >>If the socket return bytes that were not expected, exit. If
> >>socket return error, do some cleanup and quit COLO process.
> >>refer to: colo_ctl_get() and colo_ctl_get_value()
> >
> >But what happens if the slave just doesn't respond at all; e.g.
> >if the slave host loses power, it'll take a while (many seconds)
> >before the socket will timeout.
> 
> It will wait until the call returns timeout error, and then do some
> cleanup and quit COLO process.

If it was to wait here for ~30seconds for the timeout what would happen
to the primary? Would it be stopped from sending any network traffic
for those 30 seconds - I think that's too long to fail over.

> There may be better way to handle this?

In postcopy I always take reads coming back from the destination
in a separate thread, because that thread can't block the main thread
going out (I originally did that using async reads but the thread
is nicer).  You could also use something like a poll() with a shorter
timeout to however long you are happy for COLO to go before it fails.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol
@ 2014-09-12 11:57             ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 80+ messages in thread
From: Dr. David Alan Gilbert @ 2014-09-12 11:57 UTC (permalink / raw)
  To: Hongyang Yang; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

* Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
> 
> 
> ??? 09/12/2014 07:17 PM, Dr. David Alan Gilbert ??????:
> >* Hongyang Yang (yanghy@cn.fujitsu.com) wrote:
> >>
> >>
> >>??? 08/01/2014 11:03 PM, Dr. David Alan Gilbert ??????:
> >>>* Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
> >
> ><snip>
> >
> >>>>+static int do_colo_transaction(MigrationState *s, QEMUFile *control,
> >>>>+                               QEMUFile *trans)
> >>>>+{
> >>>>+    int ret;
> >>>>+
> >>>>+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
> >>>>+    if (ret) {
> >>>>+        goto out;
> >>>>+    }
> >>>>+
> >>>>+    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
> >>>
> >>>What happens at this point if the slave just doesn't respond?
> >>>(i.e. the socket doesn't drop - you just don't get the byte).
> >>
> >>If the socket return bytes that were not expected, exit. If
> >>socket return error, do some cleanup and quit COLO process.
> >>refer to: colo_ctl_get() and colo_ctl_get_value()
> >
> >But what happens if the slave just doesn't respond at all; e.g.
> >if the slave host loses power, it'll take a while (many seconds)
> >before the socket will timeout.
> 
> It will wait until the call returns timeout error, and then do some
> cleanup and quit COLO process.

If it was to wait here for ~30seconds for the timeout what would happen
to the primary? Would it be stopped from sending any network traffic
for those 30 seconds - I think that's too long to fail over.

> There may be better way to handle this?

In postcopy I always take reads coming back from the destination
in a separate thread, because that thread can't block the main thread
going out (I originally did that using async reads but the thread
is nicer).  You could also use something like a poll() with a shorter
timeout to however long you are happy for COLO to go before it fails.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it
  2014-08-01 14:52     ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2014-09-17  1:43       ` Hongyang Yang
  -1 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-17  1:43 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, kvm, eddie.dong, GuiJianfeng, mrhines, wency

Hi

在 08/01/2014 10:52 PM, Dr. David Alan Gilbert 写道:
> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>> We need a buffer to store migration data.
>>
>> On save side:
>>    all saved data was write into colo buffer first, so that we can know
>> the total size of the migration data. this can also separate the data
>> transmission from colo control data, we use colo control data over
>> socket fd to synchronous both side's stat.
>>
>> On restore side:
>>    all migration data was read into colo buffer first, then load data
>> from the buffer: If network error happens while data transmission,
>> the slaver can still functinal because the migration data are not yet
>> loaded.
>
> This is very similar to the QEMUSizedBuffer based QEMUFile's that Stefan Berger
> wrote and that I use in both my postcopy and BER patchsets:
>
>   http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00846.html
>
>   (and to the similar code from Isaku Yamahata).
>
> I think we should be able to use a shared version even if we need some changes.

I've being using this patch as COLO buffer, see my latest branch:
https://github.com/macrosheep/qemu/tree/colo_v0.4

But this QEMUSizedBuffer does not exactly match our needs, although it can work.
We need a static buffer that lives through COLO process so that we don't need
to alloc/free buffers every checkpoint. Currently open the buffer for write
always alloc a new buffer, I think I need the following modification:

1. ability to open an existing QEMUSizedBuffer for write
2. a reset_buf() api, that will clear buffered data, or just rewind QEMUFile
    related to the buffer is enough.

>
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   migration-colo.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 112 insertions(+)
>>
>> diff --git a/migration-colo.c b/migration-colo.c
>> index d566b9d..b90d9b6 100644
>> --- a/migration-colo.c
>> +++ b/migration-colo.c
>> @@ -11,6 +11,7 @@
>>   #include "qemu/main-loop.h"
>>   #include "qemu/thread.h"
>>   #include "block/coroutine.h"
>> +#include "qemu/error-report.h"
>>   #include "migration/migration-colo.h"
>>
>>   static QEMUBH *colo_bh;
>> @@ -20,14 +21,122 @@ bool colo_supported(void)
>>       return true;
>>   }
>>
>> +/* colo buffer */
>> +
>> +#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
>> +#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)
>
> Powers of 2 are nicer!
>
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it
@ 2014-09-17  1:43       ` Hongyang Yang
  0 siblings, 0 replies; 80+ messages in thread
From: Hongyang Yang @ 2014-09-17  1:43 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: kvm, GuiJianfeng, eddie.dong, qemu-devel, mrhines

Hi

在 08/01/2014 10:52 PM, Dr. David Alan Gilbert 写道:
> * Yang Hongyang (yanghy@cn.fujitsu.com) wrote:
>> We need a buffer to store migration data.
>>
>> On save side:
>>    all saved data was write into colo buffer first, so that we can know
>> the total size of the migration data. this can also separate the data
>> transmission from colo control data, we use colo control data over
>> socket fd to synchronous both side's stat.
>>
>> On restore side:
>>    all migration data was read into colo buffer first, then load data
>> from the buffer: If network error happens while data transmission,
>> the slaver can still functinal because the migration data are not yet
>> loaded.
>
> This is very similar to the QEMUSizedBuffer based QEMUFile's that Stefan Berger
> wrote and that I use in both my postcopy and BER patchsets:
>
>   http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg00846.html
>
>   (and to the similar code from Isaku Yamahata).
>
> I think we should be able to use a shared version even if we need some changes.

I've being using this patch as COLO buffer, see my latest branch:
https://github.com/macrosheep/qemu/tree/colo_v0.4

But this QEMUSizedBuffer does not exactly match our needs, although it can work.
We need a static buffer that lives through COLO process so that we don't need
to alloc/free buffers every checkpoint. Currently open the buffer for write
always alloc a new buffer, I think I need the following modification:

1. ability to open an existing QEMUSizedBuffer for write
2. a reset_buf() api, that will clear buffered data, or just rewind QEMUFile
    related to the buffer is enough.

>
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   migration-colo.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 112 insertions(+)
>>
>> diff --git a/migration-colo.c b/migration-colo.c
>> index d566b9d..b90d9b6 100644
>> --- a/migration-colo.c
>> +++ b/migration-colo.c
>> @@ -11,6 +11,7 @@
>>   #include "qemu/main-loop.h"
>>   #include "qemu/thread.h"
>>   #include "block/coroutine.h"
>> +#include "qemu/error-report.h"
>>   #include "migration/migration-colo.h"
>>
>>   static QEMUBH *colo_bh;
>> @@ -20,14 +21,122 @@ bool colo_supported(void)
>>       return true;
>>   }
>>
>> +/* colo buffer */
>> +
>> +#define COLO_BUFFER_BASE_SIZE (1000*1000*4ULL)
>> +#define COLO_BUFFER_MAX_SIZE (1000*1000*1000*10ULL)
>
> Powers of 2 are nicer!
>
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2014-09-17  1:45 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-23 14:25 [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service Yang Hongyang
2014-07-23 14:25 ` [Qemu-devel] " Yang Hongyang
2014-07-23 14:25 ` [RFC PATCH 01/17] configure: add CONFIG_COLO to switch COLO support Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 14:25 ` [RFC PATCH 02/17] COLO: introduce an api colo_supported() to indicate " Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 15:47   ` Eric Blake
2014-07-23 15:47     ` Eric Blake
2014-07-23 14:25 ` [RFC PATCH 03/17] COLO migration: add a migration capability 'colo' Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 14:41   ` Eric Blake
2014-07-23 14:41     ` Eric Blake
2014-07-23 14:25 ` [RFC PATCH 04/17] COLO info: use colo info to tell migration target colo is enabled Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-08-01 14:43   ` Dr. David Alan Gilbert
2014-08-01 14:43     ` [Qemu-devel] " Dr. David Alan Gilbert
2014-09-12  6:36     ` Hongyang Yang
2014-09-12  6:36       ` [Qemu-devel] " Hongyang Yang
2014-07-23 14:25 ` [RFC PATCH 05/17] COLO save: integrate COLO checkpointed save into qemu migration Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-08-01 14:46   ` Dr. David Alan Gilbert
2014-08-01 14:46     ` [Qemu-devel] " Dr. David Alan Gilbert
2014-07-23 14:25 ` [RFC PATCH 06/17] COLO restore: integrate COLO checkpointed restore into qemu restore Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 14:25 ` [RFC PATCH 07/17] COLO buffer: implement colo buffer as well as QEMUFileOps based on it Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 18:24   ` Eric Blake
2014-07-23 18:24     ` Eric Blake
2014-08-01 14:52   ` Dr. David Alan Gilbert
2014-08-01 14:52     ` [Qemu-devel] " Dr. David Alan Gilbert
2014-09-17  1:43     ` Hongyang Yang
2014-09-17  1:43       ` [Qemu-devel] " Hongyang Yang
2014-07-23 14:25 ` [RFC PATCH 08/17] COLO: disable qdev hotplug Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 14:25 ` [RFC PATCH 09/17] COLO ctl: implement API's that communicate with colo agent Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 14:25 ` [RFC PATCH 10/17] COLO ctl: introduce is_slave() and is_master() Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-08-01 14:55   ` Dr. David Alan Gilbert
2014-08-01 14:55     ` [Qemu-devel] " Dr. David Alan Gilbert
2014-07-23 14:25 ` [RFC PATCH 11/17] COLO ctl: implement colo checkpoint protocol Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-08-01 15:03   ` Dr. David Alan Gilbert
2014-08-01 15:03     ` [Qemu-devel] " Dr. David Alan Gilbert
2014-09-12  6:20     ` Hongyang Yang
2014-09-12  6:20       ` [Qemu-devel] " Hongyang Yang
2014-09-12 11:17       ` Dr. David Alan Gilbert
2014-09-12 11:17         ` [Qemu-devel] " Dr. David Alan Gilbert
2014-09-12 11:40         ` Hongyang Yang
2014-09-12 11:40           ` [Qemu-devel] " Hongyang Yang
2014-09-12 11:57           ` Dr. David Alan Gilbert
2014-09-12 11:57             ` [Qemu-devel] " Dr. David Alan Gilbert
2014-07-23 14:25 ` [RFC PATCH 12/17] COLO ctl: add a RunState RUN_STATE_COLO Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 15:48   ` Eric Blake
2014-07-23 15:48     ` Eric Blake
2014-07-23 14:25 ` [RFC PATCH 13/17] COLO ctl: implement colo save Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-08-01 15:07   ` Dr. David Alan Gilbert
2014-08-01 15:07     ` [Qemu-devel] " Dr. David Alan Gilbert
2014-07-23 14:25 ` [RFC PATCH 14/17] COLO ctl: implement colo restore Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 14:25 ` [RFC PATCH 15/17] COLO save: reuse migration bitmap under colo checkpoint Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-08-01 15:09   ` Dr. David Alan Gilbert
2014-08-01 15:09     ` [Qemu-devel] " Dr. David Alan Gilbert
2014-07-23 14:25 ` [RFC PATCH 16/17] COLO ram cache: implement colo ram cache on slaver Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-08-01 15:10   ` Dr. David Alan Gilbert
2014-08-01 15:10     ` [Qemu-devel] " Dr. David Alan Gilbert
2014-09-12  6:30     ` Hongyang Yang
2014-09-12  6:30       ` [Qemu-devel] " Hongyang Yang
2014-07-23 14:25 ` [RFC PATCH 17/17] HACK: trigger checkpoint every 500ms Yang Hongyang
2014-07-23 14:25   ` [Qemu-devel] " Yang Hongyang
2014-07-23 15:44 ` [Qemu-devel] [RFC PATCH 00/17] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service Eric Blake
2014-07-23 15:44   ` Eric Blake
2014-07-24  2:24   ` Hongyang Yang
2014-07-24  2:24     ` [Qemu-devel] " Hongyang Yang
2014-08-01 16:02 ` Dr. David Alan Gilbert
2014-08-01 16:02   ` [Qemu-devel] " Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.