All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 00/18] migration: COLO
@ 2016-10-30 10:46 Amit Shah
  2016-10-30 10:46 ` [Qemu-devel] [PULL 01/18] migration: Introduce capability 'x-colo' to migration Amit Shah
                   ` (18 more replies)
  0 siblings, 19 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: Amit Shah <amit@amitshah.net>

The following changes since commit 5b2ecabaeabc17f032197246c4846b9ba95ba8a6:

  Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20161028-1' into staging (2016-10-28 17:59:04 +0100)

are available in the git repository at:

  https://git.kernel.org/pub/scm/virt/qemu/amit/migration.git tags/migration-for-2.8

for you to fetch changes up to a4cc318e15955b4b9c7231c37128d4e7466d4307:

  MAINTAINERS: Add maintainer for COLO framework related files (2016-10-30 15:17:39 +0530)

----------------------------------------------------------------
Migration bits from the COLO project

----------------------------------------------------------------

zhanghailiang (18):
  migration: Introduce capability 'x-colo' to migration
  COLO: migrate COLO related info to secondary node
  migration: Enter into COLO mode after migration if COLO is enabled
  migration: Switch to COLO process after finishing loadvm
  COLO: Establish a new communicating path for COLO
  COLO: Introduce checkpointing protocol
  COLO: Add a new RunState RUN_STATE_COLO
  COLO: Send PVM state to secondary side when do checkpoint
  COLO: Load VMState into QIOChannelBuffer before restore it
  COLO: Add checkpoint-delay parameter for migrate-set-parameters
  COLO: Synchronize PVM's state to SVM periodically
  COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
  COLO: Introduce state to record failover process
  COLO: Implement the process of failover for primary VM
  COLO: Implement failover work for secondary VM
  docs: Add documentation for COLO feature
  configure: Support enable/disable COLO feature
  MAINTAINERS: Add maintainer for COLO framework related files

 MAINTAINERS                   |   8 +
 configure                     |  11 +
 docs/COLO-FT.txt              | 189 +++++++++++++++
 docs/qmp-commands.txt         |  17 +-
 hmp-commands.hx               |  15 ++
 hmp.c                         |  16 ++
 hmp.h                         |   1 +
 include/migration/colo.h      |  38 +++
 include/migration/failover.h  |  26 +++
 include/migration/migration.h |   8 +
 migration/Makefile.objs       |   2 +
 migration/colo-comm.c         |  72 ++++++
 migration/colo-failover.c     |  83 +++++++
 migration/colo.c              | 529 ++++++++++++++++++++++++++++++++++++++++++
 migration/migration.c         |  84 ++++++-
 migration/ram.c               |  37 ++-
 migration/trace-events        |   6 +
 qapi-schema.json              | 100 +++++++-
 stubs/Makefile.objs           |   1 +
 stubs/migration-colo.c        |  46 ++++
 vl.c                          |  11 +
 21 files changed, 1279 insertions(+), 21 deletions(-)
 create mode 100644 docs/COLO-FT.txt
 create mode 100644 include/migration/colo.h
 create mode 100644 include/migration/failover.h
 create mode 100644 migration/colo-comm.c
 create mode 100644 migration/colo-failover.c
 create mode 100644 migration/colo.c
 create mode 100644 stubs/migration-colo.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 01/18] migration: Introduce capability 'x-colo' to migration
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
@ 2016-10-30 10:46 ` Amit Shah
  2016-10-30 10:46 ` [Qemu-devel] [PULL 02/18] COLO: migrate COLO related info to secondary node Amit Shah
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.

The default value for COLO (COarse-Grain LOck Stepping) is disabled.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 docs/qmp-commands.txt         |  5 ++++-
 include/migration/colo.h      | 20 ++++++++++++++++++++
 include/migration/migration.h |  1 +
 migration/Makefile.objs       |  1 +
 migration/colo.c              | 19 +++++++++++++++++++
 migration/migration.c         | 18 ++++++++++++++++++
 qapi-schema.json              |  7 ++++++-
 stubs/Makefile.objs           |  1 +
 stubs/migration-colo.c        | 19 +++++++++++++++++++
 9 files changed, 89 insertions(+), 2 deletions(-)
 create mode 100644 include/migration/colo.h
 create mode 100644 migration/colo.c
 create mode 100644 stubs/migration-colo.c

diff --git a/docs/qmp-commands.txt b/docs/qmp-commands.txt
index 284576d..9bdbbfe 100644
--- a/docs/qmp-commands.txt
+++ b/docs/qmp-commands.txt
@@ -2861,6 +2861,7 @@ Enable/Disable migration capabilities
 - "compress": use multiple compression threads to accelerate live migration
 - "events": generate events for each migration state change
 - "postcopy-ram": postcopy mode for live migration
+- "x-colo": COarse-Grain LOck Stepping (COLO) for Non-stop Service
 
 Arguments:
 
@@ -2882,6 +2883,7 @@ Query current migration capabilities
          - "compress": Multiple compression threads state (json-bool)
          - "events": Migration state change event state (json-bool)
          - "postcopy-ram": postcopy ram state (json-bool)
+         - "x-colo": COarse-Grain LOck Stepping for Non-stop Service (json-bool)
 
 Arguments:
 
@@ -2895,7 +2897,8 @@ Example:
      {"state": false, "capability": "zero-blocks"},
      {"state": false, "capability": "compress"},
      {"state": true, "capability": "events"},
-     {"state": false, "capability": "postcopy-ram"}
+     {"state": false, "capability": "postcopy-ram"},
+     {"state": false, "capability": "x-colo"}
    ]}
 
 migrate-set-parameters
diff --git a/include/migration/colo.h b/include/migration/colo.h
new file mode 100644
index 0000000..59a632a
--- /dev/null
+++ b/include/migration/colo.h
@@ -0,0 +1,20 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_H
+#define QEMU_COLO_H
+
+#include "qemu-common.h"
+
+bool colo_supported(void);
+
+#endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 2791b90..34f442b 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -298,6 +298,7 @@ int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
 
 int migrate_use_xbzrle(void);
 int64_t migrate_xbzrle_cache_size(void);
+bool migrate_colo_enabled(void);
 
 int64_t xbzrle_cache_resize(int64_t new_size);
 
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index 30ad945..cff96f0 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,5 +1,6 @@
 common-obj-y += migration.o socket.o fd.o exec.o
 common-obj-y += tls.o
+common-obj-$(CONFIG_COLO) += colo.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o
 common-obj-y += qemu-file-channel.o
diff --git a/migration/colo.c b/migration/colo.c
new file mode 100644
index 0000000..d215057
--- /dev/null
+++ b/migration/colo.c
@@ -0,0 +1,19 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "migration/colo.h"
+
+bool colo_supported(void)
+{
+    return false;
+}
diff --git a/migration/migration.c b/migration/migration.c
index 156e707..c7869e5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -36,6 +36,7 @@
 #include "exec/address-spaces.h"
 #include "io/channel-buffer.h"
 #include "io/channel-tls.h"
+#include "migration/colo.h"
 
 #define MAX_THROTTLE  (32 << 20)      /* Migration transfer speed throttling */
 
@@ -531,6 +532,9 @@ MigrationCapabilityStatusList *qmp_query_migrate_capabilities(Error **errp)
 
     caps = NULL; /* silence compiler warning */
     for (i = 0; i < MIGRATION_CAPABILITY__MAX; i++) {
+        if (i == MIGRATION_CAPABILITY_X_COLO && !colo_supported()) {
+            continue;
+        }
         if (head == NULL) {
             head = g_malloc0(sizeof(*caps));
             caps = head;
@@ -733,6 +737,14 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
     }
 
     for (cap = params; cap; cap = cap->next) {
+        if (cap->value->capability == MIGRATION_CAPABILITY_X_COLO) {
+            if (!colo_supported()) {
+                error_setg(errp, "COLO is not currently supported, please"
+                             " configure with --enable-colo option in order to"
+                             " support COLO feature");
+                continue;
+            }
+        }
         s->enabled_capabilities[cap->value->capability] = cap->value->state;
     }
 
@@ -1713,6 +1725,12 @@ fail:
                       MIGRATION_STATUS_FAILED);
 }
 
+bool migrate_colo_enabled(void)
+{
+    MigrationState *s = migrate_get_current();
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_X_COLO];
+}
+
 /*
  * Master migration thread on the source VM.
  * It drives the migration and pumps the data down the outgoing channel.
diff --git a/qapi-schema.json b/qapi-schema.json
index d6a43a1..0fb4d7e 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -574,11 +574,16 @@
 #          been migrated, pulling the remaining pages along as needed. NOTE: If
 #          the migration fails during postcopy the VM will fail.  (since 2.6)
 #
+# @x-colo: If enabled, migration will never end, and the state of the VM on the
+#        primary side will be migrated continuously to the VM on secondary
+#        side, this process is called COarse-Grain LOck Stepping (COLO) for
+#        Non-stop Service. (since 2.8)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
-           'compress', 'events', 'postcopy-ram'] }
+           'compress', 'events', 'postcopy-ram', 'x-colo'] }
 
 ##
 # @MigrationCapabilityStatus
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index c5850e8..005c03f 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -48,3 +48,4 @@ stub-obj-y += iohandler.o
 stub-obj-y += smbios_type_38.o
 stub-obj-y += ipmi.o
 stub-obj-y += pc_madt_cpu_entry.o
+stub-obj-y += migration-colo.o
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
new file mode 100644
index 0000000..d215057
--- /dev/null
+++ b/stubs/migration-colo.c
@@ -0,0 +1,19 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "migration/colo.h"
+
+bool colo_supported(void)
+{
+    return false;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 02/18] COLO: migrate COLO related info to secondary node
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
  2016-10-30 10:46 ` [Qemu-devel] [PULL 01/18] migration: Introduce capability 'x-colo' to migration Amit Shah
@ 2016-10-30 10:46 ` Amit Shah
  2016-10-30 10:46 ` [Qemu-devel] [PULL 03/18] migration: Enter into COLO mode after migration if COLO is enabled Amit Shah
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

We can determine whether or not VM in destination should go into COLO mode
by referring to the info that was migrated.

We skip this section if COLO is not enabled (i.e.
migrate_set_capability colo off), so that, It doesn't break compatibility
with migration no matter whether users configure the --enable-colo/disable-colo
on the source/destination side or not;

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 include/migration/colo.h |  2 ++
 migration/Makefile.objs  |  1 +
 migration/colo-comm.c    | 51 ++++++++++++++++++++++++++++++++++++++++++++++++
 vl.c                     |  3 +++
 4 files changed, 57 insertions(+)
 create mode 100644 migration/colo-comm.c

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 59a632a..1c899a0 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -14,7 +14,9 @@
 #define QEMU_COLO_H
 
 #include "qemu-common.h"
+#include "migration/migration.h"
 
 bool colo_supported(void);
+void colo_info_init(void);
 
 #endif
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index cff96f0..4bbe9ab 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,6 +1,7 @@
 common-obj-y += migration.o socket.o fd.o exec.o
 common-obj-y += tls.o
 common-obj-$(CONFIG_COLO) += colo.o
+common-obj-y += colo-comm.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o
 common-obj-y += qemu-file-channel.o
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
new file mode 100644
index 0000000..a2d5185
--- /dev/null
+++ b/migration/colo-comm.c
@@ -0,0 +1,51 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later. See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include <migration/colo.h>
+#include "trace.h"
+
+typedef struct {
+     bool colo_requested;
+} COLOInfo;
+
+static COLOInfo colo_info;
+
+static void colo_info_pre_save(void *opaque)
+{
+    COLOInfo *s = opaque;
+
+    s->colo_requested = migrate_colo_enabled();
+}
+
+static bool colo_info_need(void *opaque)
+{
+   return migrate_colo_enabled();
+}
+
+static const VMStateDescription colo_state = {
+    .name = "COLOState",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .pre_save = colo_info_pre_save,
+    .needed = colo_info_need,
+    .fields = (VMStateField[]) {
+        VMSTATE_BOOL(colo_requested, COLOInfo),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+void colo_info_init(void)
+{
+    vmstate_register(NULL, 0, &colo_state, &colo_info);
+}
diff --git a/vl.c b/vl.c
index 74dfe4e..a1cee4f 100644
--- a/vl.c
+++ b/vl.c
@@ -90,6 +90,7 @@ int main(int argc, char **argv)
 #include "audio/audio.h"
 #include "migration/migration.h"
 #include "sysemu/cpus.h"
+#include "migration/colo.h"
 #include "sysemu/kvm.h"
 #include "qapi/qmp/qjson.h"
 #include "qemu/option.h"
@@ -4426,6 +4427,8 @@ int main(int argc, char **argv, char **envp)
 #endif
     }
 
+    colo_info_init();
+
     if (net_init_clients() < 0) {
         exit(1);
     }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 03/18] migration: Enter into COLO mode after migration if COLO is enabled
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
  2016-10-30 10:46 ` [Qemu-devel] [PULL 01/18] migration: Introduce capability 'x-colo' to migration Amit Shah
  2016-10-30 10:46 ` [Qemu-devel] [PULL 02/18] COLO: migrate COLO related info to secondary node Amit Shah
@ 2016-10-30 10:46 ` Amit Shah
  2016-10-31 22:27   ` Eric Blake
  2016-10-30 10:46 ` [Qemu-devel] [PULL 04/18] migration: Switch to COLO process after finishing loadvm Amit Shah
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
enters this state after the first live migration successfully finished
if COLO is enabled by command 'migrate_set_capability x-colo on'.

We reuse migration thread, so the process of checkpointing will be handled
in migration thread.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 include/migration/colo.h |  3 +++
 migration/colo.c         | 29 +++++++++++++++++++++++++++++
 migration/migration.c    | 38 +++++++++++++++++++++++++++++++++-----
 migration/trace-events   |  3 +++
 qapi-schema.json         |  4 +++-
 stubs/migration-colo.c   |  9 +++++++++
 6 files changed, 80 insertions(+), 6 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 1c899a0..bf84b99 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -19,4 +19,7 @@
 bool colo_supported(void);
 void colo_info_init(void);
 
+void migrate_start_colo_process(MigrationState *s);
+bool migration_in_colo_state(void);
+
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index d215057..f480431 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -11,9 +11,38 @@
  */
 
 #include "qemu/osdep.h"
+#include "sysemu/sysemu.h"
 #include "migration/colo.h"
+#include "trace.h"
 
 bool colo_supported(void)
 {
     return false;
 }
+
+bool migration_in_colo_state(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return (s->state == MIGRATION_STATUS_COLO);
+}
+
+static void colo_process_checkpoint(MigrationState *s)
+{
+    qemu_mutex_lock_iothread();
+    vm_start();
+    qemu_mutex_unlock_iothread();
+    trace_colo_vm_state_change("stop", "run");
+
+    /* TODO: COLO checkpoint savevm loop */
+
+}
+
+void migrate_start_colo_process(MigrationState *s)
+{
+    qemu_mutex_unlock_iothread();
+    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_COLO);
+    colo_process_checkpoint(s);
+    qemu_mutex_lock_iothread();
+}
diff --git a/migration/migration.c b/migration/migration.c
index c7869e5..aff3eb4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -695,6 +695,10 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 
         get_xbzrle_cache_stats(info);
         break;
+    case MIGRATION_STATUS_COLO:
+        info->has_status = true;
+        /* TODO: display COLO specific information (checkpoint info etc.) */
+        break;
     case MIGRATION_STATUS_COMPLETED:
         get_xbzrle_cache_stats(info);
 
@@ -1113,7 +1117,8 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
     params.shared = has_inc && inc;
 
     if (migration_is_setup_or_active(s->state) ||
-        s->state == MIGRATION_STATUS_CANCELLING) {
+        s->state == MIGRATION_STATUS_CANCELLING ||
+        s->state == MIGRATION_STATUS_COLO) {
         error_setg(errp, QERR_MIGRATION_ACTIVE);
         return;
     }
@@ -1661,7 +1666,11 @@ static void migration_completion(MigrationState *s, int current_active_state,
 
         if (!ret) {
             ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
-            if (ret >= 0) {
+            /*
+             * Don't mark the image with BDRV_O_INACTIVE flag if
+             * we will go into COLO stage later.
+             */
+            if (ret >= 0 && !migrate_colo_enabled()) {
                 ret = bdrv_inactivate_all();
             }
             if (ret >= 0) {
@@ -1703,8 +1712,11 @@ static void migration_completion(MigrationState *s, int current_active_state,
         goto fail_invalidate;
     }
 
-    migrate_set_state(&s->state, current_active_state,
-                      MIGRATION_STATUS_COMPLETED);
+    if (!migrate_colo_enabled()) {
+        migrate_set_state(&s->state, current_active_state,
+                          MIGRATION_STATUS_COMPLETED);
+    }
+
     return;
 
 fail_invalidate:
@@ -1749,6 +1761,7 @@ static void *migration_thread(void *opaque)
     bool entered_postcopy = false;
     /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE */
     enum MigrationStatus current_active_state = MIGRATION_STATUS_ACTIVE;
+    bool enable_colo = migrate_colo_enabled();
 
     rcu_register_thread();
 
@@ -1857,7 +1870,13 @@ static void *migration_thread(void *opaque)
     end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 
     qemu_mutex_lock_iothread();
-    qemu_savevm_state_cleanup();
+    /*
+     * The resource has been allocated by migration will be reused in COLO
+     * process, so don't release them.
+     */
+    if (!enable_colo) {
+        qemu_savevm_state_cleanup();
+    }
     if (s->state == MIGRATION_STATUS_COMPLETED) {
         uint64_t transferred_bytes = qemu_ftell(s->to_dst_file);
         s->total_time = end_time - s->total_time;
@@ -1870,6 +1889,15 @@ static void *migration_thread(void *opaque)
         }
         runstate_set(RUN_STATE_POSTMIGRATE);
     } else {
+        if (s->state == MIGRATION_STATUS_ACTIVE && enable_colo) {
+            migrate_start_colo_process(s);
+            qemu_savevm_state_cleanup();
+            /*
+            * Fixme: we will run VM in COLO no matter its old running state.
+            * After exited COLO, we will keep running.
+            */
+            old_vm_running = true;
+        }
         if (old_vm_running && !entered_postcopy) {
             vm_start();
         } else {
diff --git a/migration/trace-events b/migration/trace-events
index dfee75a..a29e5a0 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -207,3 +207,6 @@ migration_tls_outgoing_handshake_complete(void) ""
 migration_tls_incoming_handshake_start(void) ""
 migration_tls_incoming_handshake_error(const char *err) "err=%s"
 migration_tls_incoming_handshake_complete(void) ""
+
+# migration/colo.c
+colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
diff --git a/qapi-schema.json b/qapi-schema.json
index 0fb4d7e..8b21215 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -459,12 +459,14 @@
 #
 # @failed: some error occurred during migration process.
 #
+# @colo: VM is in the process of fault tolerance. (since 2.8)
+#
 # Since: 2.3
 #
 ##
 { 'enum': 'MigrationStatus',
   'data': [ 'none', 'setup', 'cancelling', 'cancelled',
-            'active', 'postcopy-active', 'completed', 'failed' ] }
+            'active', 'postcopy-active', 'completed', 'failed', 'colo' ] }
 
 ##
 # @MigrationInfo
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index d215057..0c8eef4 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -17,3 +17,12 @@ bool colo_supported(void)
 {
     return false;
 }
+
+bool migration_in_colo_state(void)
+{
+    return false;
+}
+
+void migrate_start_colo_process(MigrationState *s)
+{
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 04/18] migration: Switch to COLO process after finishing loadvm
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (2 preceding siblings ...)
  2016-10-30 10:46 ` [Qemu-devel] [PULL 03/18] migration: Enter into COLO mode after migration if COLO is enabled Amit Shah
@ 2016-10-30 10:46 ` Amit Shah
  2016-10-30 10:46 ` [Qemu-devel] [PULL 05/18] COLO: Establish a new communicating path for COLO Amit Shah
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.

We add three new members to struct MigrationIncomingState,
'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO
related thread for secondary VM, 'migration_incoming_co' records the
original migration incoming coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 include/migration/colo.h      |  7 +++++++
 include/migration/migration.h |  7 +++++++
 migration/colo-comm.c         | 10 ++++++++++
 migration/colo.c              | 21 +++++++++++++++++++++
 migration/migration.c         | 12 ++++++++++++
 stubs/migration-colo.c        | 10 ++++++++++
 6 files changed, 67 insertions(+)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index bf84b99..b40676c 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -15,6 +15,8 @@
 
 #include "qemu-common.h"
 #include "migration/migration.h"
+#include "qemu/coroutine_int.h"
+#include "qemu/thread.h"
 
 bool colo_supported(void);
 void colo_info_init(void);
@@ -22,4 +24,9 @@ void colo_info_init(void);
 void migrate_start_colo_process(MigrationState *s);
 bool migration_in_colo_state(void);
 
+/* loadvm */
+bool migration_incoming_enable_colo(void);
+void migration_incoming_exit_colo(void);
+void *colo_process_incoming_thread(void *opaque);
+bool migration_incoming_in_colo_state(void);
 #endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 34f442b..c309d23 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -21,6 +21,7 @@
 #include "migration/vmstate.h"
 #include "qapi-types.h"
 #include "exec/cpu-common.h"
+#include "qemu/coroutine_int.h"
 
 #define QEMU_VM_FILE_MAGIC           0x5145564d
 #define QEMU_VM_FILE_VERSION_COMPAT  0x00000002
@@ -107,6 +108,12 @@ struct MigrationIncomingState {
     QEMUBH *bh;
 
     int state;
+
+    bool have_colo_incoming_thread;
+    QemuThread colo_incoming_thread;
+    /* The coroutine we should enter (back) after failover */
+    Coroutine *migration_incoming_co;
+
     /* See savevm.c */
     LoadStateEntry_Head loadvm_handlers;
 };
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
index a2d5185..bf44f76 100644
--- a/migration/colo-comm.c
+++ b/migration/colo-comm.c
@@ -49,3 +49,13 @@ void colo_info_init(void)
 {
     vmstate_register(NULL, 0, &colo_state, &colo_info);
 }
+
+bool migration_incoming_enable_colo(void)
+{
+    return colo_info.colo_requested;
+}
+
+void migration_incoming_exit_colo(void)
+{
+    colo_info.colo_requested = false;
+}
diff --git a/migration/colo.c b/migration/colo.c
index f480431..550cc5f 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -27,6 +27,13 @@ bool migration_in_colo_state(void)
     return (s->state == MIGRATION_STATUS_COLO);
 }
 
+bool migration_incoming_in_colo_state(void)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+
+    return mis && (mis->state == MIGRATION_STATUS_COLO);
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     qemu_mutex_lock_iothread();
@@ -46,3 +53,17 @@ void migrate_start_colo_process(MigrationState *s)
     colo_process_checkpoint(s);
     qemu_mutex_lock_iothread();
 }
+
+void *colo_process_incoming_thread(void *opaque)
+{
+    MigrationIncomingState *mis = opaque;
+
+    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_COLO);
+
+    /* TODO: COLO checkpoint restore loop */
+
+    migration_incoming_exit_colo();
+
+    return NULL;
+}
diff --git a/migration/migration.c b/migration/migration.c
index aff3eb4..d8421b5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -407,6 +407,18 @@ static void process_incoming_migration_co(void *opaque)
         /* Else if something went wrong then just fall out of the normal exit */
     }
 
+    /* we get COLO info, and know if we are in COLO mode */
+    if (!ret && migration_incoming_enable_colo()) {
+        mis->migration_incoming_co = qemu_coroutine_self();
+        qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
+             colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
+        mis->have_colo_incoming_thread = true;
+        qemu_coroutine_yield();
+
+        /* Wait checkpoint incoming thread exit before free resource */
+        qemu_thread_join(&mis->colo_incoming_thread);
+    }
+
     qemu_fclose(f);
     free_xbzrle_decoded_buf();
 
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 0c8eef4..7b72395 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -23,6 +23,16 @@ bool migration_in_colo_state(void)
     return false;
 }
 
+bool migration_incoming_in_colo_state(void)
+{
+    return false;
+}
+
 void migrate_start_colo_process(MigrationState *s)
 {
 }
+
+void *colo_process_incoming_thread(void *opaque)
+{
+    return NULL;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 05/18] COLO: Establish a new communicating path for COLO
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (3 preceding siblings ...)
  2016-10-30 10:46 ` [Qemu-devel] [PULL 04/18] migration: Switch to COLO process after finishing loadvm Amit Shah
@ 2016-10-30 10:46 ` Amit Shah
  2016-10-30 10:46 ` [Qemu-devel] [PULL 06/18] COLO: Introduce checkpointing protocol Amit Shah
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

This new communication path will be used for returning messages
from Secondary side to Primary side.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 migration/colo.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 550cc5f..3a0d804 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -14,6 +14,7 @@
 #include "sysemu/sysemu.h"
 #include "migration/colo.h"
 #include "trace.h"
+#include "qemu/error-report.h"
 
 bool colo_supported(void)
 {
@@ -36,6 +37,12 @@ bool migration_incoming_in_colo_state(void)
 
 static void colo_process_checkpoint(MigrationState *s)
 {
+    s->rp_state.from_dst_file = qemu_file_get_return_path(s->to_dst_file);
+    if (!s->rp_state.from_dst_file) {
+        error_report("Open QEMUFile from_dst_file failed");
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
@@ -43,6 +50,10 @@ static void colo_process_checkpoint(MigrationState *s)
 
     /* TODO: COLO checkpoint savevm loop */
 
+out:
+    if (s->rp_state.from_dst_file) {
+        qemu_fclose(s->rp_state.from_dst_file);
+    }
 }
 
 void migrate_start_colo_process(MigrationState *s)
@@ -61,8 +72,25 @@ void *colo_process_incoming_thread(void *opaque)
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_COLO);
 
+    mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
+    if (!mis->to_src_file) {
+        error_report("COLO incoming thread: Open QEMUFile to_src_file failed");
+        goto out;
+    }
+    /*
+     * Note: the communication between Primary side and Secondary side
+     * should be sequential, we set the fd to unblocked in migration incoming
+     * coroutine, and here we are in the COLO incoming thread, so it is ok to
+     * set the fd back to blocked.
+     */
+    qemu_file_set_blocking(mis->from_src_file, true);
+
     /* TODO: COLO checkpoint restore loop */
 
+out:
+    if (mis->to_src_file) {
+        qemu_fclose(mis->to_src_file);
+    }
     migration_incoming_exit_colo();
 
     return NULL;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 06/18] COLO: Introduce checkpointing protocol
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (4 preceding siblings ...)
  2016-10-30 10:46 ` [Qemu-devel] [PULL 05/18] COLO: Establish a new communicating path for COLO Amit Shah
@ 2016-10-30 10:46 ` Amit Shah
  2016-10-31 18:25   ` Eduardo Habkost
  2016-10-30 10:46 ` [Qemu-devel] [PULL 07/18] COLO: Add a new RunState RUN_STATE_COLO Amit Shah
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

We need communications protocol of user-defined to control
the checkpointing process.

The new checkpointing request is started by Primary VM,
and the interactive process like below:

Checkpoint synchronizing points:

                   Primary               Secondary
                                            initial work
'checkpoint-ready'    <-------------------- @

'checkpoint-request'  @ -------------------->
                                            Suspend (Only in hybrid mode)
'checkpoint-reply'    <-------------------- @
                      Suspend&Save state
'vmstate-send'        @ -------------------->
                      Send state            Receive state
'vmstate-received'    <-------------------- @
                      Release packets       Load state
'vmstate-load'        <-------------------- @
                      Resume                Resume (Only in hybrid mode)

                      Start Comparing (Only in hybrid mode)
NOTE:
 1) '@' who sends the message
 2) Every sync-point is synchronized by two sides with only
    one handshake(single direction) for low-latency.
    If more strict synchronization is required, a opposite direction
    sync-point should be added.
 3) Since sync-points are single direction, the remote side may
    go forward a lot when this side just receives the sync-point.
 4) For now, we only support 'periodic' checkpoint, for which
   the Secondary VM is not running, later we will support 'hybrid' mode.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 migration/colo.c       | 201 ++++++++++++++++++++++++++++++++++++++++++++++++-
 migration/trace-events |   2 +
 qapi-schema.json       |  25 ++++++
 3 files changed, 226 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 3a0d804..fcc9047 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -15,6 +15,7 @@
 #include "migration/colo.h"
 #include "trace.h"
 #include "qemu/error-report.h"
+#include "qapi/error.h"
 
 bool colo_supported(void)
 {
@@ -35,22 +36,147 @@ bool migration_incoming_in_colo_state(void)
     return mis && (mis->state == MIGRATION_STATUS_COLO);
 }
 
+static void colo_send_message(QEMUFile *f, COLOMessage msg,
+                              Error **errp)
+{
+    int ret;
+
+    if (msg >= COLO_MESSAGE__MAX) {
+        error_setg(errp, "%s: Invalid message", __func__);
+        return;
+    }
+    qemu_put_be32(f, msg);
+    qemu_fflush(f);
+
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Can't send COLO message");
+    }
+    trace_colo_send_message(COLOMessage_lookup[msg]);
+}
+
+static COLOMessage colo_receive_message(QEMUFile *f, Error **errp)
+{
+    COLOMessage msg;
+    int ret;
+
+    msg = qemu_get_be32(f);
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Can't receive COLO message");
+        return msg;
+    }
+    if (msg >= COLO_MESSAGE__MAX) {
+        error_setg(errp, "%s: Invalid message", __func__);
+        return msg;
+    }
+    trace_colo_receive_message(COLOMessage_lookup[msg]);
+    return msg;
+}
+
+static void colo_receive_check_message(QEMUFile *f, COLOMessage expect_msg,
+                                       Error **errp)
+{
+    COLOMessage msg;
+    Error *local_err = NULL;
+
+    msg = colo_receive_message(f, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    if (msg != expect_msg) {
+        error_setg(errp, "Unexpected COLO message %d, expected %d",
+                          msg, expect_msg);
+    }
+}
+
+static int colo_do_checkpoint_transaction(MigrationState *s)
+{
+    Error *local_err = NULL;
+
+    colo_send_message(s->to_dst_file, COLO_MESSAGE_CHECKPOINT_REQUEST,
+                      &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    colo_receive_check_message(s->rp_state.from_dst_file,
+                    COLO_MESSAGE_CHECKPOINT_REPLY, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    /* TODO: suspend and save vm state to colo buffer */
+
+    colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    /* TODO: send vmstate to Secondary */
+
+    colo_receive_check_message(s->rp_state.from_dst_file,
+                       COLO_MESSAGE_VMSTATE_RECEIVED, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    colo_receive_check_message(s->rp_state.from_dst_file,
+                       COLO_MESSAGE_VMSTATE_LOADED, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    /* TODO: resume Primary */
+
+    return 0;
+out:
+    if (local_err) {
+        error_report_err(local_err);
+    }
+    return -EINVAL;
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
+    Error *local_err = NULL;
+    int ret;
+
     s->rp_state.from_dst_file = qemu_file_get_return_path(s->to_dst_file);
     if (!s->rp_state.from_dst_file) {
         error_report("Open QEMUFile from_dst_file failed");
         goto out;
     }
 
+    /*
+     * Wait for Secondary finish loading VM states and enter COLO
+     * restore.
+     */
+    colo_receive_check_message(s->rp_state.from_dst_file,
+                       COLO_MESSAGE_CHECKPOINT_READY, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
 
-    /* TODO: COLO checkpoint savevm loop */
+    while (s->state == MIGRATION_STATUS_COLO) {
+        ret = colo_do_checkpoint_transaction(s);
+        if (ret < 0) {
+            goto out;
+        }
+    }
 
 out:
+    /* Throw the unreported error message after exited from loop */
+    if (local_err) {
+        error_report_err(local_err);
+    }
+
     if (s->rp_state.from_dst_file) {
         qemu_fclose(s->rp_state.from_dst_file);
     }
@@ -65,9 +191,33 @@ void migrate_start_colo_process(MigrationState *s)
     qemu_mutex_lock_iothread();
 }
 
+static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
+                                     Error **errp)
+{
+    COLOMessage msg;
+    Error *local_err = NULL;
+
+    msg = colo_receive_message(f, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    switch (msg) {
+    case COLO_MESSAGE_CHECKPOINT_REQUEST:
+        *checkpoint_request = 1;
+        break;
+    default:
+        *checkpoint_request = 0;
+        error_setg(errp, "Got unknown COLO message: %d", msg);
+        break;
+    }
+}
+
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
+    Error *local_err = NULL;
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_COLO);
@@ -85,9 +235,56 @@ void *colo_process_incoming_thread(void *opaque)
      */
     qemu_file_set_blocking(mis->from_src_file, true);
 
-    /* TODO: COLO checkpoint restore loop */
+    colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
+                      &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    while (mis->state == MIGRATION_STATUS_COLO) {
+        int request;
+
+        colo_wait_handle_message(mis->from_src_file, &request, &local_err);
+        if (local_err) {
+            goto out;
+        }
+        assert(request);
+        /* FIXME: This is unnecessary for periodic checkpoint mode */
+        colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
+                     &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        colo_receive_check_message(mis->from_src_file,
+                           COLO_MESSAGE_VMSTATE_SEND, &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        /* TODO: read migration data into colo buffer */
+
+        colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_RECEIVED,
+                     &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        /* TODO: load vm state */
+
+        colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_LOADED,
+                     &local_err);
+        if (local_err) {
+            goto out;
+        }
+    }
 
 out:
+    /* Throw the unreported error message after exited from loop */
+    if (local_err) {
+        error_report_err(local_err);
+    }
+
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
     }
diff --git a/migration/trace-events b/migration/trace-events
index a29e5a0..f374c8c 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -210,3 +210,5 @@ migration_tls_incoming_handshake_complete(void) ""
 
 # migration/colo.c
 colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
+colo_send_message(const char *msg) "Send '%s' message"
+colo_receive_message(const char *msg) "Receive '%s' message"
diff --git a/qapi-schema.json b/qapi-schema.json
index 8b21215..cafada8 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -787,6 +787,31 @@
 { 'command': 'migrate-start-postcopy' }
 
 ##
+# @COLOMessage
+#
+# The message transmission between Primary side and Secondary side.
+#
+# @checkpoint-ready: Secondary VM (SVM) is ready for checkpointing
+#
+# @checkpoint-request: Primary VM (PVM) tells SVM to prepare for checkpointing
+#
+# @checkpoint-reply: SVM gets PVM's checkpoint request
+#
+# @vmstate-send: VM's state will be sent by PVM.
+#
+# @vmstate-size: The total size of VMstate.
+#
+# @vmstate-received: VM's state has been received by SVM.
+#
+# @vmstate-loaded: VM's state has been loaded by SVM.
+#
+# Since: 2.8
+##
+{ 'enum': 'COLOMessage',
+  'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
+            'vmstate-send', 'vmstate-size', 'vmstate-received',
+            'vmstate-loaded' ] }
+
 # @MouseInfo:
 #
 # Information about a mouse device.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 07/18] COLO: Add a new RunState RUN_STATE_COLO
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (5 preceding siblings ...)
  2016-10-30 10:46 ` [Qemu-devel] [PULL 06/18] COLO: Introduce checkpointing protocol Amit Shah
@ 2016-10-30 10:46 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 08/18] COLO: Send PVM state to secondary side when do checkpoint Amit Shah
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:46 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Guest will enter this state when paused to save/restore VM state
under COLO checkpoint.

Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 qapi-schema.json | 5 ++++-
 vl.c             | 8 ++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index cafada8..ee58fe6 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -175,12 +175,15 @@
 # @watchdog: the watchdog action is configured to pause and has been triggered
 #
 # @guest-panicked: guest has been panicked as a result of guest OS panic
+#
+# @colo: guest is paused to save/restore VM state under colo checkpoint (since
+# 2.8)
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
             'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
             'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
-            'guest-panicked' ] }
+            'guest-panicked', 'colo' ] }
 
 ##
 # @StatusInfo:
diff --git a/vl.c b/vl.c
index a1cee4f..0360808 100644
--- a/vl.c
+++ b/vl.c
@@ -614,6 +614,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_INMIGRATE, RUN_STATE_FINISH_MIGRATE },
     { RUN_STATE_INMIGRATE, RUN_STATE_PRELAUNCH },
     { RUN_STATE_INMIGRATE, RUN_STATE_POSTMIGRATE },
+    { RUN_STATE_INMIGRATE, RUN_STATE_COLO },
 
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
@@ -626,6 +627,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
     { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
     { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
+    { RUN_STATE_PAUSED, RUN_STATE_COLO},
 
     { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -638,10 +640,13 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
     { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
 
+    { RUN_STATE_COLO, RUN_STATE_RUNNING },
+
     { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
     { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
     { RUN_STATE_RUNNING, RUN_STATE_IO_ERROR },
@@ -652,6 +657,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
     { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
     { RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
+    { RUN_STATE_RUNNING, RUN_STATE_COLO},
 
     { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
@@ -664,10 +670,12 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
     { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
     { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
+    { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
     { RUN_STATE_WATCHDOG, RUN_STATE_PRELAUNCH },
+    { RUN_STATE_WATCHDOG, RUN_STATE_COLO},
 
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 08/18] COLO: Send PVM state to secondary side when do checkpoint
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (6 preceding siblings ...)
  2016-10-30 10:46 ` [Qemu-devel] [PULL 07/18] COLO: Add a new RunState RUN_STATE_COLO Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 09/18] COLO: Load VMState into QIOChannelBuffer before restore it Amit Shah
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

VM checkpointing is to synchronize the state of PVM to SVM, just
like migration does, we re-use save helpers to achieve migrating
PVM's state to Secondary side.

COLO need to cache the data of VM's state in the secondary side before
synchronize it to SVM. COLO need the size of the data to determine
how much data should be read in the secondary side.
So here, we can get the size of the data by saving it into I/O channel
before send it to the secondary side.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 migration/colo.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++++++-----
 migration/ram.c  | 37 +++++++++++++++++-------
 2 files changed, 105 insertions(+), 17 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index fcc9047..9dc54fc 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -13,10 +13,13 @@
 #include "qemu/osdep.h"
 #include "sysemu/sysemu.h"
 #include "migration/colo.h"
+#include "io/channel-buffer.h"
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 
+#define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
+
 bool colo_supported(void)
 {
     return false;
@@ -55,6 +58,27 @@ static void colo_send_message(QEMUFile *f, COLOMessage msg,
     trace_colo_send_message(COLOMessage_lookup[msg]);
 }
 
+static void colo_send_message_value(QEMUFile *f, COLOMessage msg,
+                                    uint64_t value, Error **errp)
+{
+    Error *local_err = NULL;
+    int ret;
+
+    colo_send_message(f, msg, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    qemu_put_be64(f, value);
+    qemu_fflush(f);
+
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Failed to send value for message:%s",
+                         COLOMessage_lookup[msg]);
+    }
+}
+
 static COLOMessage colo_receive_message(QEMUFile *f, Error **errp)
 {
     COLOMessage msg;
@@ -91,9 +115,12 @@ static void colo_receive_check_message(QEMUFile *f, COLOMessage expect_msg,
     }
 }
 
-static int colo_do_checkpoint_transaction(MigrationState *s)
+static int colo_do_checkpoint_transaction(MigrationState *s,
+                                          QIOChannelBuffer *bioc,
+                                          QEMUFile *fb)
 {
     Error *local_err = NULL;
+    int ret = -1;
 
     colo_send_message(s->to_dst_file, COLO_MESSAGE_CHECKPOINT_REQUEST,
                       &local_err);
@@ -106,15 +133,46 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
     if (local_err) {
         goto out;
     }
+    /* Reset channel-buffer directly */
+    qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL);
+    bioc->usage = 0;
 
-    /* TODO: suspend and save vm state to colo buffer */
+    qemu_mutex_lock_iothread();
+    vm_stop_force_state(RUN_STATE_COLO);
+    qemu_mutex_unlock_iothread();
+    trace_colo_vm_state_change("run", "stop");
+
+    /* Disable block migration */
+    s->params.blk = 0;
+    s->params.shared = 0;
+    qemu_savevm_state_header(fb);
+    qemu_savevm_state_begin(fb, &s->params);
+    qemu_mutex_lock_iothread();
+    qemu_savevm_state_complete_precopy(fb, false);
+    qemu_mutex_unlock_iothread();
+
+    qemu_fflush(fb);
 
     colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
     if (local_err) {
         goto out;
     }
+    /*
+     * We need the size of the VMstate data in Secondary side,
+     * With which we can decide how much data should be read.
+     */
+    colo_send_message_value(s->to_dst_file, COLO_MESSAGE_VMSTATE_SIZE,
+                            bioc->usage, &local_err);
+    if (local_err) {
+        goto out;
+    }
 
-    /* TODO: send vmstate to Secondary */
+    qemu_put_buffer(s->to_dst_file, bioc->data, bioc->usage);
+    qemu_fflush(s->to_dst_file);
+    ret = qemu_file_get_error(s->to_dst_file);
+    if (ret < 0) {
+        goto out;
+    }
 
     colo_receive_check_message(s->rp_state.from_dst_file,
                        COLO_MESSAGE_VMSTATE_RECEIVED, &local_err);
@@ -128,18 +186,24 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
         goto out;
     }
 
-    /* TODO: resume Primary */
+    ret = 0;
+
+    qemu_mutex_lock_iothread();
+    vm_start();
+    qemu_mutex_unlock_iothread();
+    trace_colo_vm_state_change("stop", "run");
 
-    return 0;
 out:
     if (local_err) {
         error_report_err(local_err);
     }
-    return -EINVAL;
+    return ret;
 }
 
 static void colo_process_checkpoint(MigrationState *s)
 {
+    QIOChannelBuffer *bioc;
+    QEMUFile *fb = NULL;
     Error *local_err = NULL;
     int ret;
 
@@ -158,6 +222,9 @@ static void colo_process_checkpoint(MigrationState *s)
     if (local_err) {
         goto out;
     }
+    bioc = qio_channel_buffer_new(COLO_BUFFER_BASE_SIZE);
+    fb = qemu_fopen_channel_output(QIO_CHANNEL(bioc));
+    object_unref(OBJECT(bioc));
 
     qemu_mutex_lock_iothread();
     vm_start();
@@ -165,7 +232,7 @@ static void colo_process_checkpoint(MigrationState *s)
     trace_colo_vm_state_change("stop", "run");
 
     while (s->state == MIGRATION_STATUS_COLO) {
-        ret = colo_do_checkpoint_transaction(s);
+        ret = colo_do_checkpoint_transaction(s, bioc, fb);
         if (ret < 0) {
             goto out;
         }
@@ -177,6 +244,10 @@ out:
         error_report_err(local_err);
     }
 
+    if (fb) {
+        qemu_fclose(fb);
+    }
+
     if (s->rp_state.from_dst_file) {
         qemu_fclose(s->rp_state.from_dst_file);
     }
diff --git a/migration/ram.c b/migration/ram.c
index d032d38..fb9252d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -43,6 +43,7 @@
 #include "trace.h"
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
+#include "migration/colo.h"
 
 #ifdef DEBUG_MIGRATION_RAM
 #define DPRINTF(fmt, ...) \
@@ -1871,16 +1872,8 @@ err:
     return ret;
 }
 
-
-/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
- * long-running RCU critical section.  When rcu-reclaims in the code
- * start to become numerous it will be necessary to reduce the
- * granularity of these critical sections.
- */
-
-static int ram_save_setup(QEMUFile *f, void *opaque)
+static int ram_save_init_globals(void)
 {
-    RAMBlock *block;
     int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
 
     dirty_rate_high_cnt = 0;
@@ -1947,6 +1940,29 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     migration_bitmap_sync();
     qemu_mutex_unlock_ramlist();
     qemu_mutex_unlock_iothread();
+    rcu_read_unlock();
+
+    return 0;
+}
+
+/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
+ * long-running RCU critical section.  When rcu-reclaims in the code
+ * start to become numerous it will be necessary to reduce the
+ * granularity of these critical sections.
+ */
+
+static int ram_save_setup(QEMUFile *f, void *opaque)
+{
+    RAMBlock *block;
+
+    /* migration has already setup the bitmap, reuse it. */
+    if (!migration_in_colo_state()) {
+        if (ram_save_init_globals() < 0) {
+            return -1;
+         }
+    }
+
+    rcu_read_lock();
 
     qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
@@ -2048,7 +2064,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
     while (true) {
         int pages;
 
-        pages = ram_find_and_save_block(f, true, &bytes_transferred);
+        pages = ram_find_and_save_block(f, !migration_in_colo_state(),
+                                        &bytes_transferred);
         /* no more blocks to sent */
         if (pages == 0) {
             break;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 09/18] COLO: Load VMState into QIOChannelBuffer before restore it
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (7 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 08/18] COLO: Send PVM state to secondary side when do checkpoint Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 10/18] COLO: Add checkpoint-delay parameter for migrate-set-parameters Amit Shah
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

We should not destroy the state of SVM (Secondary VM) until we receive
the complete data of PVM's state, in case the primary fails in the process
of sending the state, so we cache the VM's state in secondary side before
load it into SVM.

Besides, we should call qemu_system_reset() before load VM state,
which can ensure the data is intact.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 migration/colo.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 65 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 9dc54fc..bb9b870 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -115,6 +115,28 @@ static void colo_receive_check_message(QEMUFile *f, COLOMessage expect_msg,
     }
 }
 
+static uint64_t colo_receive_message_value(QEMUFile *f, uint32_t expect_msg,
+                                           Error **errp)
+{
+    Error *local_err = NULL;
+    uint64_t value;
+    int ret;
+
+    colo_receive_check_message(f, expect_msg, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return 0;
+    }
+
+    value = qemu_get_be64(f);
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Failed to get value for COLO message: %s",
+                         COLOMessage_lookup[expect_msg]);
+    }
+    return value;
+}
+
 static int colo_do_checkpoint_transaction(MigrationState *s,
                                           QIOChannelBuffer *bioc,
                                           QEMUFile *fb)
@@ -288,6 +310,10 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
+    QEMUFile *fb = NULL;
+    QIOChannelBuffer *bioc = NULL; /* Cache incoming device state */
+    uint64_t total_size;
+    uint64_t value;
     Error *local_err = NULL;
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
@@ -306,6 +332,10 @@ void *colo_process_incoming_thread(void *opaque)
      */
     qemu_file_set_blocking(mis->from_src_file, true);
 
+    bioc = qio_channel_buffer_new(COLO_BUFFER_BASE_SIZE);
+    fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
+    object_unref(OBJECT(bioc));
+
     colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
                       &local_err);
     if (local_err) {
@@ -333,7 +363,29 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
-        /* TODO: read migration data into colo buffer */
+        value = colo_receive_message_value(mis->from_src_file,
+                                 COLO_MESSAGE_VMSTATE_SIZE, &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        /*
+         * Read VM device state data into channel buffer,
+         * It's better to re-use the memory allocated.
+         * Here we need to handle the channel buffer directly.
+         */
+        if (value > bioc->capacity) {
+            bioc->capacity = value;
+            bioc->data = g_realloc(bioc->data, bioc->capacity);
+        }
+        total_size = qemu_get_buffer(mis->from_src_file, bioc->data, value);
+        if (total_size != value) {
+            error_report("Got %" PRIu64 " VMState data, less than expected"
+                        " %" PRIu64, total_size, value);
+            goto out;
+        }
+        bioc->usage = total_size;
+        qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL);
 
         colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_RECEIVED,
                      &local_err);
@@ -341,7 +393,14 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
-        /* TODO: load vm state */
+        qemu_mutex_lock_iothread();
+        qemu_system_reset(VMRESET_SILENT);
+        if (qemu_loadvm_state(fb) < 0) {
+            error_report("COLO: loadvm failed");
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+        qemu_mutex_unlock_iothread();
 
         colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_LOADED,
                      &local_err);
@@ -356,6 +415,10 @@ out:
         error_report_err(local_err);
     }
 
+    if (fb) {
+        qemu_fclose(fb);
+    }
+
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
     }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 10/18] COLO: Add checkpoint-delay parameter for migrate-set-parameters
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (8 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 09/18] COLO: Load VMState into QIOChannelBuffer before restore it Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-31 17:17   ` Juan Quintela
  2016-10-30 10:47 ` [Qemu-devel] [PULL 11/18] COLO: Synchronize PVM's state to SVM periodically Amit Shah
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Add checkpoint-delay parameter for migrate-set-parameters, so that
we can control the checkpoint frequency when COLO is in periodic mode.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 docs/qmp-commands.txt |  2 ++
 hmp.c                 |  8 ++++++++
 migration/migration.c | 16 ++++++++++++++++
 qapi-schema.json      | 12 ++++++++++--
 4 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/docs/qmp-commands.txt b/docs/qmp-commands.txt
index 9bdbbfe..2e42929 100644
--- a/docs/qmp-commands.txt
+++ b/docs/qmp-commands.txt
@@ -2916,6 +2916,8 @@ Set migration parameters
 - "max-bandwidth": set maximum speed for migrations (in bytes/sec) (json-int)
 - "downtime-limit": set maximum tolerated downtime (in milliseconds) for
                     migrations (json-int)
+- "x-checkpoint-delay": set the delay time for periodic checkpoint (json-int)
+
 Arguments:
 
 Example:
diff --git a/hmp.c b/hmp.c
index 3d60259..f2b831a 100644
--- a/hmp.c
+++ b/hmp.c
@@ -318,6 +318,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, " %s: %" PRId64 " milliseconds",
             MigrationParameter_lookup[MIGRATION_PARAMETER_DOWNTIME_LIMIT],
             params->downtime_limit);
+        monitor_printf(mon, " %s: %" PRId64,
+            MigrationParameter_lookup[MIGRATION_PARAMETER_X_CHECKPOINT_DELAY],
+            params->x_checkpoint_delay);
         monitor_printf(mon, "\n");
     }
 
@@ -1386,6 +1389,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
                 p.has_downtime_limit = true;
                 use_int_value = true;
                 break;
+            case MIGRATION_PARAMETER_X_CHECKPOINT_DELAY:
+                p.has_x_checkpoint_delay = true;
+                use_int_value = true;
+                break;
             }
 
             if (use_int_value) {
@@ -1402,6 +1409,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
                 p.cpu_throttle_initial = valueint;
                 p.cpu_throttle_increment = valueint;
                 p.downtime_limit = valueint;
+                p.x_checkpoint_delay = valueint;
             }
 
             qmp_migrate_set_parameters(&p, &err);
diff --git a/migration/migration.c b/migration/migration.c
index d8421b5..d87065a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -63,6 +63,11 @@
 /* Migration XBZRLE default cache size */
 #define DEFAULT_MIGRATE_CACHE_SIZE (64 * 1024 * 1024)
 
+/* The delay time (in ms) between two COLO checkpoints
+ * Note: Please change this default value to 10000 when we support hybrid mode.
+ */
+#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
+
 static NotifierList migration_state_notifiers =
     NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
 
@@ -95,6 +100,7 @@ MigrationState *migrate_get_current(void)
             .cpu_throttle_increment = DEFAULT_MIGRATE_CPU_THROTTLE_INCREMENT,
             .max_bandwidth = MAX_THROTTLE,
             .downtime_limit = DEFAULT_MIGRATE_SET_DOWNTIME,
+            .x_checkpoint_delay = DEFAULT_MIGRATE_X_CHECKPOINT_DELAY,
         },
     };
 
@@ -587,6 +593,7 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->max_bandwidth = s->parameters.max_bandwidth;
     params->has_downtime_limit = true;
     params->downtime_limit = s->parameters.downtime_limit;
+    params->x_checkpoint_delay = s->parameters.x_checkpoint_delay;
 
     return params;
 }
@@ -845,6 +852,11 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
                    "an integer in the range of 0 to 2000000 milliseconds");
         return;
     }
+    if (params->has_x_checkpoint_delay && (params->x_checkpoint_delay < 0)) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                    "x_checkpoint_delay",
+                    "is invalid, it should be positive");
+    }
 
     if (params->has_compress_level) {
         s->parameters.compress_level = params->compress_level;
@@ -879,6 +891,10 @@ void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
     if (params->has_downtime_limit) {
         s->parameters.downtime_limit = params->downtime_limit;
     }
+
+    if (params->has_x_checkpoint_delay) {
+        s->parameters.x_checkpoint_delay = params->x_checkpoint_delay;
+    }
 }
 
 
diff --git a/qapi-schema.json b/qapi-schema.json
index ee58fe6..05fed9a 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -674,19 +674,24 @@
 # @downtime-limit: set maximum tolerated downtime for migration. maximum
 #                  downtime in milliseconds (Since 2.8)
 #
+# @x-checkpoint-delay: The delay time (in ms) between two COLO checkpoints in
+#          periodic mode. (Since 2.8)
+#
 # Since: 2.4
 ##
 { 'enum': 'MigrationParameter',
   'data': ['compress-level', 'compress-threads', 'decompress-threads',
            'cpu-throttle-initial', 'cpu-throttle-increment',
            'tls-creds', 'tls-hostname', 'max-bandwidth',
-           'downtime-limit'] }
+           'downtime-limit', 'x-checkpoint-delay' ] }
 
 #
 # @migrate-set-parameters
 #
 # Set various migration parameters.  See MigrationParameters for details.
 #
+# @x-checkpoint-delay: the delay time between two checkpoints. (Since 2.8)
+#
 # Since: 2.4
 ##
 { 'command': 'migrate-set-parameters', 'boxed': true,
@@ -735,6 +740,8 @@
 # @downtime-limit: set maximum tolerated downtime for migration. maximum
 #                  downtime in milliseconds (Since 2.8)
 #
+# @x-checkpoint-delay: the delay time between two COLO checkpoints. (Since 2.8)
+#
 # Since: 2.4
 ##
 { 'struct': 'MigrationParameters',
@@ -746,7 +753,8 @@
             '*tls-creds': 'str',
             '*tls-hostname': 'str',
             '*max-bandwidth': 'int',
-            '*downtime-limit': 'int'} }
+            '*downtime-limit': 'int',
+            '*x-checkpoint-delay': 'int'} }
 
 ##
 # @query-migrate-parameters
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 11/18] COLO: Synchronize PVM's state to SVM periodically
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (9 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 10/18] COLO: Add checkpoint-delay parameter for migrate-set-parameters Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 12/18] COLO: Add 'x-colo-lost-heartbeat' command to trigger failover Amit Shah
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Do checkpoint periodically, the default interval is 200ms.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 migration/colo.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index bb9b870..a35cc59 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -11,6 +11,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/timer.h"
 #include "sysemu/sysemu.h"
 #include "migration/colo.h"
 #include "io/channel-buffer.h"
@@ -226,6 +227,7 @@ static void colo_process_checkpoint(MigrationState *s)
 {
     QIOChannelBuffer *bioc;
     QEMUFile *fb = NULL;
+    int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     Error *local_err = NULL;
     int ret;
 
@@ -254,10 +256,20 @@ static void colo_process_checkpoint(MigrationState *s)
     trace_colo_vm_state_change("stop", "run");
 
     while (s->state == MIGRATION_STATUS_COLO) {
+        current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+        if (current_time - checkpoint_time <
+            s->parameters.x_checkpoint_delay) {
+            int64_t delay_ms;
+
+            delay_ms = s->parameters.x_checkpoint_delay -
+                       (current_time - checkpoint_time);
+            g_usleep(delay_ms * 1000);
+        }
         ret = colo_do_checkpoint_transaction(s, bioc, fb);
         if (ret < 0) {
             goto out;
         }
+        checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     }
 
 out:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 12/18] COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (10 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 11/18] COLO: Synchronize PVM's state to SVM periodically Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 13/18] COLO: Introduce state to record failover process Amit Shah
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

We leave users to choose whatever heartbeat solution they want,
if the heartbeat is lost, or other errors they detect, they can use
experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover,
COLO will do operations accordingly.

For example, if the command is sent to the Primary side,
the Primary side will exit COLO mode, does cleanup work,
and then, PVM will take over the service work. If sent to the Secondary side,
the Secondary side will run failover work, then takes over PVM's service work.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 docs/qmp-commands.txt        | 10 ++++++++++
 hmp-commands.hx              | 15 +++++++++++++++
 hmp.c                        |  8 ++++++++
 hmp.h                        |  1 +
 include/migration/colo.h     |  3 +++
 include/migration/failover.h | 20 ++++++++++++++++++++
 migration/Makefile.objs      |  2 +-
 migration/colo-comm.c        | 11 +++++++++++
 migration/colo-failover.c    | 42 ++++++++++++++++++++++++++++++++++++++++++
 migration/colo.c             |  1 +
 qapi-schema.json             | 29 +++++++++++++++++++++++++++++
 stubs/migration-colo.c       |  8 ++++++++
 12 files changed, 149 insertions(+), 1 deletion(-)
 create mode 100644 include/migration/failover.h
 create mode 100644 migration/colo-failover.c

diff --git a/docs/qmp-commands.txt b/docs/qmp-commands.txt
index 2e42929..a4732a5 100644
--- a/docs/qmp-commands.txt
+++ b/docs/qmp-commands.txt
@@ -554,6 +554,16 @@ Example:
 -> { "execute": "migrate_set_downtime", "arguments": { "value": 0.1 } }
 <- { "return": {} }
 
+x-colo-lost-heartbeat
+--------------------
+
+Tell COLO that heartbeat is lost, a failover or takeover is needed.
+
+Example:
+
+-> { "execute": "x-colo-lost-heartbeat" }
+<- { "return": {} }
+
 client_migrate_info
 -------------------
 
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 06bef47..8819281 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1040,6 +1040,21 @@ migration (or once already in postcopy).
 ETEXI
 
     {
+        .name       = "x_colo_lost_heartbeat",
+        .args_type  = "",
+        .params     = "",
+        .help       = "Tell COLO that heartbeat is lost,\n\t\t\t"
+                      "a failover or takeover is needed.",
+        .cmd = hmp_x_colo_lost_heartbeat,
+    },
+
+STEXI
+@item x_colo_lost_heartbeat
+@findex x_colo_lost_heartbeat
+Tell COLO that heartbeat is lost, a failover or takeover is needed.
+ETEXI
+
+    {
         .name       = "client_migrate_info",
         .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
         .params     = "protocol hostname port tls-port cert-subject",
diff --git a/hmp.c b/hmp.c
index f2b831a..00af423 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1451,6 +1451,14 @@ void hmp_migrate_start_postcopy(Monitor *mon, const QDict *qdict)
     hmp_handle_error(mon, &err);
 }
 
+void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict)
+{
+    Error *err = NULL;
+
+    qmp_x_colo_lost_heartbeat(&err);
+    hmp_handle_error(mon, &err);
+}
+
 void hmp_set_password(Monitor *mon, const QDict *qdict)
 {
     const char *protocol  = qdict_get_str(qdict, "protocol");
diff --git a/hmp.h b/hmp.h
index 184769c..05daf7c 100644
--- a/hmp.h
+++ b/hmp.h
@@ -72,6 +72,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_cache_size(Monitor *mon, const QDict *qdict);
 void hmp_client_migrate_info(Monitor *mon, const QDict *qdict);
 void hmp_migrate_start_postcopy(Monitor *mon, const QDict *qdict);
+void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
 void hmp_set_password(Monitor *mon, const QDict *qdict);
 void hmp_expire_password(Monitor *mon, const QDict *qdict);
 void hmp_eject(Monitor *mon, const QDict *qdict);
diff --git a/include/migration/colo.h b/include/migration/colo.h
index b40676c..e9ac2c3 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -17,6 +17,7 @@
 #include "migration/migration.h"
 #include "qemu/coroutine_int.h"
 #include "qemu/thread.h"
+#include "qemu/main-loop.h"
 
 bool colo_supported(void);
 void colo_info_init(void);
@@ -29,4 +30,6 @@ bool migration_incoming_enable_colo(void);
 void migration_incoming_exit_colo(void);
 void *colo_process_incoming_thread(void *opaque);
 bool migration_incoming_in_colo_state(void);
+
+COLOMode get_colo_mode(void);
 #endif
diff --git a/include/migration/failover.h b/include/migration/failover.h
new file mode 100644
index 0000000..3274735
--- /dev/null
+++ b/include/migration/failover.h
@@ -0,0 +1,20 @@
+/*
+ *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ *  (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO.,LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_FAILOVER_H
+#define QEMU_FAILOVER_H
+
+#include "qemu-common.h"
+
+void failover_request_active(Error **errp);
+
+#endif
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index 4bbe9ab..3f3e237 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,7 +1,7 @@
 common-obj-y += migration.o socket.o fd.o exec.o
 common-obj-y += tls.o
-common-obj-$(CONFIG_COLO) += colo.o
 common-obj-y += colo-comm.o
+common-obj-$(CONFIG_COLO) += colo.o colo-failover.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o
 common-obj-y += qemu-file-channel.o
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
index bf44f76..20b60ec 100644
--- a/migration/colo-comm.c
+++ b/migration/colo-comm.c
@@ -21,6 +21,17 @@ typedef struct {
 
 static COLOInfo colo_info;
 
+COLOMode get_colo_mode(void)
+{
+    if (migration_in_colo_state()) {
+        return COLO_MODE_PRIMARY;
+    } else if (migration_incoming_in_colo_state()) {
+        return COLO_MODE_SECONDARY;
+    } else {
+        return COLO_MODE_UNKNOWN;
+    }
+}
+
 static void colo_info_pre_save(void *opaque)
 {
     COLOInfo *s = opaque;
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
new file mode 100644
index 0000000..e31fc10
--- /dev/null
+++ b/migration/colo-failover.c
@@ -0,0 +1,42 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "migration/colo.h"
+#include "migration/failover.h"
+#include "qmp-commands.h"
+#include "qapi/qmp/qerror.h"
+
+static QEMUBH *failover_bh;
+
+static void colo_failover_bh(void *opaque)
+{
+    qemu_bh_delete(failover_bh);
+    failover_bh = NULL;
+    /* TODO: Do failover work */
+}
+
+void failover_request_active(Error **errp)
+{
+    failover_bh = qemu_bh_new(colo_failover_bh, NULL);
+    qemu_bh_schedule(failover_bh);
+}
+
+void qmp_x_colo_lost_heartbeat(Error **errp)
+{
+    if (get_colo_mode() == COLO_MODE_UNKNOWN) {
+        error_setg(errp, QERR_FEATURE_DISABLED, "colo");
+        return;
+    }
+
+    failover_request_active(errp);
+}
diff --git a/migration/colo.c b/migration/colo.c
index a35cc59..12fa0b4 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -18,6 +18,7 @@
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
+#include "migration/failover.h"
 
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 05fed9a..a184841 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -823,6 +823,35 @@
             'vmstate-send', 'vmstate-size', 'vmstate-received',
             'vmstate-loaded' ] }
 
+##
+# @COLOMode
+#
+# The colo mode
+#
+# @unknown: unknown mode
+#
+# @primary: master side
+#
+# @secondary: slave side
+#
+# Since: 2.8
+##
+{ 'enum': 'COLOMode',
+  'data': [ 'unknown', 'primary', 'secondary'] }
+
+##
+# @x-colo-lost-heartbeat
+#
+# Tell qemu that heartbeat is lost, request it to do takeover procedures.
+# If this command is sent to the PVM, the Primary side will exit COLO mode.
+# If sent to the Secondary, the Secondary side will run failover work,
+# then takes over server operation to become the service VM.
+#
+# Since: 2.8
+##
+{ 'command': 'x-colo-lost-heartbeat' }
+
+##
 # @MouseInfo:
 #
 # Information about a mouse device.
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 7b72395..7811764 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -12,6 +12,7 @@
 
 #include "qemu/osdep.h"
 #include "migration/colo.h"
+#include "qmp-commands.h"
 
 bool colo_supported(void)
 {
@@ -36,3 +37,10 @@ void *colo_process_incoming_thread(void *opaque)
 {
     return NULL;
 }
+
+void qmp_x_colo_lost_heartbeat(Error **errp)
+{
+    error_setg(errp, "COLO is not supported, please rerun configure"
+                     " with --enable-colo option in order to support"
+                     " COLO feature");
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 13/18] COLO: Introduce state to record failover process
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (11 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 12/18] COLO: Add 'x-colo-lost-heartbeat' command to trigger failover Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 14/18] COLO: Implement the process of failover for primary VM Amit Shah
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

When handling failover, COLO processes differently according to
the different stage of failover process, here we introduce a global
atomic variable to record the status of failover.

We add four failover status to indicate the different stage of failover process.
You should use the helpers to get and set the value.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 include/migration/failover.h |  5 +++++
 migration/colo-failover.c    | 41 +++++++++++++++++++++++++++++++++++++++++
 migration/colo.c             |  4 ++++
 migration/trace-events       |  1 +
 qapi-schema.json             | 18 ++++++++++++++++++
 5 files changed, 69 insertions(+)

diff --git a/include/migration/failover.h b/include/migration/failover.h
index 3274735..7e0f36a 100644
--- a/include/migration/failover.h
+++ b/include/migration/failover.h
@@ -14,7 +14,12 @@
 #define QEMU_FAILOVER_H
 
 #include "qemu-common.h"
+#include "qapi-types.h"
 
+void failover_init_state(void);
+FailoverStatus failover_set_state(FailoverStatus old_state,
+                                     FailoverStatus new_state);
+FailoverStatus failover_get_state(void);
 void failover_request_active(Error **errp);
 
 #endif
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
index e31fc10..6cca039 100644
--- a/migration/colo-failover.c
+++ b/migration/colo-failover.c
@@ -15,22 +15,63 @@
 #include "migration/failover.h"
 #include "qmp-commands.h"
 #include "qapi/qmp/qerror.h"
+#include "qemu/error-report.h"
+#include "trace.h"
 
 static QEMUBH *failover_bh;
+static FailoverStatus failover_state;
 
 static void colo_failover_bh(void *opaque)
 {
+    int old_state;
+
     qemu_bh_delete(failover_bh);
     failover_bh = NULL;
+
+    old_state = failover_set_state(FAILOVER_STATUS_REQUIRE,
+                                   FAILOVER_STATUS_ACTIVE);
+    if (old_state != FAILOVER_STATUS_REQUIRE) {
+        error_report("Unknown error for failover, old_state = %s",
+                    FailoverStatus_lookup[old_state]);
+        return;
+    }
+
     /* TODO: Do failover work */
 }
 
 void failover_request_active(Error **errp)
 {
+   if (failover_set_state(FAILOVER_STATUS_NONE,
+        FAILOVER_STATUS_REQUIRE) != FAILOVER_STATUS_NONE) {
+        error_setg(errp, "COLO failover is already actived");
+        return;
+    }
     failover_bh = qemu_bh_new(colo_failover_bh, NULL);
     qemu_bh_schedule(failover_bh);
 }
 
+void failover_init_state(void)
+{
+    failover_state = FAILOVER_STATUS_NONE;
+}
+
+FailoverStatus failover_set_state(FailoverStatus old_state,
+                    FailoverStatus new_state)
+{
+    FailoverStatus old;
+
+    old = atomic_cmpxchg(&failover_state, old_state, new_state);
+    if (old == old_state) {
+        trace_colo_failover_set_state(FailoverStatus_lookup[new_state]);
+    }
+    return old;
+}
+
+FailoverStatus failover_get_state(void)
+{
+    return atomic_read(&failover_state);
+}
+
 void qmp_x_colo_lost_heartbeat(Error **errp)
 {
     if (get_colo_mode() == COLO_MODE_UNKNOWN) {
diff --git a/migration/colo.c b/migration/colo.c
index 12fa0b4..6b32c91 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -232,6 +232,8 @@ static void colo_process_checkpoint(MigrationState *s)
     Error *local_err = NULL;
     int ret;
 
+    failover_init_state();
+
     s->rp_state.from_dst_file = qemu_file_get_return_path(s->to_dst_file);
     if (!s->rp_state.from_dst_file) {
         error_report("Open QEMUFile from_dst_file failed");
@@ -332,6 +334,8 @@ void *colo_process_incoming_thread(void *opaque)
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_COLO);
 
+    failover_init_state();
+
     mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
     if (!mis->to_src_file) {
         error_report("COLO incoming thread: Open QEMUFile to_src_file failed");
diff --git a/migration/trace-events b/migration/trace-events
index f374c8c..94134f7 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -212,3 +212,4 @@ migration_tls_incoming_handshake_complete(void) ""
 colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
 colo_send_message(const char *msg) "Send '%s' message"
 colo_receive_message(const char *msg) "Receive '%s' message"
+colo_failover_set_state(const char *new_state) "new state %s"
diff --git a/qapi-schema.json b/qapi-schema.json
index a184841..8a7b527 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -840,6 +840,24 @@
   'data': [ 'unknown', 'primary', 'secondary'] }
 
 ##
+# @FailoverStatus
+#
+# An enumeration of COLO failover status
+#
+# @none: no failover has ever happened
+#
+# @require: got failover requirement but not handled
+#
+# @active: in the process of doing failover
+#
+# @completed: finish the process of failover
+#
+# Since: 2.8
+##
+{ 'enum': 'FailoverStatus',
+  'data': [ 'none', 'require', 'active', 'completed'] }
+
+##
 # @x-colo-lost-heartbeat
 #
 # Tell qemu that heartbeat is lost, request it to do takeover procedures.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 14/18] COLO: Implement the process of failover for primary VM
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (12 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 13/18] COLO: Introduce state to record failover process Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 15/18] COLO: Implement failover work for secondary VM Amit Shah
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

For primary side, if COLO gets failover request from users.
To be exact, gets 'x_colo_lost_heartbeat' command.
COLO thread will exit the loop while the failover BH does the
cleanup work and resumes VM.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 include/migration/colo.h     |  3 +++
 include/migration/failover.h |  1 +
 migration/colo-failover.c    |  2 +-
 migration/colo.c             | 50 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index e9ac2c3..e32eef4 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -32,4 +32,7 @@ void *colo_process_incoming_thread(void *opaque);
 bool migration_incoming_in_colo_state(void);
 
 COLOMode get_colo_mode(void);
+
+/* failover */
+void colo_do_failover(MigrationState *s);
 #endif
diff --git a/include/migration/failover.h b/include/migration/failover.h
index 7e0f36a..ad91ef2 100644
--- a/include/migration/failover.h
+++ b/include/migration/failover.h
@@ -21,5 +21,6 @@ FailoverStatus failover_set_state(FailoverStatus old_state,
                                      FailoverStatus new_state);
 FailoverStatus failover_get_state(void);
 void failover_request_active(Error **errp);
+bool failover_request_is_active(void);
 
 #endif
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
index 6cca039..cc229f5 100644
--- a/migration/colo-failover.c
+++ b/migration/colo-failover.c
@@ -36,7 +36,7 @@ static void colo_failover_bh(void *opaque)
         return;
     }
 
-    /* TODO: Do failover work */
+    colo_do_failover(NULL);
 }
 
 void failover_request_active(Error **errp)
diff --git a/migration/colo.c b/migration/colo.c
index 6b32c91..04f7c55 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -41,6 +41,40 @@ bool migration_incoming_in_colo_state(void)
     return mis && (mis->state == MIGRATION_STATUS_COLO);
 }
 
+static bool colo_runstate_is_stopped(void)
+{
+    return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
+}
+
+static void primary_vm_do_failover(void)
+{
+    MigrationState *s = migrate_get_current();
+    int old_state;
+
+    migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
+                      MIGRATION_STATUS_COMPLETED);
+
+    old_state = failover_set_state(FAILOVER_STATUS_ACTIVE,
+                                   FAILOVER_STATUS_COMPLETED);
+    if (old_state != FAILOVER_STATUS_ACTIVE) {
+        error_report("Incorrect state (%s) while doing failover for Primary VM",
+                     FailoverStatus_lookup[old_state]);
+        return;
+    }
+}
+
+void colo_do_failover(MigrationState *s)
+{
+    /* Make sure VM stopped while failover happened. */
+    if (!colo_runstate_is_stopped()) {
+        vm_stop_force_state(RUN_STATE_COLO);
+    }
+
+    if (get_colo_mode() == COLO_MODE_PRIMARY) {
+        primary_vm_do_failover();
+    }
+}
+
 static void colo_send_message(QEMUFile *f, COLOMessage msg,
                               Error **errp)
 {
@@ -162,9 +196,20 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     bioc->usage = 0;
 
     qemu_mutex_lock_iothread();
+    if (failover_get_state() != FAILOVER_STATUS_NONE) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
     vm_stop_force_state(RUN_STATE_COLO);
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("run", "stop");
+    /*
+     * Failover request bh could be called after vm_stop_force_state(),
+     * So we need check failover_request_is_active() again.
+     */
+    if (failover_get_state() != FAILOVER_STATUS_NONE) {
+        goto out;
+    }
 
     /* Disable block migration */
     s->params.blk = 0;
@@ -259,6 +304,11 @@ static void colo_process_checkpoint(MigrationState *s)
     trace_colo_vm_state_change("stop", "run");
 
     while (s->state == MIGRATION_STATUS_COLO) {
+        if (failover_get_state() != FAILOVER_STATUS_NONE) {
+            error_report("failover request");
+            goto out;
+        }
+
         current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
         if (current_time - checkpoint_time <
             s->parameters.x_checkpoint_delay) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 15/18] COLO: Implement failover work for secondary VM
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (13 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 14/18] COLO: Implement the process of failover for primary VM Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 16/18] docs: Add documentation for COLO feature Amit Shah
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

If users require SVM to takeover work, COLO incoming thread should
exit from loop while failover BH helps backing to migration incoming
coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 migration/colo.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 04f7c55..1f9c699 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -46,6 +46,33 @@ static bool colo_runstate_is_stopped(void)
     return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
 }
 
+static void secondary_vm_do_failover(void)
+{
+    int old_state;
+    MigrationIncomingState *mis = migration_incoming_get_current();
+
+    migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
+                      MIGRATION_STATUS_COMPLETED);
+
+    if (!autostart) {
+        error_report("\"-S\" qemu option will be ignored in secondary side");
+        /* recover runstate to normal migration finish state */
+        autostart = true;
+    }
+
+    old_state = failover_set_state(FAILOVER_STATUS_ACTIVE,
+                                   FAILOVER_STATUS_COMPLETED);
+    if (old_state != FAILOVER_STATUS_ACTIVE) {
+        error_report("Incorrect state (%s) while doing failover for "
+                     "secondary VM", FailoverStatus_lookup[old_state]);
+        return;
+    }
+    /* For Secondary VM, jump to incoming co */
+    if (mis->migration_incoming_co) {
+        qemu_coroutine_enter(mis->migration_incoming_co);
+    }
+}
+
 static void primary_vm_do_failover(void)
 {
     MigrationState *s = migrate_get_current();
@@ -72,6 +99,8 @@ void colo_do_failover(MigrationState *s)
 
     if (get_colo_mode() == COLO_MODE_PRIMARY) {
         primary_vm_do_failover();
+    } else {
+        secondary_vm_do_failover();
     }
 }
 
@@ -417,6 +446,11 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
         assert(request);
+        if (failover_get_state() != FAILOVER_STATUS_NONE) {
+            error_report("failover request");
+            goto out;
+        }
+
         /* FIXME: This is unnecessary for periodic checkpoint mode */
         colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
                      &local_err);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 16/18] docs: Add documentation for COLO feature
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (14 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 15/18] COLO: Implement failover work for secondary VM Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 17/18] configure: Support enable/disable " Amit Shah
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Introduce the design of COLO, and how to test it.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 docs/COLO-FT.txt | 189 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 189 insertions(+)
 create mode 100644 docs/COLO-FT.txt

diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
new file mode 100644
index 0000000..6282938
--- /dev/null
+++ b/docs/COLO-FT.txt
@@ -0,0 +1,189 @@
+COarse-grained LOck-stepping Virtual Machines for Non-stop Service
+----------------------------------------
+Copyright (c) 2016 Intel Corporation
+Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+Copyright (c) 2016 Fujitsu, Corp.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+This document gives an overview of COLO's design and how to use it.
+
+== Background ==
+Virtual machine (VM) replication is a well known technique for providing
+application-agnostic software-implemented hardware fault tolerance,
+also known as "non-stop service".
+
+COLO (COarse-grained LOck-stepping) is a high availability solution.
+Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
+same request from client, and generate response in parallel too.
+If the response packets from PVM and SVM are identical, they are released
+immediately. Otherwise, a VM checkpoint (on demand) is conducted.
+
+== Architecture ==
+
+The architecture of COLO is shown in the diagram below.
+It consists of a pair of networked physical nodes:
+The primary node running the PVM, and the secondary node running the SVM
+to maintain a valid replica of the PVM.
+PVM and SVM execute in parallel and generate output of response packets for
+client requests according to the application semantics.
+
+The incoming packets from the client or external network are received by the
+primary node, and then forwarded to the secondary node, so that both the PVM
+and the SVM are stimulated with the same requests.
+
+COLO receives the outbound packets from both the PVM and SVM and compares them
+before allowing the output to be sent to clients.
+
+The SVM is qualified as a valid replica of the PVM, as long as it generates
+identical responses to all client requests. Once the differences in the outputs
+are detected between the PVM and SVM, COLO withholds transmission of the
+outbound packets until it has successfully synchronized the PVM state to the SVM.
+
+   Primary Node                                                            Secondary Node
+ +------------+  +-----------------------+       +------------------------+  +------------+
+ |            |  |       HeartBeat       |<----->|       HeartBeat        |  |            |
+ | Primary VM |  +-----------|-----------+       +-----------|------------+  |Secondary VM|
+ |            |              |                               |               |            |
+ |            |  +-----------|-----------+       +-----------|------------+  |            |
+ |            |  |QEMU   +---v----+      |       |QEMU  +----v---+        |  |            |
+ |            |  |       |Failover|      |       |      |Failover|        |  |            |
+ |            |  |       +--------+      |       |      +--------+        |  |            |
+ |            |  |   +---------------+   |       |   +---------------+    |  |            |
+ |            |  |   | VM Checkpoint |-------------->| VM Checkpoint |    |  |            |
+ |            |  |   +---------------+   |       |   +---------------+    |  |            |
+ |            |  |                       |       |                        |  |            |
+ |Requests<---------------------------^------------------------------------------>Requests|
+ |Responses----------------------\ /--|--------------\  /------------------------Responses|
+ |            |  |               | |  |  |       |   |  |                 |  |            |
+ |            |  | +-----------+ | |  |  |       |   |  |  +------------+ |  |            |
+ |            |  | | COLO disk | | |  |  |       |   |  |  | COLO disk  | |  |            |
+ |            |  | |   Manager |-|-|--|--------------|--|->| Manager    | |  |            |
+ |            |  | +|----------+ | |  |  |       |   |  |  +-----------|+ |  |            |
+ |            |  |  |            | |  |  |       |   |  |              |  |  |            |
+ +------------+  +--|------------|-|--|--+       +---|--|--------------|--+  +------------+
+                    |            | |  |              |  |              |
+ +-------------+    | +----------v-v--|--+       +---|--v-----------+  |    +-------------+
+ |  VM Monitor |    | |  COLO Proxy      |       |    COLO Proxy    |  |    | VM Monitor  |
+ |             |    | |(compare packet)  |       | (adjust sequence)|  |    |             |
+ +-------------+    | +----------|----^--+       +------------------+  |    +-------------+
+                    |            |    |                                |
+ +------------------|------------|----|--+       +---------------------|------------------+
+ |   Kernel         |            |    |  |       |   Kernel            |                  |
+ +------------------|------------|----|--+       +---------------------|------------------+
+                    |            |    |                                |
+     +--------------v+  +--------v----|--+       +------------------+ +v-------------+
+     |   Storage     |  |External Network|       | External Network | |   Storage    |
+     +---------------+  +----------------+       +------------------+ +--------------+
+
+== Components introduction ==
+
+You can see there are several components in COLO's diagram of architecture.
+Their functions are described below.
+
+HeartBeat:
+Runs on both the primary and secondary nodes, to periodically check platform
+availability. When the primary node suffers a hardware fail-stop failure,
+the heartbeat stops responding, the secondary node will trigger a failover
+as soon as it determines the absence.
+
+COLO disk Manager:
+When primary VM writes data into image, the colo disk manger captures this data
+and sends it to secondary VM's which makes sure the context of secondary VM's
+image is consistent with the context of primary VM 's image.
+For more details, please refer to docs/block-replication.txt.
+
+Checkpoint/Failover Controller:
+Modifications of save/restore flow to realize continuous migration,
+to make sure the state of VM in Secondary side is always consistent with VM in
+Primary side.
+
+COLO Proxy:
+Delivers packets to Primary and Seconday, and then compare the responses from
+both side. Then decide whether to start a checkpoint according to some rules.
+Please refer to docs/colo-proxy.txt for more informations.
+
+Note:
+HeartBeat has not been implemented yet, so you need to trigger failover process
+by using 'x-colo-lost-heartbeat' command.
+
+== Test procedure ==
+1. Startup qemu
+Primary:
+# qemu-kvm -enable-kvm -m 2048 -smp 2 -qmp stdio -vnc :7 -name primary \
+  -device piix3-usb-uhci \
+  -device usb-tablet -netdev tap,id=hn0,vhost=off \
+  -device virtio-net-pci,id=net-pci0,netdev=hn0 \
+  -drive if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
+         children.0.file.filename=1.raw,\
+         children.0.driver=raw -S
+Secondary:
+# qemu-kvm -enable-kvm -m 2048 -smp 2 -qmp stdio -vnc :7 -name secondary \
+  -device piix3-usb-uhci \
+  -device usb-tablet -netdev tap,id=hn0,vhost=off \
+  -device virtio-net-pci,id=net-pci0,netdev=hn0 \
+  -drive if=none,id=secondary-disk0,file.filename=1.raw,driver=raw,node-name=node0 \
+  -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\
+         file.driver=qcow2,top-id=active-disk0,\
+         file.file.filename=/mnt/ramfs/active_disk.img,\
+         file.backing.driver=qcow2,\
+         file.backing.file.filename=/mnt/ramfs/hidden_disk.img,\
+         file.backing.backing=secondary-disk0 \
+  -incoming tcp:0:8888
+
+2. On Secondary VM's QEMU monitor, issue command
+{'execute':'qmp_capabilities'}
+{ 'execute': 'nbd-server-start',
+  'arguments': {'addr': {'type': 'inet', 'data': {'host': 'xx.xx.xx.xx', 'port': '8889'} } }
+}
+{'execute': 'nbd-server-add', 'arguments': {'device': 'secondeary-disk0', 'writable': true } }
+
+Note:
+  a. The qmp command nbd-server-start and nbd-server-add must be run
+     before running the qmp command migrate on primary QEMU
+  b. Active disk, hidden disk and nbd target's length should be the
+     same.
+  c. It is better to put active disk and hidden disk in ramdisk.
+
+3. On Primary VM's QEMU monitor, issue command:
+{'execute':'qmp_capabilities'}
+{ 'execute': 'human-monitor-command',
+  'arguments': {'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=xx.xx.xx.xx,file.port=8889,file.export=secondary-disk0,node-name=nbd_client0'}}
+{ 'execute':'x-blockdev-change', 'arguments':{'parent': 'primary-disk0', 'node': 'nbd_client0' } }
+{ 'execute': 'migrate-set-capabilities',
+      'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
+{ 'execute': 'migrate', 'arguments': {'uri': 'tcp:xx.xx.xx.xx:8888' } }
+
+  Note:
+  a. There should be only one NBD Client for each primary disk.
+  b. xx.xx.xx.xx is the secondary physical machine's hostname or IP
+  c. The qmp command line must be run after running qmp command line in
+     secondary qemu.
+
+4. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
+You can issue command '{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }'
+to change the checkpoint period time
+
+5. Failover test
+You can kill Primary VM and run 'x_colo_lost_heartbeat' in Secondary VM's
+monitor at the same time, then SVM will failover and client will not detect this
+change.
+
+Before issuing '{ "execute": "x-colo-lost-heartbeat" }' command, we have to
+issue block related command to stop block replication.
+Primary:
+  Remove the nbd child from the quorum:
+  { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 'child': 'children.1'}}
+  { 'execute': 'human-monitor-command','arguments': {'command-line': 'drive_del blk-buddy0'}}
+  Note: there is no qmp command to remove the blockdev now
+
+Secondary:
+  The primary host is down, so we should do the following thing:
+  { 'execute': 'nbd-server-stop' }
+
+== TODO ==
+1. Support continuous VM replication.
+2. Support shared storage.
+3. Develop the heartbeat part.
+4. Reduce checkpoint VM’s downtime while doing checkpoint.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 17/18] configure: Support enable/disable COLO feature
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (15 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 16/18] docs: Add documentation for COLO feature Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-30 10:47 ` [Qemu-devel] [PULL 18/18] MAINTAINERS: Add maintainer for COLO framework related files Amit Shah
  2016-10-31 13:57 ` [Qemu-devel] [PULL 00/18] migration: COLO Peter Maydell
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

configure --enable-colo/--disable-colo to switch COLO
support on/off.

COLO feature doesn't depend on any other external libraries,
So here it is reasonable to enable COLO by default, to
avoid re-compile QEMU if users want to use this capability.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 configure        | 11 +++++++++++
 migration/colo.c |  2 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index f83cdf8..6b7acb1 100755
--- a/configure
+++ b/configure
@@ -230,6 +230,7 @@ vhost_net="no"
 vhost_scsi="no"
 vhost_vsock="no"
 kvm="no"
+colo="yes"
 rdma=""
 gprof="no"
 debug_tcg="no"
@@ -918,6 +919,10 @@ for opt do
   ;;
   --enable-kvm) kvm="yes"
   ;;
+  --disable-colo) colo="no"
+  ;;
+  --enable-colo) colo="yes"
+  ;;
   --disable-tcg-interpreter) tcg_interpreter="no"
   ;;
   --enable-tcg-interpreter) tcg_interpreter="yes"
@@ -1366,6 +1371,7 @@ disabled with --disable-FEATURE, default is enabled if available:
   fdt             fdt device tree
   bluez           bluez stack connectivity
   kvm             KVM acceleration support
+  colo            COarse-grain LOck-stepping VM for Non-stop Service
   rdma            RDMA-based migration support
   vde             support for vde network
   netmap          support for netmap network
@@ -5004,6 +5010,7 @@ echo "Linux AIO support $linux_aio"
 echo "ATTR/XATTR support $attr"
 echo "Install blobs     $blobs"
 echo "KVM support       $kvm"
+echo "COLO support      $colo"
 echo "RDMA support      $rdma"
 echo "TCG interpreter   $tcg_interpreter"
 echo "fdt support       $fdt"
@@ -5639,6 +5646,10 @@ if have_backend "syslog"; then
 fi
 echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
 
+if test "$colo" = "yes"; then
+  echo "CONFIG_COLO=y" >> $config_host_mak
+fi
+
 if test "$rdma" = "yes" ; then
   echo "CONFIG_RDMA=y" >> $config_host_mak
 fi
diff --git a/migration/colo.c b/migration/colo.c
index 1f9c699..e7224b8 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -24,7 +24,7 @@
 
 bool colo_supported(void)
 {
-    return false;
+    return true;
 }
 
 bool migration_in_colo_state(void)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PULL 18/18] MAINTAINERS: Add maintainer for COLO framework related files
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (16 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 17/18] configure: Support enable/disable " Amit Shah
@ 2016-10-30 10:47 ` Amit Shah
  2016-10-31 13:57 ` [Qemu-devel] [PULL 00/18] migration: COLO Peter Maydell
  18 siblings, 0 replies; 26+ messages in thread
From: Amit Shah @ 2016-10-30 10:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Add myself as co-maintainer of COLO framework, so that
I can get CC'ed on future patches and bugs for this feature.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
 MAINTAINERS | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 82d4d00..859f262 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1415,6 +1415,14 @@ F: util/uuid.c
 F: include/qemu/uuid.h
 F: tests/test-uuid.c
 
+COLO Framework
+M: zhanghailiang <zhang.zhanghailiang@huawei.com>
+S: Maintained
+F: migration/colo*
+F: include/migration/colo.h
+F: include/migration/failover.h
+F: docs/COLO-FT.txt
+
 COLO Proxy
 M: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
 M: Li Zhijian <lizhijian@cn.fujitsu.com>
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PULL 00/18] migration: COLO
  2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
                   ` (17 preceding siblings ...)
  2016-10-30 10:47 ` [Qemu-devel] [PULL 18/18] MAINTAINERS: Add maintainer for COLO framework related files Amit Shah
@ 2016-10-31 13:57 ` Peter Maydell
  18 siblings, 0 replies; 26+ messages in thread
From: Peter Maydell @ 2016-10-31 13:57 UTC (permalink / raw)
  To: Amit Shah; +Cc: Juan Quintela, Dr. David Alan Gilbert, qemu list, zhanghailiang

On 30 October 2016 at 10:46, Amit Shah <amit.shah@redhat.com> wrote:
> From: Amit Shah <amit@amitshah.net>
>
> The following changes since commit 5b2ecabaeabc17f032197246c4846b9ba95ba8a6:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20161028-1' into staging (2016-10-28 17:59:04 +0100)
>
> are available in the git repository at:
>
>   https://git.kernel.org/pub/scm/virt/qemu/amit/migration.git tags/migration-for-2.8
>
> for you to fetch changes up to a4cc318e15955b4b9c7231c37128d4e7466d4307:
>
>   MAINTAINERS: Add maintainer for COLO framework related files (2016-10-30 15:17:39 +0530)
>
> ----------------------------------------------------------------
> Migration bits from the COLO project
>

Applied, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PULL 10/18] COLO: Add checkpoint-delay parameter for migrate-set-parameters
  2016-10-30 10:47 ` [Qemu-devel] [PULL 10/18] COLO: Add checkpoint-delay parameter for migrate-set-parameters Amit Shah
@ 2016-10-31 17:17   ` Juan Quintela
  2016-11-01  2:10     ` Hailiang Zhang
  0 siblings, 1 reply; 26+ messages in thread
From: Juan Quintela @ 2016-10-31 17:17 UTC (permalink / raw)
  To: Amit Shah
  Cc: Peter Maydell, Dr. David Alan Gilbert, qemu list, zhang.zhanghailiang

Amit Shah <amit.shah@redhat.com> wrote:
> From: zhanghailiang <zhang.zhanghailiang@huawei.com>
>
> Add checkpoint-delay parameter for migrate-set-parameters, so that
> we can control the checkpoint frequency when COLO is in periodic mode.
>
>  
> @@ -587,6 +593,7 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>      params->max_bandwidth = s->parameters.max_bandwidth;
>      params->has_downtime_limit = true;
>      params->downtime_limit = s->parameters.downtime_limit;
> +    params->x_checkpoint_delay = s->parameters.x_checkpoint_delay;
>  
>      return params;
>  }

I think you are missing here:

params->has_x_checkpoint_delay = true;

No?

Later, Juan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PULL 06/18] COLO: Introduce checkpointing protocol
  2016-10-30 10:46 ` [Qemu-devel] [PULL 06/18] COLO: Introduce checkpointing protocol Amit Shah
@ 2016-10-31 18:25   ` Eduardo Habkost
  2016-11-01  1:48     ` Hailiang Zhang
  0 siblings, 1 reply; 26+ messages in thread
From: Eduardo Habkost @ 2016-10-31 18:25 UTC (permalink / raw)
  To: Amit Shah
  Cc: Peter Maydell, qemu list, zhang.zhanghailiang,
	Dr. David Alan Gilbert, Juan Quintela

On Sun, Oct 30, 2016 at 04:16:58PM +0530, Amit Shah wrote:
[...]
> +static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
> +                                     Error **errp)
> +{
> +    COLOMessage msg;
> +    Error *local_err = NULL;
> +
> +    msg = colo_receive_message(f, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +
> +    switch (msg) {
> +    case COLO_MESSAGE_CHECKPOINT_REQUEST:
> +        *checkpoint_request = 1;
> +        break;
> +    default:
> +        *checkpoint_request = 0;
> +        error_setg(errp, "Got unknown COLO message: %d", msg);
> +        break;
> +    }
> +}
[...]
> +        colo_wait_handle_message(mis->from_src_file, &request, &local_err);
> +        if (local_err) {
> +            goto out;
> +        }
> +        assert(request);

GCC 4.8.5 doesn't seem to be smart enough to notice that request
will be always initialized:

  /root/qemu/migration/colo.c: In function ‘colo_process_incoming_thread’:
  /root/qemu/migration/colo.c:448:33: error: ‘request’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
           assert(request);
                                   ^
  cc1: all warnings being treated as errors

  $ gcc --version
  gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)

-- 
Eduardo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PULL 03/18] migration: Enter into COLO mode after migration if COLO is enabled
  2016-10-30 10:46 ` [Qemu-devel] [PULL 03/18] migration: Enter into COLO mode after migration if COLO is enabled Amit Shah
@ 2016-10-31 22:27   ` Eric Blake
  2016-11-01  3:39     ` Hailiang Zhang
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Blake @ 2016-10-31 22:27 UTC (permalink / raw)
  To: Amit Shah, Peter Maydell
  Cc: qemu list, zhang.zhanghailiang, Dr. David Alan Gilbert, Juan Quintela

[-- Attachment #1: Type: text/plain, Size: 1147 bytes --]

On 10/30/2016 05:46 AM, Amit Shah wrote:
> From: zhanghailiang <zhang.zhanghailiang@huawei.com>
> 
> Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
> enters this state after the first live migration successfully finished
> if COLO is enabled by command 'migrate_set_capability x-colo on'.
> 
> We reuse migration thread, so the process of checkpointing will be handled
> in migration thread.
> 

> +++ b/qapi-schema.json
> @@ -459,12 +459,14 @@
>  #
>  # @failed: some error occurred during migration process.
>  #
> +# @colo: VM is in the process of fault tolerance. (since 2.8)
> +#

The commit message mentions it, but the public documentation does not,
so it might be worth a followup patch that clarifies in the schema that
the 'colo' state is not possible without using the 'x-colo' migration
property (so that applications like libvirt know that they don't have to
worry about handling an unexpected state, at least not in 2.8 or until
the 'x-colo' is promoted to plain 'colo').

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PULL 06/18] COLO: Introduce checkpointing protocol
  2016-10-31 18:25   ` Eduardo Habkost
@ 2016-11-01  1:48     ` Hailiang Zhang
  0 siblings, 0 replies; 26+ messages in thread
From: Hailiang Zhang @ 2016-11-01  1:48 UTC (permalink / raw)
  To: Eduardo Habkost, Amit Shah
  Cc: Peter Maydell, qemu list, Dr. David Alan Gilbert, Juan Quintela

On 2016/11/1 2:25, Eduardo Habkost wrote:
> On Sun, Oct 30, 2016 at 04:16:58PM +0530, Amit Shah wrote:
> [...]
>> +static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
>> +                                     Error **errp)
>> +{
>> +    COLOMessage msg;
>> +    Error *local_err = NULL;
>> +
>> +    msg = colo_receive_message(f, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +        return;
>> +    }
>> +
>> +    switch (msg) {
>> +    case COLO_MESSAGE_CHECKPOINT_REQUEST:
>> +        *checkpoint_request = 1;
>> +        break;
>> +    default:
>> +        *checkpoint_request = 0;
>> +        error_setg(errp, "Got unknown COLO message: %d", msg);
>> +        break;
>> +    }
>> +}
> [...]
>> +        colo_wait_handle_message(mis->from_src_file, &request, &local_err);
>> +        if (local_err) {
>> +            goto out;
>> +        }
>> +        assert(request);
>
> GCC 4.8.5 doesn't seem to be smart enough to notice that request
> will be always initialized:
>
>    /root/qemu/migration/colo.c: In function ‘colo_process_incoming_thread’:
>    /root/qemu/migration/colo.c:448:33: error: ‘request’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
>             assert(request);
>                                     ^
>    cc1: all warnings being treated as errors
>
>    $ gcc --version
>    gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
>

Yes, you are right, Jeff has sent a patch to fix this
'[PATCH 1/1] migration: fix compiler warning on uninitialized variable'

Thanks,
Hailiang

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PULL 10/18] COLO: Add checkpoint-delay parameter for migrate-set-parameters
  2016-10-31 17:17   ` Juan Quintela
@ 2016-11-01  2:10     ` Hailiang Zhang
  0 siblings, 0 replies; 26+ messages in thread
From: Hailiang Zhang @ 2016-11-01  2:10 UTC (permalink / raw)
  To: quintela, Amit Shah; +Cc: Peter Maydell, Dr. David Alan Gilbert, qemu list

On 2016/11/1 1:17, Juan Quintela wrote:
> Amit Shah <amit.shah@redhat.com> wrote:
>> From: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>
>> Add checkpoint-delay parameter for migrate-set-parameters, so that
>> we can control the checkpoint frequency when COLO is in periodic mode.
>>
>>
>> @@ -587,6 +593,7 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>>       params->max_bandwidth = s->parameters.max_bandwidth;
>>       params->has_downtime_limit = true;
>>       params->downtime_limit = s->parameters.downtime_limit;
>> +    params->x_checkpoint_delay = s->parameters.x_checkpoint_delay;
>>
>>       return params;
>>   }
>
> I think you are missing here:
>
> params->has_x_checkpoint_delay = true;
>

Good catch, this will influence output of
qmp command 'query-migrate-parameters'.
(Without this, 'query-migrate-parameters'
command will shows no "x-checkpoint-delay": 200)

I will send a patch to fix it, thanks very much.

Hailiang

> No?
>
> Later, Juan
>
> .
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PULL 03/18] migration: Enter into COLO mode after migration if COLO is enabled
  2016-10-31 22:27   ` Eric Blake
@ 2016-11-01  3:39     ` Hailiang Zhang
  0 siblings, 0 replies; 26+ messages in thread
From: Hailiang Zhang @ 2016-11-01  3:39 UTC (permalink / raw)
  To: Eric Blake, Amit Shah, Peter Maydell
  Cc: qemu list, Dr. David Alan Gilbert, Juan Quintela

On 2016/11/1 6:27, Eric Blake wrote:
> On 10/30/2016 05:46 AM, Amit Shah wrote:
>> From: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>
>> Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
>> enters this state after the first live migration successfully finished
>> if COLO is enabled by command 'migrate_set_capability x-colo on'.
>>
>> We reuse migration thread, so the process of checkpointing will be handled
>> in migration thread.
>>
>
>> +++ b/qapi-schema.json
>> @@ -459,12 +459,14 @@
>>   #
>>   # @failed: some error occurred during migration process.
>>   #
>> +# @colo: VM is in the process of fault tolerance. (since 2.8)
>> +#
>
> The commit message mentions it, but the public documentation does not,
> so it might be worth a followup patch that clarifies in the schema that
> the 'colo' state is not possible without using the 'x-colo' migration
> property (so that applications like libvirt know that they don't have to
> worry about handling an unexpected state, at least not in 2.8 or until
> the 'x-colo' is promoted to plain 'colo').
>

Sounds reasonable, I'll send a patch to add such clarification.

Thanks,
Hailiang

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-11-01  3:39 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 01/18] migration: Introduce capability 'x-colo' to migration Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 02/18] COLO: migrate COLO related info to secondary node Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 03/18] migration: Enter into COLO mode after migration if COLO is enabled Amit Shah
2016-10-31 22:27   ` Eric Blake
2016-11-01  3:39     ` Hailiang Zhang
2016-10-30 10:46 ` [Qemu-devel] [PULL 04/18] migration: Switch to COLO process after finishing loadvm Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 05/18] COLO: Establish a new communicating path for COLO Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 06/18] COLO: Introduce checkpointing protocol Amit Shah
2016-10-31 18:25   ` Eduardo Habkost
2016-11-01  1:48     ` Hailiang Zhang
2016-10-30 10:46 ` [Qemu-devel] [PULL 07/18] COLO: Add a new RunState RUN_STATE_COLO Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 08/18] COLO: Send PVM state to secondary side when do checkpoint Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 09/18] COLO: Load VMState into QIOChannelBuffer before restore it Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 10/18] COLO: Add checkpoint-delay parameter for migrate-set-parameters Amit Shah
2016-10-31 17:17   ` Juan Quintela
2016-11-01  2:10     ` Hailiang Zhang
2016-10-30 10:47 ` [Qemu-devel] [PULL 11/18] COLO: Synchronize PVM's state to SVM periodically Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 12/18] COLO: Add 'x-colo-lost-heartbeat' command to trigger failover Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 13/18] COLO: Introduce state to record failover process Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 14/18] COLO: Implement the process of failover for primary VM Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 15/18] COLO: Implement failover work for secondary VM Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 16/18] docs: Add documentation for COLO feature Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 17/18] configure: Support enable/disable " Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 18/18] MAINTAINERS: Add maintainer for COLO framework related files Amit Shah
2016-10-31 13:57 ` [Qemu-devel] [PULL 00/18] migration: COLO Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.