All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-21  8:12 ` zhanghailiang
  0 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: quintela, dgilbert, amit.shah, eblake, berrange, peter.huangpeng,
	eddie.dong, yunhong.jiang, wency, lizhijian, arei.gonglei, david,
	zhanghailiang, netfilter-devel

This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
failover, proxy API, block replication API, not include block replication.
The block part has been sent by wencongyang:
"[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"

we have finished some new features and optimization on COLO (As a development branch in github),
but for easy of review, it is better to keep it simple now, so we will not add too much new 
codes into this frame patch set before it been totally reviewed. 

You can get the latest integrated qemu colo patches from github (Include Block part):
https://github.com/coloft/qemu/commits/colo-v1.2-basic
https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)

Please NOTE the difference between these two branch.
colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
process of checkpoint, including: 
   1) separate ram and device save/load process to reduce size of extra memory
      used during checkpoint
   2) live migrate part of dirty pages to slave during sleep time.
Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
info by using command 'info migrate'.

You can test any branch of the above, 
about how to test COLO, Please reference to the follow link.
http://wiki.qemu.org/Features/COLO.

COLO is still in early stage, 
your comments and feedback are warmly welcomed.

Cc: netfilter-devel@vger.kernel.org

TODO:
1. Strengthen failover
2. COLO function switch on/off
2. Optimize proxy part, include proxy script.
  1) Remove the limitation of forward network link.
  2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
3. The capability of continuous FT

v5:
- Replace the previous communication way between proxy and qemu with nfnetlink
- Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
to set the 'forward device'
- Turn DPRINTF into trace_ calls as Dave's suggestion

v4:
- New block replication scheme (use image-fleecing for sencondary side)
- Adress some comments from Eric Blake and Dave
- Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
- Add a delay (100ms) between continuous checkpoint requests to ensure VM
  run 100ms at least since last pause.
v3:
- use proxy instead of colo agent to compare network packets
- add block replication
- Optimize failover disposal
- handle shutdown

v2:
- use QEMUSizedBuffer/QEMUFile as COLO buffer
- colo support is enabled by default
- add nic replication support
- addressed comments from Eric Blake and Dr. David Alan Gilbert

v1:
- implement the frame of colo

Wen Congyang (1):
  COLO: Add block replication into colo process

zhanghailiang (28):
  configure: Add parameter for configure to enable/disable COLO support
  migration: Introduce capability 'colo' to migration
  COLO: migrate colo related info to slave
  migration: Integrate COLO checkpoint process into migration
  migration: Integrate COLO checkpoint process into loadvm
  COLO: Implement colo checkpoint protocol
  COLO: Add a new RunState RUN_STATE_COLO
  QEMUSizedBuffer: Introduce two help functions for qsb
  COLO: Save VM state to slave when do checkpoint
  COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
  COLO VMstate: Load VM state into qsb before restore it
  arch_init: Start to trace dirty pages of SVM
  COLO RAM: Flush cached RAM into SVM's memory
  COLO failover: Introduce a new command to trigger a failover
  COLO failover: Implement COLO master/slave failover work
  COLO failover: Don't do failover during loading VM's state
  COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
  COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
  COLO NIC: Implement colo nic device interface configure()
  COLO NIC : Implement colo nic init/destroy function
  COLO NIC: Some init work related with proxy module
  COLO: Handle nfnetlink message from proxy module
  COLO: Do checkpoint according to the result of packets comparation
  COLO: Improve checkpoint efficiency by do additional periodic
    checkpoint
  COLO: Add colo-set-checkpoint-period command
  COLO NIC: Implement NIC checkpoint and failover
  COLO: Disable qdev hotplug when VM is in COLO mode
  COLO: Implement shutdown checkpoint

 arch_init.c                            | 243 +++++++++-
 configure                              |  36 +-
 hmp-commands.hx                        |  30 ++
 hmp.c                                  |  14 +
 hmp.h                                  |   2 +
 include/exec/cpu-all.h                 |   1 +
 include/migration/migration-colo.h     |  57 +++
 include/migration/migration-failover.h |  22 +
 include/migration/migration.h          |   3 +
 include/migration/qemu-file.h          |   3 +-
 include/net/colo-nic.h                 |  27 ++
 include/net/net.h                      |   3 +
 include/sysemu/sysemu.h                |   3 +
 migration/Makefile.objs                |   2 +
 migration/colo-comm.c                  |  68 +++
 migration/colo-failover.c              |  48 ++
 migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
 migration/migration.c                  |  60 ++-
 migration/qemu-file-buf.c              |  58 +++
 net/Makefile.objs                      |   1 +
 net/colo-nic.c                         | 420 +++++++++++++++++
 net/tap.c                              |  45 +-
 qapi-schema.json                       |  42 +-
 qemu-options.hx                        |  10 +-
 qmp-commands.hx                        |  41 ++
 savevm.c                               |   2 +-
 scripts/colo-proxy-script.sh           |  88 ++++
 stubs/Makefile.objs                    |   1 +
 stubs/migration-colo.c                 |  58 +++
 trace-events                           |  11 +
 vl.c                                   |  39 +-
 31 files changed, 2235 insertions(+), 39 deletions(-)
 create mode 100644 include/migration/migration-colo.h
 create mode 100644 include/migration/migration-failover.h
 create mode 100644 include/net/colo-nic.h
 create mode 100644 migration/colo-comm.c
 create mode 100644 migration/colo-failover.c
 create mode 100644 migration/colo.c
 create mode 100644 net/colo-nic.c
 create mode 100755 scripts/colo-proxy-script.sh
 create mode 100644 stubs/migration-colo.c

-- 
1.7.12.4



^ permalink raw reply	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-21  8:12 ` zhanghailiang
  0 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, netfilter-devel,
	amit.shah, david

This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
failover, proxy API, block replication API, not include block replication.
The block part has been sent by wencongyang:
"[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"

we have finished some new features and optimization on COLO (As a development branch in github),
but for easy of review, it is better to keep it simple now, so we will not add too much new 
codes into this frame patch set before it been totally reviewed. 

You can get the latest integrated qemu colo patches from github (Include Block part):
https://github.com/coloft/qemu/commits/colo-v1.2-basic
https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)

Please NOTE the difference between these two branch.
colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
process of checkpoint, including: 
   1) separate ram and device save/load process to reduce size of extra memory
      used during checkpoint
   2) live migrate part of dirty pages to slave during sleep time.
Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
info by using command 'info migrate'.

You can test any branch of the above, 
about how to test COLO, Please reference to the follow link.
http://wiki.qemu.org/Features/COLO.

COLO is still in early stage, 
your comments and feedback are warmly welcomed.

Cc: netfilter-devel@vger.kernel.org

TODO:
1. Strengthen failover
2. COLO function switch on/off
2. Optimize proxy part, include proxy script.
  1) Remove the limitation of forward network link.
  2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
3. The capability of continuous FT

v5:
- Replace the previous communication way between proxy and qemu with nfnetlink
- Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
to set the 'forward device'
- Turn DPRINTF into trace_ calls as Dave's suggestion

v4:
- New block replication scheme (use image-fleecing for sencondary side)
- Adress some comments from Eric Blake and Dave
- Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
- Add a delay (100ms) between continuous checkpoint requests to ensure VM
  run 100ms at least since last pause.
v3:
- use proxy instead of colo agent to compare network packets
- add block replication
- Optimize failover disposal
- handle shutdown

v2:
- use QEMUSizedBuffer/QEMUFile as COLO buffer
- colo support is enabled by default
- add nic replication support
- addressed comments from Eric Blake and Dr. David Alan Gilbert

v1:
- implement the frame of colo

Wen Congyang (1):
  COLO: Add block replication into colo process

zhanghailiang (28):
  configure: Add parameter for configure to enable/disable COLO support
  migration: Introduce capability 'colo' to migration
  COLO: migrate colo related info to slave
  migration: Integrate COLO checkpoint process into migration
  migration: Integrate COLO checkpoint process into loadvm
  COLO: Implement colo checkpoint protocol
  COLO: Add a new RunState RUN_STATE_COLO
  QEMUSizedBuffer: Introduce two help functions for qsb
  COLO: Save VM state to slave when do checkpoint
  COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
  COLO VMstate: Load VM state into qsb before restore it
  arch_init: Start to trace dirty pages of SVM
  COLO RAM: Flush cached RAM into SVM's memory
  COLO failover: Introduce a new command to trigger a failover
  COLO failover: Implement COLO master/slave failover work
  COLO failover: Don't do failover during loading VM's state
  COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
  COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
  COLO NIC: Implement colo nic device interface configure()
  COLO NIC : Implement colo nic init/destroy function
  COLO NIC: Some init work related with proxy module
  COLO: Handle nfnetlink message from proxy module
  COLO: Do checkpoint according to the result of packets comparation
  COLO: Improve checkpoint efficiency by do additional periodic
    checkpoint
  COLO: Add colo-set-checkpoint-period command
  COLO NIC: Implement NIC checkpoint and failover
  COLO: Disable qdev hotplug when VM is in COLO mode
  COLO: Implement shutdown checkpoint

 arch_init.c                            | 243 +++++++++-
 configure                              |  36 +-
 hmp-commands.hx                        |  30 ++
 hmp.c                                  |  14 +
 hmp.h                                  |   2 +
 include/exec/cpu-all.h                 |   1 +
 include/migration/migration-colo.h     |  57 +++
 include/migration/migration-failover.h |  22 +
 include/migration/migration.h          |   3 +
 include/migration/qemu-file.h          |   3 +-
 include/net/colo-nic.h                 |  27 ++
 include/net/net.h                      |   3 +
 include/sysemu/sysemu.h                |   3 +
 migration/Makefile.objs                |   2 +
 migration/colo-comm.c                  |  68 +++
 migration/colo-failover.c              |  48 ++
 migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
 migration/migration.c                  |  60 ++-
 migration/qemu-file-buf.c              |  58 +++
 net/Makefile.objs                      |   1 +
 net/colo-nic.c                         | 420 +++++++++++++++++
 net/tap.c                              |  45 +-
 qapi-schema.json                       |  42 +-
 qemu-options.hx                        |  10 +-
 qmp-commands.hx                        |  41 ++
 savevm.c                               |   2 +-
 scripts/colo-proxy-script.sh           |  88 ++++
 stubs/Makefile.objs                    |   1 +
 stubs/migration-colo.c                 |  58 +++
 trace-events                           |  11 +
 vl.c                                   |  39 +-
 31 files changed, 2235 insertions(+), 39 deletions(-)
 create mode 100644 include/migration/migration-colo.h
 create mode 100644 include/migration/migration-failover.h
 create mode 100644 include/net/colo-nic.h
 create mode 100644 migration/colo-comm.c
 create mode 100644 migration/colo-failover.c
 create mode 100644 migration/colo.c
 create mode 100644 net/colo-nic.c
 create mode 100755 scripts/colo-proxy-script.sh
 create mode 100644 stubs/migration-colo.c

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 01/29] configure: Add parameter for configure to enable/disable COLO support
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
  (?)
@ 2015-05-21  8:12 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

configure --enable-colo/--disable-colo to switch COLO
support on/off.
COLO support is off by default.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 configure | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/configure b/configure
index 1f0f485..793fd12 100755
--- a/configure
+++ b/configure
@@ -258,6 +258,7 @@ xfs=""
 vhost_net="no"
 vhost_scsi="no"
 kvm="no"
+colo="no"
 rdma=""
 gprof="no"
 debug_tcg="no"
@@ -924,6 +925,10 @@ for opt do
   ;;
   --enable-kvm) kvm="yes"
   ;;
+  --disable-colo) colo="no"
+  ;;
+  --enable-colo) colo="yes"
+  ;;
   --disable-tcg-interpreter) tcg_interpreter="no"
   ;;
   --enable-tcg-interpreter) tcg_interpreter="yes"
@@ -1328,6 +1333,10 @@ Advanced options (experts only):
   --disable-slirp          disable SLIRP userspace network connectivity
   --disable-kvm            disable KVM acceleration support
   --enable-kvm             enable KVM acceleration support
+  --disable-colo           disable COarse-grain LOck-stepping Virtual
+                           Machines for Non-stop Service
+  --enable-colo            enable COarse-grain LOck-stepping Virtual
+                           Machines for Non-stop Service (default)
   --disable-rdma           disable RDMA-based migration support
   --enable-rdma            enable RDMA-based migration support
   --enable-tcg-interpreter enable TCG with bytecode interpreter (TCI)
@@ -4427,6 +4436,7 @@ echo "Linux AIO support $linux_aio"
 echo "ATTR/XATTR support $attr"
 echo "Install blobs     $blobs"
 echo "KVM support       $kvm"
+echo "COLO support      $colo"
 echo "RDMA support      $rdma"
 echo "TCG interpreter   $tcg_interpreter"
 echo "fdt support       $fdt"
@@ -4987,6 +4997,10 @@ if have_backend "ftrace"; then
 fi
 echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
 
+if test "$colo" = "yes"; then
+  echo "CONFIG_COLO=y" >> $config_host_mak
+fi
+
 if test "$rdma" = "yes" ; then
   echo "CONFIG_RDMA=y" >> $config_host_mak
 fi
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 02/29] migration: Introduce capability 'colo' to migration
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
  (?)
  (?)
@ 2015-05-21  8:12 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 include/migration/migration-colo.h | 20 ++++++++++++++++++++
 include/migration/migration.h      |  1 +
 migration/Makefile.objs            |  1 +
 migration/colo.c                   | 18 ++++++++++++++++++
 migration/migration.c              | 17 +++++++++++++++++
 qapi-schema.json                   |  5 ++++-
 stubs/Makefile.objs                |  1 +
 stubs/migration-colo.c             | 18 ++++++++++++++++++
 8 files changed, 80 insertions(+), 1 deletion(-)
 create mode 100644 include/migration/migration-colo.h
 create mode 100644 migration/colo.c
 create mode 100644 stubs/migration-colo.c

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
new file mode 100644
index 0000000..c6d0c51
--- /dev/null
+++ b/include/migration/migration-colo.h
@@ -0,0 +1,20 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_MIGRATION_COLO_H
+#define QEMU_MIGRATION_COLO_H
+
+#include "qemu-common.h"
+
+bool colo_supported(void);
+
+#endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index a6e025a..06767b2 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -154,6 +154,7 @@ int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
 
 int migrate_use_xbzrle(void);
 int64_t migrate_xbzrle_cache_size(void);
+bool migrate_enable_colo(void);
 
 int64_t xbzrle_cache_resize(int64_t new_size);
 
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index d929e96..5a25d39 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,4 +1,5 @@
 common-obj-y += migration.o tcp.o
+common-obj-$(CONFIG_COLO) += colo.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o qemu-file-stdio.o
 common-obj-y += xbzrle.o
diff --git a/migration/colo.c b/migration/colo.c
new file mode 100644
index 0000000..bcd753b
--- /dev/null
+++ b/migration/colo.c
@@ -0,0 +1,18 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "migration/migration-colo.h"
+
+bool colo_supported(void)
+{
+    return true;
+}
diff --git a/migration/migration.c b/migration/migration.c
index 732d229..579524f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -25,6 +25,7 @@
 #include "qemu/thread.h"
 #include "qmp-commands.h"
 #include "trace.h"
+#include "migration/migration-colo.h"
 
 #define MAX_THROTTLE  (32 << 20)      /* Migration speed throttling */
 
@@ -172,6 +173,9 @@ MigrationCapabilityStatusList *qmp_query_migrate_capabilities(Error **errp)
 
     caps = NULL; /* silence compiler warning */
     for (i = 0; i < MIGRATION_CAPABILITY_MAX; i++) {
+        if (i == MIGRATION_CAPABILITY_COLO && !colo_supported()) {
+            continue;
+        }
         if (head == NULL) {
             head = g_malloc0(sizeof(*caps));
             caps = head;
@@ -312,6 +316,13 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
     }
 
     for (cap = params; cap; cap = cap->next) {
+        if (cap->value->capability == MIGRATION_CAPABILITY_COLO &&
+            cap->value->state && !colo_supported()) {
+            error_setg(errp, "COLO is not currently supported, please"
+                             " configure with --enable-colo option in order to"
+                             " support COLO feature");
+            continue;
+        }
         s->enabled_capabilities[cap->value->capability] = cap->value->state;
     }
 }
@@ -726,6 +737,12 @@ int64_t migrate_xbzrle_cache_size(void)
     return s->xbzrle_cache_size;
 }
 
+bool migrate_enable_colo(void)
+{
+    MigrationState *s = migrate_get_current();
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_COLO];
+}
+
 /* migration thread support */
 
 static void *migration_thread(void *opaque)
diff --git a/qapi-schema.json b/qapi-schema.json
index f97ffa1..e856d44 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -526,11 +526,14 @@
 # @auto-converge: If enabled, QEMU will automatically throttle down the guest
 #          to speed up convergence of RAM migration. (since 1.6)
 #
+# @colo: If enabled, migration will never end, and the state of VM in primary side
+#        will be migrated continuously to VM in secondary side. (since 2.4)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
-           'compress'] }
+           'compress', 'colo'] }
 
 ##
 # @MigrationCapabilityStatus
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 8beff4c..65a7171 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -39,3 +39,4 @@ stub-obj-$(CONFIG_WIN32) += fd-register.o
 stub-obj-y += cpus.o
 stub-obj-y += kvm.o
 stub-obj-y += qmp_pc_dimm_device_list.o
+stub-obj-y += migration-colo.o
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
new file mode 100644
index 0000000..cd0903f
--- /dev/null
+++ b/stubs/migration-colo.c
@@ -0,0 +1,18 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "migration/migration-colo.h"
+
+bool colo_supported(void)
+{
+    return false;
+}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 03/29] COLO: migrate colo related info to slave
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (2 preceding siblings ...)
  (?)
@ 2015-05-21  8:12 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

We can know if VM in destination should go into COLO mode by refer to
the info that has been migrated from PVM.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 include/migration/migration-colo.h |  2 ++
 migration/Makefile.objs            |  1 +
 migration/colo-comm.c              | 47 ++++++++++++++++++++++++++++++++++++++
 trace-events                       |  3 +++
 vl.c                               |  5 +++-
 5 files changed, 57 insertions(+), 1 deletion(-)
 create mode 100644 migration/colo-comm.c

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index c6d0c51..e20a0c1 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -14,7 +14,9 @@
 #define QEMU_MIGRATION_COLO_H
 
 #include "qemu-common.h"
+#include "migration/migration.h"
 
 bool colo_supported(void);
+void colo_info_mig_init(void);
 
 #endif
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index 5a25d39..cb7bd30 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,5 +1,6 @@
 common-obj-y += migration.o tcp.o
 common-obj-$(CONFIG_COLO) += colo.o
+common-obj-y += colo-comm.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o qemu-file-stdio.o
 common-obj-y += xbzrle.o
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
new file mode 100644
index 0000000..0b76eb4
--- /dev/null
+++ b/migration/colo-comm.c
@@ -0,0 +1,47 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later. See the COPYING file in the top-level directory.
+ *
+ */
+
+#include <migration/migration-colo.h>
+#include "trace.h"
+
+static bool colo_requested;
+
+/* save */
+static void colo_info_save(QEMUFile *f, void *opaque)
+{
+    qemu_put_byte(f, migrate_enable_colo());
+}
+
+/* restore */
+static int colo_info_load(QEMUFile *f, void *opaque, int version_id)
+{
+    int value = qemu_get_byte(f);
+
+    if (value && !colo_requested) {
+        trace_colo_info_load("COLO request!");
+    }
+    colo_requested = value;
+
+    return 0;
+}
+
+static SaveVMHandlers savevm_colo_info_handlers = {
+    .save_state = colo_info_save,
+    .load_state = colo_info_load,
+};
+
+void colo_info_mig_init(void)
+{
+    register_savevm_live(NULL, "colo", -1, 1,
+                         &savevm_colo_info_handlers, NULL);
+}
diff --git a/trace-events b/trace-events
index 11387c3..d927cf3 100644
--- a/trace-events
+++ b/trace-events
@@ -1443,6 +1443,9 @@ rdma_start_incoming_migration_after_rdma_listen(void) ""
 rdma_start_outgoing_migration_after_rdma_connect(void) ""
 rdma_start_outgoing_migration_after_rdma_source_init(void) ""
 
+# migration/colo-comm.c
+colo_info_load(const char *msg) "%s"
+
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
 kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p"
diff --git a/vl.c b/vl.c
index 15bccc4..c42cef3 100644
--- a/vl.c
+++ b/vl.c
@@ -90,6 +90,7 @@ int main(int argc, char **argv)
 #include "sysemu/dma.h"
 #include "audio/audio.h"
 #include "migration/migration.h"
+#include "migration/migration-colo.h"
 #include "sysemu/kvm.h"
 #include "qapi/qmp/qjson.h"
 #include "qemu/option.h"
@@ -4174,7 +4175,9 @@ int main(int argc, char **argv, char **envp)
 
     blk_mig_init();
     ram_mig_init();
-
+#ifdef CONFIG_COLO
+    colo_info_mig_init();
+#endif
     /* If the currently selected machine wishes to override the units-per-bus
      * property of its default HBA interface type, do so now. */
     if (machine_class->units_per_default_bus) {
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 04/29] migration: Integrate COLO checkpoint process into migration
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (3 preceding siblings ...)
  (?)
@ 2015-05-21  8:12 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	david

Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state
after the first live migration successfully finished.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 include/migration/migration-colo.h |  3 +++
 include/migration/migration.h      |  2 ++
 migration/colo.c                   | 55 ++++++++++++++++++++++++++++++++++++++
 migration/migration.c              | 22 +++++++++++----
 qapi-schema.json                   |  2 +-
 stubs/migration-colo.c             |  9 +++++++
 trace-events                       |  3 +++
 7 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index e20a0c1..b4f75c2 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -19,4 +19,7 @@
 bool colo_supported(void);
 void colo_info_mig_init(void);
 
+void colo_init_checkpointer(MigrationState *s);
+bool migrate_in_colo_state(void);
+
 #endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 06767b2..63259c9 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -66,6 +66,8 @@ struct MigrationState
     int64_t dirty_sync_count;
 };
 
+void migrate_set_state(MigrationState *s, int old_state, int new_state);
+
 void process_incoming_migration(QEMUFile *f);
 
 void qemu_start_incoming_migration(const char *uri, Error **errp);
diff --git a/migration/colo.c b/migration/colo.c
index bcd753b..1ff4e55 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,9 +10,64 @@
  * later.  See the COPYING file in the top-level directory.
  */
 
+#include "sysemu/sysemu.h"
 #include "migration/migration-colo.h"
+#include "trace.h"
+
+static QEMUBH *colo_bh;
 
 bool colo_supported(void)
 {
     return true;
 }
+
+bool migrate_in_colo_state(void)
+{
+    MigrationState *s = migrate_get_current();
+    return (s->state == MIGRATION_STATUS_COLO);
+}
+
+static void *colo_thread(void *opaque)
+{
+    MigrationState *s = opaque;
+
+    qemu_mutex_lock_iothread();
+    vm_start();
+    qemu_mutex_unlock_iothread();
+    trace_colo_vm_state_change("stop", "run");
+
+    /*TODO: COLO checkpoint savevm loop*/
+
+    migrate_set_state(s, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED);
+
+    qemu_mutex_lock_iothread();
+    qemu_bh_schedule(s->cleanup_bh);
+    qemu_mutex_unlock_iothread();
+
+    return NULL;
+}
+
+static void colo_start_checkpointer(void *opaque)
+{
+    MigrationState *s = opaque;
+
+    if (colo_bh) {
+        qemu_bh_delete(colo_bh);
+        colo_bh = NULL;
+    }
+
+    qemu_mutex_unlock_iothread();
+    qemu_thread_join(&s->thread);
+    qemu_mutex_lock_iothread();
+
+    migrate_set_state(s, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_COLO);
+
+    qemu_thread_create(&s->thread, "colo", colo_thread, s,
+                       QEMU_THREAD_JOINABLE);
+}
+
+void colo_init_checkpointer(MigrationState *s)
+{
+    colo_bh = qemu_bh_new(colo_start_checkpointer, s);
+    qemu_bh_schedule(colo_bh);
+}
diff --git a/migration/migration.c b/migration/migration.c
index 579524f..eea31f4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -268,6 +268,10 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 
         get_xbzrle_cache_stats(info);
         break;
+    case MIGRATION_STATUS_COLO:
+        info->has_status = true;
+        /* TODO: display COLO specific information (checkpoint info etc.) */
+        break;
     case MIGRATION_STATUS_COMPLETED:
         get_xbzrle_cache_stats(info);
 
@@ -370,7 +374,7 @@ void qmp_migrate_set_parameters(bool has_compress_level,
 
 /* shared migration helpers */
 
-static void migrate_set_state(MigrationState *s, int old_state, int new_state)
+void migrate_set_state(MigrationState *s, int old_state, int new_state)
 {
     if (atomic_cmpxchg(&s->state, old_state, new_state) == new_state) {
         trace_migrate_set_state(new_state);
@@ -754,6 +758,7 @@ static void *migration_thread(void *opaque)
     int64_t max_size = 0;
     int64_t start_time = initial_time;
     bool old_vm_running = false;
+    bool enable_colo = migrate_enable_colo();
 
     qemu_savevm_state_begin(s->file, &s->params);
 
@@ -791,8 +796,10 @@ static void *migration_thread(void *opaque)
                 }
 
                 if (!qemu_file_get_error(s->file)) {
-                    migrate_set_state(s, MIGRATION_STATUS_ACTIVE,
-                                      MIGRATION_STATUS_COMPLETED);
+                    if (!enable_colo) {
+                        migrate_set_state(s, MIGRATION_STATUS_ACTIVE,
+                                          MIGRATION_STATUS_COMPLETED);
+                    }
                     break;
                 }
             }
@@ -843,11 +850,16 @@ static void *migration_thread(void *opaque)
         }
         runstate_set(RUN_STATE_POSTMIGRATE);
     } else {
-        if (old_vm_running) {
+        if (s->state == MIGRATION_STATUS_ACTIVE && enable_colo) {
+            colo_init_checkpointer(s);
+        } else if (old_vm_running) {
             vm_start();
         }
     }
-    qemu_bh_schedule(s->cleanup_bh);
+
+    if (!enable_colo) {
+        qemu_bh_schedule(s->cleanup_bh);
+    }
     qemu_mutex_unlock_iothread();
 
     return NULL;
diff --git a/qapi-schema.json b/qapi-schema.json
index e856d44..87f14a7 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -433,7 +433,7 @@
 ##
 { 'enum': 'MigrationStatus',
   'data': [ 'none', 'setup', 'cancelling', 'cancelled',
-            'active', 'completed', 'failed' ] }
+            'active', 'completed', 'failed', 'colo' ] }
 
 ##
 # @MigrationInfo
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index cd0903f..c0bb8d8 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -16,3 +16,12 @@ bool colo_supported(void)
 {
     return false;
 }
+
+bool migrate_in_colo_state(void)
+{
+    return false;
+}
+
+void colo_init_checkpointer(MigrationState *s)
+{
+}
diff --git a/trace-events b/trace-events
index d927cf3..6787009 100644
--- a/trace-events
+++ b/trace-events
@@ -1446,6 +1446,9 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) ""
 # migration/colo-comm.c
 colo_info_load(const char *msg) "%s"
 
+# migration/colo.c
+colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
+
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
 kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 05/29] migration: Integrate COLO checkpoint process into loadvm
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (4 preceding siblings ...)
  (?)
@ 2015-05-21  8:12 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 include/migration/migration-colo.h | 13 +++++++++++++
 migration/colo-comm.c              | 10 ++++++++++
 migration/colo.c                   | 15 +++++++++++++++
 migration/migration.c              | 21 ++++++++++++++++++++-
 stubs/migration-colo.c             |  5 +++++
 trace-events                       |  1 +
 6 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index b4f75c2..b2798f7 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -15,11 +15,24 @@
 
 #include "qemu-common.h"
 #include "migration/migration.h"
+#include "block/coroutine.h"
+#include "qemu/thread.h"
 
 bool colo_supported(void);
 void colo_info_mig_init(void);
 
+struct colo_incoming {
+    QEMUFile *file;
+    QemuThread thread;
+};
+
 void colo_init_checkpointer(MigrationState *s);
 bool migrate_in_colo_state(void);
 
+/* loadvm */
+extern Coroutine *migration_incoming_co;
+bool loadvm_enable_colo(void);
+void loadvm_exit_colo(void);
+void *colo_process_incoming_checkpoints(void *opaque);
+bool loadvm_in_colo_state(void);
 #endif
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
index 0b76eb4..f8be027 100644
--- a/migration/colo-comm.c
+++ b/migration/colo-comm.c
@@ -45,3 +45,13 @@ void colo_info_mig_init(void)
     register_savevm_live(NULL, "colo", -1, 1,
                          &savevm_colo_info_handlers, NULL);
 }
+
+bool loadvm_enable_colo(void)
+{
+    return colo_requested;
+}
+
+void loadvm_exit_colo(void)
+{
+    colo_requested = false;
+}
diff --git a/migration/colo.c b/migration/colo.c
index 1ff4e55..33d3105 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -13,8 +13,10 @@
 #include "sysemu/sysemu.h"
 #include "migration/migration-colo.h"
 #include "trace.h"
+#include "qemu/error-report.h"
 
 static QEMUBH *colo_bh;
+static Coroutine *colo;
 
 bool colo_supported(void)
 {
@@ -71,3 +73,16 @@ void colo_init_checkpointer(MigrationState *s)
     colo_bh = qemu_bh_new(colo_start_checkpointer, s);
     qemu_bh_schedule(colo_bh);
 }
+
+void *colo_process_incoming_checkpoints(void *opaque)
+{
+    colo = qemu_coroutine_self();
+    assert(colo != NULL);
+
+    /* TODO: COLO checkpoint restore loop */
+
+    colo = NULL;
+    loadvm_exit_colo();
+
+    return NULL;
+}
diff --git a/migration/migration.c b/migration/migration.c
index eea31f4..fc79ab8 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -110,6 +110,7 @@ void qemu_start_incoming_migration(const char *uri, Error **errp)
     }
 }
 
+Coroutine *migration_incoming_co;
 static void process_incoming_migration_co(void *opaque)
 {
     QEMUFile *f = opaque;
@@ -117,7 +118,25 @@ static void process_incoming_migration_co(void *opaque)
     int ret;
 
     ret = qemu_loadvm_state(f);
-    qemu_fclose(f);
+
+    /* we get colo info, and know if we are in colo mode */
+    if (loadvm_enable_colo()) {
+        struct colo_incoming *colo_in = g_malloc0(sizeof(*colo_in));
+
+        colo_in->file = f;
+        migration_incoming_co = qemu_coroutine_self();
+        qemu_thread_create(&colo_in->thread, "colo incoming",
+             colo_process_incoming_checkpoints, colo_in, QEMU_THREAD_JOINABLE);
+        qemu_coroutine_yield();
+        migration_incoming_co = NULL;
+#if 0
+        /* FIXME  wait checkpoint incoming thread exit, and free resource */
+        qemu_thread_join(&colo_in->thread);
+        g_free(colo_in);
+#endif
+    } else {
+        qemu_fclose(f);
+    }
     free_xbzrle_decoded_buf();
     if (ret < 0) {
         error_report("load of migration failed: %s", strerror(-ret));
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index c0bb8d8..45b992a 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -25,3 +25,8 @@ bool migrate_in_colo_state(void)
 void colo_init_checkpointer(MigrationState *s)
 {
 }
+
+void *colo_process_incoming_checkpoints(void *opaque)
+{
+    return NULL;
+}
diff --git a/trace-events b/trace-events
index 6787009..2b95743 100644
--- a/trace-events
+++ b/trace-events
@@ -1448,6 +1448,7 @@ colo_info_load(const char *msg) "%s"
 
 # migration/colo.c
 colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
+colo_receive_message(const char *msg) "Receive '%s'"
 
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 06/29] COLO: Implement colo checkpoint protocol
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (5 preceding siblings ...)
  (?)
@ 2015-05-21  8:12 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

We need communications protocol of user-defined to control the checkpoint
process.

The new checkpoint request is started by Primary VM, and the interactive process
like below:
Checkpoint synchronizing points,

                  Primary                 Secondary
  NEW             @
                                          Suspend
  SUSPENDED                               @
                  Suspend&Save state
  SEND            @
                  Send state              Receive state
  RECEIVED                                @
                  Flush network           Load state
  LOADED                                  @
                  Resume                  Resume

                  Start Comparing
NOTE:
 1) '@' who sends the message
 2) Every sync-point is synchronized by two sides with only
    one handshake(single direction) for low-latency.
    If more strict synchronization is required, a opposite direction
    sync-point should be added.
 3) Since sync-points are single direction, the remote side may
    go forward a lot when this side just receives the sync-point.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 migration/colo.c | 237 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 235 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 33d3105..7663144 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -15,6 +15,41 @@
 #include "trace.h"
 #include "qemu/error-report.h"
 
+enum {
+    COLO_CHECPOINT_READY = 0x46,
+
+    /*
+    * Checkpoint synchronizing points.
+    *
+    *                  Primary                 Secondary
+    *  NEW             @
+    *                                          Suspend
+    *  SUSPENDED                               @
+    *                  Suspend&Save state
+    *  SEND            @
+    *                  Send state              Receive state
+    *  RECEIVED                                @
+    *                  Flush network           Load state
+    *  LOADED                                  @
+    *                  Resume                  Resume
+    *
+    *                  Start Comparing
+    * NOTE:
+    * 1) '@' who sends the message
+    * 2) Every sync-point is synchronized by two sides with only
+    *    one handshake(single direction) for low-latency.
+    *    If more strict synchronization is required, a opposite direction
+    *    sync-point should be added.
+    * 3) Since sync-points are single direction, the remote side may
+    *    go forward a lot when this side just receives the sync-point.
+    */
+    COLO_CHECKPOINT_NEW,
+    COLO_CHECKPOINT_SUSPENDED,
+    COLO_CHECKPOINT_SEND,
+    COLO_CHECKPOINT_RECEIVED,
+    COLO_CHECKPOINT_LOADED,
+};
+
 static QEMUBH *colo_bh;
 static Coroutine *colo;
 
@@ -29,19 +64,136 @@ bool migrate_in_colo_state(void)
     return (s->state == MIGRATION_STATUS_COLO);
 }
 
+/* colo checkpoint control helper */
+static int colo_ctl_put(QEMUFile *f, uint64_t request)
+{
+    int ret = 0;
+
+    qemu_put_be64(f, request);
+    qemu_fflush(f);
+
+    ret = qemu_file_get_error(f);
+
+    return ret;
+}
+
+static int colo_ctl_get_value(QEMUFile *f, uint64_t *value)
+{
+    int ret = 0;
+    uint64_t temp;
+
+    temp = qemu_get_be64(f);
+
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        return -1;
+    }
+
+    *value = temp;
+    return 0;
+}
+
+static int colo_ctl_get(QEMUFile *f, uint64_t require)
+{
+    int ret;
+    uint64_t value;
+
+    ret = colo_ctl_get_value(f, &value);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (value != require) {
+        error_report("unexpected state! expected: %"PRIu64
+                     ", received: %"PRIu64, require, value);
+        exit(1);
+    }
+
+    return ret;
+}
+
+static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
+{
+    int ret;
+
+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_SUSPENDED);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* TODO: suspend and save vm state to colo buffer */
+
+    ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* TODO: send vmstate to slave */
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
+    if (ret < 0) {
+        goto out;
+    }
+    trace_colo_receive_message("COLO_CHECKPOINT_RECEIVED");
+
+    ret = colo_ctl_get(control, COLO_CHECKPOINT_LOADED);
+    if (ret < 0) {
+        goto out;
+    }
+    trace_colo_receive_message("COLO_CHECKPOINT_LOADED");
+
+    /* TODO: resume master */
+
+out:
+    return ret;
+}
+
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
+    QEMUFile *colo_control = NULL;
+    int ret;
+
+    colo_control = qemu_fopen_socket(qemu_get_fd(s->file), "rb");
+    if (!colo_control) {
+        error_report("Open colo_control failed!");
+        goto out;
+    }
+
+    /*
+     * Wait for slave finish loading vm states and enter COLO
+     * restore.
+     */
+    ret = colo_ctl_get(colo_control, COLO_CHECPOINT_READY);
+    if (ret < 0) {
+        goto out;
+    }
+    trace_colo_receive_message("COLO_CHECPOINT_READY");
 
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
 
-    /*TODO: COLO checkpoint savevm loop*/
+    while (s->state == MIGRATION_STATUS_COLO) {
+        /* start a colo checkpoint */
+        if (colo_do_checkpoint_transaction(s, colo_control)) {
+            goto out;
+        }
+    }
 
+out:
     migrate_set_state(s, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED);
 
+    if (colo_control) {
+        qemu_fclose(colo_control);
+    }
+
     qemu_mutex_lock_iothread();
     qemu_bh_schedule(s->cleanup_bh);
     qemu_mutex_unlock_iothread();
@@ -74,14 +226,95 @@ void colo_init_checkpointer(MigrationState *s)
     qemu_bh_schedule(colo_bh);
 }
 
+/*
+ * return:
+ * 0: start a checkpoint
+ * -1: some error happened, exit colo restore
+ */
+static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
+{
+    int ret;
+    uint64_t cmd;
+
+    ret = colo_ctl_get_value(f, &cmd);
+    if (ret < 0) {
+        return -1;
+    }
+
+    switch (cmd) {
+    case COLO_CHECKPOINT_NEW:
+        *checkpoint_request = 1;
+        return 0;
+    default:
+        return -1;
+    }
+}
+
 void *colo_process_incoming_checkpoints(void *opaque)
 {
+    struct colo_incoming *colo_in = opaque;
+    QEMUFile *f = colo_in->file;
+    int fd = qemu_get_fd(f);
+    QEMUFile *ctl = NULL;
+    int ret;
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
-    /* TODO: COLO checkpoint restore loop */
+    ctl = qemu_fopen_socket(fd, "wb");
+    if (!ctl) {
+        error_report("Can't open incoming channel!");
+        goto out;
+    }
+    ret = colo_ctl_put(ctl, COLO_CHECPOINT_READY);
+    if (ret < 0) {
+        goto out;
+    }
+    /* TODO: in COLO mode, slave is runing, so start the vm */
+    while (true) {
+        int request = 0;
+        int ret = colo_wait_handle_cmd(f, &request);
+
+        if (ret < 0) {
+            break;
+        } else {
+            if (!request) {
+                continue;
+            }
+        }
 
+        /* TODO: suspend guest */
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
+        if (ret < 0) {
+            goto out;
+        }
+
+        ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
+        if (ret < 0) {
+            goto out;
+        }
+        trace_colo_receive_message("COLO_CHECKPOINT_SEND");
+
+        /* TODO: read migration data into colo buffer */
+
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
+        if (ret < 0) {
+            goto out;
+        }
+        trace_colo_receive_message("COLO_CHECKPOINT_RECEIVED");
+
+        /* TODO: load vm state */
+
+        ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
+        if (ret < 0) {
+            goto out;
+        }
+}
+
+out:
     colo = NULL;
+    if (ctl) {
+        qemu_fclose(ctl);
+    }
     loadvm_exit_colo();
 
     return NULL;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 07/29] COLO: Add a new RunState RUN_STATE_COLO
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (6 preceding siblings ...)
  (?)
@ 2015-05-21  8:12 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	david

Guest will enter this state when paused to save/restore VM state
under colo checkpoint.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 qapi-schema.json | 5 ++++-
 vl.c             | 8 ++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 87f14a7..54eb707 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -148,12 +148,15 @@
 # @watchdog: the watchdog action is configured to pause and has been triggered
 #
 # @guest-panicked: guest has been panicked as a result of guest OS panic
+#
+# @colo: guest is paused to save/restore VM state under colo checkpoint (since
+# 2.4)
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
             'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
             'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
-            'guest-panicked' ] }
+            'guest-panicked', 'colo' ] }
 
 ##
 # @StatusInfo:
diff --git a/vl.c b/vl.c
index c42cef3..822bd08 100644
--- a/vl.c
+++ b/vl.c
@@ -551,6 +551,7 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
+    { RUN_STATE_INMIGRATE, RUN_STATE_COLO },
 
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
@@ -560,6 +561,7 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
     { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_PAUSED, RUN_STATE_COLO},
 
     { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -570,9 +572,12 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
 
+    { RUN_STATE_COLO, RUN_STATE_RUNNING },
+
     { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
     { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
     { RUN_STATE_RUNNING, RUN_STATE_IO_ERROR },
@@ -583,6 +588,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
     { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
     { RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
+    { RUN_STATE_RUNNING, RUN_STATE_COLO},
 
     { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
@@ -593,9 +599,11 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
     { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
     { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_WATCHDOG, RUN_STATE_COLO},
 
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 08/29] QEMUSizedBuffer: Introduce two help functions for qsb
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (7 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Yang Hongyang,
	david

Introduce two new QEMUSizedBuffer APIs which will be used by COLO to buffer
VM state:
One is qsb_put_buffer(), which put the content of a given QEMUSizedBuffer
into QEMUFile, this is used to send buffered VM state to secondary.
Another is qsb_fill_buffer(), read 'size' bytes of data from the file into
qsb, this is used to get VM state from socket into a buffer.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/qemu-file.h |  3 ++-
 migration/qemu-file-buf.c     | 58 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
index a01c5b8..9dfd05e 100644
--- a/include/migration/qemu-file.h
+++ b/include/migration/qemu-file.h
@@ -140,7 +140,8 @@ ssize_t qsb_get_buffer(const QEMUSizedBuffer *, off_t start, size_t count,
                        uint8_t *buf);
 ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *buf,
                      off_t pos, size_t count);
-
+void qsb_put_buffer(QEMUFile *f, QEMUSizedBuffer *qsb, int size);
+int qsb_fill_buffer(QEMUSizedBuffer *qsb, QEMUFile *f, int size);
 
 /*
  * For use on files opened with qemu_bufopen
diff --git a/migration/qemu-file-buf.c b/migration/qemu-file-buf.c
index 16a51a1..686f417 100644
--- a/migration/qemu-file-buf.c
+++ b/migration/qemu-file-buf.c
@@ -365,6 +365,64 @@ ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *source,
     return count;
 }
 
+
+/**
+ * Put the content of a given QEMUSizedBuffer into QEMUFile.
+ *
+ * @f: A QEMUFile
+ * @qsb: A QEMUSizedBuffer
+ * @size: size of content to write
+ */
+void qsb_put_buffer(QEMUFile *f, QEMUSizedBuffer *qsb, int size)
+{
+    int i, l;
+
+    for (i = 0; i < qsb->n_iov && size > 0; i++) {
+        l = MIN(qsb->iov[i].iov_len, size);
+        qemu_put_buffer(f, qsb->iov[i].iov_base, l);
+        size -= l;
+    }
+}
+
+/*
+ * Read 'size' bytes of data from the file into qsb.
+ * always fill from pos 0 and used after qsb_create().
+ *
+ * It will return size bytes unless there was an error, in which case it will
+ * return as many as it managed to read (assuming blocking fd's which
+ * all current QEMUFile are)
+ */
+int qsb_fill_buffer(QEMUSizedBuffer *qsb, QEMUFile *f, int size)
+{
+    ssize_t rc = qsb_grow(qsb, size);
+    int pending = size, i;
+    qsb->used = 0;
+    uint8_t *buf = NULL;
+
+    if (rc < 0) {
+        return rc;
+    }
+
+    for (i = 0; i < qsb->n_iov && pending > 0; i++) {
+        int doneone = 0;
+        /* read until iov full */
+        while (doneone < qsb->iov[i].iov_len && pending > 0) {
+            int readone = 0;
+            buf = qsb->iov[i].iov_base;
+            readone = qemu_get_buffer(f, buf,
+                                MIN(qsb->iov[i].iov_len - doneone, pending));
+            if (readone == 0) {
+                return qsb->used;
+            }
+            buf += readone;
+            doneone += readone;
+            pending -= readone;
+            qsb->used += readone;
+        }
+    }
+    return qsb->used;
+}
+
 typedef struct QEMUBuffer {
     QEMUSizedBuffer *qsb;
     QEMUFile *file;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 09/29] COLO: Save VM state to slave when do checkpoint
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (8 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

We should save PVM's RAM/device to slave when needed.

For VM state, we will cache them in slave, we use QEMUSizedBuffer
to store the data, we need know the data size of VM state, so in master,
we use qsb to store VM state temporarily, and then migrate the data to
slave.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 arch_init.c      | 48 +++++++++++++++++++++++++++++++++----------
 migration/colo.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++----
 savevm.c         |  2 +-
 3 files changed, 96 insertions(+), 16 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 23d3feb..6fbc82d 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -54,6 +54,7 @@
 #include "hw/acpi/acpi.h"
 #include "qemu/host-utils.h"
 #include "qemu/rcu_queue.h"
+#include "migration/migration-colo.h"
 
 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -1185,16 +1186,8 @@ static void reset_ram_globals(void)
 
 #define MAX_WAIT 50 /* ms, half buffered_file limit */
 
-
-/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
- * long-running RCU critical section.  When rcu-reclaims in the code
- * start to become numerous it will be necessary to reduce the
- * granularity of these critical sections.
- */
-
-static int ram_save_setup(QEMUFile *f, void *opaque)
+static int ram_save_init_globals(void)
 {
-    RAMBlock *block;
     int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
 
     mig_throttle_on = false;
@@ -1253,6 +1246,31 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     migration_bitmap_sync();
     qemu_mutex_unlock_ramlist();
     qemu_mutex_unlock_iothread();
+    rcu_read_unlock();
+
+    return 0;
+}
+
+/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
+ * long-running RCU critical section.  When rcu-reclaims in the code
+ * start to become numerous it will be necessary to reduce the
+ * granularity of these critical sections.
+ */
+
+static int ram_save_setup(QEMUFile *f, void *opaque)
+{
+    RAMBlock *block;
+
+    /*
+     * migration has already setup the bitmap, reuse it.
+     */
+    if (!migrate_in_colo_state()) {
+        if (ram_save_init_globals() < 0) {
+            return -1;
+         }
+    }
+
+    rcu_read_lock();
 
     qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
@@ -1352,7 +1370,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
     while (true) {
         int pages;
 
-        pages = ram_find_and_save_block(f, true, &bytes_transferred);
+        pages = ram_find_and_save_block(f, !migrate_in_colo_state(),
+                                        &bytes_transferred);
         /* no more blocks to sent */
         if (pages == 0) {
             break;
@@ -1361,7 +1380,14 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 
     flush_compressed_data(f);
     ram_control_after_iterate(f, RAM_CONTROL_FINISH);
-    migration_end();
+
+    /*
+     * Since we need to reuse dirty bitmap in colo,
+     * don't cleanup the bitmap.
+     */
+    if (!migrate_enable_colo() || migration_has_failed(migrate_get_current())) {
+        migration_end();
+    }
 
     rcu_read_unlock();
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
diff --git a/migration/colo.c b/migration/colo.c
index 7663144..8ff03e7 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -52,6 +52,9 @@ enum {
 
 static QEMUBH *colo_bh;
 static Coroutine *colo;
+/* colo buffer */
+#define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
+QEMUSizedBuffer *colo_buffer;
 
 bool colo_supported(void)
 {
@@ -115,6 +118,8 @@ static int colo_ctl_get(QEMUFile *f, uint64_t require)
 static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
 {
     int ret;
+    size_t size;
+    QEMUFile *trans = NULL;
 
     ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
     if (ret < 0) {
@@ -125,16 +130,47 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
     if (ret < 0) {
         goto out;
     }
+    /* Reset colo buffer and open it for write */
+    qsb_set_length(colo_buffer, 0);
+    trans = qemu_bufopen("w", colo_buffer);
+    if (!trans) {
+        error_report("Open colo buffer for write failed");
+        goto out;
+    }
+
+    /* suspend and save vm state to colo buffer */
+    qemu_mutex_lock_iothread();
+    vm_stop_force_state(RUN_STATE_COLO);
+    qemu_mutex_unlock_iothread();
+    DPRINTF("vm is stoped\n");
+
+    /* Disable block migration */
+    s->params.blk = 0;
+    s->params.shared = 0;
+    qemu_savevm_state_begin(trans, &s->params);
+    qemu_mutex_lock_iothread();
+    qemu_savevm_state_complete(trans);
+    qemu_mutex_unlock_iothread();
 
-    /* TODO: suspend and save vm state to colo buffer */
+    qemu_fflush(trans);
 
     ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
     if (ret < 0) {
         goto out;
     }
+    /* we send the total size of the vmstate first */
+    size = qsb_get_length(colo_buffer);
+    ret = colo_ctl_put(s->file, size);
+    if (ret < 0) {
+        goto out;
+    }
 
-    /* TODO: send vmstate to slave */
-
+    qsb_put_buffer(s->file, colo_buffer, size);
+    qemu_fflush(s->file);
+    ret = qemu_file_get_error(s->file);
+    if (ret < 0) {
+        goto out;
+    }
     ret = colo_ctl_get(control, COLO_CHECKPOINT_RECEIVED);
     if (ret < 0) {
         goto out;
@@ -147,9 +183,18 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
     }
     trace_colo_receive_message("COLO_CHECKPOINT_LOADED");
 
-    /* TODO: resume master */
+    ret = 0;
+    /* resume master */
+    qemu_mutex_lock_iothread();
+    vm_start();
+    qemu_mutex_unlock_iothread();
+    DPRINTF("vm resume to run again\n");
 
 out:
+    if (trans) {
+        qemu_fclose(trans);
+    }
+
     return ret;
 }
 
@@ -175,6 +220,12 @@ static void *colo_thread(void *opaque)
     }
     trace_colo_receive_message("COLO_CHECPOINT_READY");
 
+    colo_buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
+    if (colo_buffer == NULL) {
+        error_report("Failed to allocate colo buffer!");
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
@@ -190,6 +241,9 @@ static void *colo_thread(void *opaque)
 out:
     migrate_set_state(s, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED);
 
+    qsb_free(colo_buffer);
+    colo_buffer = NULL;
+
     if (colo_control) {
         qemu_fclose(colo_control);
     }
diff --git a/savevm.c b/savevm.c
index 3b0e222..cd7ec27 100644
--- a/savevm.c
+++ b/savevm.c
@@ -42,7 +42,7 @@
 #include "qemu/iov.h"
 #include "block/snapshot.h"
 #include "block/qapi.h"
-
+#include "migration/migration-colo.h"
 
 #ifndef ETH_P_RARP
 #define ETH_P_RARP 0x8035
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 10/29] COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (9 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

The ram cache is initially the same as SVM/PVM's memory.

At checkpoint, we cache the dirty RAM of PVM into RAM cache in the slave
(so that RAM cache always the same as PVM's memory at every
checkpoint), we will flush cached RAM to SVM after we receive
all PVM's vmstate (RAM/device).

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 arch_init.c                        | 87 +++++++++++++++++++++++++++++++++++++-
 include/exec/cpu-all.h             |  1 +
 include/migration/migration-colo.h |  3 ++
 migration/colo.c                   | 35 ++++++++++++---
 4 files changed, 118 insertions(+), 8 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 6fbc82d..9adf007 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -316,6 +316,7 @@ static RAMBlock *last_sent_block;
 static ram_addr_t last_offset;
 static unsigned long *migration_bitmap;
 static uint64_t migration_dirty_pages;
+static bool ram_cache_enable;
 static uint32_t last_version;
 static bool ram_bulk_stage;
 
@@ -1447,6 +1448,8 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host)
     return 0;
 }
 
+static void *memory_region_get_ram_cache_ptr(MemoryRegion *mr, RAMBlock *block);
+
 /* Must be called from within a rcu critical section.
  * Returns a pointer from within the RCU-protected ram_list.
  */
@@ -1464,7 +1467,17 @@ static inline void *host_from_stream_offset(QEMUFile *f,
             return NULL;
         }
 
-        return memory_region_get_ram_ptr(block->mr) + offset;
+        if (ram_cache_enable) {
+            /*
+            * During colo checkpoint, we need bitmap of these migrated pages.
+            * It help us to decide which pages in ram cache should be flushed
+            * into VM's RAM later.
+            */
+            migration_bitmap_set_dirty(block->mr->ram_addr + offset);
+            return memory_region_get_ram_cache_ptr(block->mr, block) + offset;
+        } else {
+            return memory_region_get_ram_ptr(block->mr) + offset;
+        }
     }
 
     len = qemu_get_byte(f);
@@ -1474,7 +1487,13 @@ static inline void *host_from_stream_offset(QEMUFile *f,
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
         if (!strncmp(id, block->idstr, sizeof(id)) &&
             block->max_length > offset) {
-            return memory_region_get_ram_ptr(block->mr) + offset;
+            if (ram_cache_enable) {
+                migration_bitmap_set_dirty(block->mr->ram_addr + offset);
+                return memory_region_get_ram_cache_ptr(block->mr, block)
+                       + offset;
+            } else {
+                return memory_region_get_ram_ptr(block->mr) + offset;
+            }
         }
     }
 
@@ -1724,6 +1743,70 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     return ret;
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM, it will be called after first migration.
+ */
+int create_and_init_ram_cache(void)
+{
+    RAMBlock *block;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        block->host_cache = qemu_anon_ram_alloc(block->used_length, NULL);
+        if (!block->host_cache) {
+            goto out_locked;
+        }
+        memcpy(block->host_cache, block->host, block->used_length);
+    }
+    rcu_read_unlock();
+    ram_cache_enable = true;
+    return 0;
+
+out_locked:
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->host_cache) {
+            qemu_anon_ram_free(block->host_cache, block->used_length);
+            block->host_cache = NULL;
+        }
+    }
+
+    rcu_read_unlock();
+    return -1;
+}
+
+void release_ram_cache(void)
+{
+    RAMBlock *block;
+
+    ram_cache_enable = false;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->host_cache) {
+            qemu_anon_ram_free(block->host_cache, block->used_length);
+            block->host_cache = NULL;
+        }
+    }
+    rcu_read_unlock();
+}
+
+static void *memory_region_get_ram_cache_ptr(MemoryRegion *mr, RAMBlock *block)
+{
+   if (mr->alias) {
+        return memory_region_get_ram_cache_ptr(mr->alias, block) +
+               mr->alias_offset;
+    }
+
+    assert(mr->terminates);
+
+    ram_addr_t addr = mr->ram_addr & TARGET_PAGE_MASK;
+
+    assert(addr - block->offset < block->used_length);
+
+    return block->host_cache + (addr - block->offset);
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index ac06c67..bcfa3bc 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -272,6 +272,7 @@ struct RAMBlock {
     struct rcu_head rcu;
     struct MemoryRegion *mr;
     uint8_t *host;
+    uint8_t *host_cache; /* For colo, VM's ram cache */
     ram_addr_t offset;
     ram_addr_t used_length;
     ram_addr_t max_length;
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index b2798f7..2110182 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -35,4 +35,7 @@ bool loadvm_enable_colo(void);
 void loadvm_exit_colo(void);
 void *colo_process_incoming_checkpoints(void *opaque);
 bool loadvm_in_colo_state(void);
+/* ram cache */
+int create_and_init_ram_cache(void);
+void release_ram_cache(void);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index 8ff03e7..39cd698 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -142,7 +142,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
     qemu_mutex_lock_iothread();
     vm_stop_force_state(RUN_STATE_COLO);
     qemu_mutex_unlock_iothread();
-    DPRINTF("vm is stoped\n");
+    trace_colo_vm_state_change("run", "stop");
 
     /* Disable block migration */
     s->params.blk = 0;
@@ -188,7 +188,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
-    DPRINTF("vm resume to run again\n");
+    trace_colo_vm_state_change("stop", "run");
 
 out:
     if (trans) {
@@ -319,11 +319,22 @@ void *colo_process_incoming_checkpoints(void *opaque)
         error_report("Can't open incoming channel!");
         goto out;
     }
+
+    if (create_and_init_ram_cache() < 0) {
+        error_report("Failed to initialize ram cache");
+        goto out;
+    }
+
     ret = colo_ctl_put(ctl, COLO_CHECPOINT_READY);
     if (ret < 0) {
         goto out;
     }
-    /* TODO: in COLO mode, slave is runing, so start the vm */
+    qemu_mutex_lock_iothread();
+    /* in COLO mode, slave is runing, so start the vm */
+    vm_start();
+    qemu_mutex_unlock_iothread();
+    trace_colo_vm_state_change("stop", "run");
+
     while (true) {
         int request = 0;
         int ret = colo_wait_handle_cmd(f, &request);
@@ -336,7 +347,12 @@ void *colo_process_incoming_checkpoints(void *opaque)
             }
         }
 
-        /* TODO: suspend guest */
+        /* suspend guest */
+        qemu_mutex_lock_iothread();
+        vm_stop_force_state(RUN_STATE_COLO);
+        qemu_mutex_unlock_iothread();
+        trace_colo_vm_state_change("run", "stop");
+
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_SUSPENDED);
         if (ret < 0) {
             goto out;
@@ -348,7 +364,7 @@ void *colo_process_incoming_checkpoints(void *opaque)
         }
         trace_colo_receive_message("COLO_CHECKPOINT_SEND");
 
-        /* TODO: read migration data into colo buffer */
+        /*TODO Load VM state */
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
         if (ret < 0) {
@@ -356,16 +372,23 @@ void *colo_process_incoming_checkpoints(void *opaque)
         }
         trace_colo_receive_message("COLO_CHECKPOINT_RECEIVED");
 
-        /* TODO: load vm state */
+        /* TODO: flush vm state */
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
         if (ret < 0) {
             goto out;
         }
+
+        /* resume guest */
+        qemu_mutex_lock_iothread();
+        vm_start();
+        qemu_mutex_unlock_iothread();
+        trace_colo_vm_state_change("stop", "start");
 }
 
 out:
     colo = NULL;
+    release_ram_cache();
     if (ctl) {
         qemu_fclose(ctl);
     }
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 11/29] COLO VMstate: Load VM state into qsb before restore it
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (10 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  2015-06-05 18:02   ` Dr. David Alan Gilbert
  -1 siblings, 1 reply; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Yang Hongyang,
	david

We should cache the device state before restore it,
besides, we should call qemu_system_reset() before load VM state,
which can ensure the data is intact.

Note: If we discard qemu_system_reset(), there will be some odd error,
For exmple, qemu in slave side crashes and reports:

KVM: entry failed, hardware error 0x7
EAX=00000000 EBX=0000e000 ECX=00009578 EDX=0000434f
ESI=0000fc10 EDI=0000434f EBP=00000000 ESP=00001fca
EIP=00009594 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0040 00000400 0000ffff 00009300
CS =f000 000f0000 0000ffff 00009b00
SS =434f 000434f0 0000ffff 00009300
DS =434f 000434f0 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     0002dcc8 00000047
IDT=     00000000 0000ffff
CR0=00000010 CR2=ffffffff CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=c0 74 0f 66 b9 78 95 00 00 66 31 d2 66 31 c0 e9 47 e0 fb 90 <f3> 90 fa fc 66 c3 66 53 66 89 c3 66 e8 9d e8 ff ff 66 01 c3 66 89 d8 66 e8 40 e9 ff ff 66
ERROR: invalid runstate transition: 'internal-error' -> 'colo'

The reason is, some of the device state will be ignored when saving device state to slave,
if the corresponding data is in its initial value, such as 0.
But the device state in slave maybe in initialized value, after a loop of checkpoint,
there will be inconsistent for the value of device state.
This will happen when the PVM reboot or SVM run ahead of PVM in the startup process.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 migration/colo.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 3 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 39cd698..0f61786 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -309,8 +309,10 @@ void *colo_process_incoming_checkpoints(void *opaque)
     struct colo_incoming *colo_in = opaque;
     QEMUFile *f = colo_in->file;
     int fd = qemu_get_fd(f);
-    QEMUFile *ctl = NULL;
+    QEMUFile *ctl = NULL, *fb = NULL;
     int ret;
+    uint64_t total_size;
+
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
@@ -325,10 +327,17 @@ void *colo_process_incoming_checkpoints(void *opaque)
         goto out;
     }
 
+    colo_buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
+    if (colo_buffer == NULL) {
+        error_report("Failed to allocate colo buffer!");
+        goto out;
+    }
+
     ret = colo_ctl_put(ctl, COLO_CHECPOINT_READY);
     if (ret < 0) {
         goto out;
     }
+
     qemu_mutex_lock_iothread();
     /* in COLO mode, slave is runing, so start the vm */
     vm_start();
@@ -364,7 +373,18 @@ void *colo_process_incoming_checkpoints(void *opaque)
         }
         trace_colo_receive_message("COLO_CHECKPOINT_SEND");
 
-        /*TODO Load VM state */
+        /* read the VM state total size first */
+        ret = colo_ctl_get_value(f, &total_size);
+        if (ret < 0) {
+            goto out;
+        }
+
+        /* read vm device state into colo buffer */
+        ret = qsb_fill_buffer(colo_buffer, f, total_size);
+        if (ret != total_size) {
+            error_report("can't get all migration data");
+            goto out;
+        }
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
         if (ret < 0) {
@@ -372,6 +392,22 @@ void *colo_process_incoming_checkpoints(void *opaque)
         }
         trace_colo_receive_message("COLO_CHECKPOINT_RECEIVED");
 
+        /* open colo buffer for read */
+        fb = qemu_bufopen("r", colo_buffer);
+        if (!fb) {
+            error_report("can't open colo buffer for read");
+            goto out;
+        }
+
+        qemu_mutex_lock_iothread();
+        qemu_system_reset(VMRESET_SILENT);
+        if (qemu_loadvm_state(fb) < 0) {
+            error_report("COLO: loadvm failed");
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+        qemu_mutex_unlock_iothread();
+
         /* TODO: flush vm state */
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
@@ -384,14 +420,25 @@ void *colo_process_incoming_checkpoints(void *opaque)
         vm_start();
         qemu_mutex_unlock_iothread();
         trace_colo_vm_state_change("stop", "start");
-}
+
+        qemu_fclose(fb);
+        fb = NULL;
+    }
 
 out:
     colo = NULL;
+
+    if (fb) {
+        qemu_fclose(fb);
+    }
+
     release_ram_cache();
     if (ctl) {
         qemu_fclose(ctl);
     }
+
+    qsb_free(colo_buffer);
+
     loadvm_exit_colo();
 
     return NULL;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 12/29] arch_init: Start to trace dirty pages of SVM
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (11 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

we will use this dirty bitmap together with VM's cache RAM dirty bitmap
to decide which page in cache should be flushed into VM's RAM.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 arch_init.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch_init.c b/arch_init.c
index 9adf007..693d00e 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -1750,6 +1750,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
 int create_and_init_ram_cache(void)
 {
     RAMBlock *block;
+    int64_t ram_cache_pages = last_ram_offset() >> TARGET_PAGE_BITS;
 
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
@@ -1761,6 +1762,15 @@ int create_and_init_ram_cache(void)
     }
     rcu_read_unlock();
     ram_cache_enable = true;
+    /*
+    * Start dirty log for slave VM, we will use this dirty bitmap together with
+    * VM's cache RAM dirty bitmap to decide which page in cache should be
+    * flushed into VM's RAM.
+    */
+    migration_bitmap = bitmap_new(ram_cache_pages);
+    migration_dirty_pages = 0;
+    memory_global_dirty_log_start();
+
     return 0;
 
 out_locked:
@@ -1781,6 +1791,12 @@ void release_ram_cache(void)
 
     ram_cache_enable = false;
 
+    if (migration_bitmap) {
+        memory_global_dirty_log_stop();
+        g_free(migration_bitmap);
+        migration_bitmap = NULL;
+    }
+
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
         if (block->host_cache) {
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 13/29] COLO RAM: Flush cached RAM into SVM's memory
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (12 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

During the time of VM's running, PVM/SVM may dirty some pages, we will transfer
PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
time. So, the content of SVM's RAM cache will always be some with PVM's memory
after checkpoint.

Instead of flushing all content of SVM's RAM cache into SVM's MEMORY,
we do this in a more efficient way:
Only flush any page that dirtied by PVM or SVM since last checkpoint.
In this way, we ensure SVM's memory same with PVM's.

Besides, we must ensure flush RAM cache before load device state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 arch_init.c                        | 92 ++++++++++++++++++++++++++++++++++++++
 include/migration/migration-colo.h |  1 +
 migration/colo.c                   |  2 -
 3 files changed, 93 insertions(+), 2 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 693d00e..2958ab5 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -1610,6 +1610,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     int flags = 0, ret = 0;
     static uint64_t seq_iter;
     int len = 0;
+    bool need_flush = false;
 
     seq_iter++;
 
@@ -1677,6 +1678,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
+
+            need_flush = true;
             ch = qemu_get_byte(f);
             ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
             break;
@@ -1687,6 +1690,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
+
+            need_flush = true;
             qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
             break;
         case RAM_SAVE_FLAG_COMPRESS_PAGE:
@@ -1719,6 +1724,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
+            need_flush = true;
             break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
@@ -1738,6 +1744,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     }
 
     rcu_read_unlock();
+
+    if (!ret  && ram_cache_enable && need_flush) {
+        DPRINTF("Flush ram_cache\n");
+        colo_flush_ram_cache();
+    }
     DPRINTF("Completed load of VM with exit code %d seq iteration "
             "%" PRIu64 "\n", ret, seq_iter);
     return ret;
@@ -1823,6 +1834,87 @@ static void *memory_region_get_ram_cache_ptr(MemoryRegion *mr, RAMBlock *block)
     return block->host_cache + (addr - block->offset);
 }
 
+/* fix me: should this helper function be merged with
+ * migration_bitmap_find_and_reset_dirty ?
+ */
+static inline
+ram_addr_t host_bitmap_find_and_reset_dirty(MemoryRegion *mr,
+                                            ram_addr_t start)
+{
+    unsigned long base = mr->ram_addr >> TARGET_PAGE_BITS;
+    unsigned long nr = base + (start >> TARGET_PAGE_BITS);
+    uint64_t mr_size = TARGET_PAGE_ALIGN(memory_region_size(mr));
+    unsigned long size = base + (mr_size >> TARGET_PAGE_BITS);
+
+    unsigned long next;
+
+    next = find_next_bit(ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION],
+                         size, nr);
+    if (next < size) {
+        clear_bit(next, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]);
+    }
+    return (next - base) << TARGET_PAGE_BITS;
+}
+
+/*
+* Flush content of RAM cache into SVM's memory.
+* Only flush the pages that be dirtied by PVM or SVM or both.
+*/
+void colo_flush_ram_cache(void)
+{
+    RAMBlock *block = NULL;
+    void *dst_host;
+    void *src_host;
+    ram_addr_t ca  = 0, ha = 0;
+    bool got_ca = 0, got_ha = 0;
+    int64_t host_dirty = 0, both_dirty = 0;
+
+    address_space_sync_dirty_bitmap(&address_space_memory);
+    rcu_read_lock();
+    block = QLIST_FIRST_RCU(&ram_list.blocks);
+    while (true) {
+        if (ca < block->used_length && ca <= ha) {
+            ca = migration_bitmap_find_and_reset_dirty(block->mr, ca);
+            if (ca < block->used_length) {
+                got_ca = 1;
+            }
+        }
+        if (ha < block->used_length && ha <= ca) {
+            ha = host_bitmap_find_and_reset_dirty(block->mr, ha);
+            if (ha < block->used_length && ha != ca) {
+                got_ha = 1;
+            }
+            host_dirty += (ha < block->used_length ? 1 : 0);
+            both_dirty += (ha < block->used_length && ha == ca ? 1 : 0);
+        }
+        if (ca >= block->used_length && ha >= block->used_length) {
+            ca = 0;
+            ha = 0;
+            block = QLIST_NEXT_RCU(block, next);
+            if (!block) {
+                break;
+            }
+        } else {
+            if (got_ha) {
+                got_ha = 0;
+                dst_host = memory_region_get_ram_ptr(block->mr) + ha;
+                src_host = memory_region_get_ram_cache_ptr(block->mr, block)
+                           + ha;
+                memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+            }
+            if (got_ca) {
+                got_ca = 0;
+                dst_host = memory_region_get_ram_ptr(block->mr) + ca;
+                src_host = memory_region_get_ram_cache_ptr(block->mr, block)
+                           + ca;
+                memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+            }
+        }
+    }
+    rcu_read_unlock();
+    assert(migration_dirty_pages == 0);
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index 2110182..c03c391 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -37,5 +37,6 @@ void *colo_process_incoming_checkpoints(void *opaque);
 bool loadvm_in_colo_state(void);
 /* ram cache */
 int create_and_init_ram_cache(void);
+void colo_flush_ram_cache(void);
 void release_ram_cache(void);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index 0f61786..74590eb 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -408,8 +408,6 @@ void *colo_process_incoming_checkpoints(void *opaque)
         }
         qemu_mutex_unlock_iothread();
 
-        /* TODO: flush vm state */
-
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
         if (ret < 0) {
             goto out;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 14/29] COLO failover: Introduce a new command to trigger a failover
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (13 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	Yang Hongyang, david

We leave users to use whatever heartbeat solution they want, if the heartbeat
is lost, or other errors they detect, they can use command
'colo_lost_heartbeat' to tell COLO to do failover, COLO will do operations
accordingly.

For example,
If send the command to PVM, Primary will exit COLO mode, and takeover,
if to Secondary, Secondary will do failover work and at last takeover server.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 hmp-commands.hx                        | 15 ++++++++++++++
 hmp.c                                  |  7 +++++++
 hmp.h                                  |  1 +
 include/migration/migration-colo.h     |  1 +
 include/migration/migration-failover.h | 20 ++++++++++++++++++
 migration/Makefile.objs                |  2 +-
 migration/colo-failover.c              | 38 ++++++++++++++++++++++++++++++++++
 migration/colo.c                       |  1 +
 qapi-schema.json                       |  9 ++++++++
 qmp-commands.hx                        | 19 +++++++++++++++++
 stubs/migration-colo.c                 |  8 +++++++
 11 files changed, 120 insertions(+), 1 deletion(-)
 create mode 100644 include/migration/migration-failover.h
 create mode 100644 migration/colo-failover.c

diff --git a/hmp-commands.hx b/hmp-commands.hx
index e864a6c..be3e398 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1008,6 +1008,21 @@ Set the parameter @var{parameter} for migration.
 ETEXI
 
     {
+        .name       = "colo_lost_heartbeat",
+        .args_type  = "",
+        .params     = "",
+        .help       = "Tell COLO that heartbeat is lost,\n\t\t\t"
+                      "a failover or takeover is needed.",
+        .mhandler.cmd = hmp_colo_lost_heartbeat,
+    },
+
+STEXI
+@item colo_lost_heartbeat
+@findex colo_lost_heartbeat
+Tell COLO that heartbeat is lost, a failover or takeover is needed.
+ETEXI
+
+    {
         .name       = "client_migrate_info",
         .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
         .params     = "protocol hostname port tls-port cert-subject",
diff --git a/hmp.c b/hmp.c
index e17852d..f87fa37 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1250,6 +1250,13 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
     }
 }
 
+void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict)
+{
+    Error *err = NULL;
+    qmp_colo_lost_heartbeat(&err);
+    hmp_handle_error(mon, &err);
+}
+
 void hmp_set_password(Monitor *mon, const QDict *qdict)
 {
     const char *protocol  = qdict_get_str(qdict, "protocol");
diff --git a/hmp.h b/hmp.h
index a158e3f..b6549f8 100644
--- a/hmp.h
+++ b/hmp.h
@@ -67,6 +67,7 @@ void hmp_migrate_set_speed(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_capability(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_cache_size(Monitor *mon, const QDict *qdict);
+void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
 void hmp_set_password(Monitor *mon, const QDict *qdict);
 void hmp_expire_password(Monitor *mon, const QDict *qdict);
 void hmp_eject(Monitor *mon, const QDict *qdict);
diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index c03c391..d6eac07 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -17,6 +17,7 @@
 #include "migration/migration.h"
 #include "block/coroutine.h"
 #include "qemu/thread.h"
+#include "qemu/main-loop.h"
 
 bool colo_supported(void);
 void colo_info_mig_init(void);
diff --git a/include/migration/migration-failover.h b/include/migration/migration-failover.h
new file mode 100644
index 0000000..a8767fc
--- /dev/null
+++ b/include/migration/migration-failover.h
@@ -0,0 +1,20 @@
+/*
+ *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ *  (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef MIGRATION_FAILOVER_H
+#define MIGRATION_FAILOVER_H
+
+#include "qemu-common.h"
+
+void failover_request_set(void);
+
+#endif
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index cb7bd30..50d8392 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,6 +1,6 @@
 common-obj-y += migration.o tcp.o
-common-obj-$(CONFIG_COLO) += colo.o
 common-obj-y += colo-comm.o
+common-obj-$(CONFIG_COLO) += colo.o colo-failover.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o qemu-file-stdio.o
 common-obj-y += xbzrle.o
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
new file mode 100644
index 0000000..2bd2e16
--- /dev/null
+++ b/migration/colo-failover.c
@@ -0,0 +1,38 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "migration/migration-colo.h"
+#include "migration/migration-failover.h"
+#include "qmp-commands.h"
+
+static bool failover_request;
+
+static QEMUBH *failover_bh;
+
+static void colo_failover_bh(void *opaque)
+{
+    qemu_bh_delete(failover_bh);
+    failover_bh = NULL;
+    /*TODO: Do failover work */
+}
+
+void failover_request_set(void)
+{
+    failover_request = true;
+    failover_bh = qemu_bh_new(colo_failover_bh, NULL);
+    qemu_bh_schedule(failover_bh);
+}
+
+void qmp_colo_lost_heartbeat(Error **errp)
+{
+    failover_request_set();
+}
diff --git a/migration/colo.c b/migration/colo.c
index 74590eb..ad44569 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -14,6 +14,7 @@
 #include "migration/migration-colo.h"
 #include "trace.h"
 #include "qemu/error-report.h"
+#include "migration/migration-failover.h"
 
 enum {
     COLO_CHECPOINT_READY = 0x46,
diff --git a/qapi-schema.json b/qapi-schema.json
index 54eb707..7562111 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -644,6 +644,15 @@
   'returns': 'MigrationParameters' }
 
 ##
+# @colo-lost-heartbeat
+#
+# Tell COLO that heartbeat is lost
+#
+# Since: 2.4
+##
+{ 'command': 'colo-lost-heartbeat' }
+
+##
 # @MouseInfo:
 #
 # Information about a mouse device.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 14e109e..3813f66 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -781,6 +781,25 @@ Example:
 EQMP
 
     {
+        .name       = "colo-lost-heartbeat",
+        .args_type  = "",
+        .mhandler.cmd_new = qmp_marshal_input_colo_lost_heartbeat,
+    },
+
+SQMP
+colo-lost-heartbeat
+--------------------
+
+Tell COLO that heartbeat is lost, a failover or takeover is needed.
+
+Example:
+
+-> { "execute": "colo-lost-heartbeat" }
+<- { "return": {} }
+
+EQMP
+
+    {
         .name       = "client_migrate_info",
         .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
         .params     = "protocol hostname port tls-port cert-subject",
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 45b992a..7d1fd9f 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -11,6 +11,7 @@
  */
 
 #include "migration/migration-colo.h"
+#include "qmp-commands.h"
 
 bool colo_supported(void)
 {
@@ -30,3 +31,10 @@ void *colo_process_incoming_checkpoints(void *opaque)
 {
     return NULL;
 }
+
+void qmp_colo_lost_heartbeat(Error **errp)
+{
+    error_setg(errp, "COLO is not supported, please rerun configure"
+                     " with --enable-colo option in order to support"
+                     " COLO feature");
+}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 15/29] COLO failover: Implement COLO master/slave failover work
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (14 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	david

If failover is requested, after some cleanup work,
PVM or SVM will exit COLO mode, and resume to normal run.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 include/migration/migration-colo.h     |  14 ++++
 include/migration/migration-failover.h |   2 +
 migration/colo-comm.c                  |  11 +++
 migration/colo-failover.c              |  12 +++-
 migration/colo.c                       | 126 ++++++++++++++++++++++++++++++++-
 stubs/migration-colo.c                 |   5 ++
 trace-events                           |   1 +
 7 files changed, 169 insertions(+), 2 deletions(-)

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index d6eac07..63f8b45 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -22,6 +22,13 @@
 bool colo_supported(void);
 void colo_info_mig_init(void);
 
+/* Checkpoint control, called in migration/checkpoint thread */
+enum {
+    COLO_UNPROTECTED_MODE = 0,
+    COLO_PRIMARY_MODE,
+    COLO_SECONDARY_MODE,
+};
+
 struct colo_incoming {
     QEMUFile *file;
     QemuThread thread;
@@ -36,8 +43,15 @@ bool loadvm_enable_colo(void);
 void loadvm_exit_colo(void);
 void *colo_process_incoming_checkpoints(void *opaque);
 bool loadvm_in_colo_state(void);
+
+int get_colo_mode(void);
+
 /* ram cache */
 int create_and_init_ram_cache(void);
 void colo_flush_ram_cache(void);
 void release_ram_cache(void);
+
+/* failover */
+void colo_do_failover(MigrationState *s);
+
 #endif
diff --git a/include/migration/migration-failover.h b/include/migration/migration-failover.h
index a8767fc..5e59b1d 100644
--- a/include/migration/migration-failover.h
+++ b/include/migration/migration-failover.h
@@ -16,5 +16,7 @@
 #include "qemu-common.h"
 
 void failover_request_set(void);
+void failover_request_clear(void);
+bool failover_request_is_set(void);
 
 #endif
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
index f8be027..16bd184 100644
--- a/migration/colo-comm.c
+++ b/migration/colo-comm.c
@@ -16,6 +16,17 @@
 
 static bool colo_requested;
 
+int get_colo_mode(void)
+{
+    if (migrate_in_colo_state()) {
+        return COLO_PRIMARY_MODE;
+    } else if (loadvm_in_colo_state()) {
+        return COLO_SECONDARY_MODE;
+    } else {
+        return COLO_UNPROTECTED_MODE;
+    }
+}
+
 /* save */
 static void colo_info_save(QEMUFile *f, void *opaque)
 {
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
index 2bd2e16..97f5d24 100644
--- a/migration/colo-failover.c
+++ b/migration/colo-failover.c
@@ -22,7 +22,7 @@ static void colo_failover_bh(void *opaque)
 {
     qemu_bh_delete(failover_bh);
     failover_bh = NULL;
-    /*TODO: Do failover work */
+    colo_do_failover(NULL);
 }
 
 void failover_request_set(void)
@@ -32,6 +32,16 @@ void failover_request_set(void)
     qemu_bh_schedule(failover_bh);
 }
 
+void failover_request_clear(void)
+{
+    failover_request = false;
+}
+
+bool failover_request_is_set(void)
+{
+    return failover_request;
+}
+
 void qmp_colo_lost_heartbeat(Error **errp)
 {
     failover_request_set();
diff --git a/migration/colo.c b/migration/colo.c
index ad44569..1c8cdfe 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -68,6 +68,67 @@ bool migrate_in_colo_state(void)
     return (s->state == MIGRATION_STATUS_COLO);
 }
 
+static bool colo_runstate_is_stopped(void)
+{
+    return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
+}
+
+/*
+ * there are two way to entry this function
+ * 1. From colo checkpoint incoming thread, in this case
+ * we should protect it by iothread lock
+ * 2. From user command, because hmp/qmp command
+ * was happened in main loop, iothread lock will cause a
+ * dead lock.
+ */
+static void slave_do_failover(void)
+{
+    colo = NULL;
+
+    if (!autostart) {
+        error_report("\"-S\" qemu option will be ignored in colo slave side");
+        /* recover runstate to normal migration finish state */
+        autostart = true;
+    }
+
+    /* On slave side, jump to incoming co */
+    if (migration_incoming_co) {
+        qemu_coroutine_enter(migration_incoming_co, NULL);
+    }
+}
+
+static void master_do_failover(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    if (!colo_runstate_is_stopped()) {
+        vm_stop_force_state(RUN_STATE_COLO);
+    }
+
+    if (s->state != MIGRATION_STATUS_FAILED) {
+        migrate_set_state(s, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED);
+    }
+
+    vm_start();
+}
+
+static bool failover_completed;
+void colo_do_failover(MigrationState *s)
+{
+    /* Make sure vm stopped while failover */
+    if (!colo_runstate_is_stopped()) {
+        vm_stop_force_state(RUN_STATE_COLO);
+    }
+
+    trace_colo_do_failover();
+    if (get_colo_mode() == COLO_SECONDARY_MODE) {
+        slave_do_failover();
+    } else {
+        master_do_failover();
+    }
+    failover_completed = true;
+}
+
 /* colo checkpoint control helper */
 static int colo_ctl_put(QEMUFile *f, uint64_t request)
 {
@@ -139,11 +200,23 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
         goto out;
     }
 
+    if (failover_request_is_set()) {
+        ret = -1;
+        goto out;
+    }
     /* suspend and save vm state to colo buffer */
     qemu_mutex_lock_iothread();
     vm_stop_force_state(RUN_STATE_COLO);
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("run", "stop");
+    /*
+     * failover request bh could be called after
+     * vm_stop_force_state so we check failover_request_is_set() again.
+     */
+    if (failover_request_is_set()) {
+        ret = -1;
+        goto out;
+    }
 
     /* Disable block migration */
     s->params.blk = 0;
@@ -233,6 +306,11 @@ static void *colo_thread(void *opaque)
     trace_colo_vm_state_change("stop", "run");
 
     while (s->state == MIGRATION_STATUS_COLO) {
+        if (failover_request_is_set()) {
+            error_report("failover request");
+            goto out;
+        }
+
         /* start a colo checkpoint */
         if (colo_do_checkpoint_transaction(s, colo_control)) {
             goto out;
@@ -240,7 +318,18 @@ static void *colo_thread(void *opaque)
     }
 
 out:
-    migrate_set_state(s, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED);
+    error_report("colo: some error happens in colo_thread");
+    qemu_mutex_lock_iothread();
+    if (!failover_request_is_set()) {
+        error_report("master takeover from checkpoint channel");
+        failover_request_set();
+    }
+    qemu_mutex_unlock_iothread();
+
+    while (!failover_completed) {
+        ;
+    }
+    failover_request_clear();
 
     qsb_free(colo_buffer);
     colo_buffer = NULL;
@@ -281,6 +370,11 @@ void colo_init_checkpointer(MigrationState *s)
     qemu_bh_schedule(colo_bh);
 }
 
+bool loadvm_in_colo_state(void)
+{
+    return colo != NULL;
+}
+
 /*
  * return:
  * 0: start a checkpoint
@@ -356,6 +450,10 @@ void *colo_process_incoming_checkpoints(void *opaque)
                 continue;
             }
         }
+        if (failover_request_is_set()) {
+            error_report("failover request");
+            goto out;
+        }
 
         /* suspend guest */
         qemu_mutex_lock_iothread();
@@ -425,6 +523,32 @@ void *colo_process_incoming_checkpoints(void *opaque)
     }
 
 out:
+    error_report("Detect some error or get a failover request");
+    /* determine whether we need to failover */
+    if (!failover_request_is_set()) {
+        /*
+        * TODO: Here, maybe we should raise a qmp event to the user,
+        * It can help user to know what happens, and help deciding whether to
+        * do failover.
+        */
+        usleep(2000 * 1000);
+    }
+    /* check flag again*/
+    if (!failover_request_is_set()) {
+        /*
+        * We assume that master is still alive according to heartbeat,
+        * just kill slave
+        */
+        error_report("SVM is going to exit!");
+        exit(1);
+    } else {
+        /* if we went here, means master may dead, we are doing failover */
+        while (!failover_completed) {
+            ;
+        }
+        failover_request_clear();
+    }
+
     colo = NULL;
 
     if (fb) {
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 7d1fd9f..9ec0c07 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -32,6 +32,11 @@ void *colo_process_incoming_checkpoints(void *opaque)
     return NULL;
 }
 
+bool loadvm_in_colo_state(void)
+{
+    return false;
+}
+
 void qmp_colo_lost_heartbeat(Error **errp)
 {
     error_setg(errp, "COLO is not supported, please rerun configure"
diff --git a/trace-events b/trace-events
index 2b95743..1ce7bba 100644
--- a/trace-events
+++ b/trace-events
@@ -1449,6 +1449,7 @@ colo_info_load(const char *msg) "%s"
 # migration/colo.c
 colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
 colo_receive_message(const char *msg) "Receive '%s'"
+colo_do_failover(void) ""
 
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 16/29] COLO failover: Don't do failover during loading VM's state
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (15 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	david

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 migration/colo.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 1c8cdfe..fc30ca5 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -52,6 +52,7 @@ enum {
 };
 
 static QEMUBH *colo_bh;
+static bool vmstate_loading;
 static Coroutine *colo;
 /* colo buffer */
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
@@ -83,6 +84,11 @@ static bool colo_runstate_is_stopped(void)
  */
 static void slave_do_failover(void)
 {
+    /* Wait for incoming thread loading vmstate */
+    while (vmstate_loading) {
+        ;
+    }
+
     colo = NULL;
 
     if (!autostart) {
@@ -500,11 +506,15 @@ void *colo_process_incoming_checkpoints(void *opaque)
 
         qemu_mutex_lock_iothread();
         qemu_system_reset(VMRESET_SILENT);
+        vmstate_loading = true;
         if (qemu_loadvm_state(fb) < 0) {
             error_report("COLO: loadvm failed");
+            vmstate_loading = false;
             qemu_mutex_unlock_iothread();
             goto out;
         }
+
+        vmstate_loading = false;
         qemu_mutex_unlock_iothread();
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 17/29] COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (16 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

The 'colo_nicname' should be assigned with network name,
for exmple, 'eth2'. It will be parameter of 'colo_script',
'colo_script' should be assigned with an scirpt path.

We parse these parameter in tap.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/net/net.h |  3 +++
 net/tap.c         | 27 ++++++++++++++++++++++++---
 qapi-schema.json  |  8 +++++++-
 qemu-options.hx   | 10 +++++++++-
 4 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/include/net/net.h b/include/net/net.h
index e66ca03..98877b5 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -84,6 +84,9 @@ struct NetClientState {
     char *model;
     char *name;
     char info_str[256];
+    char colo_script[1024];
+    char colo_nicname[128];
+    char ifname[128];
     unsigned receive_disabled : 1;
     NetClientDestructor *destructor;
     unsigned int queue_index;
diff --git a/net/tap.c b/net/tap.c
index 968df46..823f78e 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -608,6 +608,7 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
     Error *err = NULL;
     TAPState *s;
     int vhostfd;
+    NetClientState *nc = NULL;
 
     s = net_tap_fd_init(peer, model, name, fd, vnet_hdr);
     if (!s) {
@@ -635,6 +636,17 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
         }
     }
 
+    nc = &(s->nc);
+    snprintf(nc->ifname, sizeof(nc->ifname), "%s", ifname);
+    if (tap->has_colo_script) {
+        snprintf(nc->colo_script, sizeof(nc->colo_script), "%s",
+                 tap->colo_script);
+    }
+    if (tap->has_colo_nicname) {
+        snprintf(nc->colo_nicname, sizeof(nc->colo_nicname), "%s",
+                 tap->colo_nicname);
+    }
+
     if (tap->has_vhost ? tap->vhost :
         vhostfdname || (tap->has_vhostforce && tap->vhostforce)) {
         VhostNetOptions options;
@@ -754,9 +766,10 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
 
         if (tap->has_ifname || tap->has_script || tap->has_downscript ||
             tap->has_vnet_hdr || tap->has_helper || tap->has_queues ||
-            tap->has_vhostfd) {
+            tap->has_vhostfd || tap->has_colo_script || tap->has_colo_nicname) {
             error_report("ifname=, script=, downscript=, vnet_hdr=, "
                          "helper=, queues=, and vhostfd= "
+                         "colo_script=, and colo_nicname= "
                          "are invalid with fds=");
             return -1;
         }
@@ -796,9 +809,11 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
         }
     } else if (tap->has_helper) {
         if (tap->has_ifname || tap->has_script || tap->has_downscript ||
-            tap->has_vnet_hdr || tap->has_queues || tap->has_vhostfds) {
+            tap->has_vnet_hdr || tap->has_queues || tap->has_vhostfds ||
+            tap->has_colo_script || tap->has_colo_nicname) {
             error_report("ifname=, script=, downscript=, and vnet_hdr= "
-                         "queues=, and vhostfds= are invalid with helper=");
+                         "queues=, vhostfds=, colo_script=, and "
+                         "colo_nicname= are invalid with helper=");
             return -1;
         }
 
@@ -817,6 +832,12 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
             return -1;
         }
     } else {
+        if (queues > 1 && (tap->has_colo_script || tap->has_colo_nicname)) {
+            error_report("queues > 1 is invalid if colo_script or "
+                         "colo_nicname is specified");
+            return -1;
+        }
+
         if (tap->has_vhostfds) {
             error_report("vhostfds= is invalid if fds= wasn't specified");
             return -1;
diff --git a/qapi-schema.json b/qapi-schema.json
index 7562111..dc0ee07 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2258,6 +2258,10 @@
 #
 # @queues: #optional number of queues to be created for multiqueue capable tap
 #
+# @colo_nicname: #optional the host physical nic for QEMU (Since 2.3)
+#
+# @colo_script: #optional the script file which used by COLO (Since 2.3)
+#
 # Since 1.2
 ##
 { 'struct': 'NetdevTapOptions',
@@ -2274,7 +2278,9 @@
     '*vhostfd':    'str',
     '*vhostfds':   'str',
     '*vhostforce': 'bool',
-    '*queues':     'uint32'} }
+    '*queues':     'uint32',
+    '*colo_nicname':  'str',
+    '*colo_script':   'str'} }
 
 ##
 # @NetdevSocketOptions
diff --git a/qemu-options.hx b/qemu-options.hx
index ec356f6..f64e05d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1466,7 +1466,11 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
     "-net tap[,vlan=n][,name=str],ifname=name\n"
     "                connect the host TAP network interface to VLAN 'n'\n"
 #else
-    "-net tap[,vlan=n][,name=str][,fd=h][,fds=x:y:...:z][,ifname=name][,script=file][,downscript=dfile][,helper=helper][,sndbuf=nbytes][,vnet_hdr=on|off][,vhost=on|off][,vhostfd=h][,vhostfds=x:y:...:z][,vhostforce=on|off][,queues=n]\n"
+    "-net tap[,vlan=n][,name=str][,fd=h][,fds=x:y:...:z][,ifname=name][,script=file][,downscript=dfile][,helper=helper][,sndbuf=nbytes][,vnet_hdr=on|off][,vhost=on|off][,vhostfd=h][,vhostfds=x:y:...:z][,vhostforce=on|off][,queues=n]"
+#ifdef CONFIG_COLO
+    "[,colo_nicname=nicname][,colo_script=scriptfile]"
+#endif
+    "\n"
     "                connect the host TAP network interface to VLAN 'n'\n"
     "                use network scripts 'file' (default=" DEFAULT_NETWORK_SCRIPT ")\n"
     "                to configure it and 'dfile' (default=" DEFAULT_NETWORK_DOWN_SCRIPT ")\n"
@@ -1486,6 +1490,10 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
     "                use 'vhostfd=h' to connect to an already opened vhost net device\n"
     "                use 'vhostfds=x:y:...:z to connect to multiple already opened vhost net devices\n"
     "                use 'queues=n' to specify the number of queues to be created for multiqueue TAP\n"
+#ifdef CONFIG_COLO
+    "                use 'colo_nicname=nicname' to specify the host physical nic for QEMU\n"
+    "                use 'colo_script=scriptfile' to specify script file when colo is enabled\n"
+#endif
     "-net bridge[,vlan=n][,name=str][,br=bridge][,helper=helper]\n"
     "                connects a host TAP network interface to a host bridge device 'br'\n"
     "                (default=" DEFAULT_BRIDGE_INTERFACE ") using the program 'helper'\n"
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 18/29] COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (17 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

When COLO mode, we do some init work for nic that will be used for COLO.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/net/colo-nic.h | 20 ++++++++++++++
 net/Makefile.objs      |  1 +
 net/colo-nic.c         | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++
 net/tap.c              | 18 +++++++++----
 stubs/migration-colo.c |  9 +++++++
 5 files changed, 114 insertions(+), 5 deletions(-)
 create mode 100644 include/net/colo-nic.h
 create mode 100644 net/colo-nic.c

diff --git a/include/net/colo-nic.h b/include/net/colo-nic.h
new file mode 100644
index 0000000..d35ee17
--- /dev/null
+++ b/include/net/colo-nic.h
@@ -0,0 +1,20 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef COLO_NIC_H
+#define COLO_NIC_H
+
+void colo_add_nic_devices(NetClientState *nc);
+void colo_remove_nic_devices(NetClientState *nc);
+
+#endif
diff --git a/net/Makefile.objs b/net/Makefile.objs
index ec19cb3..73f4a81 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -13,3 +13,4 @@ common-obj-$(CONFIG_HAIKU) += tap-haiku.o
 common-obj-$(CONFIG_SLIRP) += slirp.o
 common-obj-$(CONFIG_VDE) += vde.o
 common-obj-$(CONFIG_NETMAP) += netmap.o
+common-obj-$(CONFIG_COLO) += colo-nic.o
diff --git a/net/colo-nic.c b/net/colo-nic.c
new file mode 100644
index 0000000..d021526
--- /dev/null
+++ b/net/colo-nic.c
@@ -0,0 +1,71 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ *
+ */
+#include "include/migration/migration.h"
+#include "migration/migration-colo.h"
+#include "net/net.h"
+#include "net/colo-nic.h"
+#include "qemu/error-report.h"
+
+
+typedef struct nic_device {
+    NetClientState *nc;
+    bool (*support_colo)(NetClientState *nc);
+    int (*configure)(NetClientState *nc, bool up, int side, int index);
+    QTAILQ_ENTRY(nic_device) next;
+    bool is_up;
+} nic_device;
+
+
+
+QTAILQ_HEAD(, nic_device) nic_devices = QTAILQ_HEAD_INITIALIZER(nic_devices);
+
+/*
+* colo_proxy_script usage
+* ./colo_proxy_script master/slave install/uninstall phy_if virt_if index
+*/
+static bool colo_nic_support(NetClientState *nc)
+{
+    return nc && nc->colo_script[0] && nc->colo_nicname[0];
+}
+
+void colo_add_nic_devices(NetClientState *nc)
+{
+    struct nic_device *nic = g_malloc0(sizeof(*nic));
+
+    nic->support_colo = colo_nic_support;
+    nic->configure = NULL;
+    /*
+     * TODO
+     * only support "-netdev tap,colo_scripte..."  options
+     * "-net nic -net tap..." options is not supported
+     */
+    nic->nc = nc;
+
+    QTAILQ_INSERT_TAIL(&nic_devices, nic, next);
+}
+
+void colo_remove_nic_devices(NetClientState *nc)
+{
+    struct nic_device *nic, *next_nic;
+
+    if (!nc) {
+        return;
+    }
+
+    QTAILQ_FOREACH_SAFE(nic, &nic_devices, next, next_nic) {
+        if (nic->nc == nc) {
+            QTAILQ_REMOVE(&nic_devices, nic, next);
+            g_free(nic);
+        }
+    }
+}
diff --git a/net/tap.c b/net/tap.c
index 823f78e..d64e046 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -41,6 +41,7 @@
 #include "qemu/error-report.h"
 
 #include "net/tap.h"
+#include "net/colo-nic.h"
 
 #include "net/vhost_net.h"
 
@@ -296,6 +297,8 @@ static void tap_cleanup(NetClientState *nc)
 
     qemu_purge_queued_packets(nc);
 
+    colo_remove_nic_devices(nc);
+
     if (s->down_script[0])
         launch_script(s->down_script, s->down_script_arg, s->fd);
 
@@ -603,7 +606,7 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
                             const char *model, const char *name,
                             const char *ifname, const char *script,
                             const char *downscript, const char *vhostfdname,
-                            int vnet_hdr, int fd)
+                            int vnet_hdr, int fd, bool setup_colo)
 {
     Error *err = NULL;
     TAPState *s;
@@ -647,6 +650,10 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
                  tap->colo_nicname);
     }
 
+    if (setup_colo) {
+        colo_add_nic_devices(nc);
+    }
+
     if (tap->has_vhost ? tap->vhost :
         vhostfdname || (tap->has_vhostforce && tap->vhostforce)) {
         VhostNetOptions options;
@@ -756,7 +763,7 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
 
         if (net_init_tap_one(tap, peer, "tap", name, NULL,
                              script, downscript,
-                             vhostfdname, vnet_hdr, fd)) {
+                             vhostfdname, vnet_hdr, fd, true)) {
             return -1;
         }
     } else if (tap->has_fds) {
@@ -803,7 +810,7 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
             if (net_init_tap_one(tap, peer, "tap", name, ifname,
                                  script, downscript,
                                  tap->has_vhostfds ? vhost_fds[i] : NULL,
-                                 vnet_hdr, fd)) {
+                                 vnet_hdr, fd, false)) {
                 return -1;
             }
         }
@@ -827,7 +834,7 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
 
         if (net_init_tap_one(tap, peer, "bridge", name, ifname,
                              script, downscript, vhostfdname,
-                             vnet_hdr, fd)) {
+                             vnet_hdr, fd, false)) {
             close(fd);
             return -1;
         }
@@ -870,7 +877,8 @@ int net_init_tap(const NetClientOptions *opts, const char *name,
             if (net_init_tap_one(tap, peer, "tap", name, ifname,
                                  i >= 1 ? "no" : script,
                                  i >= 1 ? "no" : downscript,
-                                 vhostfdname, vnet_hdr, fd)) {
+                                 vhostfdname, vnet_hdr, fd,
+                                 i == 0)) {
                 close(fd);
                 return -1;
             }
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 9ec0c07..03a395b 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -12,6 +12,7 @@
 
 #include "migration/migration-colo.h"
 #include "qmp-commands.h"
+#include "net/colo-nic.h"
 
 bool colo_supported(void)
 {
@@ -37,6 +38,14 @@ bool loadvm_in_colo_state(void)
     return false;
 }
 
+void colo_add_nic_devices(NetClientState *nc)
+{
+}
+
+void colo_remove_nic_devices(NetClientState *nc)
+{
+}
+
 void qmp_colo_lost_heartbeat(Error **errp)
 {
     error_setg(errp, "COLO is not supported, please rerun configure"
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 19/29] COLO NIC: Implement colo nic device interface configure()
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (18 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

Implement colo nic device interface configure()
add a script to configure nic devices:
${QEMU_SCRIPT_DIR}/colo-proxy-script.sh

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 net/colo-nic.c               | 56 +++++++++++++++++++++++++++-
 scripts/colo-proxy-script.sh | 88 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 143 insertions(+), 1 deletion(-)
 create mode 100755 scripts/colo-proxy-script.sh

diff --git a/net/colo-nic.c b/net/colo-nic.c
index d021526..8b678b1 100644
--- a/net/colo-nic.c
+++ b/net/colo-nic.c
@@ -38,12 +38,66 @@ static bool colo_nic_support(NetClientState *nc)
     return nc && nc->colo_script[0] && nc->colo_nicname[0];
 }
 
+static int launch_colo_script(char *argv[])
+{
+    int pid, status;
+    char *script = argv[0];
+
+    /* try to launch network script */
+    pid = fork();
+    if (pid == 0) {
+        execv(script, argv);
+        _exit(1);
+    } else if (pid > 0) {
+        while (waitpid(pid, &status, 0) != pid) {
+            /* loop */
+        }
+
+        if (WIFEXITED(status) && WEXITSTATUS(status) == 0) {
+            return 0;
+        }
+    }
+    return -1;
+}
+
+static int colo_nic_configure(NetClientState *nc,
+            bool up, int side, int index)
+{
+    int i, argc = 6;
+    char *argv[7], index_str[32];
+    char **parg;
+
+    if (!nc && index <= 0) {
+        error_report("Can not parse colo_script or colo_nicname");
+        return -1;
+    }
+
+    parg = argv;
+    *parg++ = nc->colo_script;
+    *parg++ = (char *)(side == COLO_SECONDARY_MODE ? "slave" : "master");
+    *parg++ = (char *)(up ? "install" : "uninstall");
+    *parg++ = nc->colo_nicname;
+    *parg++ = nc->ifname;
+    sprintf(index_str, "%d", index);
+    *parg++ = index_str;
+    *parg = NULL;
+
+    for (i = 0; i < argc; i++) {
+        if (!argv[i][0]) {
+            error_report("Can not get colo_script argument");
+            return -1;
+        }
+    }
+
+    return launch_colo_script(argv);
+}
+
 void colo_add_nic_devices(NetClientState *nc)
 {
     struct nic_device *nic = g_malloc0(sizeof(*nic));
 
     nic->support_colo = colo_nic_support;
-    nic->configure = NULL;
+    nic->configure = colo_nic_configure;
     /*
      * TODO
      * only support "-netdev tap,colo_scripte..."  options
diff --git a/scripts/colo-proxy-script.sh b/scripts/colo-proxy-script.sh
new file mode 100755
index 0000000..ed0fcb8
--- /dev/null
+++ b/scripts/colo-proxy-script.sh
@@ -0,0 +1,88 @@
+#!/bin/sh
+#usage: colo-proxy-script.sh master/slave install/uninstall phy_if virt_if index
+#.e.g colo-proxy-script.sh master install eth2 tap0 1
+
+side=$1
+action=$2
+phy_if=$3
+virt_if=$4
+index=$5
+br=br1
+failover_br=br0
+
+script_usage()
+{
+    echo -n "usage: ./colo-proxy-script.sh master/slave "
+    echo -e "install/uninstall phy_if virt_if index\n"
+}
+
+master_install()
+{
+    tc qdisc add dev $virt_if root handle 1: prio
+    tc filter add dev $virt_if parent 1: protocol ip prio 10 u32 match u32 \
+        0 0 flowid 1:2 action mirred egress mirror dev $phy_if
+    tc filter add dev $virt_if parent 1: protocol arp prio 11 u32 match u32 \
+        0 0 flowid 1:2 action mirred egress mirror dev $phy_if
+    tc filter add dev $virt_if parent 1: protocol ipv6 prio 12 u32 match u32 \
+        0 0 flowid 1:2 action mirred egress mirror dev $phy_if
+
+    /usr/local/sbin/iptables -t mangle -I PREROUTING -m physdev --physdev-in \
+        $virt_if -j PMYCOLO --index $index --forward-dev $phy_if
+    /usr/local/sbin/ip6tables -t mangle -I PREROUTING -m physdev --physdev-in \
+        $virt_if -j PMYCOLO --index $index --forward-dev $phy_if
+    /usr/local/sbin/arptables -I INPUT -i $phy_if -j MARK --set-mark $index
+}
+
+master_uninstall()
+{
+    tc filter del dev $virt_if parent 1: protocol ip prio 10 u32 match u32 \
+        0 0 flowid 1:2 action mirred egress mirror dev $phy_if
+    tc filter del dev $virt_if parent 1: protocol arp prio 11 u32 match u32 \
+        0 0 flowid 1:2 action mirred egress mirror dev $phy_if
+    tc filter del dev $virt_if parent 1: protocol ipv6 prio 12 u32 match u32 \
+        0 0 flowid 1:2 action mirred egress mirror dev $phy_if
+    tc qdisc del dev $virt_if root handle 1: prio
+
+    /usr/local/sbin/iptables -t mangle -D PREROUTING -m physdev --physdev-in \
+        $virt_if -j PMYCOLO --index $index --forward-dev $phy_if
+    /usr/local/sbin/ip6tables -t mangle -D PREROUTING -m physdev --physdev-in \
+        $virt_if -j PMYCOLO --index $index --forward-dev $phy_if
+    /usr/local/sbin/arptables -F
+}
+
+slave_install()
+{
+    brctl addif $br $phy_if
+
+    /usr/local/sbin/iptables -t mangle -I PREROUTING -m physdev --physdev-in \
+        $virt_if -j SECCOLO --index $index
+    /usr/local/sbin/ip6tables -t mangle -I PREROUTING -m physdev --physdev-in \
+        $virt_if -j SECCOLO --index $index
+}
+
+slave_uninstall()
+{
+    brctl delif $br $phy_if
+    brctl delif $br $virt_if
+    brctl addif $failover_br $virt_if
+
+    /usr/local/sbin/iptables -t mangle -F
+    /usr/local/sbin/ip6tables -t mangle -F
+}
+
+if [ $# -ne 5 ]; then
+    script_usage
+    exit 1
+fi
+
+if [ "x$side" != "xmaster" ] && [ "x$side" != "xslave" ]; then
+    script_usage
+    exit 2
+fi
+
+if [ "x$action" != "xinstall" ] && [ "x$action" != "xuninstall" ]; then
+    script_usage
+    exit 3
+fi
+
+${side}_${action}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 20/29] COLO NIC : Implement colo nic init/destroy function
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (19 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

When in colo mode, call colo nic init/destroy function.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/migration/migration-colo.h |  2 +-
 include/net/colo-nic.h             |  3 ++
 migration/colo.c                   | 15 ++++++++
 net/colo-nic.c                     | 74 ++++++++++++++++++++++++++++++++++++--
 4 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/include/migration/migration-colo.h b/include/migration/migration-colo.h
index 63f8b45..ebc4651 100644
--- a/include/migration/migration-colo.h
+++ b/include/migration/migration-colo.h
@@ -23,7 +23,7 @@ bool colo_supported(void);
 void colo_info_mig_init(void);
 
 /* Checkpoint control, called in migration/checkpoint thread */
-enum {
+enum colo_mode {
     COLO_UNPROTECTED_MODE = 0,
     COLO_PRIMARY_MODE,
     COLO_SECONDARY_MODE,
diff --git a/include/net/colo-nic.h b/include/net/colo-nic.h
index d35ee17..809726c 100644
--- a/include/net/colo-nic.h
+++ b/include/net/colo-nic.h
@@ -13,7 +13,10 @@
 
 #ifndef COLO_NIC_H
 #define COLO_NIC_H
+#include "migration/migration-colo.h"
 
+int colo_proxy_init(enum colo_mode mode);
+void colo_proxy_destroy(enum colo_mode mode);
 void colo_add_nic_devices(NetClientState *nc);
 void colo_remove_nic_devices(NetClientState *nc);
 
diff --git a/migration/colo.c b/migration/colo.c
index fc30ca5..22d7df1 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -15,6 +15,7 @@
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "migration/migration-failover.h"
+#include "net/colo-nic.h"
 
 enum {
     COLO_CHECPOINT_READY = 0x46,
@@ -284,6 +285,11 @@ static void *colo_thread(void *opaque)
     QEMUFile *colo_control = NULL;
     int ret;
 
+    if (colo_proxy_init(COLO_PRIMARY_MODE) != 0) {
+        error_report("Init colo proxy error");
+        goto out;
+    }
+
     colo_control = qemu_fopen_socket(qemu_get_fd(s->file), "rb");
     if (!colo_control) {
         error_report("Open colo_control failed!");
@@ -348,6 +354,8 @@ out:
     qemu_bh_schedule(s->cleanup_bh);
     qemu_mutex_unlock_iothread();
 
+    colo_proxy_destroy(COLO_PRIMARY_MODE);
+
     return NULL;
 }
 
@@ -417,6 +425,12 @@ void *colo_process_incoming_checkpoints(void *opaque)
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
+     /* configure the network */
+    if (colo_proxy_init(COLO_SECONDARY_MODE) != 0) {
+        error_report("Init colo proxy error\n");
+        goto out;
+    }
+
     ctl = qemu_fopen_socket(fd, "wb");
     if (!ctl) {
         error_report("Can't open incoming channel!");
@@ -574,5 +588,6 @@ out:
 
     loadvm_exit_colo();
 
+    colo_proxy_destroy(COLO_SECONDARY_MODE);
     return NULL;
 }
diff --git a/net/colo-nic.c b/net/colo-nic.c
index 8b678b1..fee2cfe 100644
--- a/net/colo-nic.c
+++ b/net/colo-nic.c
@@ -25,8 +25,6 @@ typedef struct nic_device {
     bool is_up;
 } nic_device;
 
-
-
 QTAILQ_HEAD(, nic_device) nic_devices = QTAILQ_HEAD_INITIALIZER(nic_devices);
 
 /*
@@ -92,6 +90,60 @@ static int colo_nic_configure(NetClientState *nc,
     return launch_colo_script(argv);
 }
 
+static int configure_one_nic(NetClientState *nc,
+             bool up, int side, int index)
+{
+    struct nic_device *nic;
+
+    assert(nc);
+
+    QTAILQ_FOREACH(nic, &nic_devices, next) {
+        if (nic->nc == nc) {
+            if (!nic->support_colo || !nic->support_colo(nic->nc)
+                || !nic->configure) {
+                return -1;
+            }
+            if (up == nic->is_up) {
+                return 0;
+            }
+
+            if (nic->configure(nic->nc, up, side, index) && up) {
+                return -1;
+            }
+            nic->is_up = up;
+            return 0;
+        }
+    }
+
+    return -1;
+}
+
+static int configure_nic(int side, int index)
+{
+    struct nic_device *nic;
+
+    if (QTAILQ_EMPTY(&nic_devices)) {
+        return -1;
+    }
+
+    QTAILQ_FOREACH(nic, &nic_devices, next) {
+        if (configure_one_nic(nic->nc, 1, side, index)) {
+            return -1;
+        }
+    }
+
+    return 0;
+}
+
+static void teardown_nic(int side, int index)
+{
+    struct nic_device *nic;
+
+    QTAILQ_FOREACH(nic, &nic_devices, next) {
+        configure_one_nic(nic->nc, 0, side, index);
+    }
+}
+
 void colo_add_nic_devices(NetClientState *nc)
 {
     struct nic_device *nic = g_malloc0(sizeof(*nic));
@@ -118,8 +170,26 @@ void colo_remove_nic_devices(NetClientState *nc)
 
     QTAILQ_FOREACH_SAFE(nic, &nic_devices, next, next_nic) {
         if (nic->nc == nc) {
+            configure_one_nic(nc, 0, get_colo_mode(), getpid());
             QTAILQ_REMOVE(&nic_devices, nic, next);
             g_free(nic);
         }
     }
 }
+
+int colo_proxy_init(enum colo_mode mode)
+{
+    int ret = -1;
+
+    ret = configure_nic(mode, getpid());
+    if (ret != 0) {
+        error_report("excute colo-proxy-script failed");
+    }
+
+    return ret;
+}
+
+void colo_proxy_destroy(enum colo_mode mode)
+{
+    teardown_nic(mode, getpid());
+}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 21/29] COLO NIC: Some init work related with proxy module
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (20 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

Implement communication protocol with proxy module by using
nfnetlink, which requires libnfnetlink libs.

Tell proxy module to do initialization work and moreover ask
kernel to acknowledge the request. It's is necessary for the first
time because Netlink is not a reliable protocol.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 configure      |  22 +++++++-
 net/colo-nic.c | 156 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 176 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 793fd12..6676823 100755
--- a/configure
+++ b/configure
@@ -2303,7 +2303,25 @@ EOF
     rdma="no"
   fi
 fi
-
+##########################################
+# COLO needs libnfnetlink libraries
+if test "$colo" != "no"; then
+  cat > $TMPC <<EOF
+#include <libnfnetlink/libnfnetlink.h>
+int main(void) { return 0; }
+EOF
+  colo_libs="-lnfnetlink"
+  if compile_prog "" "$colo_libs"; then
+    colo="yes"
+    libs_softmmu="$libs_softmmu $colo_libs"
+  else
+    if test "$colo" = "yes" ; then
+        error_exit "libnfnetlink is required for colo feature." \
+            "Make sure to have the libnfnetlink devel and headers installed."
+    fi
+    colo="no"
+  fi
+fi
 ##########################################
 # VNC TLS/WS detection
 if test "$vnc" = "yes" -a \( "$vnc_tls" != "no" -o "$vnc_ws" != "no" \) ; then
@@ -2610,7 +2628,7 @@ EOF
     if compile_prog "$cfl" "$lib" ; then
         :
     else
-        error_exit "$drv check failed" \
+        rror_exit "$drv check failed" \
             "Make sure to have the $drv libs and headers installed."
     fi
 }
diff --git a/net/colo-nic.c b/net/colo-nic.c
index fee2cfe..f4e04af 100644
--- a/net/colo-nic.c
+++ b/net/colo-nic.c
@@ -10,12 +10,64 @@
  * later.  See the COPYING file in the top-level directory.
  *
  */
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <linux/netlink.h>
+#include <libnfnetlink/libnfnetlink.h>
+#include <netinet/in.h>
+#include <sys/socket.h>
 #include "include/migration/migration.h"
 #include "migration/migration-colo.h"
 #include "net/net.h"
 #include "net/colo-nic.h"
 #include "qemu/error-report.h"
 
+/* Remove the follow define after proxy is merged into kernel,
+* using #include <libnfnetlink/libnfnetlink.h> instead.
+*/
+#define NFNL_SUBSYS_COLO 12
+
+/* Message Format
+* <---NLMSG_ALIGN(hlen)-----><-------------- NLMSG_ALIGN(len)----------------->
+* +--------------------+- - -+- - - - - - - - - - - - - - +- - - - - - + - - -+
+* |       Header       | Pad |   Netfilter Netlink Header | Attributes | Pad  |
+* |    struct nlmsghdr |     |     struct nfgenmsg        |            |      |
+* +--------------------+- - -+- - - - - - - - - - - - - - + - - - - - -+ - - -+
+*/
+
+enum nfnl_colo_msg_types {
+    NFCOLO_KERNEL_NOTIFY, /* Used by proxy module to notify qemu */
+
+    NFCOLO_DO_CHECKPOINT,
+    NFCOLO_DO_FAILOVER,
+    NFCOLO_PROXY_INIT,
+    NFCOLO_PROXY_RESET,
+
+    NFCOLO_MSG_MAX
+};
+
+enum nfnl_colo_kernel_notify_attributes {
+    NFNL_COLO_KERNEL_NOTIFY_UNSPEC,
+    NFNL_COLO_COMPARE_RESULT,
+    __NFNL_COLO_KERNEL_NOTIFY_MAX
+};
+
+#define NFNL_COLO_KERNEL_NOTIFY_MAX  (__NFNL_COLO_KERNEL_NOTIFY_MAX - 1)
+
+enum nfnl_colo_attributes {
+    NFNL_COLO_UNSPEC,
+    NFNL_COLO_MODE,
+    __NFNL_COLO_MAX
+};
+#define NFNL_COLO_MAX  (__NFNL_COLO_MAX - 1)
+
+struct nfcolo_msg_mode {
+    u_int8_t mode;
+};
+
+struct nfcolo_packet_compare { /* Unused */
+    int32_t different;
+};
 
 typedef struct nic_device {
     NetClientState *nc;
@@ -25,6 +77,9 @@ typedef struct nic_device {
     bool is_up;
 } nic_device;
 
+static struct nfnl_handle *nfnlh;
+static struct nfnl_subsys_handle *nfnlssh;
+
 QTAILQ_HEAD(, nic_device) nic_devices = QTAILQ_HEAD_INITIALIZER(nic_devices);
 
 /*
@@ -177,19 +232,120 @@ void colo_remove_nic_devices(NetClientState *nc)
     }
 }
 
+static int colo_proxy_send(enum nfnl_colo_msg_types msg_type,
+                           enum colo_mode mode, int flag, void *unused)
+{
+    struct nfcolo_msg_mode params;
+    union {
+        char buf[NFNL_HEADER_LEN
+                 + NFA_LENGTH(sizeof(struct nfcolo_msg_mode))];
+        struct nlmsghdr nmh;
+    } u;
+    int ret;
+
+    if (!nfnlssh || !nfnlh) {
+        error_report("nfnlssh and nfnlh are uninited");
+        return -1;
+    }
+    nfnl_fill_hdr(nfnlssh, &u.nmh, 0, AF_UNSPEC, 1,
+                  msg_type, NLM_F_REQUEST | flag);
+    params.mode = mode;
+    u.nmh.nlmsg_pid = nfnl_portid(nfnlh);
+    ret = nfnl_addattr_l(&u.nmh, sizeof(u),  NFNL_COLO_MODE, &params,
+                         sizeof(params));
+    if (ret < 0) {
+        error_report("call nfnl_addattr_l failed");
+        return ret;
+    }
+    ret = nfnl_send(nfnlh, &u.nmh);
+    if (ret < 0) {
+        error_report("call nfnl_send failed");
+    }
+    return ret;
+}
+
+static int check_proxy_ack(void)
+{
+    unsigned char *buf = g_malloc0(2048);
+    struct nlmsghdr *nlmsg;
+    int len;
+    int ret = -1;
+
+    len = nfnl_recv(nfnlh, buf, 2048);
+    if (len <= 0) {
+        error_report("nfnl_recv received nothing");
+        goto err;
+    }
+    nlmsg = (struct nlmsghdr *)buf;
+
+    if (nlmsg->nlmsg_type == NLMSG_ERROR) {
+        struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(nlmsg);
+
+        if (err->error) {
+            error_report("Received error message:%d",  -err->error);
+            goto err;
+        }
+    }
+
+    ret = 0;
+err:
+    g_free(buf);
+    return ret;
+}
+
 int colo_proxy_init(enum colo_mode mode)
 {
     int ret = -1;
 
+    nfnlh = nfnl_open();
+    if (!nfnlh) {
+        error_report("call nfnl_open failed");
+        return -1;
+    }
+    /* Note:
+     *  Here we must ensure that the nl_pid (also nlmsg_pid in nlmsghdr ) equal
+     *  to the process ID of VM, becase we use it to identify the VM in proxy
+     *  module.
+     */
+    if (nfnl_portid(nfnlh) != getpid()) {
+        error_report("More than one netlink of NETLINK_NETFILTER type exist");
+        return -1;
+    }
+    /* disable netlink sequence tracking by default */
+    nfnl_unset_sequence_tracking(nfnlh);
+    nfnlssh = nfnl_subsys_open(nfnlh, NFNL_SUBSYS_COLO, NFCOLO_MSG_MAX, 0);
+    if (!nfnlssh) {
+        error_report("call nfnl_subsys_open failed");
+        goto err_out;
+    }
+
+    /* Netlink is not a reliable protocol, So it is necessary to request proxy
+     * module to acknowledge in the first time.
+     */
+    ret = colo_proxy_send(NFCOLO_PROXY_INIT, mode, NLM_F_ACK, NULL);
+    if (ret < 0) {
+        goto err_out;
+    }
+
+    ret = check_proxy_ack();
+    if (ret < 0) {
+        goto err_out;
+    }
+
     ret = configure_nic(mode, getpid());
     if (ret != 0) {
         error_report("excute colo-proxy-script failed");
+        goto err_out;
     }
 
+    return 0;
+err_out:
+    nfnl_close(nfnlh);
     return ret;
 }
 
 void colo_proxy_destroy(enum colo_mode mode)
 {
+    nfnl_close(nfnlh);
     teardown_nic(mode, getpid());
 }
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 22/29] COLO: Handle nfnetlink message from proxy module
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (21 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

Proxy module will send message to qemu through nfnetlink.
Now, the message only contains the result of packets comparation.

We use a global variable 'packet_compare_different' to store the result.
And this variable should be accessed by using atomic related function,
such as 'atomic_set' 'atomic_xchg'.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 net/colo-nic.c | 41 +++++++++++++++++++++++++++++++++++++++++
 trace-events   |  1 +
 2 files changed, 42 insertions(+)

diff --git a/net/colo-nic.c b/net/colo-nic.c
index f4e04af..9e08fe3 100644
--- a/net/colo-nic.c
+++ b/net/colo-nic.c
@@ -21,6 +21,7 @@
 #include "net/net.h"
 #include "net/colo-nic.h"
 #include "qemu/error-report.h"
+#include "trace.h"
 
 /* Remove the follow define after proxy is merged into kernel,
 * using #include <libnfnetlink/libnfnetlink.h> instead.
@@ -79,6 +80,7 @@ typedef struct nic_device {
 
 static struct nfnl_handle *nfnlh;
 static struct nfnl_subsys_handle *nfnlssh;
+static int32_t packet_compare_different; /* The result of packet comparing */
 
 QTAILQ_HEAD(, nic_device) nic_devices = QTAILQ_HEAD_INITIALIZER(nic_devices);
 
@@ -264,6 +266,38 @@ static int colo_proxy_send(enum nfnl_colo_msg_types msg_type,
     return ret;
 }
 
+static int __colo_rcv_pkt(struct nlmsghdr *nlh, struct nfattr *nfa[],
+                          void *data)
+{
+    /* struct nfgenmsg *nfmsg = NLMSG_DATA(nlh); */
+    int32_t  result = ntohl(nfnl_get_data(nfa, NFNL_COLO_COMPARE_RESULT,
+                                          int32_t));
+
+    atomic_set(&packet_compare_different, result);
+    trace_colo_rcv_pkt(result);
+    return 0;
+}
+
+static struct nfnl_callback colo_nic_cb = {
+    .call   = &__colo_rcv_pkt,
+    .attr_count = NFNL_COLO_KERNEL_NOTIFY_MAX,
+};
+
+static void colo_proxy_recv(void *opaque)
+{
+    unsigned char *buf = g_malloc0(2048);
+    int len;
+    int ret;
+
+    len = nfnl_recv(nfnlh, buf, 2048);
+    ret = nfnl_handle_packet(nfnlh, (char *)buf, len);
+    if (ret < 0) {/* Notify colo thread the error */
+        atomic_set(&packet_compare_different, -1);
+        error_report("call nfnl_handle_packet failed");
+    }
+    g_free(buf);
+}
+
 static int check_proxy_ack(void)
 {
     unsigned char *buf = g_malloc0(2048);
@@ -319,6 +353,11 @@ int colo_proxy_init(enum colo_mode mode)
         goto err_out;
     }
 
+    ret = nfnl_callback_register(nfnlssh, NFCOLO_KERNEL_NOTIFY, &colo_nic_cb);
+    if (ret < 0) {
+        goto err_out;
+    }
+
     /* Netlink is not a reliable protocol, So it is necessary to request proxy
      * module to acknowledge in the first time.
      */
@@ -338,6 +377,8 @@ int colo_proxy_init(enum colo_mode mode)
         goto err_out;
     }
 
+   qemu_set_fd_handler(nfnl_fd(nfnlh), colo_proxy_recv, NULL, NULL);
+
     return 0;
 err_out:
     nfnl_close(nfnlh);
diff --git a/trace-events b/trace-events
index 1ce7bba..b1c263a 100644
--- a/trace-events
+++ b/trace-events
@@ -1450,6 +1450,7 @@ colo_info_load(const char *msg) "%s"
 colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
 colo_receive_message(const char *msg) "Receive '%s'"
 colo_do_failover(void) ""
+colo_rcv_pkt(int result) "Result of net packets comparing is different: %d"
 
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 23/29] COLO: Do checkpoint according to the result of packets comparation
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (22 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

Only do checkpoint, when the PVM's and SVM's output net packets are inconsistent,
We also limit the min time between two continuous checkpoint action, to
give VM a change to run.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/net/colo-nic.h |  2 ++
 migration/colo.c       | 32 ++++++++++++++++++++++++++++++++
 net/colo-nic.c         |  5 +++++
 3 files changed, 39 insertions(+)

diff --git a/include/net/colo-nic.h b/include/net/colo-nic.h
index 809726c..57c6719 100644
--- a/include/net/colo-nic.h
+++ b/include/net/colo-nic.h
@@ -20,4 +20,6 @@ void colo_proxy_destroy(enum colo_mode mode);
 void colo_add_nic_devices(NetClientState *nc);
 void colo_remove_nic_devices(NetClientState *nc);
 
+int colo_proxy_compare(void);
+
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index 22d7df1..a7cead9 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -17,6 +17,13 @@
 #include "migration/migration-failover.h"
 #include "net/colo-nic.h"
 
+/*
+* We should not do checkpoint one after another without any time interval,
+* Because this will lead continuous 'stop' status for VM.
+* CHECKPOINT_MIN_PERIOD is the min time limit between two checkpoint action.
+*/
+#define CHECKPOINT_MIN_PERIOD 100  /* unit: ms */
+
 enum {
     COLO_CHECPOINT_READY = 0x46,
 
@@ -283,6 +290,7 @@ static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
     QEMUFile *colo_control = NULL;
+    int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int ret;
 
     if (colo_proxy_init(COLO_PRIMARY_MODE) != 0) {
@@ -318,15 +326,39 @@ static void *colo_thread(void *opaque)
     trace_colo_vm_state_change("stop", "run");
 
     while (s->state == MIGRATION_STATUS_COLO) {
+        int proxy_checkpoint_req;
+
         if (failover_request_is_set()) {
             error_report("failover request");
             goto out;
         }
+        /* wait for a colo checkpoint */
+        proxy_checkpoint_req = colo_proxy_compare();
+        if (proxy_checkpoint_req < 0) {
+            goto out;
+        } else if (!proxy_checkpoint_req) {
+            /*
+             * No checkpoint is needed, wait for 1ms and then
+             * check if we need checkpoint again
+             */
+            g_usleep(1000);
+            continue;
+        } else {
+            int64_t interval;
+
+            current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+            interval = current_time - checkpoint_time;
+            if (interval < CHECKPOINT_MIN_PERIOD) {
+                /* Limit the min time between two checkpoint */
+                g_usleep((1000*(CHECKPOINT_MIN_PERIOD - interval)));
+            }
+        }
 
         /* start a colo checkpoint */
         if (colo_do_checkpoint_transaction(s, colo_control)) {
             goto out;
         }
+        checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     }
 
 out:
diff --git a/net/colo-nic.c b/net/colo-nic.c
index 9e08fe3..a004e08 100644
--- a/net/colo-nic.c
+++ b/net/colo-nic.c
@@ -390,3 +390,8 @@ void colo_proxy_destroy(enum colo_mode mode)
     nfnl_close(nfnlh);
     teardown_nic(mode, getpid());
 }
+
+int colo_proxy_compare(void)
+{
+    return atomic_xchg(&packet_compare_different, 0);
+}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 24/29] COLO: Improve checkpoint efficiency by do additional periodic checkpoint
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (23 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Yang Hongyang,
	david

Besides normal checkpoint which according to the result of net packets
comparing, We do additional checkpoint periodically, it will reduce the number
of dirty pages when do one checkpoint, if we don't do checkpoint for a long
time (This is a special case when the net packets is always consistent).

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration/colo.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index a7cead9..195973a 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,6 +10,7 @@
  * later.  See the COPYING file in the top-level directory.
  */
 
+#include "qemu/timer.h"
 #include "sysemu/sysemu.h"
 #include "migration/migration-colo.h"
 #include "trace.h"
@@ -24,6 +25,13 @@
 */
 #define CHECKPOINT_MIN_PERIOD 100  /* unit: ms */
 
+/*
+ * force checkpoint timer: unit ms
+ * this is large because COLO checkpoint will mostly depend on
+ * COLO compare module.
+ */
+#define CHECKPOINT_MAX_PEROID 10000
+
 enum {
     COLO_CHECPOINT_READY = 0x46,
 
@@ -336,14 +344,7 @@ static void *colo_thread(void *opaque)
         proxy_checkpoint_req = colo_proxy_compare();
         if (proxy_checkpoint_req < 0) {
             goto out;
-        } else if (!proxy_checkpoint_req) {
-            /*
-             * No checkpoint is needed, wait for 1ms and then
-             * check if we need checkpoint again
-             */
-            g_usleep(1000);
-            continue;
-        } else {
+        } else if (proxy_checkpoint_req) {
             int64_t interval;
 
             current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
@@ -352,8 +353,20 @@ static void *colo_thread(void *opaque)
                 /* Limit the min time between two checkpoint */
                 g_usleep((1000*(CHECKPOINT_MIN_PERIOD - interval)));
             }
+            goto do_checkpoint;
+        }
+
+        /*
+         * No proxy checkpoint is request, wait for 100ms
+         * and then check if we need checkpoint again.
+         */
+        current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
+            g_usleep(100000);
+            continue;
         }
 
+do_checkpoint:
         /* start a colo checkpoint */
         if (colo_do_checkpoint_transaction(s, colo_control)) {
             goto out;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 25/29] COLO: Add colo-set-checkpoint-period command
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (24 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  2015-06-05 18:45   ` Dr. David Alan Gilbert
  -1 siblings, 1 reply; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

With this command, we can control the period of checkpoint, if
there is no comparison of net packets.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 hmp-commands.hx        | 15 +++++++++++++++
 hmp.c                  |  7 +++++++
 hmp.h                  |  1 +
 migration/colo.c       | 11 ++++++++++-
 qapi-schema.json       | 13 +++++++++++++
 qmp-commands.hx        | 22 ++++++++++++++++++++++
 stubs/migration-colo.c |  4 ++++
 7 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index be3e398..32cd548 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1023,6 +1023,21 @@ Tell COLO that heartbeat is lost, a failover or takeover is needed.
 ETEXI
 
     {
+        .name       = "colo_set_checkpoint_period",
+        .args_type  = "value:i",
+        .params     = "value",
+        .help       = "set checkpoint period (in ms) for colo. "
+        "Defaults to 100ms",
+        .mhandler.cmd = hmp_colo_set_checkpoint_period,
+    },
+
+STEXI
+@item migrate_set_checkpoint_period @var{value}
+@findex migrate_set_checkpoint_period
+Set checkpoint period to @var{value} (in ms) for colo.
+ETEXI
+
+    {
         .name       = "client_migrate_info",
         .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
         .params     = "protocol hostname port tls-port cert-subject",
diff --git a/hmp.c b/hmp.c
index f87fa37..f727686 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1257,6 +1257,13 @@ void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict)
     hmp_handle_error(mon, &err);
 }
 
+void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict)
+{
+    int64_t value = qdict_get_int(qdict, "value");
+
+    qmp_colo_set_checkpoint_period(value, NULL);
+}
+
 void hmp_set_password(Monitor *mon, const QDict *qdict)
 {
     const char *protocol  = qdict_get_str(qdict, "protocol");
diff --git a/hmp.h b/hmp.h
index b6549f8..9570345 100644
--- a/hmp.h
+++ b/hmp.h
@@ -68,6 +68,7 @@ void hmp_migrate_set_capability(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_cache_size(Monitor *mon, const QDict *qdict);
 void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
+void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict);
 void hmp_set_password(Monitor *mon, const QDict *qdict);
 void hmp_expire_password(Monitor *mon, const QDict *qdict);
 void hmp_eject(Monitor *mon, const QDict *qdict);
diff --git a/migration/colo.c b/migration/colo.c
index 195973a..f5fc79c 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -17,6 +17,7 @@
 #include "qemu/error-report.h"
 #include "migration/migration-failover.h"
 #include "net/colo-nic.h"
+#include "qmp-commands.h"
 
 /*
 * We should not do checkpoint one after another without any time interval,
@@ -70,6 +71,9 @@ enum {
 static QEMUBH *colo_bh;
 static bool vmstate_loading;
 static Coroutine *colo;
+
+int64_t colo_checkpoint_period = CHECKPOINT_MAX_PEROID;
+
 /* colo buffer */
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 QEMUSizedBuffer *colo_buffer;
@@ -85,6 +89,11 @@ bool migrate_in_colo_state(void)
     return (s->state == MIGRATION_STATUS_COLO);
 }
 
+void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
+{
+    colo_checkpoint_period = value;
+}
+
 static bool colo_runstate_is_stopped(void)
 {
     return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
@@ -361,7 +370,7 @@ static void *colo_thread(void *opaque)
          * and then check if we need checkpoint again.
          */
         current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
+        if (current_time - checkpoint_time < colo_checkpoint_period) {
             g_usleep(100000);
             continue;
         }
diff --git a/qapi-schema.json b/qapi-schema.json
index dc0ee07..62b5cfd 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -653,6 +653,19 @@
 { 'command': 'colo-lost-heartbeat' }
 
 ##
+# @colo-set-checkpoint-period
+#
+# Set colo checkpoint period
+#
+# @value: period of colo checkpoint in ms
+#
+# Returns: nothing on success
+#
+# Since: 2.4
+##
+{ 'command': 'colo-set-checkpoint-period', 'data': {'value': 'int'} }
+
+##
 # @MouseInfo:
 #
 # Information about a mouse device.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 3813f66..4b16044 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -800,6 +800,28 @@ Example:
 EQMP
 
     {
+         .name       = "colo-set-checkpoint-period",
+         .args_type  = "value:i",
+         .mhandler.cmd_new = qmp_marshal_input_colo_set_checkpoint_period,
+    },
+
+SQMP
+colo-set-checkpoint-period
+--------------------------
+
+set checkpoint period
+
+Arguments:
+- "value": checkpoint period
+
+Example:
+
+-> { "execute": "colo-set-checkpoint-period", "arguments": { "value": "1000" } }
+<- { "return": {} }
+
+EQMP
+
+    {
         .name       = "client_migrate_info",
         .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
         .params     = "protocol hostname port tls-port cert-subject",
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 03a395b..d3c9dc4 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -52,3 +52,7 @@ void qmp_colo_lost_heartbeat(Error **errp)
                      " with --enable-colo option in order to support"
                      " COLO feature");
 }
+
+void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
+{
+}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 26/29] COLO NIC: Implement NIC checkpoint and failover
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (25 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, david

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/net/colo-nic.h |  2 ++
 migration/colo.c       | 21 ++++++++++++++++++---
 net/colo-nic.c         | 23 +++++++++++++++++++++++
 3 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/include/net/colo-nic.h b/include/net/colo-nic.h
index 57c6719..9670512 100644
--- a/include/net/colo-nic.h
+++ b/include/net/colo-nic.h
@@ -21,5 +21,7 @@ void colo_add_nic_devices(NetClientState *nc);
 void colo_remove_nic_devices(NetClientState *nc);
 
 int colo_proxy_compare(void);
+int colo_proxy_failover(void);
+int colo_proxy_checkpoint(enum colo_mode mode);
 
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index f5fc79c..ab9dcfe 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -114,6 +114,11 @@ static void slave_do_failover(void)
         ;
     }
 
+    if (colo_proxy_failover() != 0) {
+        error_report("colo proxy failed to do failover");
+    }
+    colo_proxy_destroy(COLO_SECONDARY_MODE);
+
     colo = NULL;
 
     if (!autostart) {
@@ -136,6 +141,8 @@ static void master_do_failover(void)
         vm_stop_force_state(RUN_STATE_COLO);
     }
 
+    colo_proxy_destroy(COLO_PRIMARY_MODE);
+
     if (s->state != MIGRATION_STATUS_FAILED) {
         migrate_set_state(s, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED);
     }
@@ -259,6 +266,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
 
     qemu_fflush(trans);
 
+    ret = colo_proxy_checkpoint(COLO_PRIMARY_MODE);
+    if (ret < 0) {
+        goto out;
+    }
+
     ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
     if (ret < 0) {
         goto out;
@@ -408,8 +420,6 @@ out:
     qemu_bh_schedule(s->cleanup_bh);
     qemu_mutex_unlock_iothread();
 
-    colo_proxy_destroy(COLO_PRIMARY_MODE);
-
     return NULL;
 }
 
@@ -540,6 +550,11 @@ void *colo_process_incoming_checkpoints(void *opaque)
             goto out;
         }
 
+        ret = colo_proxy_checkpoint(COLO_SECONDARY_MODE);
+        if (ret < 0) {
+            goto out;
+        }
+
         ret = colo_ctl_get(f, COLO_CHECKPOINT_SEND);
         if (ret < 0) {
             goto out;
@@ -618,6 +633,7 @@ out:
         * just kill slave
         */
         error_report("SVM is going to exit!");
+        colo_proxy_destroy(COLO_SECONDARY_MODE);
         exit(1);
     } else {
         /* if we went here, means master may dead, we are doing failover */
@@ -642,6 +658,5 @@ out:
 
     loadvm_exit_colo();
 
-    colo_proxy_destroy(COLO_SECONDARY_MODE);
     return NULL;
 }
diff --git a/net/colo-nic.c b/net/colo-nic.c
index a004e08..a6800b5 100644
--- a/net/colo-nic.c
+++ b/net/colo-nic.c
@@ -391,6 +391,29 @@ void colo_proxy_destroy(enum colo_mode mode)
     teardown_nic(mode, getpid());
 }
 
+/*
+* Note: Weird, Only the VM in slave side need to do failover work !!!
+*/
+int colo_proxy_failover(void)
+{
+    if (colo_proxy_send(NFCOLO_DO_FAILOVER, COLO_SECONDARY_MODE, 0, NULL) < 0) {
+        return -1;
+    }
+
+    return 0;
+}
+
+/*
+* Note: Only the VM in master side need to do checkpoint
+*/
+int colo_proxy_checkpoint(enum colo_mode  mode)
+{
+    if (colo_proxy_send(NFCOLO_DO_CHECKPOINT, mode, 0, NULL) < 0) {
+        return -1;
+    }
+    return 0;
+}
+
 int colo_proxy_compare(void)
 {
     return atomic_xchg(&packet_compare_different, 0);
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 27/29] COLO: Disable qdev hotplug when VM is in COLO mode
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (26 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Yang Hongyang,
	david

COLO do not support qdev hotplug migration, disable it.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 migration/colo.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index ab9dcfe..8740fc2 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,6 +10,7 @@
  * later.  See the COPYING file in the top-level directory.
  */
 
+#include "hw/qdev-core.h"
 #include "qemu/timer.h"
 #include "sysemu/sysemu.h"
 #include "migration/migration-colo.h"
@@ -318,6 +319,7 @@ out:
 static void *colo_thread(void *opaque)
 {
     MigrationState *s = opaque;
+    int dev_hotplug = qdev_hotplug;
     QEMUFile *colo_control = NULL;
     int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int ret;
@@ -333,6 +335,8 @@ static void *colo_thread(void *opaque)
         goto out;
     }
 
+    qdev_hotplug = 0;
+
     /*
      * Wait for slave finish loading vm states and enter COLO
      * restore.
@@ -420,6 +424,8 @@ out:
     qemu_bh_schedule(s->cleanup_bh);
     qemu_mutex_unlock_iothread();
 
+    qdev_hotplug = dev_hotplug;
+
     return NULL;
 }
 
@@ -482,10 +488,13 @@ void *colo_process_incoming_checkpoints(void *opaque)
     struct colo_incoming *colo_in = opaque;
     QEMUFile *f = colo_in->file;
     int fd = qemu_get_fd(f);
+    int dev_hotplug = qdev_hotplug;
     QEMUFile *ctl = NULL, *fb = NULL;
     int ret;
     uint64_t total_size;
 
+    qdev_hotplug = 0;
+
     colo = qemu_coroutine_self();
     assert(colo != NULL);
 
@@ -658,5 +667,7 @@ out:
 
     loadvm_exit_colo();
 
+    qdev_hotplug = dev_hotplug;
+
     return NULL;
 }
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 28/29] COLO: Implement shutdown checkpoint
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (27 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Lai Jiangshan,
	david

For SVM, we forbid it shutdown directly when in COLO mode,
FOR PVM's shutdown, we should do some work to ensure the consistent action
between PVM and SVM.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/sysemu/sysemu.h |  3 +++
 migration/colo.c        | 31 ++++++++++++++++++++++++++++++-
 vl.c                    | 26 ++++++++++++++++++++++++--
 3 files changed, 57 insertions(+), 3 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 8a52934..8b37bd2 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -51,6 +51,8 @@ typedef enum WakeupReason {
     QEMU_WAKEUP_REASON_OTHER,
 } WakeupReason;
 
+extern int colo_shutdown_requested;
+
 void qemu_system_reset_request(void);
 void qemu_system_suspend_request(void);
 void qemu_register_suspend_notifier(Notifier *notifier);
@@ -58,6 +60,7 @@ void qemu_system_wakeup_request(WakeupReason reason);
 void qemu_system_wakeup_enable(WakeupReason reason, bool enabled);
 void qemu_register_wakeup_notifier(Notifier *notifier);
 void qemu_system_shutdown_request(void);
+void qemu_system_shutdown_request_core(void);
 void qemu_system_powerdown_request(void);
 void qemu_register_powerdown_notifier(Notifier *notifier);
 void qemu_system_debug_request(void);
diff --git a/migration/colo.c b/migration/colo.c
index 8740fc2..111062f 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -67,6 +67,8 @@ enum {
     COLO_CHECKPOINT_SEND,
     COLO_CHECKPOINT_RECEIVED,
     COLO_CHECKPOINT_LOADED,
+
+    COLO_GUEST_SHUTDOWN
 };
 
 static QEMUBH *colo_bh;
@@ -218,7 +220,7 @@ static int colo_ctl_get(QEMUFile *f, uint64_t require)
 
 static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
 {
-    int ret;
+    int colo_shutdown, ret;
     size_t size;
     QEMUFile *trans = NULL;
 
@@ -245,6 +247,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
     }
     /* suspend and save vm state to colo buffer */
     qemu_mutex_lock_iothread();
+    colo_shutdown = colo_shutdown_requested;
     vm_stop_force_state(RUN_STATE_COLO);
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("run", "stop");
@@ -301,6 +304,16 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
     }
     trace_colo_receive_message("COLO_CHECKPOINT_LOADED");
 
+    if (colo_shutdown) {
+        colo_ctl_put(s->file, COLO_GUEST_SHUTDOWN);
+        qemu_fflush(s->file);
+        colo_shutdown_requested = 0;
+        qemu_system_shutdown_request_core();
+        while (1) {
+            ;
+        }
+    }
+
     ret = 0;
     /* resume master */
     qemu_mutex_lock_iothread();
@@ -365,6 +378,10 @@ static void *colo_thread(void *opaque)
             error_report("failover request");
             goto out;
         }
+
+        if (colo_shutdown_requested) {
+            goto do_checkpoint;
+        }
         /* wait for a colo checkpoint */
         proxy_checkpoint_req = colo_proxy_compare();
         if (proxy_checkpoint_req < 0) {
@@ -478,6 +495,18 @@ static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
     case COLO_CHECKPOINT_NEW:
         *checkpoint_request = 1;
         return 0;
+    case COLO_GUEST_SHUTDOWN:
+        qemu_mutex_lock_iothread();
+        vm_stop_force_state(RUN_STATE_COLO);
+        qemu_system_shutdown_request_core();
+        qemu_mutex_unlock_iothread();
+        trace_colo_receive_message("COLO_GUEST_SHUTDOWN");
+        /* the main thread will exit and termiante the whole
+        * process, do we need some cleanup?
+        */
+        for (;;) {
+            ;
+        }
     default:
         return -1;
     }
diff --git a/vl.c b/vl.c
index 822bd08..26e3ae5 100644
--- a/vl.c
+++ b/vl.c
@@ -1533,6 +1533,8 @@ static NotifierList wakeup_notifiers =
     NOTIFIER_LIST_INITIALIZER(wakeup_notifiers);
 static uint32_t wakeup_reason_mask = ~(1 << QEMU_WAKEUP_REASON_NONE);
 
+int colo_shutdown_requested;
+
 int qemu_shutdown_requested_get(void)
 {
     return shutdown_requested;
@@ -1649,6 +1651,10 @@ void qemu_system_reset(bool report)
 void qemu_system_reset_request(void)
 {
     if (no_reboot) {
+        qemu_system_shutdown_request();
+        if (!shutdown_requested) {/* colo handle it ? */
+            return;
+        }
         shutdown_requested = 1;
     } else {
         reset_requested = 1;
@@ -1717,13 +1723,29 @@ void qemu_system_killed(int signal, pid_t pid)
     qemu_system_shutdown_request();
 }
 
-void qemu_system_shutdown_request(void)
+void qemu_system_shutdown_request_core(void)
 {
-    trace_qemu_system_shutdown_request();
     shutdown_requested = 1;
     qemu_notify_event();
 }
 
+void qemu_system_shutdown_request(void)
+{
+    trace_qemu_system_shutdown_request();
+    /*
+    * if in colo mode, we need do some significant work before respond to the
+    * shutdown request.
+    */
+    if (loadvm_in_colo_state()) {
+        return ; /* primary's responsibility */
+    }
+    if (migrate_in_colo_state()) {
+        colo_shutdown_requested = 1;
+        return;
+    }
+    qemu_system_shutdown_request_core();
+}
+
 static void qemu_system_powerdown(void)
 {
     qapi_event_send_powerdown(&error_abort);
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v5 29/29] COLO: Add block replication into colo process
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
                   ` (28 preceding siblings ...)
  (?)
@ 2015-05-21  8:13 ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-21  8:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, zhanghailiang, arei.gonglei, amit.shah, Yang Hongyang,
	david

From: Wen Congyang <wency@cn.fujitsu.com>

Make sure master start block replication after slave's block replication started

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 trace-events     |   2 +
 2 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 111062f..5b600b1 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -19,6 +19,8 @@
 #include "migration/migration-failover.h"
 #include "net/colo-nic.h"
 #include "qmp-commands.h"
+#include "block/block.h"
+#include "sysemu/block-backend.h"
 
 /*
 * We should not do checkpoint one after another without any time interval,
@@ -102,6 +104,76 @@ static bool colo_runstate_is_stopped(void)
     return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
 }
 
+static void blk_start_replication(bool primary, Error **errp)
+{
+    ReplicationMode mode = primary ? REPLICATION_MODE_PRIMARY :
+                                     REPLICATION_MODE_SECONDARY;
+    BlockBackend *blk, *temp;
+    Error *local_err = NULL;
+
+    for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
+        if (blk_is_read_only(blk) || !blk_is_inserted(blk)) {
+            continue;
+        }
+
+        bdrv_start_replication(blk_bs(blk), mode, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            goto fail;
+        }
+    }
+
+    return;
+
+fail:
+    for (temp = blk_next(NULL); temp != blk; temp = blk_next(temp)) {
+        bdrv_stop_replication(blk_bs(temp), false, NULL);
+    }
+}
+
+static void blk_do_checkpoint(Error **errp)
+{
+    BlockBackend *blk;
+    Error *local_err = NULL;
+
+    for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
+        if (blk_is_read_only(blk) || !blk_is_inserted(blk)) {
+            continue;
+        }
+
+        bdrv_do_checkpoint(blk_bs(blk), &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
+    }
+}
+
+static void blk_stop_replication(bool failover, Error **errp)
+{
+    BlockBackend *blk;
+    Error *local_err = NULL;
+
+    for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
+        if (blk_is_read_only(blk) || !blk_is_inserted(blk)) {
+            continue;
+        }
+
+        bdrv_stop_replication(blk_bs(blk), failover, &local_err);
+        if (!errp) {
+            /*
+             * The caller doesn't care the result, they just
+             * want to stop all block's replication.
+             */
+            continue;
+        }
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
+    }
+}
+
 /*
  * there are two way to entry this function
  * 1. From colo checkpoint incoming thread, in this case
@@ -112,6 +184,8 @@ static bool colo_runstate_is_stopped(void)
  */
 static void slave_do_failover(void)
 {
+    Error *local_err = NULL;
+
     /* Wait for incoming thread loading vmstate */
     while (vmstate_loading) {
         ;
@@ -121,6 +195,11 @@ static void slave_do_failover(void)
         error_report("colo proxy failed to do failover");
     }
     colo_proxy_destroy(COLO_SECONDARY_MODE);
+    blk_stop_replication(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
+    trace_colo_stop_block_replication("failover");
 
     colo = NULL;
 
@@ -139,6 +218,7 @@ static void slave_do_failover(void)
 static void master_do_failover(void)
 {
     MigrationState *s = migrate_get_current();
+    Error *local_err = NULL;
 
     if (!colo_runstate_is_stopped()) {
         vm_stop_force_state(RUN_STATE_COLO);
@@ -150,6 +230,12 @@ static void master_do_failover(void)
         migrate_set_state(s, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED);
     }
 
+    blk_stop_replication(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
+    trace_colo_stop_block_replication("failover");
+
     vm_start();
 }
 
@@ -223,6 +309,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
     int colo_shutdown, ret;
     size_t size;
     QEMUFile *trans = NULL;
+    Error *local_err = NULL;
 
     ret = colo_ctl_put(s->file, COLO_CHECKPOINT_NEW);
     if (ret < 0) {
@@ -275,6 +362,16 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
         goto out;
     }
 
+    /* we call this api although this may do nothing on primary side */
+    qemu_mutex_lock_iothread();
+    blk_do_checkpoint(&local_err);
+    qemu_mutex_unlock_iothread();
+    if (local_err) {
+        error_report_err(local_err);
+        ret = -1;
+        goto out;
+    }
+
     ret = colo_ctl_put(s->file, COLO_CHECKPOINT_SEND);
     if (ret < 0) {
         goto out;
@@ -305,6 +402,10 @@ static int colo_do_checkpoint_transaction(MigrationState *s, QEMUFile *control)
     trace_colo_receive_message("COLO_CHECKPOINT_LOADED");
 
     if (colo_shutdown) {
+        qemu_mutex_lock_iothread();
+        blk_stop_replication(false, NULL);
+        trace_colo_stop_block_replication("shutdown");
+        qemu_mutex_unlock_iothread();
         colo_ctl_put(s->file, COLO_GUEST_SHUTDOWN);
         qemu_fflush(s->file);
         colo_shutdown_requested = 0;
@@ -336,6 +437,7 @@ static void *colo_thread(void *opaque)
     QEMUFile *colo_control = NULL;
     int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int ret;
+    Error *local_err = NULL;
 
     if (colo_proxy_init(COLO_PRIMARY_MODE) != 0) {
         error_report("Init colo proxy error");
@@ -367,6 +469,12 @@ static void *colo_thread(void *opaque)
     }
 
     qemu_mutex_lock_iothread();
+    /* start block replication */
+    blk_start_replication(true, &local_err);
+    if (local_err) {
+        goto out;
+    }
+    trace_colo_start_block_replication();
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
@@ -417,7 +525,11 @@ do_checkpoint:
     }
 
 out:
-    error_report("colo: some error happens in colo_thread");
+    if (local_err) {
+        error_report_err(local_err);
+    } else {
+        error_report("colo: some error happens in colo_thread");
+    }
     qemu_mutex_lock_iothread();
     if (!failover_request_is_set()) {
         error_report("master takeover from checkpoint channel");
@@ -498,6 +610,8 @@ static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
     case COLO_GUEST_SHUTDOWN:
         qemu_mutex_lock_iothread();
         vm_stop_force_state(RUN_STATE_COLO);
+        blk_stop_replication(false, NULL);
+        trace_colo_stop_block_replication("shutdown");
         qemu_system_shutdown_request_core();
         qemu_mutex_unlock_iothread();
         trace_colo_receive_message("COLO_GUEST_SHUTDOWN");
@@ -521,6 +635,7 @@ void *colo_process_incoming_checkpoints(void *opaque)
     QEMUFile *ctl = NULL, *fb = NULL;
     int ret;
     uint64_t total_size;
+    Error *local_err = NULL;
 
     qdev_hotplug = 0;
 
@@ -550,6 +665,15 @@ void *colo_process_incoming_checkpoints(void *opaque)
         goto out;
     }
 
+    qemu_mutex_lock_iothread();
+    /* start block replication */
+    blk_start_replication(false, &local_err);
+    if (local_err) {
+        goto out;
+    }
+    qemu_mutex_unlock_iothread();
+    trace_colo_start_block_replication();
+
     ret = colo_ctl_put(ctl, COLO_CHECPOINT_READY);
     if (ret < 0) {
         goto out;
@@ -636,7 +760,13 @@ void *colo_process_incoming_checkpoints(void *opaque)
         }
 
         vmstate_loading = false;
+
+        /* discard colo disk buffer */
+        blk_do_checkpoint(&local_err);
         qemu_mutex_unlock_iothread();
+        if (local_err) {
+            goto out;
+        }
 
         ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
         if (ret < 0) {
@@ -654,7 +784,11 @@ void *colo_process_incoming_checkpoints(void *opaque)
     }
 
 out:
-    error_report("Detect some error or get a failover request");
+    if (local_err) {
+        error_report_err(local_err);
+    } else {
+        error_report("Detect some error or get a failover request");
+    }
     /* determine whether we need to failover */
     if (!failover_request_is_set()) {
         /*
diff --git a/trace-events b/trace-events
index b1c263a..d0ffade 100644
--- a/trace-events
+++ b/trace-events
@@ -1451,6 +1451,8 @@ colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
 colo_receive_message(const char *msg) "Receive '%s'"
 colo_do_failover(void) ""
 colo_rcv_pkt(int result) "Result of net packets comparing is different: %d"
+colo_start_block_replication(void) "Block replication is started"
+colo_stop_block_replication(const char *reason) "Block replication is stopped(reason: '%s')"
 
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
@ 2015-05-21 11:30   ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-21 11:30 UTC (permalink / raw)
  To: zhanghailiang
  Cc: qemu-devel, quintela, amit.shah, eblake, berrange,
	peter.huangpeng, eddie.dong, yunhong.jiang, wency, lizhijian,
	arei.gonglei, david, netfilter-devel

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> failover, proxy API, block replication API, not include block replication.
> The block part has been sent by wencongyang:
> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> 
> we have finished some new features and optimization on COLO (As a development branch in github),
> but for easy of review, it is better to keep it simple now, so we will not add too much new 
> codes into this frame patch set before it been totally reviewed. 
> 
> You can get the latest integrated qemu colo patches from github (Include Block part):
> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> 
> Please NOTE the difference between these two branch.
> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
> process of checkpoint, including: 
>    1) separate ram and device save/load process to reduce size of extra memory
>       used during checkpoint
>    2) live migrate part of dirty pages to slave during sleep time.
> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> info by using command 'info migrate'.
> 
> You can test any branch of the above, 
> about how to test COLO, Please reference to the follow link.
> http://wiki.qemu.org/Features/COLO.

Thanks, I'll try these out over the next few days.
It would be good to use some tags on the colo-proxy git;
it would be a bit confusing for people trying to use your previous
qemu patches if they fetched todays colo-proxy tree.

Dave

> 
> COLO is still in early stage, 
> your comments and feedback are warmly welcomed.
> 
> Cc: netfilter-devel@vger.kernel.org
> 
> TODO:
> 1. Strengthen failover
> 2. COLO function switch on/off
> 2. Optimize proxy part, include proxy script.
>   1) Remove the limitation of forward network link.
>   2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
> 3. The capability of continuous FT
> 
> v5:
> - Replace the previous communication way between proxy and qemu with nfnetlink
> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
> to set the 'forward device'
> - Turn DPRINTF into trace_ calls as Dave's suggestion
> 
> v4:
> - New block replication scheme (use image-fleecing for sencondary side)
> - Adress some comments from Eric Blake and Dave
> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>   run 100ms at least since last pause.
> v3:
> - use proxy instead of colo agent to compare network packets
> - add block replication
> - Optimize failover disposal
> - handle shutdown
> 
> v2:
> - use QEMUSizedBuffer/QEMUFile as COLO buffer
> - colo support is enabled by default
> - add nic replication support
> - addressed comments from Eric Blake and Dr. David Alan Gilbert
> 
> v1:
> - implement the frame of colo
> 
> Wen Congyang (1):
>   COLO: Add block replication into colo process
> 
> zhanghailiang (28):
>   configure: Add parameter for configure to enable/disable COLO support
>   migration: Introduce capability 'colo' to migration
>   COLO: migrate colo related info to slave
>   migration: Integrate COLO checkpoint process into migration
>   migration: Integrate COLO checkpoint process into loadvm
>   COLO: Implement colo checkpoint protocol
>   COLO: Add a new RunState RUN_STATE_COLO
>   QEMUSizedBuffer: Introduce two help functions for qsb
>   COLO: Save VM state to slave when do checkpoint
>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>   COLO VMstate: Load VM state into qsb before restore it
>   arch_init: Start to trace dirty pages of SVM
>   COLO RAM: Flush cached RAM into SVM's memory
>   COLO failover: Introduce a new command to trigger a failover
>   COLO failover: Implement COLO master/slave failover work
>   COLO failover: Don't do failover during loading VM's state
>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>   COLO NIC: Implement colo nic device interface configure()
>   COLO NIC : Implement colo nic init/destroy function
>   COLO NIC: Some init work related with proxy module
>   COLO: Handle nfnetlink message from proxy module
>   COLO: Do checkpoint according to the result of packets comparation
>   COLO: Improve checkpoint efficiency by do additional periodic
>     checkpoint
>   COLO: Add colo-set-checkpoint-period command
>   COLO NIC: Implement NIC checkpoint and failover
>   COLO: Disable qdev hotplug when VM is in COLO mode
>   COLO: Implement shutdown checkpoint
> 
>  arch_init.c                            | 243 +++++++++-
>  configure                              |  36 +-
>  hmp-commands.hx                        |  30 ++
>  hmp.c                                  |  14 +
>  hmp.h                                  |   2 +
>  include/exec/cpu-all.h                 |   1 +
>  include/migration/migration-colo.h     |  57 +++
>  include/migration/migration-failover.h |  22 +
>  include/migration/migration.h          |   3 +
>  include/migration/qemu-file.h          |   3 +-
>  include/net/colo-nic.h                 |  27 ++
>  include/net/net.h                      |   3 +
>  include/sysemu/sysemu.h                |   3 +
>  migration/Makefile.objs                |   2 +
>  migration/colo-comm.c                  |  68 +++
>  migration/colo-failover.c              |  48 ++
>  migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
>  migration/migration.c                  |  60 ++-
>  migration/qemu-file-buf.c              |  58 +++
>  net/Makefile.objs                      |   1 +
>  net/colo-nic.c                         | 420 +++++++++++++++++
>  net/tap.c                              |  45 +-
>  qapi-schema.json                       |  42 +-
>  qemu-options.hx                        |  10 +-
>  qmp-commands.hx                        |  41 ++
>  savevm.c                               |   2 +-
>  scripts/colo-proxy-script.sh           |  88 ++++
>  stubs/Makefile.objs                    |   1 +
>  stubs/migration-colo.c                 |  58 +++
>  trace-events                           |  11 +
>  vl.c                                   |  39 +-
>  31 files changed, 2235 insertions(+), 39 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 include/migration/migration-failover.h
>  create mode 100644 include/net/colo-nic.h
>  create mode 100644 migration/colo-comm.c
>  create mode 100644 migration/colo-failover.c
>  create mode 100644 migration/colo.c
>  create mode 100644 net/colo-nic.c
>  create mode 100755 scripts/colo-proxy-script.sh
>  create mode 100644 stubs/migration-colo.c
> 
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-21 11:30   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-21 11:30 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> failover, proxy API, block replication API, not include block replication.
> The block part has been sent by wencongyang:
> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> 
> we have finished some new features and optimization on COLO (As a development branch in github),
> but for easy of review, it is better to keep it simple now, so we will not add too much new 
> codes into this frame patch set before it been totally reviewed. 
> 
> You can get the latest integrated qemu colo patches from github (Include Block part):
> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> 
> Please NOTE the difference between these two branch.
> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
> process of checkpoint, including: 
>    1) separate ram and device save/load process to reduce size of extra memory
>       used during checkpoint
>    2) live migrate part of dirty pages to slave during sleep time.
> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> info by using command 'info migrate'.
> 
> You can test any branch of the above, 
> about how to test COLO, Please reference to the follow link.
> http://wiki.qemu.org/Features/COLO.

Thanks, I'll try these out over the next few days.
It would be good to use some tags on the colo-proxy git;
it would be a bit confusing for people trying to use your previous
qemu patches if they fetched todays colo-proxy tree.

Dave

> 
> COLO is still in early stage, 
> your comments and feedback are warmly welcomed.
> 
> Cc: netfilter-devel@vger.kernel.org
> 
> TODO:
> 1. Strengthen failover
> 2. COLO function switch on/off
> 2. Optimize proxy part, include proxy script.
>   1) Remove the limitation of forward network link.
>   2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
> 3. The capability of continuous FT
> 
> v5:
> - Replace the previous communication way between proxy and qemu with nfnetlink
> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
> to set the 'forward device'
> - Turn DPRINTF into trace_ calls as Dave's suggestion
> 
> v4:
> - New block replication scheme (use image-fleecing for sencondary side)
> - Adress some comments from Eric Blake and Dave
> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>   run 100ms at least since last pause.
> v3:
> - use proxy instead of colo agent to compare network packets
> - add block replication
> - Optimize failover disposal
> - handle shutdown
> 
> v2:
> - use QEMUSizedBuffer/QEMUFile as COLO buffer
> - colo support is enabled by default
> - add nic replication support
> - addressed comments from Eric Blake and Dr. David Alan Gilbert
> 
> v1:
> - implement the frame of colo
> 
> Wen Congyang (1):
>   COLO: Add block replication into colo process
> 
> zhanghailiang (28):
>   configure: Add parameter for configure to enable/disable COLO support
>   migration: Introduce capability 'colo' to migration
>   COLO: migrate colo related info to slave
>   migration: Integrate COLO checkpoint process into migration
>   migration: Integrate COLO checkpoint process into loadvm
>   COLO: Implement colo checkpoint protocol
>   COLO: Add a new RunState RUN_STATE_COLO
>   QEMUSizedBuffer: Introduce two help functions for qsb
>   COLO: Save VM state to slave when do checkpoint
>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>   COLO VMstate: Load VM state into qsb before restore it
>   arch_init: Start to trace dirty pages of SVM
>   COLO RAM: Flush cached RAM into SVM's memory
>   COLO failover: Introduce a new command to trigger a failover
>   COLO failover: Implement COLO master/slave failover work
>   COLO failover: Don't do failover during loading VM's state
>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>   COLO NIC: Implement colo nic device interface configure()
>   COLO NIC : Implement colo nic init/destroy function
>   COLO NIC: Some init work related with proxy module
>   COLO: Handle nfnetlink message from proxy module
>   COLO: Do checkpoint according to the result of packets comparation
>   COLO: Improve checkpoint efficiency by do additional periodic
>     checkpoint
>   COLO: Add colo-set-checkpoint-period command
>   COLO NIC: Implement NIC checkpoint and failover
>   COLO: Disable qdev hotplug when VM is in COLO mode
>   COLO: Implement shutdown checkpoint
> 
>  arch_init.c                            | 243 +++++++++-
>  configure                              |  36 +-
>  hmp-commands.hx                        |  30 ++
>  hmp.c                                  |  14 +
>  hmp.h                                  |   2 +
>  include/exec/cpu-all.h                 |   1 +
>  include/migration/migration-colo.h     |  57 +++
>  include/migration/migration-failover.h |  22 +
>  include/migration/migration.h          |   3 +
>  include/migration/qemu-file.h          |   3 +-
>  include/net/colo-nic.h                 |  27 ++
>  include/net/net.h                      |   3 +
>  include/sysemu/sysemu.h                |   3 +
>  migration/Makefile.objs                |   2 +
>  migration/colo-comm.c                  |  68 +++
>  migration/colo-failover.c              |  48 ++
>  migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
>  migration/migration.c                  |  60 ++-
>  migration/qemu-file-buf.c              |  58 +++
>  net/Makefile.objs                      |   1 +
>  net/colo-nic.c                         | 420 +++++++++++++++++
>  net/tap.c                              |  45 +-
>  qapi-schema.json                       |  42 +-
>  qemu-options.hx                        |  10 +-
>  qmp-commands.hx                        |  41 ++
>  savevm.c                               |   2 +-
>  scripts/colo-proxy-script.sh           |  88 ++++
>  stubs/Makefile.objs                    |   1 +
>  stubs/migration-colo.c                 |  58 +++
>  trace-events                           |  11 +
>  vl.c                                   |  39 +-
>  31 files changed, 2235 insertions(+), 39 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 include/migration/migration-failover.h
>  create mode 100644 include/net/colo-nic.h
>  create mode 100644 migration/colo-comm.c
>  create mode 100644 migration/colo-failover.c
>  create mode 100644 migration/colo.c
>  create mode 100644 net/colo-nic.c
>  create mode 100755 scripts/colo-proxy-script.sh
>  create mode 100644 stubs/migration-colo.c
> 
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-21 11:30   ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2015-05-22  6:26     ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-22  6:26 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: peter.huangpeng, qemu-devel, quintela, amit.shah, eblake,
	berrange, eddie.dong, yunhong.jiang, wency, lizhijian,
	arei.gonglei, david, netfilter-devel

On 2015/5/21 19:30, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>> failover, proxy API, block replication API, not include block replication.
>> The block part has been sent by wencongyang:
>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>
>> we have finished some new features and optimization on COLO (As a development branch in github),
>> but for easy of review, it is better to keep it simple now, so we will not add too much new
>> codes into this frame patch set before it been totally reviewed.
>>
>> You can get the latest integrated qemu colo patches from github (Include Block part):
>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>
>> Please NOTE the difference between these two branch.
>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
>> process of checkpoint, including:
>>     1) separate ram and device save/load process to reduce size of extra memory
>>        used during checkpoint
>>     2) live migrate part of dirty pages to slave during sleep time.
>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>> info by using command 'info migrate'.
>>
>> You can test any branch of the above,
>> about how to test COLO, Please reference to the follow link.
>> http://wiki.qemu.org/Features/COLO.
>
> Thanks, I'll try these out over the next few days.
> It would be good to use some tags on the colo-proxy git;
> it would be a bit confusing for people trying to use your previous
> qemu patches if they fetched todays colo-proxy tree.
>

Hi Dave,

I have put tags on the colo-proxy in github, thanks for your suggestion.
And one more thing, to test this new version, you have to rebuild your kernel and iptables,
since there are some modification :)

Thanks,
zhanghailiang

>
>>
>> COLO is still in early stage,
>> your comments and feedback are warmly welcomed.
>>
>> Cc: netfilter-devel@vger.kernel.org
>>
>> TODO:
>> 1. Strengthen failover
>> 2. COLO function switch on/off
>> 2. Optimize proxy part, include proxy script.
>>    1) Remove the limitation of forward network link.
>>    2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
>> 3. The capability of continuous FT
>>
>> v5:
>> - Replace the previous communication way between proxy and qemu with nfnetlink
>> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
>> to set the 'forward device'
>> - Turn DPRINTF into trace_ calls as Dave's suggestion
>>
>> v4:
>> - New block replication scheme (use image-fleecing for sencondary side)
>> - Adress some comments from Eric Blake and Dave
>> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
>> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>>    run 100ms at least since last pause.
>> v3:
>> - use proxy instead of colo agent to compare network packets
>> - add block replication
>> - Optimize failover disposal
>> - handle shutdown
>>
>> v2:
>> - use QEMUSizedBuffer/QEMUFile as COLO buffer
>> - colo support is enabled by default
>> - add nic replication support
>> - addressed comments from Eric Blake and Dr. David Alan Gilbert
>>
>> v1:
>> - implement the frame of colo
>>
>> Wen Congyang (1):
>>    COLO: Add block replication into colo process
>>
>> zhanghailiang (28):
>>    configure: Add parameter for configure to enable/disable COLO support
>>    migration: Introduce capability 'colo' to migration
>>    COLO: migrate colo related info to slave
>>    migration: Integrate COLO checkpoint process into migration
>>    migration: Integrate COLO checkpoint process into loadvm
>>    COLO: Implement colo checkpoint protocol
>>    COLO: Add a new RunState RUN_STATE_COLO
>>    QEMUSizedBuffer: Introduce two help functions for qsb
>>    COLO: Save VM state to slave when do checkpoint
>>    COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>>    COLO VMstate: Load VM state into qsb before restore it
>>    arch_init: Start to trace dirty pages of SVM
>>    COLO RAM: Flush cached RAM into SVM's memory
>>    COLO failover: Introduce a new command to trigger a failover
>>    COLO failover: Implement COLO master/slave failover work
>>    COLO failover: Don't do failover during loading VM's state
>>    COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>>    COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>>    COLO NIC: Implement colo nic device interface configure()
>>    COLO NIC : Implement colo nic init/destroy function
>>    COLO NIC: Some init work related with proxy module
>>    COLO: Handle nfnetlink message from proxy module
>>    COLO: Do checkpoint according to the result of packets comparation
>>    COLO: Improve checkpoint efficiency by do additional periodic
>>      checkpoint
>>    COLO: Add colo-set-checkpoint-period command
>>    COLO NIC: Implement NIC checkpoint and failover
>>    COLO: Disable qdev hotplug when VM is in COLO mode
>>    COLO: Implement shutdown checkpoint
>>
>>   arch_init.c                            | 243 +++++++++-
>>   configure                              |  36 +-
>>   hmp-commands.hx                        |  30 ++
>>   hmp.c                                  |  14 +
>>   hmp.h                                  |   2 +
>>   include/exec/cpu-all.h                 |   1 +
>>   include/migration/migration-colo.h     |  57 +++
>>   include/migration/migration-failover.h |  22 +
>>   include/migration/migration.h          |   3 +
>>   include/migration/qemu-file.h          |   3 +-
>>   include/net/colo-nic.h                 |  27 ++
>>   include/net/net.h                      |   3 +
>>   include/sysemu/sysemu.h                |   3 +
>>   migration/Makefile.objs                |   2 +
>>   migration/colo-comm.c                  |  68 +++
>>   migration/colo-failover.c              |  48 ++
>>   migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
>>   migration/migration.c                  |  60 ++-
>>   migration/qemu-file-buf.c              |  58 +++
>>   net/Makefile.objs                      |   1 +
>>   net/colo-nic.c                         | 420 +++++++++++++++++
>>   net/tap.c                              |  45 +-
>>   qapi-schema.json                       |  42 +-
>>   qemu-options.hx                        |  10 +-
>>   qmp-commands.hx                        |  41 ++
>>   savevm.c                               |   2 +-
>>   scripts/colo-proxy-script.sh           |  88 ++++
>>   stubs/Makefile.objs                    |   1 +
>>   stubs/migration-colo.c                 |  58 +++
>>   trace-events                           |  11 +
>>   vl.c                                   |  39 +-
>>   31 files changed, 2235 insertions(+), 39 deletions(-)
>>   create mode 100644 include/migration/migration-colo.h
>>   create mode 100644 include/migration/migration-failover.h
>>   create mode 100644 include/net/colo-nic.h
>>   create mode 100644 migration/colo-comm.c
>>   create mode 100644 migration/colo-failover.c
>>   create mode 100644 migration/colo.c
>>   create mode 100644 net/colo-nic.c
>>   create mode 100755 scripts/colo-proxy-script.sh
>>   create mode 100644 stubs/migration-colo.c
>>
>> --
>> 1.7.12.4
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-22  6:26     ` zhanghailiang
  0 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-22  6:26 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

On 2015/5/21 19:30, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>> failover, proxy API, block replication API, not include block replication.
>> The block part has been sent by wencongyang:
>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>
>> we have finished some new features and optimization on COLO (As a development branch in github),
>> but for easy of review, it is better to keep it simple now, so we will not add too much new
>> codes into this frame patch set before it been totally reviewed.
>>
>> You can get the latest integrated qemu colo patches from github (Include Block part):
>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>
>> Please NOTE the difference between these two branch.
>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
>> process of checkpoint, including:
>>     1) separate ram and device save/load process to reduce size of extra memory
>>        used during checkpoint
>>     2) live migrate part of dirty pages to slave during sleep time.
>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>> info by using command 'info migrate'.
>>
>> You can test any branch of the above,
>> about how to test COLO, Please reference to the follow link.
>> http://wiki.qemu.org/Features/COLO.
>
> Thanks, I'll try these out over the next few days.
> It would be good to use some tags on the colo-proxy git;
> it would be a bit confusing for people trying to use your previous
> qemu patches if they fetched todays colo-proxy tree.
>

Hi Dave,

I have put tags on the colo-proxy in github, thanks for your suggestion.
And one more thing, to test this new version, you have to rebuild your kernel and iptables,
since there are some modification :)

Thanks,
zhanghailiang

>
>>
>> COLO is still in early stage,
>> your comments and feedback are warmly welcomed.
>>
>> Cc: netfilter-devel@vger.kernel.org
>>
>> TODO:
>> 1. Strengthen failover
>> 2. COLO function switch on/off
>> 2. Optimize proxy part, include proxy script.
>>    1) Remove the limitation of forward network link.
>>    2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
>> 3. The capability of continuous FT
>>
>> v5:
>> - Replace the previous communication way between proxy and qemu with nfnetlink
>> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
>> to set the 'forward device'
>> - Turn DPRINTF into trace_ calls as Dave's suggestion
>>
>> v4:
>> - New block replication scheme (use image-fleecing for sencondary side)
>> - Adress some comments from Eric Blake and Dave
>> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
>> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>>    run 100ms at least since last pause.
>> v3:
>> - use proxy instead of colo agent to compare network packets
>> - add block replication
>> - Optimize failover disposal
>> - handle shutdown
>>
>> v2:
>> - use QEMUSizedBuffer/QEMUFile as COLO buffer
>> - colo support is enabled by default
>> - add nic replication support
>> - addressed comments from Eric Blake and Dr. David Alan Gilbert
>>
>> v1:
>> - implement the frame of colo
>>
>> Wen Congyang (1):
>>    COLO: Add block replication into colo process
>>
>> zhanghailiang (28):
>>    configure: Add parameter for configure to enable/disable COLO support
>>    migration: Introduce capability 'colo' to migration
>>    COLO: migrate colo related info to slave
>>    migration: Integrate COLO checkpoint process into migration
>>    migration: Integrate COLO checkpoint process into loadvm
>>    COLO: Implement colo checkpoint protocol
>>    COLO: Add a new RunState RUN_STATE_COLO
>>    QEMUSizedBuffer: Introduce two help functions for qsb
>>    COLO: Save VM state to slave when do checkpoint
>>    COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>>    COLO VMstate: Load VM state into qsb before restore it
>>    arch_init: Start to trace dirty pages of SVM
>>    COLO RAM: Flush cached RAM into SVM's memory
>>    COLO failover: Introduce a new command to trigger a failover
>>    COLO failover: Implement COLO master/slave failover work
>>    COLO failover: Don't do failover during loading VM's state
>>    COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>>    COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>>    COLO NIC: Implement colo nic device interface configure()
>>    COLO NIC : Implement colo nic init/destroy function
>>    COLO NIC: Some init work related with proxy module
>>    COLO: Handle nfnetlink message from proxy module
>>    COLO: Do checkpoint according to the result of packets comparation
>>    COLO: Improve checkpoint efficiency by do additional periodic
>>      checkpoint
>>    COLO: Add colo-set-checkpoint-period command
>>    COLO NIC: Implement NIC checkpoint and failover
>>    COLO: Disable qdev hotplug when VM is in COLO mode
>>    COLO: Implement shutdown checkpoint
>>
>>   arch_init.c                            | 243 +++++++++-
>>   configure                              |  36 +-
>>   hmp-commands.hx                        |  30 ++
>>   hmp.c                                  |  14 +
>>   hmp.h                                  |   2 +
>>   include/exec/cpu-all.h                 |   1 +
>>   include/migration/migration-colo.h     |  57 +++
>>   include/migration/migration-failover.h |  22 +
>>   include/migration/migration.h          |   3 +
>>   include/migration/qemu-file.h          |   3 +-
>>   include/net/colo-nic.h                 |  27 ++
>>   include/net/net.h                      |   3 +
>>   include/sysemu/sysemu.h                |   3 +
>>   migration/Makefile.objs                |   2 +
>>   migration/colo-comm.c                  |  68 +++
>>   migration/colo-failover.c              |  48 ++
>>   migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
>>   migration/migration.c                  |  60 ++-
>>   migration/qemu-file-buf.c              |  58 +++
>>   net/Makefile.objs                      |   1 +
>>   net/colo-nic.c                         | 420 +++++++++++++++++
>>   net/tap.c                              |  45 +-
>>   qapi-schema.json                       |  42 +-
>>   qemu-options.hx                        |  10 +-
>>   qmp-commands.hx                        |  41 ++
>>   savevm.c                               |   2 +-
>>   scripts/colo-proxy-script.sh           |  88 ++++
>>   stubs/Makefile.objs                    |   1 +
>>   stubs/migration-colo.c                 |  58 +++
>>   trace-events                           |  11 +
>>   vl.c                                   |  39 +-
>>   31 files changed, 2235 insertions(+), 39 deletions(-)
>>   create mode 100644 include/migration/migration-colo.h
>>   create mode 100644 include/migration/migration-failover.h
>>   create mode 100644 include/net/colo-nic.h
>>   create mode 100644 migration/colo-comm.c
>>   create mode 100644 migration/colo-failover.c
>>   create mode 100644 migration/colo.c
>>   create mode 100644 net/colo-nic.c
>>   create mode 100755 scripts/colo-proxy-script.sh
>>   create mode 100644 stubs/migration-colo.c
>>
>> --
>> 1.7.12.4
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
@ 2015-05-28 16:24   ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-28 16:24 UTC (permalink / raw)
  To: zhanghailiang
  Cc: qemu-devel, quintela, amit.shah, eblake, berrange,
	peter.huangpeng, eddie.dong, yunhong.jiang, wency, lizhijian,
	arei.gonglei, david, netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 9643 bytes --]

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> failover, proxy API, block replication API, not include block replication.
> The block part has been sent by wencongyang:
> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> 
> we have finished some new features and optimization on COLO (As a development branch in github),
> but for easy of review, it is better to keep it simple now, so we will not add too much new 
> codes into this frame patch set before it been totally reviewed. 
> 
> You can get the latest integrated qemu colo patches from github (Include Block part):
> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> 
> Please NOTE the difference between these two branch.
> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
> process of checkpoint, including: 
>    1) separate ram and device save/load process to reduce size of extra memory
>       used during checkpoint
>    2) live migrate part of dirty pages to slave during sleep time.
> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> info by using command 'info migrate'.


Hi,
  I have that running now.

Some notes:
  1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
  2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
     they're very minor changes and I don't think related to (1).
  3) I've also included some minor fixups I needed to get the -developing world
     to build;  my compiler is fussy about unused variables etc - but I think the code
     in ram_save_complete in your -developing patch is wrong because there are two
     'pages' variables and the one in the inner loop is the only one changed.
  4) I've started trying simple benchmarks and tests now:
    a) With a simple web server most requests have very little overhead, the comparison
       matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
       corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
       since the downtime isn't that big.
    b) I tried something with more dynamic pages - the front page of a simple bugzilla
       install;  it failed the comparison every time; it took me a while to figure out
       why, but it generates a unique token in it's javascript each time (for a password reset
       link), and I guess the randomness used by that doesn't match on the two hosts.
       It surprised me, because I didn't expect this page to have much randomness
       in.

  4a is really nice - it shows the benefit of COLO over the simple checkpointing;
checkpoints happen very rarely.

The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
after the qemu quits; the backtrace of the qemu stack is:

[<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
[<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
[<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
[<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
[<ffffffff81090c96>] notifier_call_chain+0x66/0x90
[<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
[<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
[<ffffffff815878bf>] sock_release+0x1f/0x90
[<ffffffff81587942>] sock_close+0x12/0x20
[<ffffffff812193c3>] __fput+0xd3/0x210
[<ffffffff8121954e>] ____fput+0xe/0x10
[<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
[<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
[<ffffffff81722b66>] int_signal+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff

that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
I'm not sure of the right fix; perhaps it might be possible to replace the 
synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?

Thanks,

Dave

> 
> You can test any branch of the above, 
> about how to test COLO, Please reference to the follow link.
> http://wiki.qemu.org/Features/COLO.
> 
> COLO is still in early stage, 
> your comments and feedback are warmly welcomed.
> 
> Cc: netfilter-devel@vger.kernel.org
> 
> TODO:
> 1. Strengthen failover
> 2. COLO function switch on/off
> 2. Optimize proxy part, include proxy script.
>   1) Remove the limitation of forward network link.
>   2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
> 3. The capability of continuous FT
> 
> v5:
> - Replace the previous communication way between proxy and qemu with nfnetlink
> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
> to set the 'forward device'
> - Turn DPRINTF into trace_ calls as Dave's suggestion
> 
> v4:
> - New block replication scheme (use image-fleecing for sencondary side)
> - Adress some comments from Eric Blake and Dave
> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>   run 100ms at least since last pause.
> v3:
> - use proxy instead of colo agent to compare network packets
> - add block replication
> - Optimize failover disposal
> - handle shutdown
> 
> v2:
> - use QEMUSizedBuffer/QEMUFile as COLO buffer
> - colo support is enabled by default
> - add nic replication support
> - addressed comments from Eric Blake and Dr. David Alan Gilbert
> 
> v1:
> - implement the frame of colo
> 
> Wen Congyang (1):
>   COLO: Add block replication into colo process
> 
> zhanghailiang (28):
>   configure: Add parameter for configure to enable/disable COLO support
>   migration: Introduce capability 'colo' to migration
>   COLO: migrate colo related info to slave
>   migration: Integrate COLO checkpoint process into migration
>   migration: Integrate COLO checkpoint process into loadvm
>   COLO: Implement colo checkpoint protocol
>   COLO: Add a new RunState RUN_STATE_COLO
>   QEMUSizedBuffer: Introduce two help functions for qsb
>   COLO: Save VM state to slave when do checkpoint
>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>   COLO VMstate: Load VM state into qsb before restore it
>   arch_init: Start to trace dirty pages of SVM
>   COLO RAM: Flush cached RAM into SVM's memory
>   COLO failover: Introduce a new command to trigger a failover
>   COLO failover: Implement COLO master/slave failover work
>   COLO failover: Don't do failover during loading VM's state
>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>   COLO NIC: Implement colo nic device interface configure()
>   COLO NIC : Implement colo nic init/destroy function
>   COLO NIC: Some init work related with proxy module
>   COLO: Handle nfnetlink message from proxy module
>   COLO: Do checkpoint according to the result of packets comparation
>   COLO: Improve checkpoint efficiency by do additional periodic
>     checkpoint
>   COLO: Add colo-set-checkpoint-period command
>   COLO NIC: Implement NIC checkpoint and failover
>   COLO: Disable qdev hotplug when VM is in COLO mode
>   COLO: Implement shutdown checkpoint
> 
>  arch_init.c                            | 243 +++++++++-
>  configure                              |  36 +-
>  hmp-commands.hx                        |  30 ++
>  hmp.c                                  |  14 +
>  hmp.h                                  |   2 +
>  include/exec/cpu-all.h                 |   1 +
>  include/migration/migration-colo.h     |  57 +++
>  include/migration/migration-failover.h |  22 +
>  include/migration/migration.h          |   3 +
>  include/migration/qemu-file.h          |   3 +-
>  include/net/colo-nic.h                 |  27 ++
>  include/net/net.h                      |   3 +
>  include/sysemu/sysemu.h                |   3 +
>  migration/Makefile.objs                |   2 +
>  migration/colo-comm.c                  |  68 +++
>  migration/colo-failover.c              |  48 ++
>  migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
>  migration/migration.c                  |  60 ++-
>  migration/qemu-file-buf.c              |  58 +++
>  net/Makefile.objs                      |   1 +
>  net/colo-nic.c                         | 420 +++++++++++++++++
>  net/tap.c                              |  45 +-
>  qapi-schema.json                       |  42 +-
>  qemu-options.hx                        |  10 +-
>  qmp-commands.hx                        |  41 ++
>  savevm.c                               |   2 +-
>  scripts/colo-proxy-script.sh           |  88 ++++
>  stubs/Makefile.objs                    |   1 +
>  stubs/migration-colo.c                 |  58 +++
>  trace-events                           |  11 +
>  vl.c                                   |  39 +-
>  31 files changed, 2235 insertions(+), 39 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 include/migration/migration-failover.h
>  create mode 100644 include/net/colo-nic.h
>  create mode 100644 migration/colo-comm.c
>  create mode 100644 migration/colo-failover.c
>  create mode 100644 migration/colo.c
>  create mode 100644 net/colo-nic.c
>  create mode 100755 scripts/colo-proxy-script.sh
>  create mode 100644 stubs/migration-colo.c
> 
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

[-- Attachment #2: colo-proxy-4.1build.diff --]
[-- Type: text/plain, Size: 2380 bytes --]

commit 06f74102be1aa0e5c4a8ca523ac23dad3aa3e282
Author: Dr. David Alan Gilbert (git/414) <dgilbert@redhat.com>
Date:   Wed May 27 14:53:55 2015 -0400

    Hacks to build with 4.1
    
    Changes needed due to:
    238e54c9   David S. Miller      2015-04-03   Make nf_hookfn use nf_hook_state.
    1d1de89b   David S. Miller      2015-04-03   Use nf_hook_state in nf_queue_entry.
    
    Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

diff --git a/xt_PMYCOLO.c b/xt_PMYCOLO.c
index f5d7cda..626e170 100644
--- a/xt_PMYCOLO.c
+++ b/xt_PMYCOLO.c
@@ -829,7 +829,7 @@ static int colo_enqueue_packet(struct nf_queue_entry *entry, unsigned int ptr)
 		pr_dbg("master: gso again???!!!\n");
 	}
 
-	if (entry->hook != NF_INET_PRE_ROUTING) {
+	if (entry->state.hook != NF_INET_PRE_ROUTING) {
 		pr_dbg("packet is not on pre routing chain\n");
 		return -1;
 	}
@@ -839,7 +839,7 @@ static int colo_enqueue_packet(struct nf_queue_entry *entry, unsigned int ptr)
 		pr_dbg("%s: Could not find node: %d\n",__func__, conn->vm_pid);
 		return -1;
 	}
-	switch (entry->pf) {
+	switch (entry->state.pf) {
 	case NFPROTO_IPV4:
 		skb->protocol = htons(ETH_P_IP);
 		break;
@@ -1133,8 +1133,7 @@ out:
 
 static unsigned int
 colo_slaver_queue_hook(const struct nf_hook_ops *ops, struct sk_buff *skb,
-		       const struct net_device *in, const struct net_device *out,
-		       int (*okfn)(struct sk_buff *))
+                       const struct nf_hook_state *state)
 {
 	struct nf_conn *ct;
 	struct nf_conn_colo *conn;
@@ -1193,8 +1192,7 @@ out_unlock:
 
 static unsigned int
 colo_slaver_arp_hook(const struct nf_hook_ops *ops, struct sk_buff *skb,
-		     const struct net_device *in, const struct net_device *out,
-		     int (*okfn)(struct sk_buff *))
+                     const struct nf_hook_state *state)
 {
 	unsigned int ret = NF_ACCEPT;
 	const struct arphdr *arp;
diff --git a/xt_SECCOLO.c b/xt_SECCOLO.c
index fe8b4da..8bdef15 100644
--- a/xt_SECCOLO.c
+++ b/xt_SECCOLO.c
@@ -28,8 +28,7 @@ MODULE_DESCRIPTION("Xtables: secondary proxy module for colo.");
 
 static unsigned int
 colo_secondary_hook(const struct nf_hook_ops *ops, struct sk_buff *skb,
-		    const struct net_device *in, const struct net_device *out,
-		    int (*okfn)(struct sk_buff *))
+		    const struct nf_hook_state *hook_state)
 {
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn_colo *conn;

[-- Attachment #3: colo-may-build.patches --]
[-- Type: text/plain, Size: 1114 bytes --]

diff --git a/arch_init.c b/arch_init.c
index b7ce63a..564d87c 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -1359,7 +1359,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 /* Called with iothread lock */
 static int ram_save_complete(QEMUFile *f, void *opaque)
 {
-    int pages;
+    int pages = 0;
 
     rcu_read_lock();
 
diff --git a/migration/colo.c b/migration/colo.c
index dd6aef1..1ce9793 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -725,7 +725,7 @@ void *colo_process_incoming_checkpoints(void *opaque)
     int ret;
     uint64_t total_size;
     Error *local_err = NULL;
-    static int init_once;
+    //static int init_once;
 
     qdev_hotplug = 0;
 
diff --git a/savevm.c b/savevm.c
index 0c45387..873169d 100644
--- a/savevm.c
+++ b/savevm.c
@@ -877,7 +877,7 @@ int qemu_save_ram_state(QEMUFile *f, bool complete)
     SaveStateEntry *se;
     int section = complete ? QEMU_VM_SECTION_END : QEMU_VM_SECTION_PART;
     int (*save_state)(QEMUFile *f, void *opaque);
-    int ret;
+    int ret = 0;
 
     QTAILQ_FOREACH(se, &savevm_handlers, entry) {
         if (!se->ops) {

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-28 16:24   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-28 16:24 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

[-- Attachment #1: Type: text/plain, Size: 9643 bytes --]

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> failover, proxy API, block replication API, not include block replication.
> The block part has been sent by wencongyang:
> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> 
> we have finished some new features and optimization on COLO (As a development branch in github),
> but for easy of review, it is better to keep it simple now, so we will not add too much new 
> codes into this frame patch set before it been totally reviewed. 
> 
> You can get the latest integrated qemu colo patches from github (Include Block part):
> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> 
> Please NOTE the difference between these two branch.
> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
> process of checkpoint, including: 
>    1) separate ram and device save/load process to reduce size of extra memory
>       used during checkpoint
>    2) live migrate part of dirty pages to slave during sleep time.
> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> info by using command 'info migrate'.


Hi,
  I have that running now.

Some notes:
  1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
  2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
     they're very minor changes and I don't think related to (1).
  3) I've also included some minor fixups I needed to get the -developing world
     to build;  my compiler is fussy about unused variables etc - but I think the code
     in ram_save_complete in your -developing patch is wrong because there are two
     'pages' variables and the one in the inner loop is the only one changed.
  4) I've started trying simple benchmarks and tests now:
    a) With a simple web server most requests have very little overhead, the comparison
       matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
       corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
       since the downtime isn't that big.
    b) I tried something with more dynamic pages - the front page of a simple bugzilla
       install;  it failed the comparison every time; it took me a while to figure out
       why, but it generates a unique token in it's javascript each time (for a password reset
       link), and I guess the randomness used by that doesn't match on the two hosts.
       It surprised me, because I didn't expect this page to have much randomness
       in.

  4a is really nice - it shows the benefit of COLO over the simple checkpointing;
checkpoints happen very rarely.

The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
after the qemu quits; the backtrace of the qemu stack is:

[<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
[<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
[<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
[<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
[<ffffffff81090c96>] notifier_call_chain+0x66/0x90
[<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
[<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
[<ffffffff815878bf>] sock_release+0x1f/0x90
[<ffffffff81587942>] sock_close+0x12/0x20
[<ffffffff812193c3>] __fput+0xd3/0x210
[<ffffffff8121954e>] ____fput+0xe/0x10
[<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
[<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
[<ffffffff81722b66>] int_signal+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff

that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
I'm not sure of the right fix; perhaps it might be possible to replace the 
synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?

Thanks,

Dave

> 
> You can test any branch of the above, 
> about how to test COLO, Please reference to the follow link.
> http://wiki.qemu.org/Features/COLO.
> 
> COLO is still in early stage, 
> your comments and feedback are warmly welcomed.
> 
> Cc: netfilter-devel@vger.kernel.org
> 
> TODO:
> 1. Strengthen failover
> 2. COLO function switch on/off
> 2. Optimize proxy part, include proxy script.
>   1) Remove the limitation of forward network link.
>   2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
> 3. The capability of continuous FT
> 
> v5:
> - Replace the previous communication way between proxy and qemu with nfnetlink
> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
> to set the 'forward device'
> - Turn DPRINTF into trace_ calls as Dave's suggestion
> 
> v4:
> - New block replication scheme (use image-fleecing for sencondary side)
> - Adress some comments from Eric Blake and Dave
> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>   run 100ms at least since last pause.
> v3:
> - use proxy instead of colo agent to compare network packets
> - add block replication
> - Optimize failover disposal
> - handle shutdown
> 
> v2:
> - use QEMUSizedBuffer/QEMUFile as COLO buffer
> - colo support is enabled by default
> - add nic replication support
> - addressed comments from Eric Blake and Dr. David Alan Gilbert
> 
> v1:
> - implement the frame of colo
> 
> Wen Congyang (1):
>   COLO: Add block replication into colo process
> 
> zhanghailiang (28):
>   configure: Add parameter for configure to enable/disable COLO support
>   migration: Introduce capability 'colo' to migration
>   COLO: migrate colo related info to slave
>   migration: Integrate COLO checkpoint process into migration
>   migration: Integrate COLO checkpoint process into loadvm
>   COLO: Implement colo checkpoint protocol
>   COLO: Add a new RunState RUN_STATE_COLO
>   QEMUSizedBuffer: Introduce two help functions for qsb
>   COLO: Save VM state to slave when do checkpoint
>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>   COLO VMstate: Load VM state into qsb before restore it
>   arch_init: Start to trace dirty pages of SVM
>   COLO RAM: Flush cached RAM into SVM's memory
>   COLO failover: Introduce a new command to trigger a failover
>   COLO failover: Implement COLO master/slave failover work
>   COLO failover: Don't do failover during loading VM's state
>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>   COLO NIC: Implement colo nic device interface configure()
>   COLO NIC : Implement colo nic init/destroy function
>   COLO NIC: Some init work related with proxy module
>   COLO: Handle nfnetlink message from proxy module
>   COLO: Do checkpoint according to the result of packets comparation
>   COLO: Improve checkpoint efficiency by do additional periodic
>     checkpoint
>   COLO: Add colo-set-checkpoint-period command
>   COLO NIC: Implement NIC checkpoint and failover
>   COLO: Disable qdev hotplug when VM is in COLO mode
>   COLO: Implement shutdown checkpoint
> 
>  arch_init.c                            | 243 +++++++++-
>  configure                              |  36 +-
>  hmp-commands.hx                        |  30 ++
>  hmp.c                                  |  14 +
>  hmp.h                                  |   2 +
>  include/exec/cpu-all.h                 |   1 +
>  include/migration/migration-colo.h     |  57 +++
>  include/migration/migration-failover.h |  22 +
>  include/migration/migration.h          |   3 +
>  include/migration/qemu-file.h          |   3 +-
>  include/net/colo-nic.h                 |  27 ++
>  include/net/net.h                      |   3 +
>  include/sysemu/sysemu.h                |   3 +
>  migration/Makefile.objs                |   2 +
>  migration/colo-comm.c                  |  68 +++
>  migration/colo-failover.c              |  48 ++
>  migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
>  migration/migration.c                  |  60 ++-
>  migration/qemu-file-buf.c              |  58 +++
>  net/Makefile.objs                      |   1 +
>  net/colo-nic.c                         | 420 +++++++++++++++++
>  net/tap.c                              |  45 +-
>  qapi-schema.json                       |  42 +-
>  qemu-options.hx                        |  10 +-
>  qmp-commands.hx                        |  41 ++
>  savevm.c                               |   2 +-
>  scripts/colo-proxy-script.sh           |  88 ++++
>  stubs/Makefile.objs                    |   1 +
>  stubs/migration-colo.c                 |  58 +++
>  trace-events                           |  11 +
>  vl.c                                   |  39 +-
>  31 files changed, 2235 insertions(+), 39 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 include/migration/migration-failover.h
>  create mode 100644 include/net/colo-nic.h
>  create mode 100644 migration/colo-comm.c
>  create mode 100644 migration/colo-failover.c
>  create mode 100644 migration/colo.c
>  create mode 100644 net/colo-nic.c
>  create mode 100755 scripts/colo-proxy-script.sh
>  create mode 100644 stubs/migration-colo.c
> 
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

[-- Attachment #2: colo-proxy-4.1build.diff --]
[-- Type: text/plain, Size: 2380 bytes --]

commit 06f74102be1aa0e5c4a8ca523ac23dad3aa3e282
Author: Dr. David Alan Gilbert (git/414) <dgilbert@redhat.com>
Date:   Wed May 27 14:53:55 2015 -0400

    Hacks to build with 4.1
    
    Changes needed due to:
    238e54c9   David S. Miller      2015-04-03   Make nf_hookfn use nf_hook_state.
    1d1de89b   David S. Miller      2015-04-03   Use nf_hook_state in nf_queue_entry.
    
    Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

diff --git a/xt_PMYCOLO.c b/xt_PMYCOLO.c
index f5d7cda..626e170 100644
--- a/xt_PMYCOLO.c
+++ b/xt_PMYCOLO.c
@@ -829,7 +829,7 @@ static int colo_enqueue_packet(struct nf_queue_entry *entry, unsigned int ptr)
 		pr_dbg("master: gso again???!!!\n");
 	}
 
-	if (entry->hook != NF_INET_PRE_ROUTING) {
+	if (entry->state.hook != NF_INET_PRE_ROUTING) {
 		pr_dbg("packet is not on pre routing chain\n");
 		return -1;
 	}
@@ -839,7 +839,7 @@ static int colo_enqueue_packet(struct nf_queue_entry *entry, unsigned int ptr)
 		pr_dbg("%s: Could not find node: %d\n",__func__, conn->vm_pid);
 		return -1;
 	}
-	switch (entry->pf) {
+	switch (entry->state.pf) {
 	case NFPROTO_IPV4:
 		skb->protocol = htons(ETH_P_IP);
 		break;
@@ -1133,8 +1133,7 @@ out:
 
 static unsigned int
 colo_slaver_queue_hook(const struct nf_hook_ops *ops, struct sk_buff *skb,
-		       const struct net_device *in, const struct net_device *out,
-		       int (*okfn)(struct sk_buff *))
+                       const struct nf_hook_state *state)
 {
 	struct nf_conn *ct;
 	struct nf_conn_colo *conn;
@@ -1193,8 +1192,7 @@ out_unlock:
 
 static unsigned int
 colo_slaver_arp_hook(const struct nf_hook_ops *ops, struct sk_buff *skb,
-		     const struct net_device *in, const struct net_device *out,
-		     int (*okfn)(struct sk_buff *))
+                     const struct nf_hook_state *state)
 {
 	unsigned int ret = NF_ACCEPT;
 	const struct arphdr *arp;
diff --git a/xt_SECCOLO.c b/xt_SECCOLO.c
index fe8b4da..8bdef15 100644
--- a/xt_SECCOLO.c
+++ b/xt_SECCOLO.c
@@ -28,8 +28,7 @@ MODULE_DESCRIPTION("Xtables: secondary proxy module for colo.");
 
 static unsigned int
 colo_secondary_hook(const struct nf_hook_ops *ops, struct sk_buff *skb,
-		    const struct net_device *in, const struct net_device *out,
-		    int (*okfn)(struct sk_buff *))
+		    const struct nf_hook_state *hook_state)
 {
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn_colo *conn;

[-- Attachment #3: colo-may-build.patches --]
[-- Type: text/plain, Size: 1114 bytes --]

diff --git a/arch_init.c b/arch_init.c
index b7ce63a..564d87c 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -1359,7 +1359,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 /* Called with iothread lock */
 static int ram_save_complete(QEMUFile *f, void *opaque)
 {
-    int pages;
+    int pages = 0;
 
     rcu_read_lock();
 
diff --git a/migration/colo.c b/migration/colo.c
index dd6aef1..1ce9793 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -725,7 +725,7 @@ void *colo_process_incoming_checkpoints(void *opaque)
     int ret;
     uint64_t total_size;
     Error *local_err = NULL;
-    static int init_once;
+    //static int init_once;
 
     qdev_hotplug = 0;
 
diff --git a/savevm.c b/savevm.c
index 0c45387..873169d 100644
--- a/savevm.c
+++ b/savevm.c
@@ -877,7 +877,7 @@ int qemu_save_ram_state(QEMUFile *f, bool complete)
     SaveStateEntry *se;
     int section = complete ? QEMU_VM_SECTION_END : QEMU_VM_SECTION_PART;
     int (*save_state)(QEMUFile *f, void *opaque);
-    int ret;
+    int ret = 0;
 
     QTAILQ_FOREACH(se, &savevm_handlers, entry) {
         if (!se->ops) {

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-28 16:24   ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2015-05-29  1:29     ` Wen Congyang
  -1 siblings, 0 replies; 62+ messages in thread
From: Wen Congyang @ 2015-05-29  1:29 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, zhanghailiang
  Cc: qemu-devel, quintela, amit.shah, eblake, berrange,
	peter.huangpeng, eddie.dong, yunhong.jiang, lizhijian,
	arei.gonglei, david, netfilter-devel

On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>> failover, proxy API, block replication API, not include block replication.
>> The block part has been sent by wencongyang:
>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>
>> we have finished some new features and optimization on COLO (As a development branch in github),
>> but for easy of review, it is better to keep it simple now, so we will not add too much new 
>> codes into this frame patch set before it been totally reviewed. 
>>
>> You can get the latest integrated qemu colo patches from github (Include Block part):
>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>
>> Please NOTE the difference between these two branch.
>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
>> process of checkpoint, including: 
>>    1) separate ram and device save/load process to reduce size of extra memory
>>       used during checkpoint
>>    2) live migrate part of dirty pages to slave during sleep time.
>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>> info by using command 'info migrate'.
> 
> 
> Hi,
>   I have that running now.
> 
> Some notes:
>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>      they're very minor changes and I don't think related to (1).
>   3) I've also included some minor fixups I needed to get the -developing world
>      to build;  my compiler is fussy about unused variables etc - but I think the code
>      in ram_save_complete in your -developing patch is wrong because there are two
>      'pages' variables and the one in the inner loop is the only one changed.
>   4) I've started trying simple benchmarks and tests now:
>     a) With a simple web server most requests have very little overhead, the comparison
>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>        since the downtime isn't that big.
>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
>        install;  it failed the comparison every time; it took me a while to figure out
>        why, but it generates a unique token in it's javascript each time (for a password reset
>        link), and I guess the randomness used by that doesn't match on the two hosts.
>        It surprised me, because I didn't expect this page to have much randomness
>        in.
> 
>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> checkpoints happen very rarely.
> 
> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> after the qemu quits; the backtrace of the qemu stack is:

How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?

> 
> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> [<ffffffff815878bf>] sock_release+0x1f/0x90
> [<ffffffff81587942>] sock_close+0x12/0x20
> [<ffffffff812193c3>] __fput+0xd3/0x210
> [<ffffffff8121954e>] ____fput+0xe/0x10
> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> [<ffffffff81722b66>] int_signal+0x12/0x17
> [<ffffffffffffffff>] 0xffffffffffffffff

Thanks for your test. The backtrace is very useful, and we will fix it soon.

> 
> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> I'm not sure of the right fix; perhaps it might be possible to replace the 
> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?

I agree with it.

Thanks
Wen Congyang

> 
> Thanks,
> 
> Dave
> 
>>


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-29  1:29     ` Wen Congyang
  0 siblings, 0 replies; 62+ messages in thread
From: Wen Congyang @ 2015-05-29  1:29 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>> failover, proxy API, block replication API, not include block replication.
>> The block part has been sent by wencongyang:
>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>
>> we have finished some new features and optimization on COLO (As a development branch in github),
>> but for easy of review, it is better to keep it simple now, so we will not add too much new 
>> codes into this frame patch set before it been totally reviewed. 
>>
>> You can get the latest integrated qemu colo patches from github (Include Block part):
>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>
>> Please NOTE the difference between these two branch.
>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
>> process of checkpoint, including: 
>>    1) separate ram and device save/load process to reduce size of extra memory
>>       used during checkpoint
>>    2) live migrate part of dirty pages to slave during sleep time.
>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>> info by using command 'info migrate'.
> 
> 
> Hi,
>   I have that running now.
> 
> Some notes:
>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>      they're very minor changes and I don't think related to (1).
>   3) I've also included some minor fixups I needed to get the -developing world
>      to build;  my compiler is fussy about unused variables etc - but I think the code
>      in ram_save_complete in your -developing patch is wrong because there are two
>      'pages' variables and the one in the inner loop is the only one changed.
>   4) I've started trying simple benchmarks and tests now:
>     a) With a simple web server most requests have very little overhead, the comparison
>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>        since the downtime isn't that big.
>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
>        install;  it failed the comparison every time; it took me a while to figure out
>        why, but it generates a unique token in it's javascript each time (for a password reset
>        link), and I guess the randomness used by that doesn't match on the two hosts.
>        It surprised me, because I didn't expect this page to have much randomness
>        in.
> 
>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> checkpoints happen very rarely.
> 
> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> after the qemu quits; the backtrace of the qemu stack is:

How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?

> 
> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> [<ffffffff815878bf>] sock_release+0x1f/0x90
> [<ffffffff81587942>] sock_close+0x12/0x20
> [<ffffffff812193c3>] __fput+0xd3/0x210
> [<ffffffff8121954e>] ____fput+0xe/0x10
> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> [<ffffffff81722b66>] int_signal+0x12/0x17
> [<ffffffffffffffff>] 0xffffffffffffffff

Thanks for your test. The backtrace is very useful, and we will fix it soon.

> 
> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> I'm not sure of the right fix; perhaps it might be possible to replace the 
> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?

I agree with it.

Thanks
Wen Congyang

> 
> Thanks,
> 
> Dave
> 
>>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-29  1:29     ` [Qemu-devel] " Wen Congyang
@ 2015-05-29  8:01       ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-29  8:01 UTC (permalink / raw)
  To: Wen Congyang
  Cc: zhanghailiang, qemu-devel, quintela, amit.shah, eblake, berrange,
	peter.huangpeng, eddie.dong, yunhong.jiang, lizhijian,
	arei.gonglei, david, netfilter-devel

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> >> failover, proxy API, block replication API, not include block replication.
> >> The block part has been sent by wencongyang:
> >> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> >>
> >> we have finished some new features and optimization on COLO (As a development branch in github),
> >> but for easy of review, it is better to keep it simple now, so we will not add too much new 
> >> codes into this frame patch set before it been totally reviewed. 
> >>
> >> You can get the latest integrated qemu colo patches from github (Include Block part):
> >> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> >> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> >>
> >> Please NOTE the difference between these two branch.
> >> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> >> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
> >> process of checkpoint, including: 
> >>    1) separate ram and device save/load process to reduce size of extra memory
> >>       used during checkpoint
> >>    2) live migrate part of dirty pages to slave during sleep time.
> >> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> >> info by using command 'info migrate'.
> > 
> > 
> > Hi,
> >   I have that running now.
> > 
> > Some notes:
> >   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
> >   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
> >      they're very minor changes and I don't think related to (1).
> >   3) I've also included some minor fixups I needed to get the -developing world
> >      to build;  my compiler is fussy about unused variables etc - but I think the code
> >      in ram_save_complete in your -developing patch is wrong because there are two
> >      'pages' variables and the one in the inner loop is the only one changed.
> >   4) I've started trying simple benchmarks and tests now:
> >     a) With a simple web server most requests have very little overhead, the comparison
> >        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
> >        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
> >        since the downtime isn't that big.
> >     b) I tried something with more dynamic pages - the front page of a simple bugzilla
> >        install;  it failed the comparison every time; it took me a while to figure out
> >        why, but it generates a unique token in it's javascript each time (for a password reset
> >        link), and I guess the randomness used by that doesn't match on the two hosts.
> >        It surprised me, because I didn't expect this page to have much randomness
> >        in.
> > 
> >   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> > checkpoints happen very rarely.
> > 
> > The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> > after the qemu quits; the backtrace of the qemu stack is:
> 
> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?

I've seen two ways:
   1) Shutdown the guest - when the guest exits and qemu exits, then I see this problem
   2) If there is a problem with the colo-proxy-script (I got the path wrong) so qemu
      quit.

> > [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> > [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> > [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> > [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> > [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> > [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> > [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> > [<ffffffff815878bf>] sock_release+0x1f/0x90
> > [<ffffffff81587942>] sock_close+0x12/0x20
> > [<ffffffff812193c3>] __fput+0xd3/0x210
> > [<ffffffff8121954e>] ____fput+0xe/0x10
> > [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> > [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> > [<ffffffff81722b66>] int_signal+0x12/0x17
> > [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Thanks for your test. The backtrace is very useful, and we will fix it soon.

Thank you,

Dave

> > 
> > that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
> > older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> > I'm not sure of the right fix; perhaps it might be possible to replace the 
> > synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> 
> I agree with it.
> 
> Thanks
> Wen Congyang
> 
> > 
> > Thanks,
> > 
> > Dave
> > 
> >>
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-29  8:01       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-29  8:01 UTC (permalink / raw)
  To: Wen Congyang
  Cc: zhanghailiang, lizhijian, quintela, yunhong.jiang, eddie.dong,
	qemu-devel, peter.huangpeng, arei.gonglei, netfilter-devel,
	amit.shah, david

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> >> failover, proxy API, block replication API, not include block replication.
> >> The block part has been sent by wencongyang:
> >> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> >>
> >> we have finished some new features and optimization on COLO (As a development branch in github),
> >> but for easy of review, it is better to keep it simple now, so we will not add too much new 
> >> codes into this frame patch set before it been totally reviewed. 
> >>
> >> You can get the latest integrated qemu colo patches from github (Include Block part):
> >> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> >> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> >>
> >> Please NOTE the difference between these two branch.
> >> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> >> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
> >> process of checkpoint, including: 
> >>    1) separate ram and device save/load process to reduce size of extra memory
> >>       used during checkpoint
> >>    2) live migrate part of dirty pages to slave during sleep time.
> >> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> >> info by using command 'info migrate'.
> > 
> > 
> > Hi,
> >   I have that running now.
> > 
> > Some notes:
> >   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
> >   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
> >      they're very minor changes and I don't think related to (1).
> >   3) I've also included some minor fixups I needed to get the -developing world
> >      to build;  my compiler is fussy about unused variables etc - but I think the code
> >      in ram_save_complete in your -developing patch is wrong because there are two
> >      'pages' variables and the one in the inner loop is the only one changed.
> >   4) I've started trying simple benchmarks and tests now:
> >     a) With a simple web server most requests have very little overhead, the comparison
> >        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
> >        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
> >        since the downtime isn't that big.
> >     b) I tried something with more dynamic pages - the front page of a simple bugzilla
> >        install;  it failed the comparison every time; it took me a while to figure out
> >        why, but it generates a unique token in it's javascript each time (for a password reset
> >        link), and I guess the randomness used by that doesn't match on the two hosts.
> >        It surprised me, because I didn't expect this page to have much randomness
> >        in.
> > 
> >   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> > checkpoints happen very rarely.
> > 
> > The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> > after the qemu quits; the backtrace of the qemu stack is:
> 
> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?

I've seen two ways:
   1) Shutdown the guest - when the guest exits and qemu exits, then I see this problem
   2) If there is a problem with the colo-proxy-script (I got the path wrong) so qemu
      quit.

> > [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> > [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> > [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> > [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> > [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> > [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> > [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> > [<ffffffff815878bf>] sock_release+0x1f/0x90
> > [<ffffffff81587942>] sock_close+0x12/0x20
> > [<ffffffff812193c3>] __fput+0xd3/0x210
> > [<ffffffff8121954e>] ____fput+0xe/0x10
> > [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> > [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> > [<ffffffff81722b66>] int_signal+0x12/0x17
> > [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Thanks for your test. The backtrace is very useful, and we will fix it soon.

Thank you,

Dave

> > 
> > that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
> > older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> > I'm not sure of the right fix; perhaps it might be possible to replace the 
> > synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> 
> I agree with it.
> 
> Thanks
> Wen Congyang
> 
> > 
> > Thanks,
> > 
> > Dave
> > 
> >>
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-29  1:29     ` [Qemu-devel] " Wen Congyang
@ 2015-05-29  8:06       ` zhanghailiang
  -1 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-29  8:06 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

On 2015/5/29 9:29, Wen Congyang wrote:
> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>>> failover, proxy API, block replication API, not include block replication.
>>> The block part has been sent by wencongyang:
>>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>>
>>> we have finished some new features and optimization on COLO (As a development branch in github),
>>> but for easy of review, it is better to keep it simple now, so we will not add too much new
>>> codes into this frame patch set before it been totally reviewed.
>>>
>>> You can get the latest integrated qemu colo patches from github (Include Block part):
>>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>>
>>> Please NOTE the difference between these two branch.
>>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
>>> process of checkpoint, including:
>>>     1) separate ram and device save/load process to reduce size of extra memory
>>>        used during checkpoint
>>>     2) live migrate part of dirty pages to slave during sleep time.
>>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>>> info by using command 'info migrate'.
>>
>>
>> Hi,
>>    I have that running now.
>>
>> Some notes:
>>    1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>>    2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>>       they're very minor changes and I don't think related to (1).
>>    3) I've also included some minor fixups I needed to get the -developing world
>>       to build;  my compiler is fussy about unused variables etc - but I think the code
>>       in ram_save_complete in your -developing patch is wrong because there are two
>>       'pages' variables and the one in the inner loop is the only one changed.

Oops, i will fix them. thank you for pointing out this low grade mistake. :)

>>    4) I've started trying simple benchmarks and tests now:
>>      a) With a simple web server most requests have very little overhead, the comparison
>>         matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>>         corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>>         since the downtime isn't that big.

Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?

>>      b) I tried something with more dynamic pages - the front page of a simple bugzilla
>>         install;  it failed the comparison every time; it took me a while to figure out

Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?

>>         why, but it generates a unique token in it's javascript each time (for a password reset
>>         link), and I guess the randomness used by that doesn't match on the two hosts.
>>         It surprised me, because I didn't expect this page to have much randomness
>>         in.
>>
>>    4a is really nice - it shows the benefit of COLO over the simple checkpointing;
>> checkpoints happen very rarely.
>>
>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
>> after the qemu quits; the backtrace of the qemu stack is:
>
> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
>
>>
>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
>> [<ffffffff815878bf>] sock_release+0x1f/0x90
>> [<ffffffff81587942>] sock_close+0x12/0x20
>> [<ffffffff812193c3>] __fput+0xd3/0x210
>> [<ffffffff8121954e>] ____fput+0xe/0x10
>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
>> [<ffffffff81722b66>] int_signal+0x12/0x17
>> [<ffffffffffffffff>] 0xffffffffffffffff
>
> Thanks for your test. The backtrace is very useful, and we will fix it soon.
>

Yes, it is a bug, the callback function colonl_close_event() is called when holding
rcu lock:
netlink_release
     ->atomic_notifier_call_chain
          ->rcu_read_lock();
          ->notifier_call_chain
             ->ret = nb->notifier_call(nb, val, v);
And here it is wrong to call synchronize_rcu which will lead to sleep.
Besides, there is another function might lead to sleep, kthread_stop which is called
in destroy_notify_cb.

>>
>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
>> I'm not sure of the right fix; perhaps it might be possible to replace the
>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
>
> I agree with it.

That is a good solution, i will fix both of the above problems.

Thanks,
zhanghailiang

>
>>
>> Thanks,
>>
>> Dave
>>
>>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-29  8:06       ` zhanghailiang
  0 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-05-29  8:06 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

On 2015/5/29 9:29, Wen Congyang wrote:
> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>>> failover, proxy API, block replication API, not include block replication.
>>> The block part has been sent by wencongyang:
>>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>>
>>> we have finished some new features and optimization on COLO (As a development branch in github),
>>> but for easy of review, it is better to keep it simple now, so we will not add too much new
>>> codes into this frame patch set before it been totally reviewed.
>>>
>>> You can get the latest integrated qemu colo patches from github (Include Block part):
>>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>>
>>> Please NOTE the difference between these two branch.
>>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
>>> process of checkpoint, including:
>>>     1) separate ram and device save/load process to reduce size of extra memory
>>>        used during checkpoint
>>>     2) live migrate part of dirty pages to slave during sleep time.
>>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>>> info by using command 'info migrate'.
>>
>>
>> Hi,
>>    I have that running now.
>>
>> Some notes:
>>    1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>>    2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>>       they're very minor changes and I don't think related to (1).
>>    3) I've also included some minor fixups I needed to get the -developing world
>>       to build;  my compiler is fussy about unused variables etc - but I think the code
>>       in ram_save_complete in your -developing patch is wrong because there are two
>>       'pages' variables and the one in the inner loop is the only one changed.

Oops, i will fix them. thank you for pointing out this low grade mistake. :)

>>    4) I've started trying simple benchmarks and tests now:
>>      a) With a simple web server most requests have very little overhead, the comparison
>>         matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>>         corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>>         since the downtime isn't that big.

Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?

>>      b) I tried something with more dynamic pages - the front page of a simple bugzilla
>>         install;  it failed the comparison every time; it took me a while to figure out

Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?

>>         why, but it generates a unique token in it's javascript each time (for a password reset
>>         link), and I guess the randomness used by that doesn't match on the two hosts.
>>         It surprised me, because I didn't expect this page to have much randomness
>>         in.
>>
>>    4a is really nice - it shows the benefit of COLO over the simple checkpointing;
>> checkpoints happen very rarely.
>>
>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
>> after the qemu quits; the backtrace of the qemu stack is:
>
> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
>
>>
>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
>> [<ffffffff815878bf>] sock_release+0x1f/0x90
>> [<ffffffff81587942>] sock_close+0x12/0x20
>> [<ffffffff812193c3>] __fput+0xd3/0x210
>> [<ffffffff8121954e>] ____fput+0xe/0x10
>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
>> [<ffffffff81722b66>] int_signal+0x12/0x17
>> [<ffffffffffffffff>] 0xffffffffffffffff
>
> Thanks for your test. The backtrace is very useful, and we will fix it soon.
>

Yes, it is a bug, the callback function colonl_close_event() is called when holding
rcu lock:
netlink_release
     ->atomic_notifier_call_chain
          ->rcu_read_lock();
          ->notifier_call_chain
             ->ret = nb->notifier_call(nb, val, v);
And here it is wrong to call synchronize_rcu which will lead to sleep.
Besides, there is another function might lead to sleep, kthread_stop which is called
in destroy_notify_cb.

>>
>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
>> I'm not sure of the right fix; perhaps it might be possible to replace the
>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
>
> I agree with it.

That is a good solution, i will fix both of the above problems.

Thanks,
zhanghailiang

>
>>
>> Thanks,
>>
>> Dave
>>
>>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-29  8:06       ` [Qemu-devel] " zhanghailiang
@ 2015-05-29  8:42         ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-29  8:42 UTC (permalink / raw)
  To: zhanghailiang
  Cc: Wen Congyang, peter.huangpeng, qemu-devel, quintela, amit.shah,
	eblake, berrange, eddie.dong, yunhong.jiang, lizhijian,
	arei.gonglei, david, netfilter-devel

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> On 2015/5/29 9:29, Wen Congyang wrote:
> >On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> >>* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>>This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> >>>failover, proxy API, block replication API, not include block replication.
> >>>The block part has been sent by wencongyang:
> >>>"[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> >>>
> >>>we have finished some new features and optimization on COLO (As a development branch in github),
> >>>but for easy of review, it is better to keep it simple now, so we will not add too much new
> >>>codes into this frame patch set before it been totally reviewed.
> >>>
> >>>You can get the latest integrated qemu colo patches from github (Include Block part):
> >>>https://github.com/coloft/qemu/commits/colo-v1.2-basic
> >>>https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> >>>
> >>>Please NOTE the difference between these two branch.
> >>>colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> >>>Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
> >>>process of checkpoint, including:
> >>>    1) separate ram and device save/load process to reduce size of extra memory
> >>>       used during checkpoint
> >>>    2) live migrate part of dirty pages to slave during sleep time.
> >>>Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> >>>info by using command 'info migrate'.
> >>
> >>
> >>Hi,
> >>   I have that running now.
> >>
> >>Some notes:
> >>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
> >>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
> >>      they're very minor changes and I don't think related to (1).
> >>   3) I've also included some minor fixups I needed to get the -developing world
> >>      to build;  my compiler is fussy about unused variables etc - but I think the code
> >>      in ram_save_complete in your -developing patch is wrong because there are two
> >>      'pages' variables and the one in the inner loop is the only one changed.
> 
> Oops, i will fix them. thank you for pointing out this low grade mistake. :)

No problem; we all make them.

> >>   4) I've started trying simple benchmarks and tests now:
> >>     a) With a simple web server most requests have very little overhead, the comparison
> >>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
> >>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
> >>        since the downtime isn't that big.
> 
> Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?

Yes, I've turned that off, I still get the big spikes; not looked why yet.

> >>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
> >>        install;  it failed the comparison every time; it took me a while to figure out
> 
> Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?

Yes.

> >>        why, but it generates a unique token in it's javascript each time (for a password reset
> >>        link), and I guess the randomness used by that doesn't match on the two hosts.
> >>        It surprised me, because I didn't expect this page to have much randomness
> >>        in.
> >>
> >>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> >>checkpoints happen very rarely.
> >>
> >>The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> >>after the qemu quits; the backtrace of the qemu stack is:
> >
> >How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
> >
> >>
> >>[<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> >>[<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> >>[<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> >>[<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> >>[<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> >>[<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> >>[<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> >>[<ffffffff815878bf>] sock_release+0x1f/0x90
> >>[<ffffffff81587942>] sock_close+0x12/0x20
> >>[<ffffffff812193c3>] __fput+0xd3/0x210
> >>[<ffffffff8121954e>] ____fput+0xe/0x10
> >>[<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> >>[<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> >>[<ffffffff81722b66>] int_signal+0x12/0x17
> >>[<ffffffffffffffff>] 0xffffffffffffffff
> >
> >Thanks for your test. The backtrace is very useful, and we will fix it soon.
> >
> 
> Yes, it is a bug, the callback function colonl_close_event() is called when holding
> rcu lock:
> netlink_release
>     ->atomic_notifier_call_chain
>          ->rcu_read_lock();
>          ->notifier_call_chain
>             ->ret = nb->notifier_call(nb, val, v);
> And here it is wrong to call synchronize_rcu which will lead to sleep.
> Besides, there is another function might lead to sleep, kthread_stop which is called
> in destroy_notify_cb.
> 
> >>
> >>that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
> >>older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> >>I'm not sure of the right fix; perhaps it might be possible to replace the
> >>synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> >
> >I agree with it.
> 
> That is a good solution, i will fix both of the above problems.

Thanks,

Dave

> 
> Thanks,
> zhanghailiang
> 
> >
> >>
> >>Thanks,
> >>
> >>Dave
> >>
> >>>
> >
> >
> >.
> >
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-29  8:42         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-29  8:42 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> On 2015/5/29 9:29, Wen Congyang wrote:
> >On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> >>* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>>This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> >>>failover, proxy API, block replication API, not include block replication.
> >>>The block part has been sent by wencongyang:
> >>>"[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> >>>
> >>>we have finished some new features and optimization on COLO (As a development branch in github),
> >>>but for easy of review, it is better to keep it simple now, so we will not add too much new
> >>>codes into this frame patch set before it been totally reviewed.
> >>>
> >>>You can get the latest integrated qemu colo patches from github (Include Block part):
> >>>https://github.com/coloft/qemu/commits/colo-v1.2-basic
> >>>https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> >>>
> >>>Please NOTE the difference between these two branch.
> >>>colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> >>>Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
> >>>process of checkpoint, including:
> >>>    1) separate ram and device save/load process to reduce size of extra memory
> >>>       used during checkpoint
> >>>    2) live migrate part of dirty pages to slave during sleep time.
> >>>Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> >>>info by using command 'info migrate'.
> >>
> >>
> >>Hi,
> >>   I have that running now.
> >>
> >>Some notes:
> >>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
> >>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
> >>      they're very minor changes and I don't think related to (1).
> >>   3) I've also included some minor fixups I needed to get the -developing world
> >>      to build;  my compiler is fussy about unused variables etc - but I think the code
> >>      in ram_save_complete in your -developing patch is wrong because there are two
> >>      'pages' variables and the one in the inner loop is the only one changed.
> 
> Oops, i will fix them. thank you for pointing out this low grade mistake. :)

No problem; we all make them.

> >>   4) I've started trying simple benchmarks and tests now:
> >>     a) With a simple web server most requests have very little overhead, the comparison
> >>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
> >>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
> >>        since the downtime isn't that big.
> 
> Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?

Yes, I've turned that off, I still get the big spikes; not looked why yet.

> >>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
> >>        install;  it failed the comparison every time; it took me a while to figure out
> 
> Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?

Yes.

> >>        why, but it generates a unique token in it's javascript each time (for a password reset
> >>        link), and I guess the randomness used by that doesn't match on the two hosts.
> >>        It surprised me, because I didn't expect this page to have much randomness
> >>        in.
> >>
> >>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> >>checkpoints happen very rarely.
> >>
> >>The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> >>after the qemu quits; the backtrace of the qemu stack is:
> >
> >How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
> >
> >>
> >>[<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> >>[<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> >>[<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> >>[<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> >>[<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> >>[<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> >>[<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> >>[<ffffffff815878bf>] sock_release+0x1f/0x90
> >>[<ffffffff81587942>] sock_close+0x12/0x20
> >>[<ffffffff812193c3>] __fput+0xd3/0x210
> >>[<ffffffff8121954e>] ____fput+0xe/0x10
> >>[<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> >>[<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> >>[<ffffffff81722b66>] int_signal+0x12/0x17
> >>[<ffffffffffffffff>] 0xffffffffffffffff
> >
> >Thanks for your test. The backtrace is very useful, and we will fix it soon.
> >
> 
> Yes, it is a bug, the callback function colonl_close_event() is called when holding
> rcu lock:
> netlink_release
>     ->atomic_notifier_call_chain
>          ->rcu_read_lock();
>          ->notifier_call_chain
>             ->ret = nb->notifier_call(nb, val, v);
> And here it is wrong to call synchronize_rcu which will lead to sleep.
> Besides, there is another function might lead to sleep, kthread_stop which is called
> in destroy_notify_cb.
> 
> >>
> >>that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
> >>older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> >>I'm not sure of the right fix; perhaps it might be possible to replace the
> >>synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> >
> >I agree with it.
> 
> That is a good solution, i will fix both of the above problems.

Thanks,

Dave

> 
> Thanks,
> zhanghailiang
> 
> >
> >>
> >>Thanks,
> >>
> >>Dave
> >>
> >>>
> >
> >
> >.
> >
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-29  8:42         ` [Qemu-devel] " Dr. David Alan Gilbert
  (?)
@ 2015-05-29 12:34         ` Wen Congyang
  2015-05-29 15:12             ` [Qemu-devel] " Dr. David Alan Gilbert
  -1 siblings, 1 reply; 62+ messages in thread
From: Wen Congyang @ 2015-05-29 12:34 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, zhanghailiang
  Cc: peter.huangpeng, qemu-devel, quintela, amit.shah, eblake,
	berrange, eddie.dong, yunhong.jiang, lizhijian, arei.gonglei,
	david, netfilter-devel

On 05/29/2015 04:42 PM, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2015/5/29 9:29, Wen Congyang wrote:
>>> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
>>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>>>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>>>>> failover, proxy API, block replication API, not include block replication.
>>>>> The block part has been sent by wencongyang:
>>>>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>>>>
>>>>> we have finished some new features and optimization on COLO (As a development branch in github),
>>>>> but for easy of review, it is better to keep it simple now, so we will not add too much new
>>>>> codes into this frame patch set before it been totally reviewed.
>>>>>
>>>>> You can get the latest integrated qemu colo patches from github (Include Block part):
>>>>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>>>>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>>>>
>>>>> Please NOTE the difference between these two branch.
>>>>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>>>>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
>>>>> process of checkpoint, including:
>>>>>    1) separate ram and device save/load process to reduce size of extra memory
>>>>>       used during checkpoint
>>>>>    2) live migrate part of dirty pages to slave during sleep time.
>>>>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>>>>> info by using command 'info migrate'.
>>>>
>>>>
>>>> Hi,
>>>>   I have that running now.
>>>>
>>>> Some notes:
>>>>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>>>>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>>>>      they're very minor changes and I don't think related to (1).
>>>>   3) I've also included some minor fixups I needed to get the -developing world
>>>>      to build;  my compiler is fussy about unused variables etc - but I think the code
>>>>      in ram_save_complete in your -developing patch is wrong because there are two
>>>>      'pages' variables and the one in the inner loop is the only one changed.
>>
>> Oops, i will fix them. thank you for pointing out this low grade mistake. :)
> 
> No problem; we all make them.
> 
>>>>   4) I've started trying simple benchmarks and tests now:
>>>>     a) With a simple web server most requests have very little overhead, the comparison
>>>>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>>>>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>>>>        since the downtime isn't that big.
>>
>> Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?
> 
> Yes, I've turned that off, I still get the big spikes; not looked why yet.
> 
>>>>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
>>>>        install;  it failed the comparison every time; it took me a while to figure out
>>
>> Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?
> 
> Yes.
> 
>>>>        why, but it generates a unique token in it's javascript each time (for a password reset
>>>>        link), and I guess the randomness used by that doesn't match on the two hosts.
>>>>        It surprised me, because I didn't expect this page to have much randomness
>>>>        in.
>>>>
>>>>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
>>>> checkpoints happen very rarely.
>>>>
>>>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
>>>> after the qemu quits; the backtrace of the qemu stack is:
>>>
>>> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
>>>
>>>>
>>>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
>>>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
>>>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
>>>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
>>>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
>>>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
>>>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
>>>> [<ffffffff815878bf>] sock_release+0x1f/0x90
>>>> [<ffffffff81587942>] sock_close+0x12/0x20
>>>> [<ffffffff812193c3>] __fput+0xd3/0x210
>>>> [<ffffffff8121954e>] ____fput+0xe/0x10
>>>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
>>>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
>>>> [<ffffffff81722b66>] int_signal+0x12/0x17
>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>
>>> Thanks for your test. The backtrace is very useful, and we will fix it soon.
>>>
>>
>> Yes, it is a bug, the callback function colonl_close_event() is called when holding
>> rcu lock:
>> netlink_release
>>     ->atomic_notifier_call_chain
>>          ->rcu_read_lock();
>>          ->notifier_call_chain
>>             ->ret = nb->notifier_call(nb, val, v);
>> And here it is wrong to call synchronize_rcu which will lead to sleep.
>> Besides, there is another function might lead to sleep, kthread_stop which is called
>> in destroy_notify_cb.
>>
>>>>
>>>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
>>>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
>>>> I'm not sure of the right fix; perhaps it might be possible to replace the
>>>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
>>>
>>> I agree with it.
>>
>> That is a good solution, i will fix both of the above problems.
> 
> Thanks,

We have fix this problem, and test it. The patch is pushed to github, please try it.

Thanks
Wen Congyang

> 
> Dave
> 
>>
>> Thanks,
>> zhanghailiang
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Dave
>>>>
>>>>>
>>>
>>>
>>> .
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-29 12:34         ` Wen Congyang
@ 2015-05-29 15:12             ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-29 15:12 UTC (permalink / raw)
  To: Wen Congyang
  Cc: zhanghailiang, peter.huangpeng, qemu-devel, quintela, amit.shah,
	eblake, berrange, eddie.dong, yunhong.jiang, lizhijian,
	arei.gonglei, david, netfilter-devel

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 05/29/2015 04:42 PM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >> On 2015/5/29 9:29, Wen Congyang wrote:
> >>> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> >>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:

<snip>

> >>>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> >>>> after the qemu quits; the backtrace of the qemu stack is:
> >>>
> >>> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
> >>>
> >>>>
> >>>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> >>>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> >>>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> >>>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> >>>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> >>>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> >>>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> >>>> [<ffffffff815878bf>] sock_release+0x1f/0x90
> >>>> [<ffffffff81587942>] sock_close+0x12/0x20
> >>>> [<ffffffff812193c3>] __fput+0xd3/0x210
> >>>> [<ffffffff8121954e>] ____fput+0xe/0x10
> >>>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> >>>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> >>>> [<ffffffff81722b66>] int_signal+0x12/0x17
> >>>> [<ffffffffffffffff>] 0xffffffffffffffff
> >>>
> >>> Thanks for your test. The backtrace is very useful, and we will fix it soon.
> >>>
> >>
> >> Yes, it is a bug, the callback function colonl_close_event() is called when holding
> >> rcu lock:
> >> netlink_release
> >>     ->atomic_notifier_call_chain
> >>          ->rcu_read_lock();
> >>          ->notifier_call_chain
> >>             ->ret = nb->notifier_call(nb, val, v);
> >> And here it is wrong to call synchronize_rcu which will lead to sleep.
> >> Besides, there is another function might lead to sleep, kthread_stop which is called
> >> in destroy_notify_cb.
> >>
> >>>>
> >>>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
> >>>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> >>>> I'm not sure of the right fix; perhaps it might be possible to replace the
> >>>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> >>>
> >>> I agree with it.
> >>
> >> That is a good solution, i will fix both of the above problems.
> > 
> > Thanks,
> 
> We have fix this problem, and test it. The patch is pushed to github, please try it.

Yes, that works.  Thank you very much for the quick fix.

Dave

> 
> Thanks
> Wen Congyang
> 
> > 
> > Dave
> > 
> >>
> >> Thanks,
> >> zhanghailiang
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Dave
> >>>>
> >>>>>
> >>>
> >>>
> >>> .
> >>>
> >>
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > --
> > To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > .
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-05-29 15:12             ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-05-29 15:12 UTC (permalink / raw)
  To: Wen Congyang
  Cc: zhanghailiang, lizhijian, quintela, yunhong.jiang, eddie.dong,
	qemu-devel, peter.huangpeng, arei.gonglei, netfilter-devel,
	amit.shah, david

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 05/29/2015 04:42 PM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >> On 2015/5/29 9:29, Wen Congyang wrote:
> >>> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> >>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:

<snip>

> >>>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> >>>> after the qemu quits; the backtrace of the qemu stack is:
> >>>
> >>> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
> >>>
> >>>>
> >>>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> >>>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> >>>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> >>>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> >>>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> >>>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> >>>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> >>>> [<ffffffff815878bf>] sock_release+0x1f/0x90
> >>>> [<ffffffff81587942>] sock_close+0x12/0x20
> >>>> [<ffffffff812193c3>] __fput+0xd3/0x210
> >>>> [<ffffffff8121954e>] ____fput+0xe/0x10
> >>>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> >>>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> >>>> [<ffffffff81722b66>] int_signal+0x12/0x17
> >>>> [<ffffffffffffffff>] 0xffffffffffffffff
> >>>
> >>> Thanks for your test. The backtrace is very useful, and we will fix it soon.
> >>>
> >>
> >> Yes, it is a bug, the callback function colonl_close_event() is called when holding
> >> rcu lock:
> >> netlink_release
> >>     ->atomic_notifier_call_chain
> >>          ->rcu_read_lock();
> >>          ->notifier_call_chain
> >>             ->ret = nb->notifier_call(nb, val, v);
> >> And here it is wrong to call synchronize_rcu which will lead to sleep.
> >> Besides, there is another function might lead to sleep, kthread_stop which is called
> >> in destroy_notify_cb.
> >>
> >>>>
> >>>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
> >>>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> >>>> I'm not sure of the right fix; perhaps it might be possible to replace the
> >>>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> >>>
> >>> I agree with it.
> >>
> >> That is a good solution, i will fix both of the above problems.
> > 
> > Thanks,
> 
> We have fix this problem, and test it. The patch is pushed to github, please try it.

Yes, that works.  Thank you very much for the quick fix.

Dave

> 
> Thanks
> Wen Congyang
> 
> > 
> > Dave
> > 
> >>
> >> Thanks,
> >> zhanghailiang
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Dave
> >>>>
> >>>>>
> >>>
> >>>
> >>> .
> >>>
> >>
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > --
> > To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > .
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-29  8:42         ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2015-06-01  1:41           ` Wen Congyang
  -1 siblings, 0 replies; 62+ messages in thread
From: Wen Congyang @ 2015-06-01  1:41 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, zhanghailiang
  Cc: peter.huangpeng, qemu-devel, quintela, amit.shah, eblake,
	berrange, eddie.dong, yunhong.jiang, lizhijian, arei.gonglei,
	david, netfilter-devel

On 05/29/2015 04:42 PM, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2015/5/29 9:29, Wen Congyang wrote:
>>> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
>>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>>>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>>>>> failover, proxy API, block replication API, not include block replication.
>>>>> The block part has been sent by wencongyang:
>>>>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>>>>
>>>>> we have finished some new features and optimization on COLO (As a development branch in github),
>>>>> but for easy of review, it is better to keep it simple now, so we will not add too much new
>>>>> codes into this frame patch set before it been totally reviewed.
>>>>>
>>>>> You can get the latest integrated qemu colo patches from github (Include Block part):
>>>>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>>>>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>>>>
>>>>> Please NOTE the difference between these two branch.
>>>>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>>>>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
>>>>> process of checkpoint, including:
>>>>>    1) separate ram and device save/load process to reduce size of extra memory
>>>>>       used during checkpoint
>>>>>    2) live migrate part of dirty pages to slave during sleep time.
>>>>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>>>>> info by using command 'info migrate'.
>>>>
>>>>
>>>> Hi,
>>>>   I have that running now.
>>>>
>>>> Some notes:
>>>>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>>>>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>>>>      they're very minor changes and I don't think related to (1).
>>>>   3) I've also included some minor fixups I needed to get the -developing world
>>>>      to build;  my compiler is fussy about unused variables etc - but I think the code
>>>>      in ram_save_complete in your -developing patch is wrong because there are two
>>>>      'pages' variables and the one in the inner loop is the only one changed.
>>
>> Oops, i will fix them. thank you for pointing out this low grade mistake. :)
> 
> No problem; we all make them.
> 
>>>>   4) I've started trying simple benchmarks and tests now:
>>>>     a) With a simple web server most requests have very little overhead, the comparison
>>>>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>>>>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>>>>        since the downtime isn't that big.
>>
>> Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?
> 
> Yes, I've turned that off, I still get the big spikes; not looked why yet.

How to reproduce it? Use webbench or the other benchmark?

Thanks
Wen Congyang

> 
>>>>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
>>>>        install;  it failed the comparison every time; it took me a while to figure out
>>
>> Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?
> 
> Yes.
> 
>>>>        why, but it generates a unique token in it's javascript each time (for a password reset
>>>>        link), and I guess the randomness used by that doesn't match on the two hosts.
>>>>        It surprised me, because I didn't expect this page to have much randomness
>>>>        in.
>>>>
>>>>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
>>>> checkpoints happen very rarely.
>>>>
>>>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
>>>> after the qemu quits; the backtrace of the qemu stack is:
>>>
>>> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
>>>
>>>>
>>>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
>>>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
>>>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
>>>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
>>>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
>>>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
>>>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
>>>> [<ffffffff815878bf>] sock_release+0x1f/0x90
>>>> [<ffffffff81587942>] sock_close+0x12/0x20
>>>> [<ffffffff812193c3>] __fput+0xd3/0x210
>>>> [<ffffffff8121954e>] ____fput+0xe/0x10
>>>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
>>>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
>>>> [<ffffffff81722b66>] int_signal+0x12/0x17
>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>
>>> Thanks for your test. The backtrace is very useful, and we will fix it soon.
>>>
>>
>> Yes, it is a bug, the callback function colonl_close_event() is called when holding
>> rcu lock:
>> netlink_release
>>     ->atomic_notifier_call_chain
>>          ->rcu_read_lock();
>>          ->notifier_call_chain
>>             ->ret = nb->notifier_call(nb, val, v);
>> And here it is wrong to call synchronize_rcu which will lead to sleep.
>> Besides, there is another function might lead to sleep, kthread_stop which is called
>> in destroy_notify_cb.
>>
>>>>
>>>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
>>>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
>>>> I'm not sure of the right fix; perhaps it might be possible to replace the
>>>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
>>>
>>> I agree with it.
>>
>> That is a good solution, i will fix both of the above problems.
> 
> Thanks,
> 
> Dave
> 
>>
>> Thanks,
>> zhanghailiang
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Dave
>>>>
>>>>>
>>>
>>>
>>> .
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-06-01  1:41           ` Wen Congyang
  0 siblings, 0 replies; 62+ messages in thread
From: Wen Congyang @ 2015-06-01  1:41 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

On 05/29/2015 04:42 PM, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2015/5/29 9:29, Wen Congyang wrote:
>>> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
>>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>>>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>>>>> failover, proxy API, block replication API, not include block replication.
>>>>> The block part has been sent by wencongyang:
>>>>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>>>>
>>>>> we have finished some new features and optimization on COLO (As a development branch in github),
>>>>> but for easy of review, it is better to keep it simple now, so we will not add too much new
>>>>> codes into this frame patch set before it been totally reviewed.
>>>>>
>>>>> You can get the latest integrated qemu colo patches from github (Include Block part):
>>>>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>>>>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>>>>
>>>>> Please NOTE the difference between these two branch.
>>>>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>>>>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
>>>>> process of checkpoint, including:
>>>>>    1) separate ram and device save/load process to reduce size of extra memory
>>>>>       used during checkpoint
>>>>>    2) live migrate part of dirty pages to slave during sleep time.
>>>>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>>>>> info by using command 'info migrate'.
>>>>
>>>>
>>>> Hi,
>>>>   I have that running now.
>>>>
>>>> Some notes:
>>>>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>>>>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>>>>      they're very minor changes and I don't think related to (1).
>>>>   3) I've also included some minor fixups I needed to get the -developing world
>>>>      to build;  my compiler is fussy about unused variables etc - but I think the code
>>>>      in ram_save_complete in your -developing patch is wrong because there are two
>>>>      'pages' variables and the one in the inner loop is the only one changed.
>>
>> Oops, i will fix them. thank you for pointing out this low grade mistake. :)
> 
> No problem; we all make them.
> 
>>>>   4) I've started trying simple benchmarks and tests now:
>>>>     a) With a simple web server most requests have very little overhead, the comparison
>>>>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>>>>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>>>>        since the downtime isn't that big.
>>
>> Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?
> 
> Yes, I've turned that off, I still get the big spikes; not looked why yet.

How to reproduce it? Use webbench or the other benchmark?

Thanks
Wen Congyang

> 
>>>>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
>>>>        install;  it failed the comparison every time; it took me a while to figure out
>>
>> Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?
> 
> Yes.
> 
>>>>        why, but it generates a unique token in it's javascript each time (for a password reset
>>>>        link), and I guess the randomness used by that doesn't match on the two hosts.
>>>>        It surprised me, because I didn't expect this page to have much randomness
>>>>        in.
>>>>
>>>>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
>>>> checkpoints happen very rarely.
>>>>
>>>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
>>>> after the qemu quits; the backtrace of the qemu stack is:
>>>
>>> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
>>>
>>>>
>>>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
>>>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
>>>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
>>>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
>>>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
>>>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
>>>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
>>>> [<ffffffff815878bf>] sock_release+0x1f/0x90
>>>> [<ffffffff81587942>] sock_close+0x12/0x20
>>>> [<ffffffff812193c3>] __fput+0xd3/0x210
>>>> [<ffffffff8121954e>] ____fput+0xe/0x10
>>>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
>>>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
>>>> [<ffffffff81722b66>] int_signal+0x12/0x17
>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>
>>> Thanks for your test. The backtrace is very useful, and we will fix it soon.
>>>
>>
>> Yes, it is a bug, the callback function colonl_close_event() is called when holding
>> rcu lock:
>> netlink_release
>>     ->atomic_notifier_call_chain
>>          ->rcu_read_lock();
>>          ->notifier_call_chain
>>             ->ret = nb->notifier_call(nb, val, v);
>> And here it is wrong to call synchronize_rcu which will lead to sleep.
>> Besides, there is another function might lead to sleep, kthread_stop which is called
>> in destroy_notify_cb.
>>
>>>>
>>>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
>>>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
>>>> I'm not sure of the right fix; perhaps it might be possible to replace the
>>>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
>>>
>>> I agree with it.
>>
>> That is a good solution, i will fix both of the above problems.
> 
> Thanks,
> 
> Dave
> 
>>
>> Thanks,
>> zhanghailiang
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Dave
>>>>
>>>>>
>>>
>>>
>>> .
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-06-01  1:41           ` [Qemu-devel] " Wen Congyang
@ 2015-06-01  9:16             ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-06-01  9:16 UTC (permalink / raw)
  To: Wen Congyang
  Cc: zhanghailiang, peter.huangpeng, qemu-devel, quintela, amit.shah,
	eblake, berrange, eddie.dong, yunhong.jiang, lizhijian,
	arei.gonglei, david, netfilter-devel

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 05/29/2015 04:42 PM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >> On 2015/5/29 9:29, Wen Congyang wrote:
> >>> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> >>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>>>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> >>>>> failover, proxy API, block replication API, not include block replication.
> >>>>> The block part has been sent by wencongyang:
> >>>>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> >>>>>
> >>>>> we have finished some new features and optimization on COLO (As a development branch in github),
> >>>>> but for easy of review, it is better to keep it simple now, so we will not add too much new
> >>>>> codes into this frame patch set before it been totally reviewed.
> >>>>>
> >>>>> You can get the latest integrated qemu colo patches from github (Include Block part):
> >>>>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> >>>>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> >>>>>
> >>>>> Please NOTE the difference between these two branch.
> >>>>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> >>>>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
> >>>>> process of checkpoint, including:
> >>>>>    1) separate ram and device save/load process to reduce size of extra memory
> >>>>>       used during checkpoint
> >>>>>    2) live migrate part of dirty pages to slave during sleep time.
> >>>>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> >>>>> info by using command 'info migrate'.
> >>>>
> >>>>
> >>>> Hi,
> >>>>   I have that running now.
> >>>>
> >>>> Some notes:
> >>>>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
> >>>>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
> >>>>      they're very minor changes and I don't think related to (1).
> >>>>   3) I've also included some minor fixups I needed to get the -developing world
> >>>>      to build;  my compiler is fussy about unused variables etc - but I think the code
> >>>>      in ram_save_complete in your -developing patch is wrong because there are two
> >>>>      'pages' variables and the one in the inner loop is the only one changed.
> >>
> >> Oops, i will fix them. thank you for pointing out this low grade mistake. :)
> > 
> > No problem; we all make them.
> > 
> >>>>   4) I've started trying simple benchmarks and tests now:
> >>>>     a) With a simple web server most requests have very little overhead, the comparison
> >>>>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
> >>>>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
> >>>>        since the downtime isn't that big.
> >>
> >> Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?
> > 
> > Yes, I've turned that off, I still get the big spikes; not looked why yet.
> 
> How to reproduce it? Use webbench or the other benchmark?

Much simpler;
while true; do (time curl -O /dev/null http://myguestaddress/bugzilla/abigfile.txt ) 2>&1 | grep ^real; done

where 'abigfile.txt' is a simple 750K text file that I put in the
directory on the guests webserver.

The times I'm seeing in COLO mode are:

real	0m0.043s   <--- (a) normal very quick
real	0m0.045s
real	0m0.053s
real	0m0.044s
real	0m0.264s
real	0m0.053s
real	0m1.193s   <--- (b) occasional very long
real	0m0.152s   <--- (c) sometimes gets slower repeat - miscompare each time?
real	0m0.142s            
real	0m0.148s
real	0m0.145s
real	0m0.148s


If I force a failover to secondary I get times like (a).
(b) is the case I mentioned in the last mail.
(c) I've only seen in the latest version - but sometimes it does (c)
   repeatedly for a while.  Info migrate shows the 'proxy discompare count'
   going up.


The host doing the wget is a separate host from the two machines running COLO;
all three machines are connected via gigabit ether.   A separate 10Gbit link
runs between the two COLO machines for the COLO traffic.

Dave


> 
> Thanks
> Wen Congyang
> 
> > 
> >>>>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
> >>>>        install;  it failed the comparison every time; it took me a while to figure out
> >>
> >> Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?
> > 
> > Yes.
> > 
> >>>>        why, but it generates a unique token in it's javascript each time (for a password reset
> >>>>        link), and I guess the randomness used by that doesn't match on the two hosts.
> >>>>        It surprised me, because I didn't expect this page to have much randomness
> >>>>        in.
> >>>>
> >>>>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> >>>> checkpoints happen very rarely.
> >>>>
> >>>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> >>>> after the qemu quits; the backtrace of the qemu stack is:
> >>>
> >>> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
> >>>
> >>>>
> >>>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> >>>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> >>>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> >>>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> >>>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> >>>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> >>>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> >>>> [<ffffffff815878bf>] sock_release+0x1f/0x90
> >>>> [<ffffffff81587942>] sock_close+0x12/0x20
> >>>> [<ffffffff812193c3>] __fput+0xd3/0x210
> >>>> [<ffffffff8121954e>] ____fput+0xe/0x10
> >>>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> >>>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> >>>> [<ffffffff81722b66>] int_signal+0x12/0x17
> >>>> [<ffffffffffffffff>] 0xffffffffffffffff
> >>>
> >>> Thanks for your test. The backtrace is very useful, and we will fix it soon.
> >>>
> >>
> >> Yes, it is a bug, the callback function colonl_close_event() is called when holding
> >> rcu lock:
> >> netlink_release
> >>     ->atomic_notifier_call_chain
> >>          ->rcu_read_lock();
> >>          ->notifier_call_chain
> >>             ->ret = nb->notifier_call(nb, val, v);
> >> And here it is wrong to call synchronize_rcu which will lead to sleep.
> >> Besides, there is another function might lead to sleep, kthread_stop which is called
> >> in destroy_notify_cb.
> >>
> >>>>
> >>>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
> >>>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> >>>> I'm not sure of the right fix; perhaps it might be possible to replace the
> >>>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> >>>
> >>> I agree with it.
> >>
> >> That is a good solution, i will fix both of the above problems.
> > 
> > Thanks,
> > 
> > Dave
> > 
> >>
> >> Thanks,
> >> zhanghailiang
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Dave
> >>>>
> >>>>>
> >>>
> >>>
> >>> .
> >>>
> >>
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > --
> > To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > .
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-06-01  9:16             ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-06-01  9:16 UTC (permalink / raw)
  To: Wen Congyang
  Cc: zhanghailiang, lizhijian, quintela, yunhong.jiang, eddie.dong,
	qemu-devel, peter.huangpeng, arei.gonglei, netfilter-devel,
	amit.shah, david

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 05/29/2015 04:42 PM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >> On 2015/5/29 9:29, Wen Congyang wrote:
> >>> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> >>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>>>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> >>>>> failover, proxy API, block replication API, not include block replication.
> >>>>> The block part has been sent by wencongyang:
> >>>>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> >>>>>
> >>>>> we have finished some new features and optimization on COLO (As a development branch in github),
> >>>>> but for easy of review, it is better to keep it simple now, so we will not add too much new
> >>>>> codes into this frame patch set before it been totally reviewed.
> >>>>>
> >>>>> You can get the latest integrated qemu colo patches from github (Include Block part):
> >>>>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> >>>>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> >>>>>
> >>>>> Please NOTE the difference between these two branch.
> >>>>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> >>>>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the
> >>>>> process of checkpoint, including:
> >>>>>    1) separate ram and device save/load process to reduce size of extra memory
> >>>>>       used during checkpoint
> >>>>>    2) live migrate part of dirty pages to slave during sleep time.
> >>>>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> >>>>> info by using command 'info migrate'.
> >>>>
> >>>>
> >>>> Hi,
> >>>>   I have that running now.
> >>>>
> >>>> Some notes:
> >>>>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
> >>>>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
> >>>>      they're very minor changes and I don't think related to (1).
> >>>>   3) I've also included some minor fixups I needed to get the -developing world
> >>>>      to build;  my compiler is fussy about unused variables etc - but I think the code
> >>>>      in ram_save_complete in your -developing patch is wrong because there are two
> >>>>      'pages' variables and the one in the inner loop is the only one changed.
> >>
> >> Oops, i will fix them. thank you for pointing out this low grade mistake. :)
> > 
> > No problem; we all make them.
> > 
> >>>>   4) I've started trying simple benchmarks and tests now:
> >>>>     a) With a simple web server most requests have very little overhead, the comparison
> >>>>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
> >>>>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
> >>>>        since the downtime isn't that big.
> >>
> >> Have you disabled DEBUG for colo proxy? I turned it on in default, is this related?
> > 
> > Yes, I've turned that off, I still get the big spikes; not looked why yet.
> 
> How to reproduce it? Use webbench or the other benchmark?

Much simpler;
while true; do (time curl -O /dev/null http://myguestaddress/bugzilla/abigfile.txt ) 2>&1 | grep ^real; done

where 'abigfile.txt' is a simple 750K text file that I put in the
directory on the guests webserver.

The times I'm seeing in COLO mode are:

real	0m0.043s   <--- (a) normal very quick
real	0m0.045s
real	0m0.053s
real	0m0.044s
real	0m0.264s
real	0m0.053s
real	0m1.193s   <--- (b) occasional very long
real	0m0.152s   <--- (c) sometimes gets slower repeat - miscompare each time?
real	0m0.142s            
real	0m0.148s
real	0m0.145s
real	0m0.148s


If I force a failover to secondary I get times like (a).
(b) is the case I mentioned in the last mail.
(c) I've only seen in the latest version - but sometimes it does (c)
   repeatedly for a while.  Info migrate shows the 'proxy discompare count'
   going up.


The host doing the wget is a separate host from the two machines running COLO;
all three machines are connected via gigabit ether.   A separate 10Gbit link
runs between the two COLO machines for the COLO traffic.

Dave


> 
> Thanks
> Wen Congyang
> 
> > 
> >>>>     b) I tried something with more dynamic pages - the front page of a simple bugzilla
> >>>>        install;  it failed the comparison every time; it took me a while to figure out
> >>
> >> Failed comprison ? Do you mean the net packets in these two sides are always inconsistent?
> > 
> > Yes.
> > 
> >>>>        why, but it generates a unique token in it's javascript each time (for a password reset
> >>>>        link), and I guess the randomness used by that doesn't match on the two hosts.
> >>>>        It surprised me, because I didn't expect this page to have much randomness
> >>>>        in.
> >>>>
> >>>>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> >>>> checkpoints happen very rarely.
> >>>>
> >>>> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> >>>> after the qemu quits; the backtrace of the qemu stack is:
> >>>
> >>> How to reproduce it? Use monitor command quit to quit qemu? Or kill the qemu?
> >>>
> >>>>
> >>>> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> >>>> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> >>>> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> >>>> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> >>>> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> >>>> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> >>>> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> >>>> [<ffffffff815878bf>] sock_release+0x1f/0x90
> >>>> [<ffffffff81587942>] sock_close+0x12/0x20
> >>>> [<ffffffff812193c3>] __fput+0xd3/0x210
> >>>> [<ffffffff8121954e>] ____fput+0xe/0x10
> >>>> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> >>>> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> >>>> [<ffffffff81722b66>] int_signal+0x12/0x17
> >>>> [<ffffffffffffffff>] 0xffffffffffffffff
> >>>
> >>> Thanks for your test. The backtrace is very useful, and we will fix it soon.
> >>>
> >>
> >> Yes, it is a bug, the callback function colonl_close_event() is called when holding
> >> rcu lock:
> >> netlink_release
> >>     ->atomic_notifier_call_chain
> >>          ->rcu_read_lock();
> >>          ->notifier_call_chain
> >>             ->ret = nb->notifier_call(nb, val, v);
> >> And here it is wrong to call synchronize_rcu which will lead to sleep.
> >> Besides, there is another function might lead to sleep, kthread_stop which is called
> >> in destroy_notify_cb.
> >>
> >>>>
> >>>> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and
> >>>> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> >>>> I'm not sure of the right fix; perhaps it might be possible to replace the
> >>>> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> >>>
> >>> I agree with it.
> >>
> >> That is a good solution, i will fix both of the above problems.
> > 
> > Thanks,
> > 
> > Dave
> > 
> >>
> >> Thanks,
> >> zhanghailiang
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Dave
> >>>>
> >>>>>
> >>>
> >>>
> >>> .
> >>>
> >>
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > --
> > To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > .
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-05-28 16:24   ` [Qemu-devel] " Dr. David Alan Gilbert
@ 2015-06-02  3:51     ` Wen Congyang
  -1 siblings, 0 replies; 62+ messages in thread
From: Wen Congyang @ 2015-06-02  3:51 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, zhanghailiang
  Cc: qemu-devel, quintela, amit.shah, eblake, berrange,
	peter.huangpeng, eddie.dong, yunhong.jiang, lizhijian,
	arei.gonglei, david, netfilter-devel

On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>> failover, proxy API, block replication API, not include block replication.
>> The block part has been sent by wencongyang:
>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>
>> we have finished some new features and optimization on COLO (As a development branch in github),
>> but for easy of review, it is better to keep it simple now, so we will not add too much new 
>> codes into this frame patch set before it been totally reviewed. 
>>
>> You can get the latest integrated qemu colo patches from github (Include Block part):
>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>
>> Please NOTE the difference between these two branch.
>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
>> process of checkpoint, including: 
>>    1) separate ram and device save/load process to reduce size of extra memory
>>       used during checkpoint
>>    2) live migrate part of dirty pages to slave during sleep time.
>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>> info by using command 'info migrate'.
> 
> 
> Hi,
>   I have that running now.
> 
> Some notes:
>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>      they're very minor changes and I don't think related to (1).
>   3) I've also included some minor fixups I needed to get the -developing world
>      to build;  my compiler is fussy about unused variables etc - but I think the code
>      in ram_save_complete in your -developing patch is wrong because there are two
>      'pages' variables and the one in the inner loop is the only one changed.
>   4) I've started trying simple benchmarks and tests now:
>     a) With a simple web server most requests have very little overhead, the comparison
>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>        since the downtime isn't that big.

I reproduce it, and we are investigating it now.

>     b) I tried something with more dynamic pages - the front page of a simple bugzilla

What does 'dynamic pages' mean?

>        install;  it failed the comparison every time; it took me a while to figure out
>        why, but it generates a unique token in it's javascript each time (for a password reset
>        link), and I guess the randomness used by that doesn't match on the two hosts.

Yes.

Thanks
Wen Congyang

>        It surprised me, because I didn't expect this page to have much randomness
>        in.
> 
>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> checkpoints happen very rarely.
> 
> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> after the qemu quits; the backtrace of the qemu stack is:
> 
> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> [<ffffffff815878bf>] sock_release+0x1f/0x90
> [<ffffffff81587942>] sock_close+0x12/0x20
> [<ffffffff812193c3>] __fput+0xd3/0x210
> [<ffffffff8121954e>] ____fput+0xe/0x10
> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> [<ffffffff81722b66>] int_signal+0x12/0x17
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> I'm not sure of the right fix; perhaps it might be possible to replace the 
> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> 
> Thanks,
> 
> Dave
> 
>>
>> You can test any branch of the above, 
>> about how to test COLO, Please reference to the follow link.
>> http://wiki.qemu.org/Features/COLO.
>>
>> COLO is still in early stage, 
>> your comments and feedback are warmly welcomed.
>>
>> Cc: netfilter-devel@vger.kernel.org
>>
>> TODO:
>> 1. Strengthen failover
>> 2. COLO function switch on/off
>> 2. Optimize proxy part, include proxy script.
>>   1) Remove the limitation of forward network link.
>>   2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
>> 3. The capability of continuous FT
>>
>> v5:
>> - Replace the previous communication way between proxy and qemu with nfnetlink
>> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
>> to set the 'forward device'
>> - Turn DPRINTF into trace_ calls as Dave's suggestion
>>
>> v4:
>> - New block replication scheme (use image-fleecing for sencondary side)
>> - Adress some comments from Eric Blake and Dave
>> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
>> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>>   run 100ms at least since last pause.
>> v3:
>> - use proxy instead of colo agent to compare network packets
>> - add block replication
>> - Optimize failover disposal
>> - handle shutdown
>>
>> v2:
>> - use QEMUSizedBuffer/QEMUFile as COLO buffer
>> - colo support is enabled by default
>> - add nic replication support
>> - addressed comments from Eric Blake and Dr. David Alan Gilbert
>>
>> v1:
>> - implement the frame of colo
>>
>> Wen Congyang (1):
>>   COLO: Add block replication into colo process
>>
>> zhanghailiang (28):
>>   configure: Add parameter for configure to enable/disable COLO support
>>   migration: Introduce capability 'colo' to migration
>>   COLO: migrate colo related info to slave
>>   migration: Integrate COLO checkpoint process into migration
>>   migration: Integrate COLO checkpoint process into loadvm
>>   COLO: Implement colo checkpoint protocol
>>   COLO: Add a new RunState RUN_STATE_COLO
>>   QEMUSizedBuffer: Introduce two help functions for qsb
>>   COLO: Save VM state to slave when do checkpoint
>>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>>   COLO VMstate: Load VM state into qsb before restore it
>>   arch_init: Start to trace dirty pages of SVM
>>   COLO RAM: Flush cached RAM into SVM's memory
>>   COLO failover: Introduce a new command to trigger a failover
>>   COLO failover: Implement COLO master/slave failover work
>>   COLO failover: Don't do failover during loading VM's state
>>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>>   COLO NIC: Implement colo nic device interface configure()
>>   COLO NIC : Implement colo nic init/destroy function
>>   COLO NIC: Some init work related with proxy module
>>   COLO: Handle nfnetlink message from proxy module
>>   COLO: Do checkpoint according to the result of packets comparation
>>   COLO: Improve checkpoint efficiency by do additional periodic
>>     checkpoint
>>   COLO: Add colo-set-checkpoint-period command
>>   COLO NIC: Implement NIC checkpoint and failover
>>   COLO: Disable qdev hotplug when VM is in COLO mode
>>   COLO: Implement shutdown checkpoint
>>
>>  arch_init.c                            | 243 +++++++++-
>>  configure                              |  36 +-
>>  hmp-commands.hx                        |  30 ++
>>  hmp.c                                  |  14 +
>>  hmp.h                                  |   2 +
>>  include/exec/cpu-all.h                 |   1 +
>>  include/migration/migration-colo.h     |  57 +++
>>  include/migration/migration-failover.h |  22 +
>>  include/migration/migration.h          |   3 +
>>  include/migration/qemu-file.h          |   3 +-
>>  include/net/colo-nic.h                 |  27 ++
>>  include/net/net.h                      |   3 +
>>  include/sysemu/sysemu.h                |   3 +
>>  migration/Makefile.objs                |   2 +
>>  migration/colo-comm.c                  |  68 +++
>>  migration/colo-failover.c              |  48 ++
>>  migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
>>  migration/migration.c                  |  60 ++-
>>  migration/qemu-file-buf.c              |  58 +++
>>  net/Makefile.objs                      |   1 +
>>  net/colo-nic.c                         | 420 +++++++++++++++++
>>  net/tap.c                              |  45 +-
>>  qapi-schema.json                       |  42 +-
>>  qemu-options.hx                        |  10 +-
>>  qmp-commands.hx                        |  41 ++
>>  savevm.c                               |   2 +-
>>  scripts/colo-proxy-script.sh           |  88 ++++
>>  stubs/Makefile.objs                    |   1 +
>>  stubs/migration-colo.c                 |  58 +++
>>  trace-events                           |  11 +
>>  vl.c                                   |  39 +-
>>  31 files changed, 2235 insertions(+), 39 deletions(-)
>>  create mode 100644 include/migration/migration-colo.h
>>  create mode 100644 include/migration/migration-failover.h
>>  create mode 100644 include/net/colo-nic.h
>>  create mode 100644 migration/colo-comm.c
>>  create mode 100644 migration/colo-failover.c
>>  create mode 100644 migration/colo.c
>>  create mode 100644 net/colo-nic.c
>>  create mode 100755 scripts/colo-proxy-script.sh
>>  create mode 100644 stubs/migration-colo.c
>>
>> -- 
>> 1.7.12.4
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-06-02  3:51     ` Wen Congyang
  0 siblings, 0 replies; 62+ messages in thread
From: Wen Congyang @ 2015-06-02  3:51 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, netfilter-devel, amit.shah, david

On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
>> failover, proxy API, block replication API, not include block replication.
>> The block part has been sent by wencongyang:
>> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
>>
>> we have finished some new features and optimization on COLO (As a development branch in github),
>> but for easy of review, it is better to keep it simple now, so we will not add too much new 
>> codes into this frame patch set before it been totally reviewed. 
>>
>> You can get the latest integrated qemu colo patches from github (Include Block part):
>> https://github.com/coloft/qemu/commits/colo-v1.2-basic
>> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
>>
>> Please NOTE the difference between these two branch.
>> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
>> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
>> process of checkpoint, including: 
>>    1) separate ram and device save/load process to reduce size of extra memory
>>       used during checkpoint
>>    2) live migrate part of dirty pages to slave during sleep time.
>> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
>> info by using command 'info migrate'.
> 
> 
> Hi,
>   I have that running now.
> 
> Some notes:
>   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
>   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
>      they're very minor changes and I don't think related to (1).
>   3) I've also included some minor fixups I needed to get the -developing world
>      to build;  my compiler is fussy about unused variables etc - but I think the code
>      in ram_save_complete in your -developing patch is wrong because there are two
>      'pages' variables and the one in the inner loop is the only one changed.
>   4) I've started trying simple benchmarks and tests now:
>     a) With a simple web server most requests have very little overhead, the comparison
>        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
>        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
>        since the downtime isn't that big.

I reproduce it, and we are investigating it now.

>     b) I tried something with more dynamic pages - the front page of a simple bugzilla

What does 'dynamic pages' mean?

>        install;  it failed the comparison every time; it took me a while to figure out
>        why, but it generates a unique token in it's javascript each time (for a password reset
>        link), and I guess the randomness used by that doesn't match on the two hosts.

Yes.

Thanks
Wen Congyang

>        It surprised me, because I didn't expect this page to have much randomness
>        in.
> 
>   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> checkpoints happen very rarely.
> 
> The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> after the qemu quits; the backtrace of the qemu stack is:
> 
> [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> [<ffffffff815878bf>] sock_release+0x1f/0x90
> [<ffffffff81587942>] sock_close+0x12/0x20
> [<ffffffff812193c3>] __fput+0xd3/0x210
> [<ffffffff8121954e>] ____fput+0xe/0x10
> [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> [<ffffffff81722b66>] int_signal+0x12/0x17
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
> older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> I'm not sure of the right fix; perhaps it might be possible to replace the 
> synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> 
> Thanks,
> 
> Dave
> 
>>
>> You can test any branch of the above, 
>> about how to test COLO, Please reference to the follow link.
>> http://wiki.qemu.org/Features/COLO.
>>
>> COLO is still in early stage, 
>> your comments and feedback are warmly welcomed.
>>
>> Cc: netfilter-devel@vger.kernel.org
>>
>> TODO:
>> 1. Strengthen failover
>> 2. COLO function switch on/off
>> 2. Optimize proxy part, include proxy script.
>>   1) Remove the limitation of forward network link.
>>   2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
>> 3. The capability of continuous FT
>>
>> v5:
>> - Replace the previous communication way between proxy and qemu with nfnetlink
>> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
>> to set the 'forward device'
>> - Turn DPRINTF into trace_ calls as Dave's suggestion
>>
>> v4:
>> - New block replication scheme (use image-fleecing for sencondary side)
>> - Adress some comments from Eric Blake and Dave
>> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
>> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>>   run 100ms at least since last pause.
>> v3:
>> - use proxy instead of colo agent to compare network packets
>> - add block replication
>> - Optimize failover disposal
>> - handle shutdown
>>
>> v2:
>> - use QEMUSizedBuffer/QEMUFile as COLO buffer
>> - colo support is enabled by default
>> - add nic replication support
>> - addressed comments from Eric Blake and Dr. David Alan Gilbert
>>
>> v1:
>> - implement the frame of colo
>>
>> Wen Congyang (1):
>>   COLO: Add block replication into colo process
>>
>> zhanghailiang (28):
>>   configure: Add parameter for configure to enable/disable COLO support
>>   migration: Introduce capability 'colo' to migration
>>   COLO: migrate colo related info to slave
>>   migration: Integrate COLO checkpoint process into migration
>>   migration: Integrate COLO checkpoint process into loadvm
>>   COLO: Implement colo checkpoint protocol
>>   COLO: Add a new RunState RUN_STATE_COLO
>>   QEMUSizedBuffer: Introduce two help functions for qsb
>>   COLO: Save VM state to slave when do checkpoint
>>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>>   COLO VMstate: Load VM state into qsb before restore it
>>   arch_init: Start to trace dirty pages of SVM
>>   COLO RAM: Flush cached RAM into SVM's memory
>>   COLO failover: Introduce a new command to trigger a failover
>>   COLO failover: Implement COLO master/slave failover work
>>   COLO failover: Don't do failover during loading VM's state
>>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>>   COLO NIC: Implement colo nic device interface configure()
>>   COLO NIC : Implement colo nic init/destroy function
>>   COLO NIC: Some init work related with proxy module
>>   COLO: Handle nfnetlink message from proxy module
>>   COLO: Do checkpoint according to the result of packets comparation
>>   COLO: Improve checkpoint efficiency by do additional periodic
>>     checkpoint
>>   COLO: Add colo-set-checkpoint-period command
>>   COLO NIC: Implement NIC checkpoint and failover
>>   COLO: Disable qdev hotplug when VM is in COLO mode
>>   COLO: Implement shutdown checkpoint
>>
>>  arch_init.c                            | 243 +++++++++-
>>  configure                              |  36 +-
>>  hmp-commands.hx                        |  30 ++
>>  hmp.c                                  |  14 +
>>  hmp.h                                  |   2 +
>>  include/exec/cpu-all.h                 |   1 +
>>  include/migration/migration-colo.h     |  57 +++
>>  include/migration/migration-failover.h |  22 +
>>  include/migration/migration.h          |   3 +
>>  include/migration/qemu-file.h          |   3 +-
>>  include/net/colo-nic.h                 |  27 ++
>>  include/net/net.h                      |   3 +
>>  include/sysemu/sysemu.h                |   3 +
>>  migration/Makefile.objs                |   2 +
>>  migration/colo-comm.c                  |  68 +++
>>  migration/colo-failover.c              |  48 ++
>>  migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
>>  migration/migration.c                  |  60 ++-
>>  migration/qemu-file-buf.c              |  58 +++
>>  net/Makefile.objs                      |   1 +
>>  net/colo-nic.c                         | 420 +++++++++++++++++
>>  net/tap.c                              |  45 +-
>>  qapi-schema.json                       |  42 +-
>>  qemu-options.hx                        |  10 +-
>>  qmp-commands.hx                        |  41 ++
>>  savevm.c                               |   2 +-
>>  scripts/colo-proxy-script.sh           |  88 ++++
>>  stubs/Makefile.objs                    |   1 +
>>  stubs/migration-colo.c                 |  58 +++
>>  trace-events                           |  11 +
>>  vl.c                                   |  39 +-
>>  31 files changed, 2235 insertions(+), 39 deletions(-)
>>  create mode 100644 include/migration/migration-colo.h
>>  create mode 100644 include/migration/migration-failover.h
>>  create mode 100644 include/net/colo-nic.h
>>  create mode 100644 migration/colo-comm.c
>>  create mode 100644 migration/colo-failover.c
>>  create mode 100644 migration/colo.c
>>  create mode 100644 net/colo-nic.c
>>  create mode 100755 scripts/colo-proxy-script.sh
>>  create mode 100644 stubs/migration-colo.c
>>
>> -- 
>> 1.7.12.4
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
  2015-06-02  3:51     ` [Qemu-devel] " Wen Congyang
@ 2015-06-02  8:02       ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-06-02  8:02 UTC (permalink / raw)
  To: Wen Congyang
  Cc: zhanghailiang, qemu-devel, quintela, amit.shah, eblake, berrange,
	peter.huangpeng, eddie.dong, yunhong.jiang, lizhijian,
	arei.gonglei, david, netfilter-devel

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> >> failover, proxy API, block replication API, not include block replication.
> >> The block part has been sent by wencongyang:
> >> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> >>
> >> we have finished some new features and optimization on COLO (As a development branch in github),
> >> but for easy of review, it is better to keep it simple now, so we will not add too much new 
> >> codes into this frame patch set before it been totally reviewed. 
> >>
> >> You can get the latest integrated qemu colo patches from github (Include Block part):
> >> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> >> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> >>
> >> Please NOTE the difference between these two branch.
> >> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> >> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
> >> process of checkpoint, including: 
> >>    1) separate ram and device save/load process to reduce size of extra memory
> >>       used during checkpoint
> >>    2) live migrate part of dirty pages to slave during sleep time.
> >> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> >> info by using command 'info migrate'.
> > 
> > 
> > Hi,
> >   I have that running now.
> > 
> > Some notes:
> >   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
> >   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
> >      they're very minor changes and I don't think related to (1).
> >   3) I've also included some minor fixups I needed to get the -developing world
> >      to build;  my compiler is fussy about unused variables etc - but I think the code
> >      in ram_save_complete in your -developing patch is wrong because there are two
> >      'pages' variables and the one in the inner loop is the only one changed.
> >   4) I've started trying simple benchmarks and tests now:
> >     a) With a simple web server most requests have very little overhead, the comparison
> >        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
> >        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
> >        since the downtime isn't that big.
> 
> I reproduce it, and we are investigating it now.
> 
> >     b) I tried something with more dynamic pages - the front page of a simple bugzilla
> 
> What does 'dynamic pages' mean?

dynamic is the oppposite of static; so pages that are created at runtime.
Bugzilla generates most pages using cgi-bin from templates, even for simple things,
although most of the output from the cgi-bin script is the same each time.

> >        install;  it failed the comparison every time; it took me a while to figure out
> >        why, but it generates a unique token in it's javascript each time (for a password reset
> >        link), and I guess the randomness used by that doesn't match on the two hosts.
> 
> Yes.
> 

Dave

> Thanks
> Wen Congyang
> 
> >        It surprised me, because I didn't expect this page to have much randomness
> >        in.
> > 
> >   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> > checkpoints happen very rarely.
> > 
> > The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> > after the qemu quits; the backtrace of the qemu stack is:
> > 
> > [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> > [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> > [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> > [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> > [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> > [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> > [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> > [<ffffffff815878bf>] sock_release+0x1f/0x90
> > [<ffffffff81587942>] sock_close+0x12/0x20
> > [<ffffffff812193c3>] __fput+0xd3/0x210
> > [<ffffffff8121954e>] ____fput+0xe/0x10
> > [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> > [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> > [<ffffffff81722b66>] int_signal+0x12/0x17
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 
> > that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
> > older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> > I'm not sure of the right fix; perhaps it might be possible to replace the 
> > synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> > 
> > Thanks,
> > 
> > Dave
> > 
> >>
> >> You can test any branch of the above, 
> >> about how to test COLO, Please reference to the follow link.
> >> http://wiki.qemu.org/Features/COLO.
> >>
> >> COLO is still in early stage, 
> >> your comments and feedback are warmly welcomed.
> >>
> >> Cc: netfilter-devel@vger.kernel.org
> >>
> >> TODO:
> >> 1. Strengthen failover
> >> 2. COLO function switch on/off
> >> 2. Optimize proxy part, include proxy script.
> >>   1) Remove the limitation of forward network link.
> >>   2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
> >> 3. The capability of continuous FT
> >>
> >> v5:
> >> - Replace the previous communication way between proxy and qemu with nfnetlink
> >> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
> >> to set the 'forward device'
> >> - Turn DPRINTF into trace_ calls as Dave's suggestion
> >>
> >> v4:
> >> - New block replication scheme (use image-fleecing for sencondary side)
> >> - Adress some comments from Eric Blake and Dave
> >> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
> >> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
> >>   run 100ms at least since last pause.
> >> v3:
> >> - use proxy instead of colo agent to compare network packets
> >> - add block replication
> >> - Optimize failover disposal
> >> - handle shutdown
> >>
> >> v2:
> >> - use QEMUSizedBuffer/QEMUFile as COLO buffer
> >> - colo support is enabled by default
> >> - add nic replication support
> >> - addressed comments from Eric Blake and Dr. David Alan Gilbert
> >>
> >> v1:
> >> - implement the frame of colo
> >>
> >> Wen Congyang (1):
> >>   COLO: Add block replication into colo process
> >>
> >> zhanghailiang (28):
> >>   configure: Add parameter for configure to enable/disable COLO support
> >>   migration: Introduce capability 'colo' to migration
> >>   COLO: migrate colo related info to slave
> >>   migration: Integrate COLO checkpoint process into migration
> >>   migration: Integrate COLO checkpoint process into loadvm
> >>   COLO: Implement colo checkpoint protocol
> >>   COLO: Add a new RunState RUN_STATE_COLO
> >>   QEMUSizedBuffer: Introduce two help functions for qsb
> >>   COLO: Save VM state to slave when do checkpoint
> >>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
> >>   COLO VMstate: Load VM state into qsb before restore it
> >>   arch_init: Start to trace dirty pages of SVM
> >>   COLO RAM: Flush cached RAM into SVM's memory
> >>   COLO failover: Introduce a new command to trigger a failover
> >>   COLO failover: Implement COLO master/slave failover work
> >>   COLO failover: Don't do failover during loading VM's state
> >>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
> >>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
> >>   COLO NIC: Implement colo nic device interface configure()
> >>   COLO NIC : Implement colo nic init/destroy function
> >>   COLO NIC: Some init work related with proxy module
> >>   COLO: Handle nfnetlink message from proxy module
> >>   COLO: Do checkpoint according to the result of packets comparation
> >>   COLO: Improve checkpoint efficiency by do additional periodic
> >>     checkpoint
> >>   COLO: Add colo-set-checkpoint-period command
> >>   COLO NIC: Implement NIC checkpoint and failover
> >>   COLO: Disable qdev hotplug when VM is in COLO mode
> >>   COLO: Implement shutdown checkpoint
> >>
> >>  arch_init.c                            | 243 +++++++++-
> >>  configure                              |  36 +-
> >>  hmp-commands.hx                        |  30 ++
> >>  hmp.c                                  |  14 +
> >>  hmp.h                                  |   2 +
> >>  include/exec/cpu-all.h                 |   1 +
> >>  include/migration/migration-colo.h     |  57 +++
> >>  include/migration/migration-failover.h |  22 +
> >>  include/migration/migration.h          |   3 +
> >>  include/migration/qemu-file.h          |   3 +-
> >>  include/net/colo-nic.h                 |  27 ++
> >>  include/net/net.h                      |   3 +
> >>  include/sysemu/sysemu.h                |   3 +
> >>  migration/Makefile.objs                |   2 +
> >>  migration/colo-comm.c                  |  68 +++
> >>  migration/colo-failover.c              |  48 ++
> >>  migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
> >>  migration/migration.c                  |  60 ++-
> >>  migration/qemu-file-buf.c              |  58 +++
> >>  net/Makefile.objs                      |   1 +
> >>  net/colo-nic.c                         | 420 +++++++++++++++++
> >>  net/tap.c                              |  45 +-
> >>  qapi-schema.json                       |  42 +-
> >>  qemu-options.hx                        |  10 +-
> >>  qmp-commands.hx                        |  41 ++
> >>  savevm.c                               |   2 +-
> >>  scripts/colo-proxy-script.sh           |  88 ++++
> >>  stubs/Makefile.objs                    |   1 +
> >>  stubs/migration-colo.c                 |  58 +++
> >>  trace-events                           |  11 +
> >>  vl.c                                   |  39 +-
> >>  31 files changed, 2235 insertions(+), 39 deletions(-)
> >>  create mode 100644 include/migration/migration-colo.h
> >>  create mode 100644 include/migration/migration-failover.h
> >>  create mode 100644 include/net/colo-nic.h
> >>  create mode 100644 migration/colo-comm.c
> >>  create mode 100644 migration/colo-failover.c
> >>  create mode 100644 migration/colo.c
> >>  create mode 100644 net/colo-nic.c
> >>  create mode 100755 scripts/colo-proxy-script.sh
> >>  create mode 100644 stubs/migration-colo.c
> >>
> >> -- 
> >> 1.7.12.4
> >>
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
@ 2015-06-02  8:02       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-06-02  8:02 UTC (permalink / raw)
  To: Wen Congyang
  Cc: zhanghailiang, lizhijian, quintela, yunhong.jiang, eddie.dong,
	qemu-devel, peter.huangpeng, arei.gonglei, netfilter-devel,
	amit.shah, david

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 05/29/2015 12:24 AM, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >> This is the 5th version of COLO, here is only COLO frame part, include: VM checkpoint,
> >> failover, proxy API, block replication API, not include block replication.
> >> The block part has been sent by wencongyang:
> >> "[Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints"
> >>
> >> we have finished some new features and optimization on COLO (As a development branch in github),
> >> but for easy of review, it is better to keep it simple now, so we will not add too much new 
> >> codes into this frame patch set before it been totally reviewed. 
> >>
> >> You can get the latest integrated qemu colo patches from github (Include Block part):
> >> https://github.com/coloft/qemu/commits/colo-v1.2-basic
> >> https://github.com/coloft/qemu/commits/colo-v1.2-developing (more features)
> >>
> >> Please NOTE the difference between these two branch.
> >> colo-v1.2-basic is exactly same with this patch series, which has basic features of COLO.
> >> Compared with colo-v1.2-basic, colo-v1.2-developing has some optimization in the 
> >> process of checkpoint, including: 
> >>    1) separate ram and device save/load process to reduce size of extra memory
> >>       used during checkpoint
> >>    2) live migrate part of dirty pages to slave during sleep time.
> >> Besides, we add some statistic info in colo-v1.2-developing, which you can get these stat
> >> info by using command 'info migrate'.
> > 
> > 
> > Hi,
> >   I have that running now.
> > 
> > Some notes:
> >   1) The colo-proxy is working OK until qemu quits, and then it gets an RCU problem; see below
> >   2) I've attached some minor tweaks that were needed to build with the 4.1rc kernel I'm using;
> >      they're very minor changes and I don't think related to (1).
> >   3) I've also included some minor fixups I needed to get the -developing world
> >      to build;  my compiler is fussy about unused variables etc - but I think the code
> >      in ram_save_complete in your -developing patch is wrong because there are two
> >      'pages' variables and the one in the inner loop is the only one changed.
> >   4) I've started trying simple benchmarks and tests now:
> >     a) With a simple web server most requests have very little overhead, the comparison
> >        matches most of the time;  I do get quite large spikes (0.04s->1.05s) which I guess
> >        corresponds to when a checkpoint happens, but I'm not sure why the spike is so big,
> >        since the downtime isn't that big.
> 
> I reproduce it, and we are investigating it now.
> 
> >     b) I tried something with more dynamic pages - the front page of a simple bugzilla
> 
> What does 'dynamic pages' mean?

dynamic is the oppposite of static; so pages that are created at runtime.
Bugzilla generates most pages using cgi-bin from templates, even for simple things,
although most of the output from the cgi-bin script is the same each time.

> >        install;  it failed the comparison every time; it took me a while to figure out
> >        why, but it generates a unique token in it's javascript each time (for a password reset
> >        link), and I guess the randomness used by that doesn't match on the two hosts.
> 
> Yes.
> 

Dave

> Thanks
> Wen Congyang
> 
> >        It surprised me, because I didn't expect this page to have much randomness
> >        in.
> > 
> >   4a is really nice - it shows the benefit of COLO over the simple checkpointing;
> > checkpoints happen very rarely.
> > 
> > The colo-proxy rcu problem I hit shows as rcu-stalls in both primary and secondary
> > after the qemu quits; the backtrace of the qemu stack is:
> > 
> > [<ffffffff810d8c0c>] wait_rcu_gp+0x5c/0x80
> > [<ffffffff810ddb05>] synchronize_rcu+0x45/0xd0
> > [<ffffffffa0a251e5>] colo_node_release+0x35/0x50 [nfnetlink_colo]
> > [<ffffffffa0a25795>] colonl_close_event+0xe5/0x160 [nfnetlink_colo]
> > [<ffffffff81090c96>] notifier_call_chain+0x66/0x90
> > [<ffffffff8109154c>] atomic_notifier_call_chain+0x6c/0x110
> > [<ffffffff815eee07>] netlink_release+0x5b7/0x7f0
> > [<ffffffff815878bf>] sock_release+0x1f/0x90
> > [<ffffffff81587942>] sock_close+0x12/0x20
> > [<ffffffff812193c3>] __fput+0xd3/0x210
> > [<ffffffff8121954e>] ____fput+0xe/0x10
> > [<ffffffff8108d9f7>] task_work_run+0xb7/0xf0
> > [<ffffffff81002d4d>] do_notify_resume+0x8d/0xa0
> > [<ffffffff81722b66>] int_signal+0x12/0x17
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > 
> > that's with both the 423a8e268acbe3e644a16c15bc79603cfe9eb084 from yesterday and 
> > older e58e5152b74945871b00a88164901c0d46e6365e tags on colo-proxy.
> > I'm not sure of the right fix; perhaps it might be possible to replace the 
> > synchronize_rcu in colo_node_release by a call_rcu that does the kfree later?
> > 
> > Thanks,
> > 
> > Dave
> > 
> >>
> >> You can test any branch of the above, 
> >> about how to test COLO, Please reference to the follow link.
> >> http://wiki.qemu.org/Features/COLO.
> >>
> >> COLO is still in early stage, 
> >> your comments and feedback are warmly welcomed.
> >>
> >> Cc: netfilter-devel@vger.kernel.org
> >>
> >> TODO:
> >> 1. Strengthen failover
> >> 2. COLO function switch on/off
> >> 2. Optimize proxy part, include proxy script.
> >>   1) Remove the limitation of forward network link.
> >>   2) Reuse the nfqueue_entry and NF_STOLEN to enqueue skb
> >> 3. The capability of continuous FT
> >>
> >> v5:
> >> - Replace the previous communication way between proxy and qemu with nfnetlink
> >> - Remove the 'forward device'parameter of xt_PMYCOLO, and now we use iptables command
> >> to set the 'forward device'
> >> - Turn DPRINTF into trace_ calls as Dave's suggestion
> >>
> >> v4:
> >> - New block replication scheme (use image-fleecing for sencondary side)
> >> - Adress some comments from Eric Blake and Dave
> >> - Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
> >> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
> >>   run 100ms at least since last pause.
> >> v3:
> >> - use proxy instead of colo agent to compare network packets
> >> - add block replication
> >> - Optimize failover disposal
> >> - handle shutdown
> >>
> >> v2:
> >> - use QEMUSizedBuffer/QEMUFile as COLO buffer
> >> - colo support is enabled by default
> >> - add nic replication support
> >> - addressed comments from Eric Blake and Dr. David Alan Gilbert
> >>
> >> v1:
> >> - implement the frame of colo
> >>
> >> Wen Congyang (1):
> >>   COLO: Add block replication into colo process
> >>
> >> zhanghailiang (28):
> >>   configure: Add parameter for configure to enable/disable COLO support
> >>   migration: Introduce capability 'colo' to migration
> >>   COLO: migrate colo related info to slave
> >>   migration: Integrate COLO checkpoint process into migration
> >>   migration: Integrate COLO checkpoint process into loadvm
> >>   COLO: Implement colo checkpoint protocol
> >>   COLO: Add a new RunState RUN_STATE_COLO
> >>   QEMUSizedBuffer: Introduce two help functions for qsb
> >>   COLO: Save VM state to slave when do checkpoint
> >>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
> >>   COLO VMstate: Load VM state into qsb before restore it
> >>   arch_init: Start to trace dirty pages of SVM
> >>   COLO RAM: Flush cached RAM into SVM's memory
> >>   COLO failover: Introduce a new command to trigger a failover
> >>   COLO failover: Implement COLO master/slave failover work
> >>   COLO failover: Don't do failover during loading VM's state
> >>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
> >>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
> >>   COLO NIC: Implement colo nic device interface configure()
> >>   COLO NIC : Implement colo nic init/destroy function
> >>   COLO NIC: Some init work related with proxy module
> >>   COLO: Handle nfnetlink message from proxy module
> >>   COLO: Do checkpoint according to the result of packets comparation
> >>   COLO: Improve checkpoint efficiency by do additional periodic
> >>     checkpoint
> >>   COLO: Add colo-set-checkpoint-period command
> >>   COLO NIC: Implement NIC checkpoint and failover
> >>   COLO: Disable qdev hotplug when VM is in COLO mode
> >>   COLO: Implement shutdown checkpoint
> >>
> >>  arch_init.c                            | 243 +++++++++-
> >>  configure                              |  36 +-
> >>  hmp-commands.hx                        |  30 ++
> >>  hmp.c                                  |  14 +
> >>  hmp.h                                  |   2 +
> >>  include/exec/cpu-all.h                 |   1 +
> >>  include/migration/migration-colo.h     |  57 +++
> >>  include/migration/migration-failover.h |  22 +
> >>  include/migration/migration.h          |   3 +
> >>  include/migration/qemu-file.h          |   3 +-
> >>  include/net/colo-nic.h                 |  27 ++
> >>  include/net/net.h                      |   3 +
> >>  include/sysemu/sysemu.h                |   3 +
> >>  migration/Makefile.objs                |   2 +
> >>  migration/colo-comm.c                  |  68 +++
> >>  migration/colo-failover.c              |  48 ++
> >>  migration/colo.c                       | 836 +++++++++++++++++++++++++++++++++
> >>  migration/migration.c                  |  60 ++-
> >>  migration/qemu-file-buf.c              |  58 +++
> >>  net/Makefile.objs                      |   1 +
> >>  net/colo-nic.c                         | 420 +++++++++++++++++
> >>  net/tap.c                              |  45 +-
> >>  qapi-schema.json                       |  42 +-
> >>  qemu-options.hx                        |  10 +-
> >>  qmp-commands.hx                        |  41 ++
> >>  savevm.c                               |   2 +-
> >>  scripts/colo-proxy-script.sh           |  88 ++++
> >>  stubs/Makefile.objs                    |   1 +
> >>  stubs/migration-colo.c                 |  58 +++
> >>  trace-events                           |  11 +
> >>  vl.c                                   |  39 +-
> >>  31 files changed, 2235 insertions(+), 39 deletions(-)
> >>  create mode 100644 include/migration/migration-colo.h
> >>  create mode 100644 include/migration/migration-failover.h
> >>  create mode 100644 include/net/colo-nic.h
> >>  create mode 100644 migration/colo-comm.c
> >>  create mode 100644 migration/colo-failover.c
> >>  create mode 100644 migration/colo.c
> >>  create mode 100644 net/colo-nic.c
> >>  create mode 100755 scripts/colo-proxy-script.sh
> >>  create mode 100644 stubs/migration-colo.c
> >>
> >> -- 
> >> 1.7.12.4
> >>
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 11/29] COLO VMstate: Load VM state into qsb before restore it
  2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 11/29] COLO VMstate: Load VM state into qsb before restore it zhanghailiang
@ 2015-06-05 18:02   ` Dr. David Alan Gilbert
  2015-06-09  2:19     ` zhanghailiang
  0 siblings, 1 reply; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-06-05 18:02 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, amit.shah, Yang Hongyang, david

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We should cache the device state before restore it,
> besides, we should call qemu_system_reset() before load VM state,
> which can ensure the data is intact.

I think the description could be better; to me the important
point is not that it's a 'cache', but the important point is that you
don't destroy the state of the secondary until you are sure that you can
read the whole state from the primary, just in case the primary fails
in the middle of sending the state.

However, other than that:

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

(I suspect you'll need updates as the qemu migration code updates)

Dave

> Note: If we discard qemu_system_reset(), there will be some odd error,
> For exmple, qemu in slave side crashes and reports:
> 
> KVM: entry failed, hardware error 0x7
> EAX=00000000 EBX=0000e000 ECX=00009578 EDX=0000434f
> ESI=0000fc10 EDI=0000434f EBP=00000000 ESP=00001fca
> EIP=00009594 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0040 00000400 0000ffff 00009300
> CS =f000 000f0000 0000ffff 00009b00
> SS =434f 000434f0 0000ffff 00009300
> DS =434f 000434f0 0000ffff 00009300
> FS =0000 00000000 0000ffff 00009300
> GS =0000 00000000 0000ffff 00009300
> LDT=0000 00000000 0000ffff 00008200
> TR =0000 00000000 0000ffff 00008b00
> GDT=     0002dcc8 00000047
> IDT=     00000000 0000ffff
> CR0=00000010 CR2=ffffffff CR3=00000000 CR4=00000000
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=c0 74 0f 66 b9 78 95 00 00 66 31 d2 66 31 c0 e9 47 e0 fb 90 <f3> 90 fa fc 66 c3 66 53 66 89 c3 66 e8 9d e8 ff ff 66 01 c3 66 89 d8 66 e8 40 e9 ff ff 66
> ERROR: invalid runstate transition: 'internal-error' -> 'colo'
> 
> The reason is, some of the device state will be ignored when saving device state to slave,
> if the corresponding data is in its initial value, such as 0.
> But the device state in slave maybe in initialized value, after a loop of checkpoint,
> there will be inconsistent for the value of device state.
> This will happen when the PVM reboot or SVM run ahead of PVM in the startup process.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  migration/colo.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 50 insertions(+), 3 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 39cd698..0f61786 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -309,8 +309,10 @@ void *colo_process_incoming_checkpoints(void *opaque)
>      struct colo_incoming *colo_in = opaque;
>      QEMUFile *f = colo_in->file;
>      int fd = qemu_get_fd(f);
> -    QEMUFile *ctl = NULL;
> +    QEMUFile *ctl = NULL, *fb = NULL;
>      int ret;
> +    uint64_t total_size;
> +
>      colo = qemu_coroutine_self();
>      assert(colo != NULL);
>  
> @@ -325,10 +327,17 @@ void *colo_process_incoming_checkpoints(void *opaque)
>          goto out;
>      }
>  
> +    colo_buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
> +    if (colo_buffer == NULL) {
> +        error_report("Failed to allocate colo buffer!");
> +        goto out;
> +    }
> +
>      ret = colo_ctl_put(ctl, COLO_CHECPOINT_READY);
>      if (ret < 0) {
>          goto out;
>      }
> +
>      qemu_mutex_lock_iothread();
>      /* in COLO mode, slave is runing, so start the vm */
>      vm_start();
> @@ -364,7 +373,18 @@ void *colo_process_incoming_checkpoints(void *opaque)
>          }
>          trace_colo_receive_message("COLO_CHECKPOINT_SEND");
>  
> -        /*TODO Load VM state */
> +        /* read the VM state total size first */
> +        ret = colo_ctl_get_value(f, &total_size);
> +        if (ret < 0) {
> +            goto out;
> +        }
> +
> +        /* read vm device state into colo buffer */
> +        ret = qsb_fill_buffer(colo_buffer, f, total_size);
> +        if (ret != total_size) {
> +            error_report("can't get all migration data");
> +            goto out;
> +        }
>  
>          ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
>          if (ret < 0) {
> @@ -372,6 +392,22 @@ void *colo_process_incoming_checkpoints(void *opaque)
>          }
>          trace_colo_receive_message("COLO_CHECKPOINT_RECEIVED");
>  
> +        /* open colo buffer for read */
> +        fb = qemu_bufopen("r", colo_buffer);
> +        if (!fb) {
> +            error_report("can't open colo buffer for read");
> +            goto out;
> +        }
> +
> +        qemu_mutex_lock_iothread();
> +        qemu_system_reset(VMRESET_SILENT);
> +        if (qemu_loadvm_state(fb) < 0) {
> +            error_report("COLO: loadvm failed");
> +            qemu_mutex_unlock_iothread();
> +            goto out;
> +        }
> +        qemu_mutex_unlock_iothread();
> +
>          /* TODO: flush vm state */
>  
>          ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
> @@ -384,14 +420,25 @@ void *colo_process_incoming_checkpoints(void *opaque)
>          vm_start();
>          qemu_mutex_unlock_iothread();
>          trace_colo_vm_state_change("stop", "start");
> -}
> +
> +        qemu_fclose(fb);
> +        fb = NULL;
> +    }
>  
>  out:
>      colo = NULL;
> +
> +    if (fb) {
> +        qemu_fclose(fb);
> +    }
> +
>      release_ram_cache();
>      if (ctl) {
>          qemu_fclose(ctl);
>      }
> +
> +    qsb_free(colo_buffer);
> +
>      loadvm_exit_colo();
>  
>      return NULL;
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 25/29] COLO: Add colo-set-checkpoint-period command
  2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 25/29] COLO: Add colo-set-checkpoint-period command zhanghailiang
@ 2015-06-05 18:45   ` Dr. David Alan Gilbert
  2015-06-09  3:28     ` zhanghailiang
  0 siblings, 1 reply; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-06-05 18:45 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, amit.shah, david

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> With this command, we can control the period of checkpoint, if
> there is no comparison of net packets.

This should use the MigrationParameter stuff that's gone into qemu recently;
in my local copy of your code I've got this, and your COLO_MIN period, and the 
delay after a miscompare and your live-ram size threshold all wired in as
MigrationParameters; makes it a lot easier to play with the values.

Dave

> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---
>  hmp-commands.hx        | 15 +++++++++++++++
>  hmp.c                  |  7 +++++++
>  hmp.h                  |  1 +
>  migration/colo.c       | 11 ++++++++++-
>  qapi-schema.json       | 13 +++++++++++++
>  qmp-commands.hx        | 22 ++++++++++++++++++++++
>  stubs/migration-colo.c |  4 ++++
>  7 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index be3e398..32cd548 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1023,6 +1023,21 @@ Tell COLO that heartbeat is lost, a failover or takeover is needed.
>  ETEXI
>  
>      {
> +        .name       = "colo_set_checkpoint_period",
> +        .args_type  = "value:i",
> +        .params     = "value",
> +        .help       = "set checkpoint period (in ms) for colo. "
> +        "Defaults to 100ms",
> +        .mhandler.cmd = hmp_colo_set_checkpoint_period,
> +    },
> +
> +STEXI
> +@item migrate_set_checkpoint_period @var{value}
> +@findex migrate_set_checkpoint_period
> +Set checkpoint period to @var{value} (in ms) for colo.
> +ETEXI
> +
> +    {
>          .name       = "client_migrate_info",
>          .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
>          .params     = "protocol hostname port tls-port cert-subject",
> diff --git a/hmp.c b/hmp.c
> index f87fa37..f727686 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -1257,6 +1257,13 @@ void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict)
>      hmp_handle_error(mon, &err);
>  }
>  
> +void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict)
> +{
> +    int64_t value = qdict_get_int(qdict, "value");
> +
> +    qmp_colo_set_checkpoint_period(value, NULL);
> +}
> +
>  void hmp_set_password(Monitor *mon, const QDict *qdict)
>  {
>      const char *protocol  = qdict_get_str(qdict, "protocol");
> diff --git a/hmp.h b/hmp.h
> index b6549f8..9570345 100644
> --- a/hmp.h
> +++ b/hmp.h
> @@ -68,6 +68,7 @@ void hmp_migrate_set_capability(Monitor *mon, const QDict *qdict);
>  void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
>  void hmp_migrate_set_cache_size(Monitor *mon, const QDict *qdict);
>  void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
> +void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict);
>  void hmp_set_password(Monitor *mon, const QDict *qdict);
>  void hmp_expire_password(Monitor *mon, const QDict *qdict);
>  void hmp_eject(Monitor *mon, const QDict *qdict);
> diff --git a/migration/colo.c b/migration/colo.c
> index 195973a..f5fc79c 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -17,6 +17,7 @@
>  #include "qemu/error-report.h"
>  #include "migration/migration-failover.h"
>  #include "net/colo-nic.h"
> +#include "qmp-commands.h"
>  
>  /*
>  * We should not do checkpoint one after another without any time interval,
> @@ -70,6 +71,9 @@ enum {
>  static QEMUBH *colo_bh;
>  static bool vmstate_loading;
>  static Coroutine *colo;
> +
> +int64_t colo_checkpoint_period = CHECKPOINT_MAX_PEROID;
> +
>  /* colo buffer */
>  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>  QEMUSizedBuffer *colo_buffer;
> @@ -85,6 +89,11 @@ bool migrate_in_colo_state(void)
>      return (s->state == MIGRATION_STATUS_COLO);
>  }
>  
> +void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
> +{
> +    colo_checkpoint_period = value;
> +}
> +
>  static bool colo_runstate_is_stopped(void)
>  {
>      return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
> @@ -361,7 +370,7 @@ static void *colo_thread(void *opaque)
>           * and then check if we need checkpoint again.
>           */
>          current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> -        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
> +        if (current_time - checkpoint_time < colo_checkpoint_period) {
>              g_usleep(100000);
>              continue;
>          }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index dc0ee07..62b5cfd 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -653,6 +653,19 @@
>  { 'command': 'colo-lost-heartbeat' }
>  
>  ##
> +# @colo-set-checkpoint-period
> +#
> +# Set colo checkpoint period
> +#
> +# @value: period of colo checkpoint in ms
> +#
> +# Returns: nothing on success
> +#
> +# Since: 2.4
> +##
> +{ 'command': 'colo-set-checkpoint-period', 'data': {'value': 'int'} }
> +
> +##
>  # @MouseInfo:
>  #
>  # Information about a mouse device.
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index 3813f66..4b16044 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -800,6 +800,28 @@ Example:
>  EQMP
>  
>      {
> +         .name       = "colo-set-checkpoint-period",
> +         .args_type  = "value:i",
> +         .mhandler.cmd_new = qmp_marshal_input_colo_set_checkpoint_period,
> +    },
> +
> +SQMP
> +colo-set-checkpoint-period
> +--------------------------
> +
> +set checkpoint period
> +
> +Arguments:
> +- "value": checkpoint period
> +
> +Example:
> +
> +-> { "execute": "colo-set-checkpoint-period", "arguments": { "value": "1000" } }
> +<- { "return": {} }
> +
> +EQMP
> +
> +    {
>          .name       = "client_migrate_info",
>          .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
>          .params     = "protocol hostname port tls-port cert-subject",
> diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
> index 03a395b..d3c9dc4 100644
> --- a/stubs/migration-colo.c
> +++ b/stubs/migration-colo.c
> @@ -52,3 +52,7 @@ void qmp_colo_lost_heartbeat(Error **errp)
>                       " with --enable-colo option in order to support"
>                       " COLO feature");
>  }
> +
> +void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
> +{
> +}
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 11/29] COLO VMstate: Load VM state into qsb before restore it
  2015-06-05 18:02   ` Dr. David Alan Gilbert
@ 2015-06-09  2:19     ` zhanghailiang
  0 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-06-09  2:19 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, amit.shah, Yang Hongyang, david

On 2015/6/6 2:02, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> We should cache the device state before restore it,
>> besides, we should call qemu_system_reset() before load VM state,
>> which can ensure the data is intact.
>
> I think the description could be better; to me the important
> point is not that it's a 'cache', but the important point is that you
> don't destroy the state of the secondary until you are sure that you can
> read the whole state from the primary, just in case the primary fails
> in the middle of sending the state.
>

OK, I will fix this description.

> However, other than that:
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
> (I suspect you'll need updates as the qemu migration code updates)
>

Yes, thanks.

> Dave
>
>> Note: If we discard qemu_system_reset(), there will be some odd error,
>> For exmple, qemu in slave side crashes and reports:
>>
>> KVM: entry failed, hardware error 0x7
>> EAX=00000000 EBX=0000e000 ECX=00009578 EDX=0000434f
>> ESI=0000fc10 EDI=0000434f EBP=00000000 ESP=00001fca
>> EIP=00009594 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0040 00000400 0000ffff 00009300
>> CS =f000 000f0000 0000ffff 00009b00
>> SS =434f 000434f0 0000ffff 00009300
>> DS =434f 000434f0 0000ffff 00009300
>> FS =0000 00000000 0000ffff 00009300
>> GS =0000 00000000 0000ffff 00009300
>> LDT=0000 00000000 0000ffff 00008200
>> TR =0000 00000000 0000ffff 00008b00
>> GDT=     0002dcc8 00000047
>> IDT=     00000000 0000ffff
>> CR0=00000010 CR2=ffffffff CR3=00000000 CR4=00000000
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000000
>> Code=c0 74 0f 66 b9 78 95 00 00 66 31 d2 66 31 c0 e9 47 e0 fb 90 <f3> 90 fa fc 66 c3 66 53 66 89 c3 66 e8 9d e8 ff ff 66 01 c3 66 89 d8 66 e8 40 e9 ff ff 66
>> ERROR: invalid runstate transition: 'internal-error' -> 'colo'
>>
>> The reason is, some of the device state will be ignored when saving device state to slave,
>> if the corresponding data is in its initial value, such as 0.
>> But the device state in slave maybe in initialized value, after a loop of checkpoint,
>> there will be inconsistent for the value of device state.
>> This will happen when the PVM reboot or SVM run ahead of PVM in the startup process.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>>   migration/colo.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++---
>>   1 file changed, 50 insertions(+), 3 deletions(-)
>>
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 39cd698..0f61786 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -309,8 +309,10 @@ void *colo_process_incoming_checkpoints(void *opaque)
>>       struct colo_incoming *colo_in = opaque;
>>       QEMUFile *f = colo_in->file;
>>       int fd = qemu_get_fd(f);
>> -    QEMUFile *ctl = NULL;
>> +    QEMUFile *ctl = NULL, *fb = NULL;
>>       int ret;
>> +    uint64_t total_size;
>> +
>>       colo = qemu_coroutine_self();
>>       assert(colo != NULL);
>>
>> @@ -325,10 +327,17 @@ void *colo_process_incoming_checkpoints(void *opaque)
>>           goto out;
>>       }
>>
>> +    colo_buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
>> +    if (colo_buffer == NULL) {
>> +        error_report("Failed to allocate colo buffer!");
>> +        goto out;
>> +    }
>> +
>>       ret = colo_ctl_put(ctl, COLO_CHECPOINT_READY);
>>       if (ret < 0) {
>>           goto out;
>>       }
>> +
>>       qemu_mutex_lock_iothread();
>>       /* in COLO mode, slave is runing, so start the vm */
>>       vm_start();
>> @@ -364,7 +373,18 @@ void *colo_process_incoming_checkpoints(void *opaque)
>>           }
>>           trace_colo_receive_message("COLO_CHECKPOINT_SEND");
>>
>> -        /*TODO Load VM state */
>> +        /* read the VM state total size first */
>> +        ret = colo_ctl_get_value(f, &total_size);
>> +        if (ret < 0) {
>> +            goto out;
>> +        }
>> +
>> +        /* read vm device state into colo buffer */
>> +        ret = qsb_fill_buffer(colo_buffer, f, total_size);
>> +        if (ret != total_size) {
>> +            error_report("can't get all migration data");
>> +            goto out;
>> +        }
>>
>>           ret = colo_ctl_put(ctl, COLO_CHECKPOINT_RECEIVED);
>>           if (ret < 0) {
>> @@ -372,6 +392,22 @@ void *colo_process_incoming_checkpoints(void *opaque)
>>           }
>>           trace_colo_receive_message("COLO_CHECKPOINT_RECEIVED");
>>
>> +        /* open colo buffer for read */
>> +        fb = qemu_bufopen("r", colo_buffer);
>> +        if (!fb) {
>> +            error_report("can't open colo buffer for read");
>> +            goto out;
>> +        }
>> +
>> +        qemu_mutex_lock_iothread();
>> +        qemu_system_reset(VMRESET_SILENT);
>> +        if (qemu_loadvm_state(fb) < 0) {
>> +            error_report("COLO: loadvm failed");
>> +            qemu_mutex_unlock_iothread();
>> +            goto out;
>> +        }
>> +        qemu_mutex_unlock_iothread();
>> +
>>           /* TODO: flush vm state */
>>
>>           ret = colo_ctl_put(ctl, COLO_CHECKPOINT_LOADED);
>> @@ -384,14 +420,25 @@ void *colo_process_incoming_checkpoints(void *opaque)
>>           vm_start();
>>           qemu_mutex_unlock_iothread();
>>           trace_colo_vm_state_change("stop", "start");
>> -}
>> +
>> +        qemu_fclose(fb);
>> +        fb = NULL;
>> +    }
>>
>>   out:
>>       colo = NULL;
>> +
>> +    if (fb) {
>> +        qemu_fclose(fb);
>> +    }
>> +
>>       release_ram_cache();
>>       if (ctl) {
>>           qemu_fclose(ctl);
>>       }
>> +
>> +    qsb_free(colo_buffer);
>> +
>>       loadvm_exit_colo();
>>
>>       return NULL;
>> --
>> 1.7.12.4
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 25/29] COLO: Add colo-set-checkpoint-period command
  2015-06-05 18:45   ` Dr. David Alan Gilbert
@ 2015-06-09  3:28     ` zhanghailiang
  2015-06-09  8:01       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 62+ messages in thread
From: zhanghailiang @ 2015-06-09  3:28 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: liang.z.li, lizhijian, quintela, yunhong.jiang, eddie.dong,
	peter.huangpeng, qemu-devel, arei.gonglei, amit.shah, david

On 2015/6/6 2:45, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> With this command, we can control the period of checkpoint, if
>> there is no comparison of net packets.
>
> This should use the MigrationParameter stuff that's gone into qemu recently;
> in my local copy of your code I've got this, and your COLO_MIN period, and the
> delay after a miscompare and your live-ram size threshold all wired in as
> MigrationParameters; makes it a lot easier to play with the values.

Yes, it is a good idea to use the new command 'migrate-set-parameters' to set all the parameters of COLO
related, but i noticed that this new command was custom-built for compress, not good designed for extension.
The qmp api is defined like bellow:
void qmp_migrate_set_parameters(bool has_compress_level,
                                 int64_t compress_level,
                                 bool has_compress_threads,
                                 int64_t compress_threads,
                                 bool has_decompress_threads,
                                 int64_t decompress_threads, Error **errp)
Maybe we should change it like:

void qmp_migrate_set_parameters(bool has_compress_info,
                                 struct compress_info compress,
				Error **errp)

I will try to fix it like that, and then use it in the COLO.

Thanks,
zhanghailiang

>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> ---
>>   hmp-commands.hx        | 15 +++++++++++++++
>>   hmp.c                  |  7 +++++++
>>   hmp.h                  |  1 +
>>   migration/colo.c       | 11 ++++++++++-
>>   qapi-schema.json       | 13 +++++++++++++
>>   qmp-commands.hx        | 22 ++++++++++++++++++++++
>>   stubs/migration-colo.c |  4 ++++
>>   7 files changed, 72 insertions(+), 1 deletion(-)
>>
>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>> index be3e398..32cd548 100644
>> --- a/hmp-commands.hx
>> +++ b/hmp-commands.hx
>> @@ -1023,6 +1023,21 @@ Tell COLO that heartbeat is lost, a failover or takeover is needed.
>>   ETEXI
>>
>>       {
>> +        .name       = "colo_set_checkpoint_period",
>> +        .args_type  = "value:i",
>> +        .params     = "value",
>> +        .help       = "set checkpoint period (in ms) for colo. "
>> +        "Defaults to 100ms",
>> +        .mhandler.cmd = hmp_colo_set_checkpoint_period,
>> +    },
>> +
>> +STEXI
>> +@item migrate_set_checkpoint_period @var{value}
>> +@findex migrate_set_checkpoint_period
>> +Set checkpoint period to @var{value} (in ms) for colo.
>> +ETEXI
>> +
>> +    {
>>           .name       = "client_migrate_info",
>>           .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
>>           .params     = "protocol hostname port tls-port cert-subject",
>> diff --git a/hmp.c b/hmp.c
>> index f87fa37..f727686 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -1257,6 +1257,13 @@ void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict)
>>       hmp_handle_error(mon, &err);
>>   }
>>
>> +void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict)
>> +{
>> +    int64_t value = qdict_get_int(qdict, "value");
>> +
>> +    qmp_colo_set_checkpoint_period(value, NULL);
>> +}
>> +
>>   void hmp_set_password(Monitor *mon, const QDict *qdict)
>>   {
>>       const char *protocol  = qdict_get_str(qdict, "protocol");
>> diff --git a/hmp.h b/hmp.h
>> index b6549f8..9570345 100644
>> --- a/hmp.h
>> +++ b/hmp.h
>> @@ -68,6 +68,7 @@ void hmp_migrate_set_capability(Monitor *mon, const QDict *qdict);
>>   void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
>>   void hmp_migrate_set_cache_size(Monitor *mon, const QDict *qdict);
>>   void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
>> +void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict);
>>   void hmp_set_password(Monitor *mon, const QDict *qdict);
>>   void hmp_expire_password(Monitor *mon, const QDict *qdict);
>>   void hmp_eject(Monitor *mon, const QDict *qdict);
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 195973a..f5fc79c 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -17,6 +17,7 @@
>>   #include "qemu/error-report.h"
>>   #include "migration/migration-failover.h"
>>   #include "net/colo-nic.h"
>> +#include "qmp-commands.h"
>>
>>   /*
>>   * We should not do checkpoint one after another without any time interval,
>> @@ -70,6 +71,9 @@ enum {
>>   static QEMUBH *colo_bh;
>>   static bool vmstate_loading;
>>   static Coroutine *colo;
>> +
>> +int64_t colo_checkpoint_period = CHECKPOINT_MAX_PEROID;
>> +
>>   /* colo buffer */
>>   #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>>   QEMUSizedBuffer *colo_buffer;
>> @@ -85,6 +89,11 @@ bool migrate_in_colo_state(void)
>>       return (s->state == MIGRATION_STATUS_COLO);
>>   }
>>
>> +void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
>> +{
>> +    colo_checkpoint_period = value;
>> +}
>> +
>>   static bool colo_runstate_is_stopped(void)
>>   {
>>       return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
>> @@ -361,7 +370,7 @@ static void *colo_thread(void *opaque)
>>            * and then check if we need checkpoint again.
>>            */
>>           current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> -        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
>> +        if (current_time - checkpoint_time < colo_checkpoint_period) {
>>               g_usleep(100000);
>>               continue;
>>           }
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index dc0ee07..62b5cfd 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -653,6 +653,19 @@
>>   { 'command': 'colo-lost-heartbeat' }
>>
>>   ##
>> +# @colo-set-checkpoint-period
>> +#
>> +# Set colo checkpoint period
>> +#
>> +# @value: period of colo checkpoint in ms
>> +#
>> +# Returns: nothing on success
>> +#
>> +# Since: 2.4
>> +##
>> +{ 'command': 'colo-set-checkpoint-period', 'data': {'value': 'int'} }
>> +
>> +##
>>   # @MouseInfo:
>>   #
>>   # Information about a mouse device.
>> diff --git a/qmp-commands.hx b/qmp-commands.hx
>> index 3813f66..4b16044 100644
>> --- a/qmp-commands.hx
>> +++ b/qmp-commands.hx
>> @@ -800,6 +800,28 @@ Example:
>>   EQMP
>>
>>       {
>> +         .name       = "colo-set-checkpoint-period",
>> +         .args_type  = "value:i",
>> +         .mhandler.cmd_new = qmp_marshal_input_colo_set_checkpoint_period,
>> +    },
>> +
>> +SQMP
>> +colo-set-checkpoint-period
>> +--------------------------
>> +
>> +set checkpoint period
>> +
>> +Arguments:
>> +- "value": checkpoint period
>> +
>> +Example:
>> +
>> +-> { "execute": "colo-set-checkpoint-period", "arguments": { "value": "1000" } }
>> +<- { "return": {} }
>> +
>> +EQMP
>> +
>> +    {
>>           .name       = "client_migrate_info",
>>           .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
>>           .params     = "protocol hostname port tls-port cert-subject",
>> diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
>> index 03a395b..d3c9dc4 100644
>> --- a/stubs/migration-colo.c
>> +++ b/stubs/migration-colo.c
>> @@ -52,3 +52,7 @@ void qmp_colo_lost_heartbeat(Error **errp)
>>                        " with --enable-colo option in order to support"
>>                        " COLO feature");
>>   }
>> +
>> +void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
>> +{
>> +}
>> --
>> 1.7.12.4
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 25/29] COLO: Add colo-set-checkpoint-period command
  2015-06-09  3:28     ` zhanghailiang
@ 2015-06-09  8:01       ` Dr. David Alan Gilbert
  2015-06-09 10:14         ` zhanghailiang
  0 siblings, 1 reply; 62+ messages in thread
From: Dr. David Alan Gilbert @ 2015-06-09  8:01 UTC (permalink / raw)
  To: zhanghailiang
  Cc: liang.z.li, lizhijian, quintela, yunhong.jiang, eddie.dong,
	peter.huangpeng, qemu-devel, arei.gonglei, amit.shah, david

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> On 2015/6/6 2:45, Dr. David Alan Gilbert wrote:
> >* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>With this command, we can control the period of checkpoint, if
> >>there is no comparison of net packets.
> >
> >This should use the MigrationParameter stuff that's gone into qemu recently;
> >in my local copy of your code I've got this, and your COLO_MIN period, and the
> >delay after a miscompare and your live-ram size threshold all wired in as
> >MigrationParameters; makes it a lot easier to play with the values.
> 
> Yes, it is a good idea to use the new command 'migrate-set-parameters' to set all the parameters of COLO
> related, but i noticed that this new command was custom-built for compress, not good designed for extension.
> The qmp api is defined like bellow:
> void qmp_migrate_set_parameters(bool has_compress_level,
>                                 int64_t compress_level,
>                                 bool has_compress_threads,
>                                 int64_t compress_threads,
>                                 bool has_decompress_threads,
>                                 int64_t decompress_threads, Error **errp)

Yes, I don't like it.

> Maybe we should change it like:
> 
> void qmp_migrate_set_parameters(bool has_compress_info,
>                                 struct compress_info compress,
> 				Error **errp)
> 
> I will try to fix it like that, and then use it in the COLO.

See my thread with Markus and Eric here:
https://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg01709.html

Dave

> 
> Thanks,
> zhanghailiang
> 
> >>Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> >>Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> >>---
> >>  hmp-commands.hx        | 15 +++++++++++++++
> >>  hmp.c                  |  7 +++++++
> >>  hmp.h                  |  1 +
> >>  migration/colo.c       | 11 ++++++++++-
> >>  qapi-schema.json       | 13 +++++++++++++
> >>  qmp-commands.hx        | 22 ++++++++++++++++++++++
> >>  stubs/migration-colo.c |  4 ++++
> >>  7 files changed, 72 insertions(+), 1 deletion(-)
> >>
> >>diff --git a/hmp-commands.hx b/hmp-commands.hx
> >>index be3e398..32cd548 100644
> >>--- a/hmp-commands.hx
> >>+++ b/hmp-commands.hx
> >>@@ -1023,6 +1023,21 @@ Tell COLO that heartbeat is lost, a failover or takeover is needed.
> >>  ETEXI
> >>
> >>      {
> >>+        .name       = "colo_set_checkpoint_period",
> >>+        .args_type  = "value:i",
> >>+        .params     = "value",
> >>+        .help       = "set checkpoint period (in ms) for colo. "
> >>+        "Defaults to 100ms",
> >>+        .mhandler.cmd = hmp_colo_set_checkpoint_period,
> >>+    },
> >>+
> >>+STEXI
> >>+@item migrate_set_checkpoint_period @var{value}
> >>+@findex migrate_set_checkpoint_period
> >>+Set checkpoint period to @var{value} (in ms) for colo.
> >>+ETEXI
> >>+
> >>+    {
> >>          .name       = "client_migrate_info",
> >>          .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
> >>          .params     = "protocol hostname port tls-port cert-subject",
> >>diff --git a/hmp.c b/hmp.c
> >>index f87fa37..f727686 100644
> >>--- a/hmp.c
> >>+++ b/hmp.c
> >>@@ -1257,6 +1257,13 @@ void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict)
> >>      hmp_handle_error(mon, &err);
> >>  }
> >>
> >>+void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict)
> >>+{
> >>+    int64_t value = qdict_get_int(qdict, "value");
> >>+
> >>+    qmp_colo_set_checkpoint_period(value, NULL);
> >>+}
> >>+
> >>  void hmp_set_password(Monitor *mon, const QDict *qdict)
> >>  {
> >>      const char *protocol  = qdict_get_str(qdict, "protocol");
> >>diff --git a/hmp.h b/hmp.h
> >>index b6549f8..9570345 100644
> >>--- a/hmp.h
> >>+++ b/hmp.h
> >>@@ -68,6 +68,7 @@ void hmp_migrate_set_capability(Monitor *mon, const QDict *qdict);
> >>  void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
> >>  void hmp_migrate_set_cache_size(Monitor *mon, const QDict *qdict);
> >>  void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
> >>+void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict);
> >>  void hmp_set_password(Monitor *mon, const QDict *qdict);
> >>  void hmp_expire_password(Monitor *mon, const QDict *qdict);
> >>  void hmp_eject(Monitor *mon, const QDict *qdict);
> >>diff --git a/migration/colo.c b/migration/colo.c
> >>index 195973a..f5fc79c 100644
> >>--- a/migration/colo.c
> >>+++ b/migration/colo.c
> >>@@ -17,6 +17,7 @@
> >>  #include "qemu/error-report.h"
> >>  #include "migration/migration-failover.h"
> >>  #include "net/colo-nic.h"
> >>+#include "qmp-commands.h"
> >>
> >>  /*
> >>  * We should not do checkpoint one after another without any time interval,
> >>@@ -70,6 +71,9 @@ enum {
> >>  static QEMUBH *colo_bh;
> >>  static bool vmstate_loading;
> >>  static Coroutine *colo;
> >>+
> >>+int64_t colo_checkpoint_period = CHECKPOINT_MAX_PEROID;
> >>+
> >>  /* colo buffer */
> >>  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> >>  QEMUSizedBuffer *colo_buffer;
> >>@@ -85,6 +89,11 @@ bool migrate_in_colo_state(void)
> >>      return (s->state == MIGRATION_STATUS_COLO);
> >>  }
> >>
> >>+void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
> >>+{
> >>+    colo_checkpoint_period = value;
> >>+}
> >>+
> >>  static bool colo_runstate_is_stopped(void)
> >>  {
> >>      return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
> >>@@ -361,7 +370,7 @@ static void *colo_thread(void *opaque)
> >>           * and then check if we need checkpoint again.
> >>           */
> >>          current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> >>-        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
> >>+        if (current_time - checkpoint_time < colo_checkpoint_period) {
> >>              g_usleep(100000);
> >>              continue;
> >>          }
> >>diff --git a/qapi-schema.json b/qapi-schema.json
> >>index dc0ee07..62b5cfd 100644
> >>--- a/qapi-schema.json
> >>+++ b/qapi-schema.json
> >>@@ -653,6 +653,19 @@
> >>  { 'command': 'colo-lost-heartbeat' }
> >>
> >>  ##
> >>+# @colo-set-checkpoint-period
> >>+#
> >>+# Set colo checkpoint period
> >>+#
> >>+# @value: period of colo checkpoint in ms
> >>+#
> >>+# Returns: nothing on success
> >>+#
> >>+# Since: 2.4
> >>+##
> >>+{ 'command': 'colo-set-checkpoint-period', 'data': {'value': 'int'} }
> >>+
> >>+##
> >>  # @MouseInfo:
> >>  #
> >>  # Information about a mouse device.
> >>diff --git a/qmp-commands.hx b/qmp-commands.hx
> >>index 3813f66..4b16044 100644
> >>--- a/qmp-commands.hx
> >>+++ b/qmp-commands.hx
> >>@@ -800,6 +800,28 @@ Example:
> >>  EQMP
> >>
> >>      {
> >>+         .name       = "colo-set-checkpoint-period",
> >>+         .args_type  = "value:i",
> >>+         .mhandler.cmd_new = qmp_marshal_input_colo_set_checkpoint_period,
> >>+    },
> >>+
> >>+SQMP
> >>+colo-set-checkpoint-period
> >>+--------------------------
> >>+
> >>+set checkpoint period
> >>+
> >>+Arguments:
> >>+- "value": checkpoint period
> >>+
> >>+Example:
> >>+
> >>+-> { "execute": "colo-set-checkpoint-period", "arguments": { "value": "1000" } }
> >>+<- { "return": {} }
> >>+
> >>+EQMP
> >>+
> >>+    {
> >>          .name       = "client_migrate_info",
> >>          .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
> >>          .params     = "protocol hostname port tls-port cert-subject",
> >>diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
> >>index 03a395b..d3c9dc4 100644
> >>--- a/stubs/migration-colo.c
> >>+++ b/stubs/migration-colo.c
> >>@@ -52,3 +52,7 @@ void qmp_colo_lost_heartbeat(Error **errp)
> >>                       " with --enable-colo option in order to support"
> >>                       " COLO feature");
> >>  }
> >>+
> >>+void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
> >>+{
> >>+}
> >>--
> >>1.7.12.4
> >>
> >>
> >--
> >Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> >.
> >
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v5 25/29] COLO: Add colo-set-checkpoint-period command
  2015-06-09  8:01       ` Dr. David Alan Gilbert
@ 2015-06-09 10:14         ` zhanghailiang
  0 siblings, 0 replies; 62+ messages in thread
From: zhanghailiang @ 2015-06-09 10:14 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: liang.z.li, lizhijian, quintela, yunhong.jiang, eddie.dong,
	peter.huangpeng, qemu-devel, arei.gonglei, amit.shah, david

On 2015/6/9 16:01, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2015/6/6 2:45, Dr. David Alan Gilbert wrote:
>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>>> With this command, we can control the period of checkpoint, if
>>>> there is no comparison of net packets.
>>>
>>> This should use the MigrationParameter stuff that's gone into qemu recently;
>>> in my local copy of your code I've got this, and your COLO_MIN period, and the
>>> delay after a miscompare and your live-ram size threshold all wired in as
>>> MigrationParameters; makes it a lot easier to play with the values.
>>
>> Yes, it is a good idea to use the new command 'migrate-set-parameters' to set all the parameters of COLO
>> related, but i noticed that this new command was custom-built for compress, not good designed for extension.
>> The qmp api is defined like bellow:
>> void qmp_migrate_set_parameters(bool has_compress_level,
>>                                  int64_t compress_level,
>>                                  bool has_compress_threads,
>>                                  int64_t compress_threads,
>>                                  bool has_decompress_threads,
>>                                  int64_t decompress_threads, Error **errp)
>
> Yes, I don't like it.
>
>> Maybe we should change it like:
>>
>> void qmp_migrate_set_parameters(bool has_compress_info,
>>                                  struct compress_info compress,
>> 				Error **errp)
>>
>> I will try to fix it like that, and then use it in the COLO.
>
> See my thread with Markus and Eric here:
> https://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg01709.html

Er, i didn't notice this discussion, and have sent a RFC patch to community,
please ignore. Thanks, :)

>>
>> Thanks,
>> zhanghailiang
>>
>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> ---
>>>>   hmp-commands.hx        | 15 +++++++++++++++
>>>>   hmp.c                  |  7 +++++++
>>>>   hmp.h                  |  1 +
>>>>   migration/colo.c       | 11 ++++++++++-
>>>>   qapi-schema.json       | 13 +++++++++++++
>>>>   qmp-commands.hx        | 22 ++++++++++++++++++++++
>>>>   stubs/migration-colo.c |  4 ++++
>>>>   7 files changed, 72 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hmp-commands.hx b/hmp-commands.hx
>>>> index be3e398..32cd548 100644
>>>> --- a/hmp-commands.hx
>>>> +++ b/hmp-commands.hx
>>>> @@ -1023,6 +1023,21 @@ Tell COLO that heartbeat is lost, a failover or takeover is needed.
>>>>   ETEXI
>>>>
>>>>       {
>>>> +        .name       = "colo_set_checkpoint_period",
>>>> +        .args_type  = "value:i",
>>>> +        .params     = "value",
>>>> +        .help       = "set checkpoint period (in ms) for colo. "
>>>> +        "Defaults to 100ms",
>>>> +        .mhandler.cmd = hmp_colo_set_checkpoint_period,
>>>> +    },
>>>> +
>>>> +STEXI
>>>> +@item migrate_set_checkpoint_period @var{value}
>>>> +@findex migrate_set_checkpoint_period
>>>> +Set checkpoint period to @var{value} (in ms) for colo.
>>>> +ETEXI
>>>> +
>>>> +    {
>>>>           .name       = "client_migrate_info",
>>>>           .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
>>>>           .params     = "protocol hostname port tls-port cert-subject",
>>>> diff --git a/hmp.c b/hmp.c
>>>> index f87fa37..f727686 100644
>>>> --- a/hmp.c
>>>> +++ b/hmp.c
>>>> @@ -1257,6 +1257,13 @@ void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict)
>>>>       hmp_handle_error(mon, &err);
>>>>   }
>>>>
>>>> +void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict)
>>>> +{
>>>> +    int64_t value = qdict_get_int(qdict, "value");
>>>> +
>>>> +    qmp_colo_set_checkpoint_period(value, NULL);
>>>> +}
>>>> +
>>>>   void hmp_set_password(Monitor *mon, const QDict *qdict)
>>>>   {
>>>>       const char *protocol  = qdict_get_str(qdict, "protocol");
>>>> diff --git a/hmp.h b/hmp.h
>>>> index b6549f8..9570345 100644
>>>> --- a/hmp.h
>>>> +++ b/hmp.h
>>>> @@ -68,6 +68,7 @@ void hmp_migrate_set_capability(Monitor *mon, const QDict *qdict);
>>>>   void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
>>>>   void hmp_migrate_set_cache_size(Monitor *mon, const QDict *qdict);
>>>>   void hmp_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
>>>> +void hmp_colo_set_checkpoint_period(Monitor *mon, const QDict *qdict);
>>>>   void hmp_set_password(Monitor *mon, const QDict *qdict);
>>>>   void hmp_expire_password(Monitor *mon, const QDict *qdict);
>>>>   void hmp_eject(Monitor *mon, const QDict *qdict);
>>>> diff --git a/migration/colo.c b/migration/colo.c
>>>> index 195973a..f5fc79c 100644
>>>> --- a/migration/colo.c
>>>> +++ b/migration/colo.c
>>>> @@ -17,6 +17,7 @@
>>>>   #include "qemu/error-report.h"
>>>>   #include "migration/migration-failover.h"
>>>>   #include "net/colo-nic.h"
>>>> +#include "qmp-commands.h"
>>>>
>>>>   /*
>>>>   * We should not do checkpoint one after another without any time interval,
>>>> @@ -70,6 +71,9 @@ enum {
>>>>   static QEMUBH *colo_bh;
>>>>   static bool vmstate_loading;
>>>>   static Coroutine *colo;
>>>> +
>>>> +int64_t colo_checkpoint_period = CHECKPOINT_MAX_PEROID;
>>>> +
>>>>   /* colo buffer */
>>>>   #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>>>>   QEMUSizedBuffer *colo_buffer;
>>>> @@ -85,6 +89,11 @@ bool migrate_in_colo_state(void)
>>>>       return (s->state == MIGRATION_STATUS_COLO);
>>>>   }
>>>>
>>>> +void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
>>>> +{
>>>> +    colo_checkpoint_period = value;
>>>> +}
>>>> +
>>>>   static bool colo_runstate_is_stopped(void)
>>>>   {
>>>>       return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
>>>> @@ -361,7 +370,7 @@ static void *colo_thread(void *opaque)
>>>>            * and then check if we need checkpoint again.
>>>>            */
>>>>           current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>>> -        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
>>>> +        if (current_time - checkpoint_time < colo_checkpoint_period) {
>>>>               g_usleep(100000);
>>>>               continue;
>>>>           }
>>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>>> index dc0ee07..62b5cfd 100644
>>>> --- a/qapi-schema.json
>>>> +++ b/qapi-schema.json
>>>> @@ -653,6 +653,19 @@
>>>>   { 'command': 'colo-lost-heartbeat' }
>>>>
>>>>   ##
>>>> +# @colo-set-checkpoint-period
>>>> +#
>>>> +# Set colo checkpoint period
>>>> +#
>>>> +# @value: period of colo checkpoint in ms
>>>> +#
>>>> +# Returns: nothing on success
>>>> +#
>>>> +# Since: 2.4
>>>> +##
>>>> +{ 'command': 'colo-set-checkpoint-period', 'data': {'value': 'int'} }
>>>> +
>>>> +##
>>>>   # @MouseInfo:
>>>>   #
>>>>   # Information about a mouse device.
>>>> diff --git a/qmp-commands.hx b/qmp-commands.hx
>>>> index 3813f66..4b16044 100644
>>>> --- a/qmp-commands.hx
>>>> +++ b/qmp-commands.hx
>>>> @@ -800,6 +800,28 @@ Example:
>>>>   EQMP
>>>>
>>>>       {
>>>> +         .name       = "colo-set-checkpoint-period",
>>>> +         .args_type  = "value:i",
>>>> +         .mhandler.cmd_new = qmp_marshal_input_colo_set_checkpoint_period,
>>>> +    },
>>>> +
>>>> +SQMP
>>>> +colo-set-checkpoint-period
>>>> +--------------------------
>>>> +
>>>> +set checkpoint period
>>>> +
>>>> +Arguments:
>>>> +- "value": checkpoint period
>>>> +
>>>> +Example:
>>>> +
>>>> +-> { "execute": "colo-set-checkpoint-period", "arguments": { "value": "1000" } }
>>>> +<- { "return": {} }
>>>> +
>>>> +EQMP
>>>> +
>>>> +    {
>>>>           .name       = "client_migrate_info",
>>>>           .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
>>>>           .params     = "protocol hostname port tls-port cert-subject",
>>>> diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
>>>> index 03a395b..d3c9dc4 100644
>>>> --- a/stubs/migration-colo.c
>>>> +++ b/stubs/migration-colo.c
>>>> @@ -52,3 +52,7 @@ void qmp_colo_lost_heartbeat(Error **errp)
>>>>                        " with --enable-colo option in order to support"
>>>>                        " COLO feature");
>>>>   }
>>>> +
>>>> +void qmp_colo_set_checkpoint_period(int64_t value, Error **errp)
>>>> +{
>>>> +}
>>>> --
>>>> 1.7.12.4
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>> .
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2015-06-09 10:17 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-21  8:12 [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service zhanghailiang
2015-05-21  8:12 ` [Qemu-devel] " zhanghailiang
2015-05-21  8:12 ` [Qemu-devel] [PATCH COLO-Frame v5 01/29] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
2015-05-21  8:12 ` [Qemu-devel] [PATCH COLO-Frame v5 02/29] migration: Introduce capability 'colo' to migration zhanghailiang
2015-05-21  8:12 ` [Qemu-devel] [PATCH COLO-Frame v5 03/29] COLO: migrate colo related info to slave zhanghailiang
2015-05-21  8:12 ` [Qemu-devel] [PATCH COLO-Frame v5 04/29] migration: Integrate COLO checkpoint process into migration zhanghailiang
2015-05-21  8:12 ` [Qemu-devel] [PATCH COLO-Frame v5 05/29] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
2015-05-21  8:12 ` [Qemu-devel] [PATCH COLO-Frame v5 06/29] COLO: Implement colo checkpoint protocol zhanghailiang
2015-05-21  8:12 ` [Qemu-devel] [PATCH COLO-Frame v5 07/29] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 08/29] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 09/29] COLO: Save VM state to slave when do checkpoint zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 10/29] COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 11/29] COLO VMstate: Load VM state into qsb before restore it zhanghailiang
2015-06-05 18:02   ` Dr. David Alan Gilbert
2015-06-09  2:19     ` zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 12/29] arch_init: Start to trace dirty pages of SVM zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 13/29] COLO RAM: Flush cached RAM into SVM's memory zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 14/29] COLO failover: Introduce a new command to trigger a failover zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 15/29] COLO failover: Implement COLO master/slave failover work zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 16/29] COLO failover: Don't do failover during loading VM's state zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 17/29] COLO: Add new command parameter 'colo_nicname' 'colo_script' for net zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 18/29] COLO NIC: Init/remove colo nic devices when add/cleanup tap devices zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 19/29] COLO NIC: Implement colo nic device interface configure() zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 20/29] COLO NIC : Implement colo nic init/destroy function zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 21/29] COLO NIC: Some init work related with proxy module zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 22/29] COLO: Handle nfnetlink message from " zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 23/29] COLO: Do checkpoint according to the result of packets comparation zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 24/29] COLO: Improve checkpoint efficiency by do additional periodic checkpoint zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 25/29] COLO: Add colo-set-checkpoint-period command zhanghailiang
2015-06-05 18:45   ` Dr. David Alan Gilbert
2015-06-09  3:28     ` zhanghailiang
2015-06-09  8:01       ` Dr. David Alan Gilbert
2015-06-09 10:14         ` zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 26/29] COLO NIC: Implement NIC checkpoint and failover zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 27/29] COLO: Disable qdev hotplug when VM is in COLO mode zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 28/29] COLO: Implement shutdown checkpoint zhanghailiang
2015-05-21  8:13 ` [Qemu-devel] [PATCH COLO-Frame v5 29/29] COLO: Add block replication into colo process zhanghailiang
2015-05-21 11:30 ` [PATCH COLO-Frame v5 00/29] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service Dr. David Alan Gilbert
2015-05-21 11:30   ` [Qemu-devel] " Dr. David Alan Gilbert
2015-05-22  6:26   ` zhanghailiang
2015-05-22  6:26     ` [Qemu-devel] " zhanghailiang
2015-05-28 16:24 ` Dr. David Alan Gilbert
2015-05-28 16:24   ` [Qemu-devel] " Dr. David Alan Gilbert
2015-05-29  1:29   ` Wen Congyang
2015-05-29  1:29     ` [Qemu-devel] " Wen Congyang
2015-05-29  8:01     ` Dr. David Alan Gilbert
2015-05-29  8:01       ` [Qemu-devel] " Dr. David Alan Gilbert
2015-05-29  8:06     ` zhanghailiang
2015-05-29  8:06       ` [Qemu-devel] " zhanghailiang
2015-05-29  8:42       ` Dr. David Alan Gilbert
2015-05-29  8:42         ` [Qemu-devel] " Dr. David Alan Gilbert
2015-05-29 12:34         ` Wen Congyang
2015-05-29 15:12           ` Dr. David Alan Gilbert
2015-05-29 15:12             ` [Qemu-devel] " Dr. David Alan Gilbert
2015-06-01  1:41         ` Wen Congyang
2015-06-01  1:41           ` [Qemu-devel] " Wen Congyang
2015-06-01  9:16           ` Dr. David Alan Gilbert
2015-06-01  9:16             ` [Qemu-devel] " Dr. David Alan Gilbert
2015-06-02  3:51   ` Wen Congyang
2015-06-02  3:51     ` [Qemu-devel] " Wen Congyang
2015-06-02  8:02     ` Dr. David Alan Gilbert
2015-06-02  8:02       ` [Qemu-devel] " Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.