All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)
@ 2015-11-03 11:56 zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
                   ` (37 more replies)
  0 siblings, 38 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

This is the 10th version of COLO.

Still, this version of COLO is only support periodic checkpoint,
just like MicroCheckpointing and Remus does. We call it 'periodic' mode,
the normal 'colo' mode is based on packets compare module, which
is not supported for now. The compare module 'proxy' is in the process
of development.

The 'peroidic' mode is based on netfilter which has been merged.
It uses the 'filter-buffer' to buffer and release packets.
Patch 32 ~ Patch 36 export several APIs for netfilter to support
this capability.

As usual, here is only COLO frame part, you can get the whole codes from github:
https://github.com/coloft/qemu/commits/colo-v2.1-periodic-mode

Test procedure:
1. Startup qemu
Primary side:
#x86_64-softmmu/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device virtio-net-pci,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:66 -boot c -drive if=virtio,id=colo-disk1,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,children.0.driver=raw -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -S

Secondary side:
#x86_64-softmmu/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device virtio-net-pci,id=net-pci0,netdev=hn0,mac=52:54:00:12:34:66 -boot c -drive if=none,id=colo-disk1,file.filename=/dev/null,driver=raw -drive if=virtio,id=active-disk1,throttling.bps-total=70000000,driver=replication,mode=secondary,file.driver=qcow2,file.file.filename=/mnt/ramfs/active_disk.img,file.backing.allow-write-backing-file=on,file.backing.driver=qcow2,file.backing.file.filename=/mnt/ramfs/hidden_disk.img,file.backing.backing.allow-write-backing-file=on,file.backing.backing.file.filename=/mnt/sdd/pure_IMG/linux/redhat/rhel_6.5_64_2U_ide,file.backing.backing.driver=raw,file.backing.backing.node-name=sdisk -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -incoming tcp:0:8881

2. On Secondary VM's QEMU monitor, issue command
(qemu) blockdev_remove_medium colo-disk1
(qemu) blockdev_insert_medium colo-disk1 sdisk
(qemu) nbd_server_start 192.168.2.88:8880
(qemu) nbd_server_add -w colo-disk1

3. On Primary VM's QEMU monitor, issue command:
(qemu) drive_add buddy driver=replication,mode=primary,file.driver=nbd,file.host=192.168.2.88,file.port=8880,file.export=colo-disk1,node-name=test3,if=none
(qemu) blockdev_change add colo-disk1 buddy test3
(qemu) migrate_set_capability x-colo on
(qemu) migrate tcp:192.168.2.88:8881

4. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
You can by issue command "migrate_set_parameter checkpoint-delay 2000"
to change the checkpoint period time.

5. Failover test
You can kill Primary VM and run 'x_colo_lost_heartbeat' in Secondary VM's
monitor at the same time, then SVM will failover and client will not feel this 
change.

COLO is a totally new feature which is still in early stage,
your comments and feedback are warmly welcomed.

TODO:
1. implement packets compare module (proxy) in qemu
2. checkpoint based on proxy in qemu
3. The capability of continuous FT

v10:
 - Rename 'colo_lost_heartbeat' command to experimental 'x_colo_lost_heartbeat'
 - Rename migration capability 'colo' to 'x-colo' (Eric's suggestion)
 - Simplify the process of primary side by dropping colo thread and reusing
   migration thread. (Dave's suggestion)
 - Add several netfilter related APIs to support buffer/release packets
   for COLO (patch 32 ~ patch 36)

zhanghailiang (38):
  configure: Add parameter for configure to enable/disable COLO support
  migration: Introduce capability 'x-colo' to migration
  COLO: migrate colo related info to secondary node
  migration: Add state records for migration incoming
  migration: Integrate COLO checkpoint process into migration
  migration: Integrate COLO checkpoint process into loadvm
  migration: Rename the'file' member of MigrationState and
    MigrationIncomingState
  COLO/migration: establish a new communication path from destination to
    source
  COLO: Implement colo checkpoint protocol
  COLO: Add a new RunState RUN_STATE_COLO
  QEMUSizedBuffer: Introduce two help functions for qsb
  COLO: Save PVM state to secondary side when do checkpoint
  COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
  COLO: Load VMState into qsb before restore it
  ram/COLO: Record pages received from PVM by re-using migration dirty
    bitmap
  COLO: Flush PVM's cached RAM into SVM's memory
  COLO: synchronize PVM's state to SVM periodically
  COLO failover: Introduce a new command to trigger a failover
  COLO failover: Introduce state to record failover process
  COLO: Implement failover work for Primary VM
  COLO: Implement failover work for Secondary VM
  COLO: implement default failover treatment
  qmp event: Add event notification for COLO error
  COLO failover: Shutdown related socket fd when do failover
  COLO failover: Don't do failover during loading VM's state
  COLO: Control the checkpoint delay time by migrate-set-parameters
    command
  COLO: Process shutdown command for VM in COLO state
  COLO: Update the global runstate after going into colo state
  savevm: Split load vm state function qemu_loadvm_state
  COLO: Separate the process of saving/loading ram and device state
  COLO: Split qemu_savevm_state_begin out of checkpoint process
  netfilter: Add a public API to release all the buffered packets
  netfilter: Introduce an API to delete the timer of all buffer-filters
  filter-buffer: Accept zero interval
  netfilter: Introduce a API to automatically add filter-buffer for each
    netdev
  netfilter: Introduce an API to delete all the automatically added
    netfilters
  colo: Use the netfilter to buffer and release packets
  COLO: Add block replication into colo process

 configure                     |  11 +
 docs/qmp-events.txt           |  17 +
 hmp-commands.hx               |  15 +
 hmp.c                         |  15 +
 hmp.h                         |   1 +
 include/exec/ram_addr.h       |   1 +
 include/migration/colo.h      |  44 +++
 include/migration/failover.h  |  33 ++
 include/migration/migration.h |  17 +-
 include/migration/qemu-file.h |   3 +-
 include/net/filter.h          |   5 +
 include/net/net.h             |   7 +
 include/sysemu/sysemu.h       |   8 +
 migration/Makefile.objs       |   2 +
 migration/colo-comm.c         |  71 ++++
 migration/colo-failover.c     |  83 +++++
 migration/colo.c              | 788 ++++++++++++++++++++++++++++++++++++++++++
 migration/exec.c              |   4 +-
 migration/fd.c                |   4 +-
 migration/migration.c         | 178 +++++++---
 migration/qemu-file-buf.c     |  58 ++++
 migration/ram.c               | 177 +++++++++-
 migration/savevm.c            | 309 +++++++++++++----
 migration/tcp.c               |   4 +-
 migration/unix.c              |   4 +-
 net/filter-buffer.c           | 143 +++++++-
 net/filter.c                  |  15 +
 net/net.c                     |  44 +++
 qapi-schema.json              | 105 +++++-
 qapi/event.json               |  17 +
 qmp-commands.hx               |  22 +-
 stubs/Makefile.objs           |   1 +
 stubs/migration-colo.c        |  45 +++
 trace-events                  |   9 +
 vl.c                          |  37 +-
 35 files changed, 2130 insertions(+), 167 deletions(-)
 create mode 100644 include/migration/colo.h
 create mode 100644 include/migration/failover.h
 create mode 100644 migration/colo-comm.c
 create mode 100644 migration/colo-failover.c
 create mode 100644 migration/colo.c
 create mode 100644 stubs/migration-colo.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-05 14:52   ` Eric Blake
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration zhanghailiang
                   ` (36 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

configure --enable-colo/--disable-colo to switch COLO
support on/off.
COLO support is off by default.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 configure | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/configure b/configure
index 7a1d08d..e339c60 100755
--- a/configure
+++ b/configure
@@ -257,6 +257,7 @@ xfs=""
 vhost_net="no"
 vhost_scsi="no"
 kvm="no"
+colo="no"
 rdma=""
 gprof="no"
 debug_tcg="no"
@@ -929,6 +930,10 @@ for opt do
   ;;
   --enable-kvm) kvm="yes"
   ;;
+  --disable-colo) colo="no"
+  ;;
+  --enable-colo) colo="yes"
+  ;;
   --disable-tcg-interpreter) tcg_interpreter="no"
   ;;
   --enable-tcg-interpreter) tcg_interpreter="yes"
@@ -1353,6 +1358,7 @@ disabled with --disable-FEATURE, default is enabled if available:
   fdt             fdt device tree
   bluez           bluez stack connectivity
   kvm             KVM acceleration support
+  colo            COarse-grain LOck-stepping VM for Non-stop Service
   rdma            RDMA-based migration support
   uuid            uuid support
   vde             support for vde network
@@ -4732,6 +4738,7 @@ echo "Linux AIO support $linux_aio"
 echo "ATTR/XATTR support $attr"
 echo "Install blobs     $blobs"
 echo "KVM support       $kvm"
+echo "COLO support      $colo"
 echo "RDMA support      $rdma"
 echo "TCG interpreter   $tcg_interpreter"
 echo "fdt support       $fdt"
@@ -5321,6 +5328,10 @@ if have_backend "ftrace"; then
 fi
 echo "CONFIG_TRACE_FILE=$trace_file" >> $config_host_mak
 
+if test "$colo" = "yes"; then
+  echo "CONFIG_COLO=y" >> $config_host_mak
+fi
+
 if test "$rdma" = "yes" ; then
   echo "CONFIG_RDMA=y" >> $config_host_mak
 fi
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-13 16:01   ` Eric Blake
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 03/38] COLO: migrate colo related info to secondary node zhanghailiang
                   ` (35 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, zhanghailiang

We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
v10:
- Rename capability 'colo' to experimental 'x-colo' (Eric's suggestion).
- Rename migrate_enable_colo() to migrate_colo_enabled() (Eric's suggestion).
---
 include/migration/colo.h      | 20 ++++++++++++++++++++
 include/migration/migration.h |  1 +
 migration/Makefile.objs       |  1 +
 migration/colo.c              | 18 ++++++++++++++++++
 migration/migration.c         | 17 +++++++++++++++++
 qapi-schema.json              |  6 +++++-
 qmp-commands.hx               |  1 +
 stubs/Makefile.objs           |  1 +
 stubs/migration-colo.c        | 18 ++++++++++++++++++
 9 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644 include/migration/colo.h
 create mode 100644 migration/colo.c
 create mode 100644 stubs/migration-colo.c

diff --git a/include/migration/colo.h b/include/migration/colo.h
new file mode 100644
index 0000000..c60a590
--- /dev/null
+++ b/include/migration/colo.h
@@ -0,0 +1,20 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_H
+#define QEMU_COLO_H
+
+#include "qemu-common.h"
+
+bool colo_supported(void);
+
+#endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 8334621..8643a74 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -170,6 +170,7 @@ int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
 
 int migrate_use_xbzrle(void);
 int64_t migrate_xbzrle_cache_size(void);
+bool migrate_colo_enabled(void);
 
 int64_t xbzrle_cache_resize(int64_t new_size);
 
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index d929e96..5a25d39 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,4 +1,5 @@
 common-obj-y += migration.o tcp.o
+common-obj-$(CONFIG_COLO) += colo.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o qemu-file-stdio.o
 common-obj-y += xbzrle.o
diff --git a/migration/colo.c b/migration/colo.c
new file mode 100644
index 0000000..2c40d2e
--- /dev/null
+++ b/migration/colo.c
@@ -0,0 +1,18 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "migration/colo.h"
+
+bool colo_supported(void)
+{
+    return true;
+}
diff --git a/migration/migration.c b/migration/migration.c
index b092f38..0d7068f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -30,6 +30,7 @@
 #include "qapi/util.h"
 #include "qapi-event.h"
 #include "qom/cpu.h"
+#include "migration/colo.h"
 
 #define MAX_THROTTLE  (32 << 20)      /* Migration transfer speed throttling */
 
@@ -364,6 +365,9 @@ MigrationCapabilityStatusList *qmp_query_migrate_capabilities(Error **errp)
 
     caps = NULL; /* silence compiler warning */
     for (i = 0; i < MIGRATION_CAPABILITY_MAX; i++) {
+        if (i == MIGRATION_CAPABILITY_X_COLO && !colo_supported()) {
+            continue;
+        }
         if (head == NULL) {
             head = g_malloc0(sizeof(*caps));
             caps = head;
@@ -513,6 +517,13 @@ void qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
     }
 
     for (cap = params; cap; cap = cap->next) {
+        if (cap->value->capability == MIGRATION_CAPABILITY_X_COLO &&
+            !colo_supported()) {
+            error_setg(errp, "COLO is not currently supported, please"
+                             " configure with --enable-colo option in order to"
+                             " support COLO feature");
+            continue;
+        }
         s->enabled_capabilities[cap->value->capability] = cap->value->state;
     }
 }
@@ -1018,6 +1029,12 @@ fail:
     migrate_set_state(s, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_FAILED);
 }
 
+bool migrate_colo_enabled(void)
+{
+    MigrationState *s = migrate_get_current();
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_X_COLO];
+}
+
 /* migration thread support */
 
 static void *migration_thread(void *opaque)
diff --git a/qapi-schema.json b/qapi-schema.json
index 7f0c4c5..cb5e5fd 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -540,11 +540,15 @@
 # @auto-converge: If enabled, QEMU will automatically throttle down the guest
 #          to speed up convergence of RAM migration. (since 1.6)
 #
+# @x-colo: If enabled, migration will never end, and the state of the VM on the
+#        primary side will be migrated continuously to the VM on secondary
+#        side. (since 2.5)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
-           'compress', 'events'] }
+           'compress', 'events', 'x-colo'] }
 
 ##
 # @MigrationCapabilityStatus
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 0eb92d7..2f885b3 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3498,6 +3498,7 @@ Query current migration capabilities
          - "rdma-pin-all" : RDMA Pin Page state (json-bool)
          - "auto-converge" : Auto Converge state (json-bool)
          - "zero-blocks" : Zero Blocks state (json-bool)
+         - "x-colo" : COarse-Grain LOck Stepping for Non-stop Service (json-bool)
 
 Arguments:
 
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 251443b..d3306c5 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -35,3 +35,4 @@ stub-obj-y += kvm.o
 stub-obj-y += qmp_pc_dimm_device_list.o
 stub-obj-y += target-monitor-defs.o
 stub-obj-y += vhost.o
+stub-obj-y += migration-colo.o
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
new file mode 100644
index 0000000..3d817df
--- /dev/null
+++ b/stubs/migration-colo.c
@@ -0,0 +1,18 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "migration/colo.h"
+
+bool colo_supported(void)
+{
+    return false;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 03/38] COLO: migrate colo related info to secondary node
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-06 16:36   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 04/38] migration: Add state records for migration incoming zhanghailiang
                   ` (34 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

We can know if VM in destination should go into COLO mode by refer to
the info that been migrated from PVM.

We skip this section if colo is not enabled (i.e.
migrate_set_capability colo off), so that, It not break compatibility with migration
however the --enable-colo/disable-colo on the source/destination;

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
v10:
- Use VMSTATE_BOOL instead of VMSTATE_UNIT32 for 'colo_requested' (Dave's suggestion).
---
 include/migration/colo.h |  2 ++
 migration/Makefile.objs  |  1 +
 migration/colo-comm.c    | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
 vl.c                     |  3 ++-
 4 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 migration/colo-comm.c

diff --git a/include/migration/colo.h b/include/migration/colo.h
index c60a590..9b6662d 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -14,7 +14,9 @@
 #define QEMU_COLO_H
 
 #include "qemu-common.h"
+#include "migration/migration.h"
 
 bool colo_supported(void);
+void colo_info_mig_init(void);
 
 #endif
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index 5a25d39..cb7bd30 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,5 +1,6 @@
 common-obj-y += migration.o tcp.o
 common-obj-$(CONFIG_COLO) += colo.o
+common-obj-y += colo-comm.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o qemu-file-stdio.o
 common-obj-y += xbzrle.o
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
new file mode 100644
index 0000000..fb407e0
--- /dev/null
+++ b/migration/colo-comm.c
@@ -0,0 +1,50 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later. See the COPYING file in the top-level directory.
+ *
+ */
+
+#include <migration/colo.h>
+#include "trace.h"
+
+typedef struct {
+     bool colo_requested;
+} COLOInfo;
+
+static COLOInfo colo_info;
+
+static void colo_info_pre_save(void *opaque)
+{
+    COLOInfo *s = opaque;
+
+    s->colo_requested = migrate_colo_enabled();
+}
+
+static bool colo_info_need(void *opaque)
+{
+   return migrate_colo_enabled();
+}
+
+static const VMStateDescription colo_state = {
+     .name = "COLOState",
+     .version_id = 1,
+     .minimum_version_id = 1,
+     .pre_save = colo_info_pre_save,
+     .needed = colo_info_need,
+     .fields = (VMStateField[]) {
+         VMSTATE_BOOL(colo_requested, COLOInfo),
+         VMSTATE_END_OF_LIST()
+        },
+};
+
+void colo_info_mig_init(void)
+{
+    vmstate_register(NULL, 0, &colo_state, &colo_info);
+}
diff --git a/vl.c b/vl.c
index f5f7c3f..10e6cbe 100644
--- a/vl.c
+++ b/vl.c
@@ -91,6 +91,7 @@ int main(int argc, char **argv)
 #include "sysemu/dma.h"
 #include "audio/audio.h"
 #include "migration/migration.h"
+#include "migration/colo.h"
 #include "sysemu/kvm.h"
 #include "qapi/qmp/qjson.h"
 #include "qemu/option.h"
@@ -4421,7 +4422,7 @@ int main(int argc, char **argv, char **envp)
 
     blk_mig_init();
     ram_mig_init();
-
+    colo_info_mig_init();
     /* If the currently selected machine wishes to override the units-per-bus
      * property of its default HBA interface type, do so now. */
     if (machine_class->units_per_default_bus) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 04/38] migration: Add state records for migration incoming
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (2 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 03/38] COLO: migrate colo related info to secondary node zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration zhanghailiang
                   ` (33 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

For migration destination, we also need to know its state,
we will use it in COLO.

Here we add a new member 'state' for MigrationIncomingState,
and also use migrate_set_state() to modify its value.
We fix the first parameter of migrate_set_state(), and make it
public.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/migration.h |  3 +++
 migration/migration.c         | 43 +++++++++++++++++++++++++++----------------
 2 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 8643a74..9bfd8ff 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -50,6 +50,7 @@ typedef QLIST_HEAD(, LoadStateEntry) LoadStateEntry_Head;
 struct MigrationIncomingState {
     QEMUFile *file;
 
+    int state;
     /* See savevm.c */
     LoadStateEntry_Head loadvm_handlers;
 };
@@ -82,6 +83,8 @@ struct MigrationState
     int64_t dirty_sync_count;
 };
 
+void migrate_set_state(int *state, int old_state, int new_state);
+
 void process_incoming_migration(QEMUFile *f);
 
 void qemu_start_incoming_migration(const char *uri, Error **errp);
diff --git a/migration/migration.c b/migration/migration.c
index 0d7068f..b179464 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -97,6 +97,7 @@ MigrationIncomingState *migration_incoming_state_new(QEMUFile* f)
 {
     mis_current = g_new0(MigrationIncomingState, 1);
     mis_current->file = f;
+    mis_current->state = MIGRATION_STATUS_NONE;
     QLIST_INIT(&mis_current->loadvm_handlers);
 
     return mis_current;
@@ -278,11 +279,13 @@ void qemu_start_incoming_migration(const char *uri, Error **errp)
 static void process_incoming_migration_co(void *opaque)
 {
     QEMUFile *f = opaque;
+    MigrationIncomingState *mis;
     Error *local_err = NULL;
     int ret;
 
-    migration_incoming_state_new(f);
-    migrate_generate_event(MIGRATION_STATUS_ACTIVE);
+    mis = migration_incoming_state_new(f);
+    migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
+                      MIGRATION_STATUS_ACTIVE);
     ret = qemu_loadvm_state(f);
 
     qemu_fclose(f);
@@ -290,7 +293,8 @@ static void process_incoming_migration_co(void *opaque)
     migration_incoming_state_destroy();
 
     if (ret < 0) {
-        migrate_generate_event(MIGRATION_STATUS_FAILED);
+        migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                          MIGRATION_STATUS_FAILED);
         error_report("load of migration failed: %s", strerror(-ret));
         migrate_decompress_threads_join();
         exit(EXIT_FAILURE);
@@ -299,7 +303,8 @@ static void process_incoming_migration_co(void *opaque)
     /* Make sure all file formats flush their mutable metadata */
     bdrv_invalidate_cache_all(&local_err);
     if (local_err) {
-        migrate_generate_event(MIGRATION_STATUS_FAILED);
+        migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                          MIGRATION_STATUS_FAILED);
         error_report_err(local_err);
         migrate_decompress_threads_join();
         exit(EXIT_FAILURE);
@@ -331,7 +336,8 @@ static void process_incoming_migration_co(void *opaque)
      * observer sees this event they might start to prod at the VM assuming
      * it's ready to use.
      */
-    migrate_generate_event(MIGRATION_STATUS_COMPLETED);
+    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_COMPLETED);
 }
 
 void process_incoming_migration(QEMUFile *f)
@@ -596,9 +602,9 @@ void qmp_migrate_set_parameters(bool has_compress_level,
 
 /* shared migration helpers */
 
-static void migrate_set_state(MigrationState *s, int old_state, int new_state)
+void migrate_set_state(int *state, int old_state, int new_state)
 {
-    if (atomic_cmpxchg(&s->state, old_state, new_state) == old_state) {
+    if (atomic_cmpxchg(state, old_state, new_state) == old_state) {
         trace_migrate_set_state(new_state);
         migrate_generate_event(new_state);
     }
@@ -627,7 +633,7 @@ static void migrate_fd_cleanup(void *opaque)
     if (s->state != MIGRATION_STATUS_COMPLETED) {
         qemu_savevm_state_cancel();
         if (s->state == MIGRATION_STATUS_CANCELLING) {
-            migrate_set_state(s, MIGRATION_STATUS_CANCELLING,
+            migrate_set_state(&s->state, MIGRATION_STATUS_CANCELLING,
                               MIGRATION_STATUS_CANCELLED);
         }
     }
@@ -639,7 +645,8 @@ void migrate_fd_error(MigrationState *s)
 {
     trace_migrate_fd_error();
     assert(s->file == NULL);
-    migrate_set_state(s, MIGRATION_STATUS_SETUP, MIGRATION_STATUS_FAILED);
+    migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+                      MIGRATION_STATUS_FAILED);
     notifier_list_notify(&migration_state_notifiers, s);
 }
 
@@ -655,7 +662,7 @@ static void migrate_fd_cancel(MigrationState *s)
             old_state != MIGRATION_STATUS_ACTIVE) {
             break;
         }
-        migrate_set_state(s, old_state, MIGRATION_STATUS_CANCELLING);
+        migrate_set_state(&s->state, old_state, MIGRATION_STATUS_CANCELLING);
     } while (s->state != MIGRATION_STATUS_CANCELLING);
 
     /*
@@ -731,7 +738,7 @@ static MigrationState *migrate_init(const MigrationParams *params)
     s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
                 x_cpu_throttle_increment;
     s->bandwidth_limit = bandwidth_limit;
-    migrate_set_state(s, MIGRATION_STATUS_NONE, MIGRATION_STATUS_SETUP);
+    migrate_set_state(&s->state, MIGRATION_STATUS_NONE, MIGRATION_STATUS_SETUP);
 
     s->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     return s;
@@ -829,7 +836,8 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
     } else {
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "uri",
                    "a valid migration protocol");
-        migrate_set_state(s, MIGRATION_STATUS_SETUP, MIGRATION_STATUS_FAILED);
+        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+                          MIGRATION_STATUS_FAILED);
         return;
     }
 
@@ -1022,11 +1030,13 @@ static void migration_completion(MigrationState *s, bool *old_vm_running,
         goto fail;
     }
 
-    migrate_set_state(s, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_COMPLETED);
+    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_COMPLETED);
     return;
 
 fail:
-    migrate_set_state(s, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_FAILED);
+    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_FAILED);
 }
 
 bool migrate_colo_enabled(void)
@@ -1053,7 +1063,8 @@ static void *migration_thread(void *opaque)
     qemu_savevm_state_begin(s->file, &s->params);
 
     s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
-    migrate_set_state(s, MIGRATION_STATUS_SETUP, MIGRATION_STATUS_ACTIVE);
+    migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+                      MIGRATION_STATUS_ACTIVE);
 
     while (s->state == MIGRATION_STATUS_ACTIVE) {
         int64_t current_time;
@@ -1072,7 +1083,7 @@ static void *migration_thread(void *opaque)
         }
 
         if (qemu_file_get_error(s->file)) {
-            migrate_set_state(s, MIGRATION_STATUS_ACTIVE,
+            migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
                               MIGRATION_STATUS_FAILED);
             break;
         }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (3 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 04/38] migration: Add state records for migration incoming zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-06 16:48   ` Dr. David Alan Gilbert
  2015-11-13 16:42   ` Eric Blake
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
                   ` (32 subsequent siblings)
  37 siblings, 2 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state
after the first live migration successfully finished.

We reuse migration thread, so if colo is enabled by user, migration thread will
go into the process of colo.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
v10: Simplify process by dropping colo thread and reusing migration thread.
     (Dave's suggestion)
---
 include/migration/colo.h |  3 +++
 migration/colo.c         | 31 +++++++++++++++++++++++++++++++
 migration/migration.c    | 19 +++++++++++++++----
 qapi-schema.json         |  2 +-
 stubs/migration-colo.c   |  9 +++++++++
 trace-events             |  3 +++
 6 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 9b6662d..f462f06 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -19,4 +19,7 @@
 bool colo_supported(void);
 void colo_info_mig_init(void);
 
+void migrate_start_colo_process(MigrationState *s);
+bool migration_in_colo_state(void);
+
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index 2c40d2e..cf0ccb8 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,9 +10,40 @@
  * later.  See the COPYING file in the top-level directory.
  */
 
+#include "sysemu/sysemu.h"
 #include "migration/colo.h"
+#include "trace.h"
 
 bool colo_supported(void)
 {
     return true;
 }
+
+bool migration_in_colo_state(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return (s->state == MIGRATION_STATUS_COLO);
+}
+
+static void colo_process_checkpoint(MigrationState *s)
+{
+    qemu_mutex_lock_iothread();
+    vm_start();
+    qemu_mutex_unlock_iothread();
+    trace_colo_vm_state_change("stop", "run");
+
+    /*TODO: COLO checkpoint savevm loop*/
+
+    migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
+                      MIGRATION_STATUS_COMPLETED);
+}
+
+void migrate_start_colo_process(MigrationState *s)
+{
+    qemu_mutex_unlock_iothread();
+    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_COLO);
+    colo_process_checkpoint(s);
+    qemu_mutex_lock_iothread();
+}
diff --git a/migration/migration.c b/migration/migration.c
index b179464..cf83531 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -475,6 +475,10 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 
         get_xbzrle_cache_stats(info);
         break;
+    case MIGRATION_STATUS_COLO:
+        info->has_status = true;
+        /* TODO: display COLO specific information (checkpoint info etc.) */
+        break;
     case MIGRATION_STATUS_COMPLETED:
         get_xbzrle_cache_stats(info);
 
@@ -793,7 +797,8 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 
     if (s->state == MIGRATION_STATUS_ACTIVE ||
         s->state == MIGRATION_STATUS_SETUP ||
-        s->state == MIGRATION_STATUS_CANCELLING) {
+        s->state == MIGRATION_STATUS_CANCELLING ||
+        s->state == MIGRATION_STATUS_COLO) {
         error_setg(errp, QERR_MIGRATION_ACTIVE);
         return;
     }
@@ -1030,8 +1035,11 @@ static void migration_completion(MigrationState *s, bool *old_vm_running,
         goto fail;
     }
 
-    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
-                      MIGRATION_STATUS_COMPLETED);
+    if (!migrate_colo_enabled()) {
+        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                          MIGRATION_STATUS_COMPLETED);
+    }
+
     return;
 
 fail:
@@ -1056,6 +1064,7 @@ static void *migration_thread(void *opaque)
     int64_t max_size = 0;
     int64_t start_time = initial_time;
     bool old_vm_running = false;
+    bool enable_colo = migrate_colo_enabled();
 
     rcu_register_thread();
 
@@ -1130,7 +1139,9 @@ static void *migration_thread(void *opaque)
         }
         runstate_set(RUN_STATE_POSTMIGRATE);
     } else {
-        if (old_vm_running) {
+        if (s->state == MIGRATION_STATUS_ACTIVE && enable_colo) {
+            migrate_start_colo_process(s);
+        } else if (old_vm_running) {
             vm_start();
         }
     }
diff --git a/qapi-schema.json b/qapi-schema.json
index cb5e5fd..22251ec 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -439,7 +439,7 @@
 ##
 { 'enum': 'MigrationStatus',
   'data': [ 'none', 'setup', 'cancelling', 'cancelled',
-            'active', 'completed', 'failed' ] }
+            'active', 'completed', 'failed', 'colo' ] }
 
 ##
 # @MigrationInfo
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index 3d817df..acddca6 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -16,3 +16,12 @@ bool colo_supported(void)
 {
     return false;
 }
+
+bool migration_in_colo_state(void)
+{
+    return false;
+}
+
+void migrate_start_colo_process(MigrationState *s)
+{
+}
diff --git a/trace-events b/trace-events
index 72136b9..9cd6391 100644
--- a/trace-events
+++ b/trace-events
@@ -1497,6 +1497,9 @@ rdma_start_incoming_migration_after_rdma_listen(void) ""
 rdma_start_outgoing_migration_after_rdma_connect(void) ""
 rdma_start_outgoing_migration_after_rdma_source_init(void) ""
 
+# migration/colo.c
+colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
+
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
 kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (4 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-06 17:29   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 07/38] migration: Rename the'file' member of MigrationState and MigrationIncomingState zhanghailiang
                   ` (31 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.
We add three new members to struct MigrationIncomingState, 'have_colo_incoming_thread'
and 'colo_incoming_thread' record the colo related threads for secondary VM,
'migration_incoming_co' records the original migration incoming coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
v10: fix a bug about fd leak which is found by Dave.
---
 include/migration/colo.h      |  7 +++++++
 include/migration/migration.h |  7 +++++++
 migration/colo-comm.c         | 10 ++++++++++
 migration/colo.c              | 22 ++++++++++++++++++++++
 migration/migration.c         | 21 +++++++++++++++++++++
 stubs/migration-colo.c        | 10 ++++++++++
 6 files changed, 77 insertions(+)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index f462f06..2676c4a 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -15,6 +15,8 @@
 
 #include "qemu-common.h"
 #include "migration/migration.h"
+#include "qemu/coroutine_int.h"
+#include "qemu/thread.h"
 
 bool colo_supported(void);
 void colo_info_mig_init(void);
@@ -22,4 +24,9 @@ void colo_info_mig_init(void);
 void migrate_start_colo_process(MigrationState *s);
 bool migration_in_colo_state(void);
 
+/* loadvm */
+bool migration_incoming_enable_colo(void);
+void migration_incoming_exit_colo(void);
+void *colo_process_incoming_thread(void *opaque);
+bool migration_incoming_in_colo_state(void);
 #endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 9bfd8ff..3bc83fb 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -22,6 +22,7 @@
 #include "migration/vmstate.h"
 #include "qapi-types.h"
 #include "exec/cpu-common.h"
+#include "qemu/coroutine_int.h"
 
 #define QEMU_VM_FILE_MAGIC           0x5145564d
 #define QEMU_VM_FILE_VERSION_COMPAT  0x00000002
@@ -51,6 +52,12 @@ struct MigrationIncomingState {
     QEMUFile *file;
 
     int state;
+
+    bool have_colo_incoming_thread;
+    QemuThread colo_incoming_thread;
+    /* The coroutine we should enter (back) after failover */
+    Coroutine *migration_incoming_co;
+
     /* See savevm.c */
     LoadStateEntry_Head loadvm_handlers;
 };
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
index fb407e0..30df3d3 100644
--- a/migration/colo-comm.c
+++ b/migration/colo-comm.c
@@ -48,3 +48,13 @@ void colo_info_mig_init(void)
 {
     vmstate_register(NULL, 0, &colo_state, &colo_info);
 }
+
+bool migration_incoming_enable_colo(void)
+{
+    return colo_info.colo_requested;
+}
+
+void migration_incoming_exit_colo(void)
+{
+    colo_info.colo_requested = 0;
+}
diff --git a/migration/colo.c b/migration/colo.c
index cf0ccb8..6880aa0 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -13,6 +13,7 @@
 #include "sysemu/sysemu.h"
 #include "migration/colo.h"
 #include "trace.h"
+#include "qemu/error-report.h"
 
 bool colo_supported(void)
 {
@@ -26,6 +27,13 @@ bool migration_in_colo_state(void)
     return (s->state == MIGRATION_STATUS_COLO);
 }
 
+bool migration_incoming_in_colo_state(void)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+
+    return mis && (mis->state == MIGRATION_STATUS_COLO);
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     qemu_mutex_lock_iothread();
@@ -47,3 +55,17 @@ void migrate_start_colo_process(MigrationState *s)
     colo_process_checkpoint(s);
     qemu_mutex_lock_iothread();
 }
+
+void *colo_process_incoming_thread(void *opaque)
+{
+    MigrationIncomingState *mis = opaque;
+
+    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_COLO);
+
+    /* TODO: COLO checkpoint restore loop */
+
+    migration_incoming_exit_colo();
+
+    return NULL;
+}
diff --git a/migration/migration.c b/migration/migration.c
index cf83531..7d8cd38 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -288,6 +288,27 @@ static void process_incoming_migration_co(void *opaque)
                       MIGRATION_STATUS_ACTIVE);
     ret = qemu_loadvm_state(f);
 
+    if (!ret) {
+        /* Make sure all file formats flush their mutable metadata */
+        bdrv_invalidate_cache_all(&local_err);
+        if (local_err) {
+            error_report_err(local_err);
+            migrate_decompress_threads_join();
+            exit(EXIT_FAILURE);
+        }
+    }
+    /* we get colo info, and know if we are in colo mode */
+    if (!ret && migration_incoming_enable_colo()) {
+        mis->migration_incoming_co = qemu_coroutine_self();
+        qemu_thread_create(&mis->colo_incoming_thread, "colo incoming",
+             colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
+        mis->have_colo_incoming_thread = true;
+        qemu_coroutine_yield();
+
+        /* Wait checkpoint incoming thread exit before free resource */
+        qemu_thread_join(&mis->colo_incoming_thread);
+    }
+
     qemu_fclose(f);
     free_xbzrle_decoded_buf();
     migration_incoming_state_destroy();
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index acddca6..c12516e 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -22,6 +22,16 @@ bool migration_in_colo_state(void)
     return false;
 }
 
+bool migration_incoming_in_colo_state(void)
+{
+    return false;
+}
+
 void migrate_start_colo_process(MigrationState *s)
 {
 }
+
+void *colo_process_incoming_thread(void *opaque)
+{
+    return NULL;
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 07/38] migration: Rename the'file' member of MigrationState and MigrationIncomingState
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (5 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 08/38] COLO/migration: establish a new communication path from destination to source zhanghailiang
                   ` (30 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

Rename the 'file' member of MigrationState to 'to_dst_file' and
Rename the 'file' member of MigrationIncomingState to 'from_src_file'.

For now, there are only one path direction for migration, it is from
source side to destination side, but for colo and post-copy, we need
both directions communication, so here we rename the file member to indicate
this path.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
Will be dropped if post-copy is merged.
---
 include/migration/migration.h |  4 ++--
 migration/exec.c              |  4 ++--
 migration/fd.c                |  4 ++--
 migration/migration.c         | 48 ++++++++++++++++++++++---------------------
 migration/tcp.c               |  4 ++--
 migration/unix.c              |  4 ++--
 6 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3bc83fb..a874da1 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -49,7 +49,7 @@ typedef QLIST_HEAD(, LoadStateEntry) LoadStateEntry_Head;
 
 /* State for the incoming migration */
 struct MigrationIncomingState {
-    QEMUFile *file;
+    QEMUFile *from_src_file;
 
     int state;
 
@@ -73,7 +73,7 @@ struct MigrationState
     size_t xfer_limit;
     QemuThread thread;
     QEMUBH *cleanup_bh;
-    QEMUFile *file;
+    QEMUFile *to_dst_file;
     int parameters[MIGRATION_PARAMETER_MAX];
 
     int state;
diff --git a/migration/exec.c b/migration/exec.c
index 8406d2b..9037109 100644
--- a/migration/exec.c
+++ b/migration/exec.c
@@ -36,8 +36,8 @@
 
 void exec_start_outgoing_migration(MigrationState *s, const char *command, Error **errp)
 {
-    s->file = qemu_popen_cmd(command, "w");
-    if (s->file == NULL) {
+    s->to_dst_file = qemu_popen_cmd(command, "w");
+    if (s->to_dst_file == NULL) {
         error_setg_errno(errp, errno, "failed to popen the migration target");
         return;
     }
diff --git a/migration/fd.c b/migration/fd.c
index 3e4bed0..9a9d6c5 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -50,9 +50,9 @@ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **
     }
 
     if (fd_is_socket(fd)) {
-        s->file = qemu_fopen_socket(fd, "wb");
+        s->to_dst_file = qemu_fopen_socket(fd, "wb");
     } else {
-        s->file = qemu_fdopen(fd, "wb");
+        s->to_dst_file = qemu_fdopen(fd, "wb");
     }
 
     migrate_fd_connect(s);
diff --git a/migration/migration.c b/migration/migration.c
index 7d8cd38..227243e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -96,7 +96,7 @@ MigrationIncomingState *migration_incoming_get_current(void)
 MigrationIncomingState *migration_incoming_state_new(QEMUFile* f)
 {
     mis_current = g_new0(MigrationIncomingState, 1);
-    mis_current->file = f;
+    mis_current->from_src_file = f;
     mis_current->state = MIGRATION_STATUS_NONE;
     QLIST_INIT(&mis_current->loadvm_handlers);
 
@@ -642,15 +642,15 @@ static void migrate_fd_cleanup(void *opaque)
     qemu_bh_delete(s->cleanup_bh);
     s->cleanup_bh = NULL;
 
-    if (s->file) {
+    if (s->to_dst_file) {
         trace_migrate_fd_cleanup();
         qemu_mutex_unlock_iothread();
         qemu_thread_join(&s->thread);
         qemu_mutex_lock_iothread();
 
         migrate_compress_threads_join();
-        qemu_fclose(s->file);
-        s->file = NULL;
+        qemu_fclose(s->to_dst_file);
+        s->to_dst_file = NULL;
     }
 
     assert(s->state != MIGRATION_STATUS_ACTIVE);
@@ -669,7 +669,7 @@ static void migrate_fd_cleanup(void *opaque)
 void migrate_fd_error(MigrationState *s)
 {
     trace_migrate_fd_error();
-    assert(s->file == NULL);
+    assert(s->to_dst_file == NULL);
     migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
                       MIGRATION_STATUS_FAILED);
     notifier_list_notify(&migration_state_notifiers, s);
@@ -678,7 +678,7 @@ void migrate_fd_error(MigrationState *s)
 static void migrate_fd_cancel(MigrationState *s)
 {
     int old_state ;
-    QEMUFile *f = migrate_get_current()->file;
+    QEMUFile *f = migrate_get_current()->to_dst_file;
     trace_migrate_fd_cancel();
 
     do {
@@ -926,8 +926,9 @@ void qmp_migrate_set_speed(int64_t value, Error **errp)
 
     s = migrate_get_current();
     s->bandwidth_limit = value;
-    if (s->file) {
-        qemu_file_set_rate_limit(s->file, s->bandwidth_limit / XFER_LIMIT_RATIO);
+    if (s->to_dst_file) {
+        qemu_file_set_rate_limit(s->to_dst_file,
+                                 s->bandwidth_limit / XFER_LIMIT_RATIO);
     }
 }
 
@@ -1041,8 +1042,8 @@ static void migration_completion(MigrationState *s, bool *old_vm_running,
     if (!ret) {
         ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
         if (ret >= 0) {
-            qemu_file_set_rate_limit(s->file, INT64_MAX);
-            qemu_savevm_state_complete(s->file);
+            qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
+            qemu_savevm_state_complete(s->to_dst_file);
         }
     }
     qemu_mutex_unlock_iothread();
@@ -1051,7 +1052,7 @@ static void migration_completion(MigrationState *s, bool *old_vm_running,
         goto fail;
     }
 
-    if (qemu_file_get_error(s->file)) {
+    if (qemu_file_get_error(s->to_dst_file)) {
         trace_migration_completion_file_err();
         goto fail;
     }
@@ -1089,8 +1090,8 @@ static void *migration_thread(void *opaque)
 
     rcu_register_thread();
 
-    qemu_savevm_state_header(s->file);
-    qemu_savevm_state_begin(s->file, &s->params);
+    qemu_savevm_state_header(s->to_dst_file);
+    qemu_savevm_state_begin(s->to_dst_file, &s->params);
 
     s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
     migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
@@ -1100,11 +1101,11 @@ static void *migration_thread(void *opaque)
         int64_t current_time;
         uint64_t pending_size;
 
-        if (!qemu_file_rate_limit(s->file)) {
-            pending_size = qemu_savevm_state_pending(s->file, max_size);
+        if (!qemu_file_rate_limit(s->to_dst_file)) {
+            pending_size = qemu_savevm_state_pending(s->to_dst_file, max_size);
             trace_migrate_pending(pending_size, max_size);
             if (pending_size && pending_size >= max_size) {
-                qemu_savevm_state_iterate(s->file);
+                qemu_savevm_state_iterate(s->to_dst_file);
             } else {
                 trace_migration_thread_low_pending(pending_size);
                 migration_completion(s, &old_vm_running, &start_time);
@@ -1112,14 +1113,15 @@ static void *migration_thread(void *opaque)
             }
         }
 
-        if (qemu_file_get_error(s->file)) {
+        if (qemu_file_get_error(s->to_dst_file)) {
             migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
                               MIGRATION_STATUS_FAILED);
             break;
         }
         current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
         if (current_time >= initial_time + BUFFER_DELAY) {
-            uint64_t transferred_bytes = qemu_ftell(s->file) - initial_bytes;
+            uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
+                                         initial_bytes;
             uint64_t time_spent = current_time - initial_time;
             double bandwidth = transferred_bytes / time_spent;
             max_size = bandwidth * migrate_max_downtime() / 1000000;
@@ -1135,11 +1137,11 @@ static void *migration_thread(void *opaque)
                 s->expected_downtime = s->dirty_bytes_rate / bandwidth;
             }
 
-            qemu_file_reset_rate_limit(s->file);
+            qemu_file_reset_rate_limit(s->to_dst_file);
             initial_time = current_time;
-            initial_bytes = qemu_ftell(s->file);
+            initial_bytes = qemu_ftell(s->to_dst_file);
         }
-        if (qemu_file_rate_limit(s->file)) {
+        if (qemu_file_rate_limit(s->to_dst_file)) {
             /* usleep expects microseconds */
             g_usleep((initial_time + BUFFER_DELAY - current_time)*1000);
         }
@@ -1151,7 +1153,7 @@ static void *migration_thread(void *opaque)
     qemu_mutex_lock_iothread();
     if (s->state == MIGRATION_STATUS_COMPLETED) {
         int64_t end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
-        uint64_t transferred_bytes = qemu_ftell(s->file);
+        uint64_t transferred_bytes = qemu_ftell(s->to_dst_file);
         s->total_time = end_time - s->total_time;
         s->downtime = end_time - start_time;
         if (s->total_time) {
@@ -1179,7 +1181,7 @@ void migrate_fd_connect(MigrationState *s)
     s->expected_downtime = max_downtime/1000000;
     s->cleanup_bh = qemu_bh_new(migrate_fd_cleanup, s);
 
-    qemu_file_set_rate_limit(s->file,
+    qemu_file_set_rate_limit(s->to_dst_file,
                              s->bandwidth_limit / XFER_LIMIT_RATIO);
 
     /* Notify before starting migration thread */
diff --git a/migration/tcp.c b/migration/tcp.c
index ae89172..e083d68 100644
--- a/migration/tcp.c
+++ b/migration/tcp.c
@@ -39,11 +39,11 @@ static void tcp_wait_for_connect(int fd, Error *err, void *opaque)
 
     if (fd < 0) {
         DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
-        s->file = NULL;
+        s->to_dst_file = NULL;
         migrate_fd_error(s);
     } else {
         DPRINTF("migrate connect success\n");
-        s->file = qemu_fopen_socket(fd, "wb");
+        s->to_dst_file = qemu_fopen_socket(fd, "wb");
         migrate_fd_connect(s);
     }
 }
diff --git a/migration/unix.c b/migration/unix.c
index b591813..5492dd6 100644
--- a/migration/unix.c
+++ b/migration/unix.c
@@ -39,11 +39,11 @@ static void unix_wait_for_connect(int fd, Error *err, void *opaque)
 
     if (fd < 0) {
         DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
-        s->file = NULL;
+        s->to_dst_file = NULL;
         migrate_fd_error(s);
     } else {
         DPRINTF("migrate connect success\n");
-        s->file = qemu_fopen_socket(fd, "wb");
+        s->to_dst_file = qemu_fopen_socket(fd, "wb");
         migrate_fd_connect(s);
     }
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 08/38] COLO/migration: establish a new communication path from destination to source
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (6 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 07/38] migration: Rename the'file' member of MigrationState and MigrationIncomingState zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol zhanghailiang
                   ` (29 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

Add a new member 'to_src_file' to MigrationIncomingState and a
new member 'from_dst_file' to MigrationState.
They will be used for returning messages from destination to source.
It will also be used by post-copy migration.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
v10: fix the the error log (Dave's suggestion).
---
 include/migration/migration.h |  3 ++-
 migration/colo.c              | 44 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index a874da1..0c0309d 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -50,7 +50,7 @@ typedef QLIST_HEAD(, LoadStateEntry) LoadStateEntry_Head;
 /* State for the incoming migration */
 struct MigrationIncomingState {
     QEMUFile *from_src_file;
-
+    QEMUFile *to_src_file;
     int state;
 
     bool have_colo_incoming_thread;
@@ -74,6 +74,7 @@ struct MigrationState
     QemuThread thread;
     QEMUBH *cleanup_bh;
     QEMUFile *to_dst_file;
+    QEMUFile  *from_dst_file;
     int parameters[MIGRATION_PARAMETER_MAX];
 
     int state;
diff --git a/migration/colo.c b/migration/colo.c
index 6880aa0..4fdf3a9 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -36,6 +36,21 @@ bool migration_incoming_in_colo_state(void)
 
 static void colo_process_checkpoint(MigrationState *s)
 {
+    int fd, ret = 0;
+
+    /* Dup the fd of to_dst_file */
+    fd = dup(qemu_get_fd(s->to_dst_file));
+    if (fd == -1) {
+        ret = -errno;
+        goto out;
+    }
+    s->from_dst_file = qemu_fopen_socket(fd, "rb");
+    if (!s->from_dst_file) {
+        ret = -EINVAL;
+        error_report("Open QEMUFile from_dst_file failed");
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
@@ -43,8 +58,16 @@ static void colo_process_checkpoint(MigrationState *s)
 
     /*TODO: COLO checkpoint savevm loop*/
 
+out:
+    if (ret < 0) {
+        error_report("%s: %s", __func__, strerror(-ret));
+    }
     migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
+
+    if (s->from_dst_file) {
+        qemu_fclose(s->from_dst_file);
+    }
 }
 
 void migrate_start_colo_process(MigrationState *s)
@@ -59,12 +82,33 @@ void migrate_start_colo_process(MigrationState *s)
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
+    int fd, ret = 0;
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_COLO);
 
+    fd = dup(qemu_get_fd(mis->from_src_file));
+    if (fd < 0) {
+        ret = -errno;
+        goto out;
+    }
+    mis->to_src_file = qemu_fopen_socket(fd, "wb");
+    if (!mis->to_src_file) {
+        ret = -EINVAL;
+        error_report("colo incoming thread: Open QEMUFile to_src_file failed");
+        goto out;
+    }
     /* TODO: COLO checkpoint restore loop */
 
+out:
+    if (ret < 0) {
+        error_report("colo incoming thread will exit, detect error: %s",
+                     strerror(-ret));
+    }
+
+    if (mis->to_src_file) {
+        qemu_fclose(mis->to_src_file);
+    }
     migration_incoming_exit_colo();
 
     return NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (7 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 08/38] COLO/migration: establish a new communication path from destination to source zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-06 18:26   ` Dr. David Alan Gilbert
  2015-11-13 16:46   ` Eric Blake
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
                   ` (28 subsequent siblings)
  37 siblings, 2 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

We need communications protocol of user-defined to control the checkpoint
process.

The new checkpoint request is started by Primary VM, and the interactive process
like below:
Checkpoint synchronizing points,

                       Primary                         Secondary
'checkpoint-request'   @ ----------------------------->
                                                       Suspend (In hybrid mode)
'checkpoint-reply'     <------------------------------ @
                       Suspend&Save state
'vmstate-send'         @ ----------------------------->
                       Send state                      Receive state
'vmstate-received'     <------------------------------ @
                       Release packets                 Load state
'vmstate-load'         <------------------------------ @
                       Resume                          Resume (In hybrid mode)

                       Start Comparing (In hybrid mode)
NOTE:
 1) '@' who sends the message
 2) Every sync-point is synchronized by two sides with only
    one handshake(single direction) for low-latency.
    If more strict synchronization is required, a opposite direction
    sync-point should be added.
 3) Since sync-points are single direction, the remote side may
    go forward a lot when this side just receives the sync-point.
 4) For now, we only support 'periodic' checkpoint, for which
   the Secondary VM is not running, later we will support 'hybrid' mode.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Eric Blake <eblake@redhat.com>
---
v10:
- Rename enum COLOCmd to COLOCommand (Eric's suggestion).
- Remove unused 'ram-steal'
---
 migration/colo.c | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 qapi-schema.json |  27 ++++++++
 trace-events     |   2 +
 3 files changed, 219 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 4fdf3a9..2510762 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,10 +10,12 @@
  * later.  See the COPYING file in the top-level directory.
  */
 
+#include <unistd.h>
 #include "sysemu/sysemu.h"
 #include "migration/colo.h"
 #include "trace.h"
 #include "qemu/error-report.h"
+#include "qemu/sockets.h"
 
 bool colo_supported(void)
 {
@@ -34,6 +36,103 @@ bool migration_incoming_in_colo_state(void)
     return mis && (mis->state == MIGRATION_STATUS_COLO);
 }
 
+/* colo checkpoint control helper */
+static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value)
+{
+    int ret = 0;
+
+    qemu_put_be32(f, cmd);
+    qemu_put_be64(f, value);
+    qemu_fflush(f);
+
+    ret = qemu_file_get_error(f);
+    trace_colo_ctl_put(COLOCommand_lookup[cmd], value);
+
+    return ret;
+}
+
+static int colo_ctl_get_cmd(QEMUFile *f, uint32_t *cmd)
+{
+    int ret = 0;
+
+    *cmd = qemu_get_be32(f);
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        return ret;
+    }
+    if (*cmd >= COLO_COMMAND_MAX) {
+        error_report("Invalid colo command, get cmd:%d", *cmd);
+        return -EINVAL;
+    }
+    trace_colo_ctl_get(COLOCommand_lookup[*cmd]);
+
+    return 0;
+}
+
+static int colo_ctl_get(QEMUFile *f, uint32_t require)
+{
+    int ret;
+    uint32_t cmd;
+    uint64_t value;
+
+    ret = colo_ctl_get_cmd(f, &cmd);
+    if (ret < 0) {
+        return ret;
+    }
+    if (cmd != require) {
+        error_report("Unexpect colo command, expect:%d, but get cmd:%d",
+                     require, cmd);
+        return -EINVAL;
+    }
+
+    value = qemu_get_be64(f);
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        return ret;
+    }
+
+    return value;
+}
+
+static int colo_do_checkpoint_transaction(MigrationState *s)
+{
+    int ret;
+
+    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_REPLY);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* TODO: suspend and save vm state to colo buffer */
+
+    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* TODO: send vmstate to Secondary */
+
+    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_LOADED);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* TODO: resume Primary */
+
+out:
+    return ret;
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     int fd, ret = 0;
@@ -51,12 +150,27 @@ static void colo_process_checkpoint(MigrationState *s)
         goto out;
     }
 
+    /*
+     * Wait for Secondary finish loading vm states and enter COLO
+     * restore.
+     */
+    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_READY);
+    if (ret < 0) {
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
 
-    /*TODO: COLO checkpoint savevm loop*/
+    while (s->state == MIGRATION_STATUS_COLO) {
+        /* start a colo checkpoint */
+        ret = colo_do_checkpoint_transaction(s);
+        if (ret < 0) {
+            goto out;
+        }
+    }
 
 out:
     if (ret < 0) {
@@ -79,6 +193,39 @@ void migrate_start_colo_process(MigrationState *s)
     qemu_mutex_lock_iothread();
 }
 
+/*
+ * return:
+ * 0: start a checkpoint
+ * -1: some error happened, exit colo restore
+ */
+static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
+{
+    int ret;
+    uint32_t cmd;
+    uint64_t value;
+
+    ret = colo_ctl_get_cmd(f, &cmd);
+    if (ret < 0) {
+        /* do failover ? */
+        return ret;
+    }
+    /* Fix me: this value should be 0, which is not so good,
+     * should be used for checking ?
+     */
+    value = qemu_get_be64(f);
+    if (value != 0) {
+        return -EINVAL;
+    }
+
+    switch (cmd) {
+    case COLO_COMMAND_CHECKPOINT_REQUEST:
+        *checkpoint_request = 1;
+        return 0;
+    default:
+        return -EINVAL;
+    }
+}
+
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
@@ -98,7 +245,48 @@ void *colo_process_incoming_thread(void *opaque)
         error_report("colo incoming thread: Open QEMUFile to_src_file failed");
         goto out;
     }
-    /* TODO: COLO checkpoint restore loop */
+
+    ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
+    if (ret < 0) {
+        goto out;
+    }
+
+    while (mis->state == MIGRATION_STATUS_COLO) {
+        int request = 0;
+        int ret = colo_wait_handle_cmd(mis->from_src_file, &request);
+
+        if (ret < 0) {
+            break;
+        } else {
+            if (!request) {
+                continue;
+            }
+        }
+
+        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_REPLY, 0);
+        if (ret < 0) {
+            goto out;
+        }
+
+        ret = colo_ctl_get(mis->from_src_file, COLO_COMMAND_VMSTATE_SEND);
+        if (ret < 0) {
+            goto out;
+        }
+
+        /* TODO: read migration data into colo buffer */
+
+        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
+        if (ret < 0) {
+            goto out;
+        }
+
+        /* TODO: load vm state */
+
+        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
+        if (ret < 0) {
+            goto out;
+        }
+    }
 
 out:
     if (ret < 0) {
diff --git a/qapi-schema.json b/qapi-schema.json
index 22251ec..5c4fe6d 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -702,6 +702,33 @@
             '*tls-port': 'int', '*cert-subject': 'str' } }
 
 ##
+# @COLOCommand
+#
+# The colo command
+#
+# @invalid: unknown command
+#
+# @checkpoint-ready: SVM is ready for checkpointing
+#
+# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
+#
+# @checkpoint-reply: SVM gets PVM's checkpoint request
+#
+# @vmstate-send: VM's state will be sent by PVM.
+#
+# @vmstate-size: The total size of VMstate.
+#
+# @vmstate-received: VM's state has been received by SVM
+#
+# @vmstate-loaded: VM's state has been loaded by SVM
+#
+# Since: 2.5
+##
+{ 'enum': 'COLOCommand',
+  'data': [ 'invalid', 'checkpoint-ready', 'checkpoint-request',
+            'checkpoint-reply', 'vmstate-send', 'vmstate-size',
+            'vmstate-received', 'vmstate-loaded' ] }
+
 # @MouseInfo:
 #
 # Information about a mouse device.
diff --git a/trace-events b/trace-events
index 9cd6391..ee4679c 100644
--- a/trace-events
+++ b/trace-events
@@ -1499,6 +1499,8 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) ""
 
 # migration/colo.c
 colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
+colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64""
+colo_ctl_get(const char *msg) "Receive '%s' cmd"
 
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (8 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-06 18:28   ` Dr. David Alan Gilbert
  2015-11-13 16:47   ` Eric Blake
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
                   ` (27 subsequent siblings)
  37 siblings, 2 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, zhanghailiang

Guest will enter this state when paused to save/restore VM state
under colo checkpoint.

Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 qapi-schema.json | 7 ++++++-
 vl.c             | 8 ++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 5c4fe6d..49f2a90 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -154,12 +154,15 @@
 # @watchdog: the watchdog action is configured to pause and has been triggered
 #
 # @guest-panicked: guest has been panicked as a result of guest OS panic
+#
+# @colo: guest is paused to save/restore VM state under colo checkpoint (since
+# 2.5)
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
             'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
             'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
-            'guest-panicked' ] }
+            'guest-panicked', 'colo' ] }
 
 ##
 # @StatusInfo:
@@ -434,6 +437,8 @@
 #
 # @failed: some error occurred during migration process.
 #
+# @colo: VM is in the process of fault tolerance. (since 2.5)
+#
 # Since: 2.3
 #
 ##
diff --git a/vl.c b/vl.c
index 10e6cbe..c459a3e 100644
--- a/vl.c
+++ b/vl.c
@@ -586,6 +586,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_INMIGRATE, RUN_STATE_WATCHDOG },
     { RUN_STATE_INMIGRATE, RUN_STATE_GUEST_PANICKED },
     { RUN_STATE_INMIGRATE, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_INMIGRATE, RUN_STATE_COLO },
 
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
     { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
@@ -595,6 +596,7 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
     { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_PAUSED, RUN_STATE_COLO},
 
     { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -605,9 +607,12 @@ static const RunStateTransition runstate_transitions_def[] = {
 
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
 
+    { RUN_STATE_COLO, RUN_STATE_RUNNING },
+
     { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
     { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
     { RUN_STATE_RUNNING, RUN_STATE_IO_ERROR },
@@ -618,6 +623,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
     { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
     { RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
+    { RUN_STATE_RUNNING, RUN_STATE_COLO},
 
     { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
@@ -628,9 +634,11 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
     { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
     { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
 
     { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
     { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_WATCHDOG, RUN_STATE_COLO},
 
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
     { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (9 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-06 18:30   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint zhanghailiang
                   ` (26 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

Introduce two new QEMUSizedBuffer APIs which will be used by COLO to buffer
VM state:
One is qsb_put_buffer(), which put the content of a given QEMUSizedBuffer
into QEMUFile, this is used to send buffered VM state to secondary.
Another is qsb_fill_buffer(), read 'size' bytes of data from the file into
qsb, this is used to get VM state from socket into a buffer.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/qemu-file.h |  3 ++-
 migration/qemu-file-buf.c     | 58 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
index 29a338d..de42d5b 100644
--- a/include/migration/qemu-file.h
+++ b/include/migration/qemu-file.h
@@ -144,7 +144,8 @@ ssize_t qsb_get_buffer(const QEMUSizedBuffer *, off_t start, size_t count,
                        uint8_t *buf);
 ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *buf,
                      off_t pos, size_t count);
-
+void qsb_put_buffer(QEMUFile *f, QEMUSizedBuffer *qsb, int size);
+int qsb_fill_buffer(QEMUSizedBuffer *qsb, QEMUFile *f, int size);
 
 /*
  * For use on files opened with qemu_bufopen
diff --git a/migration/qemu-file-buf.c b/migration/qemu-file-buf.c
index 49516b8..e58004d 100644
--- a/migration/qemu-file-buf.c
+++ b/migration/qemu-file-buf.c
@@ -366,6 +366,64 @@ ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *source,
     return count;
 }
 
+
+/**
+ * Put the content of a given QEMUSizedBuffer into QEMUFile.
+ *
+ * @f: A QEMUFile
+ * @qsb: A QEMUSizedBuffer
+ * @size: size of content to write
+ */
+void qsb_put_buffer(QEMUFile *f, QEMUSizedBuffer *qsb, int size)
+{
+    int i, l;
+
+    for (i = 0; i < qsb->n_iov && size > 0; i++) {
+        l = MIN(qsb->iov[i].iov_len, size);
+        qemu_put_buffer(f, qsb->iov[i].iov_base, l);
+        size -= l;
+    }
+}
+
+/*
+ * Read 'size' bytes of data from the file into qsb.
+ * always fill from pos 0 and used after qsb_create().
+ *
+ * It will return size bytes unless there was an error, in which case it will
+ * return as many as it managed to read (assuming blocking fd's which
+ * all current QEMUFile are)
+ */
+int qsb_fill_buffer(QEMUSizedBuffer *qsb, QEMUFile *f, int size)
+{
+    ssize_t rc = qsb_grow(qsb, size);
+    int pending = size, i;
+    qsb->used = 0;
+    uint8_t *buf = NULL;
+
+    if (rc < 0) {
+        return rc;
+    }
+
+    for (i = 0; i < qsb->n_iov && pending > 0; i++) {
+        int doneone = 0;
+        /* read until iov full */
+        while (doneone < qsb->iov[i].iov_len && pending > 0) {
+            int readone = 0;
+            buf = qsb->iov[i].iov_base;
+            readone = qemu_get_buffer(f, buf,
+                                MIN(qsb->iov[i].iov_len - doneone, pending));
+            if (readone == 0) {
+                return qsb->used;
+            }
+            buf += readone;
+            doneone += readone;
+            pending -= readone;
+            qsb->used += readone;
+        }
+    }
+    return qsb->used;
+}
+
 typedef struct QEMUBuffer {
     QEMUSizedBuffer *qsb;
     QEMUFile *file;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (10 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-06 18:59   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
                   ` (25 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

The main process of checkpoint is to synchronize SVM with PVM.
VM's state includes ram and device state. So we will migrate PVM's
state to SVM when do checkpoint, just like migration does.

We will cache PVM's state in slave, we use QEMUSizedBuffer
to store the data, we need to know the size of VM state, so in master,
we use qsb to store VM state temporarily, get the data size by call qsb_get_length()
and then migrate the data to the qsb in the secondary side.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c   | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 migration/ram.c    | 47 +++++++++++++++++++++++++++++--------
 migration/savevm.c |  2 +-
 3 files changed, 101 insertions(+), 16 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 2510762..b865513 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -17,6 +17,9 @@
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
 
+/* colo buffer */
+#define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
+
 bool colo_supported(void)
 {
     return true;
@@ -94,9 +97,12 @@ static int colo_ctl_get(QEMUFile *f, uint32_t require)
     return value;
 }
 
-static int colo_do_checkpoint_transaction(MigrationState *s)
+static int colo_do_checkpoint_transaction(MigrationState *s,
+                                          QEMUSizedBuffer *buffer)
 {
     int ret;
+    size_t size;
+    QEMUFile *trans = NULL;
 
     ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
     if (ret < 0) {
@@ -107,15 +113,47 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
     if (ret < 0) {
         goto out;
     }
+    /* Reset colo buffer and open it for write */
+    qsb_set_length(buffer, 0);
+    trans = qemu_bufopen("w", buffer);
+    if (!trans) {
+        error_report("Open colo buffer for write failed");
+        goto out;
+    }
 
-    /* TODO: suspend and save vm state to colo buffer */
+    qemu_mutex_lock_iothread();
+    vm_stop_force_state(RUN_STATE_COLO);
+    qemu_mutex_unlock_iothread();
+    trace_colo_vm_state_change("run", "stop");
+
+    /* Disable block migration */
+    s->params.blk = 0;
+    s->params.shared = 0;
+    qemu_savevm_state_header(trans);
+    qemu_savevm_state_begin(trans, &s->params);
+    qemu_mutex_lock_iothread();
+    qemu_savevm_state_complete(trans);
+    qemu_mutex_unlock_iothread();
+
+    qemu_fflush(trans);
 
     ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
     if (ret < 0) {
         goto out;
     }
+    /* we send the total size of the vmstate first */
+    size = qsb_get_length(buffer);
+    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SIZE, size);
+    if (ret < 0) {
+        goto out;
+    }
 
-    /* TODO: send vmstate to Secondary */
+    qsb_put_buffer(s->to_dst_file, buffer, size);
+    qemu_fflush(s->to_dst_file);
+    ret = qemu_file_get_error(s->to_dst_file);
+    if (ret < 0) {
+        goto out;
+    }
 
     ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
     if (ret < 0) {
@@ -127,14 +165,24 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
         goto out;
     }
 
-    /* TODO: resume Primary */
+    ret = 0;
+    /* resume master */
+    qemu_mutex_lock_iothread();
+    vm_start();
+    qemu_mutex_unlock_iothread();
+    trace_colo_vm_state_change("stop", "run");
 
 out:
+    if (trans) {
+        qemu_fclose(trans);
+    }
+
     return ret;
 }
 
 static void colo_process_checkpoint(MigrationState *s)
 {
+    QEMUSizedBuffer *buffer = NULL;
     int fd, ret = 0;
 
     /* Dup the fd of to_dst_file */
@@ -159,6 +207,13 @@ static void colo_process_checkpoint(MigrationState *s)
         goto out;
     }
 
+    buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
+    if (buffer == NULL) {
+        ret = -ENOMEM;
+        error_report("Failed to allocate buffer!");
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
@@ -166,7 +221,7 @@ static void colo_process_checkpoint(MigrationState *s)
 
     while (s->state == MIGRATION_STATUS_COLO) {
         /* start a colo checkpoint */
-        ret = colo_do_checkpoint_transaction(s);
+        ret = colo_do_checkpoint_transaction(s, buffer);
         if (ret < 0) {
             goto out;
         }
@@ -179,6 +234,9 @@ out:
     migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
 
+    qsb_free(buffer);
+    buffer = NULL;
+
     if (s->from_dst_file) {
         qemu_fclose(s->from_dst_file);
     }
diff --git a/migration/ram.c b/migration/ram.c
index a25bcc7..5784c15 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -38,6 +38,7 @@
 #include "trace.h"
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
+#include "migration/colo.h"
 
 #ifdef DEBUG_MIGRATION_RAM
 #define DPRINTF(fmt, ...) \
@@ -1165,15 +1166,8 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
     }
 }
 
-/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
- * long-running RCU critical section.  When rcu-reclaims in the code
- * start to become numerous it will be necessary to reduce the
- * granularity of these critical sections.
- */
-
-static int ram_save_setup(QEMUFile *f, void *opaque)
+static int ram_save_init_globals(void)
 {
-    RAMBlock *block;
     int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
 
     dirty_rate_high_cnt = 0;
@@ -1233,6 +1227,31 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     migration_bitmap_sync();
     qemu_mutex_unlock_ramlist();
     qemu_mutex_unlock_iothread();
+    rcu_read_unlock();
+
+    return 0;
+}
+
+/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
+ * long-running RCU critical section.  When rcu-reclaims in the code
+ * start to become numerous it will be necessary to reduce the
+ * granularity of these critical sections.
+ */
+
+static int ram_save_setup(QEMUFile *f, void *opaque)
+{
+    RAMBlock *block;
+
+    /*
+     * migration has already setup the bitmap, reuse it.
+     */
+    if (!migration_in_colo_state()) {
+        if (ram_save_init_globals() < 0) {
+            return -1;
+         }
+    }
+
+    rcu_read_lock();
 
     qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
@@ -1332,7 +1351,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
     while (true) {
         int pages;
 
-        pages = ram_find_and_save_block(f, true, &bytes_transferred);
+        pages = ram_find_and_save_block(f, !migration_in_colo_state(),
+                                        &bytes_transferred);
         /* no more blocks to sent */
         if (pages == 0) {
             break;
@@ -1343,8 +1363,15 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
     ram_control_after_iterate(f, RAM_CONTROL_FINISH);
 
     rcu_read_unlock();
+    /*
+     * Since we need to reuse dirty bitmap in colo,
+     * don't cleanup the bitmap.
+     */
+    if (!migrate_colo_enabled() ||
+        migration_has_failed(migrate_get_current())) {
+        migration_end();
+    }
 
-    migration_end();
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
 
     return 0;
diff --git a/migration/savevm.c b/migration/savevm.c
index dbcc39a..0faf12b 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -48,7 +48,7 @@
 #include "qemu/iov.h"
 #include "block/snapshot.h"
 #include "block/qapi.h"
-
+#include "migration/colo.h"
 
 #ifndef ETH_P_RARP
 #define ETH_P_RARP 0x8035
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (11 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-13 15:39   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it zhanghailiang
                   ` (24 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.

We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached ram
to SVM after we receive all PVM's state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
v10: Split the process of dirty pages recording into a new patch
---
 include/exec/ram_addr.h  |  1 +
 include/migration/colo.h |  3 +++
 migration/colo.c         | 14 +++++++++--
 migration/ram.c          | 61 ++++++++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 3360ac5..e7c4310 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -28,6 +28,7 @@ struct RAMBlock {
     struct rcu_head rcu;
     struct MemoryRegion *mr;
     uint8_t *host;
+    uint8_t *host_cache; /* For colo, VM's ram cache */
     ram_addr_t offset;
     ram_addr_t used_length;
     ram_addr_t max_length;
diff --git a/include/migration/colo.h b/include/migration/colo.h
index 2676c4a..8edd5f1 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -29,4 +29,7 @@ bool migration_incoming_enable_colo(void);
 void migration_incoming_exit_colo(void);
 void *colo_process_incoming_thread(void *opaque);
 bool migration_incoming_in_colo_state(void);
+/* ram cache */
+int colo_init_ram_cache(void);
+void colo_release_ram_cache(void);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index b865513..25f85b2 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -304,6 +304,12 @@ void *colo_process_incoming_thread(void *opaque)
         goto out;
     }
 
+    ret = colo_init_ram_cache();
+    if (ret < 0) {
+        error_report("Failed to initialize ram cache");
+        goto out;
+    }
+
     ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
     if (ret < 0) {
         goto out;
@@ -331,14 +337,14 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
-        /* TODO: read migration data into colo buffer */
+        /* TODO Load VM state */
 
         ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
         if (ret < 0) {
             goto out;
         }
 
-        /* TODO: load vm state */
+        /* TODO: flush vm state */
 
         ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
         if (ret < 0) {
@@ -352,6 +358,10 @@ out:
                      strerror(-ret));
     }
 
+    qemu_mutex_lock_iothread();
+    colo_release_ram_cache();
+    qemu_mutex_unlock_iothread();
+
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
     }
diff --git a/migration/ram.c b/migration/ram.c
index 5784c15..b094dc3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -222,6 +222,7 @@ static RAMBlock *last_sent_block;
 static ram_addr_t last_offset;
 static QemuMutex migration_bitmap_mutex;
 static uint64_t migration_dirty_pages;
+static bool ram_cache_enable;
 static uint32_t last_version;
 static bool ram_bulk_stage;
 
@@ -1446,7 +1447,11 @@ static inline void *host_from_stream_offset(QEMUFile *f,
             return NULL;
         }
 
-        return block->host + offset;
+        if (ram_cache_enable) {
+            return block->host_cache + offset;
+        } else {
+            return block->host + offset;
+        }
     }
 
     len = qemu_get_byte(f);
@@ -1456,7 +1461,11 @@ static inline void *host_from_stream_offset(QEMUFile *f,
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
         if (!strncmp(id, block->idstr, sizeof(id)) &&
             block->max_length > offset) {
-            return block->host + offset;
+            if (ram_cache_enable) {
+                return block->host_cache + offset;
+            } else {
+                return block->host + offset;
+            }
         }
     }
 
@@ -1707,6 +1716,54 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     return ret;
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM, it will be called after first migration.
+ */
+int colo_init_ram_cache(void)
+{
+    RAMBlock *block;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        block->host_cache = qemu_anon_ram_alloc(block->used_length, NULL);
+        if (!block->host_cache) {
+            goto out_locked;
+        }
+        memcpy(block->host_cache, block->host, block->used_length);
+    }
+    rcu_read_unlock();
+    ram_cache_enable = true;
+    return 0;
+
+out_locked:
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->host_cache) {
+            qemu_anon_ram_free(block->host_cache, block->used_length);
+            block->host_cache = NULL;
+        }
+    }
+
+    rcu_read_unlock();
+    return -errno;
+}
+
+void colo_release_ram_cache(void)
+{
+    RAMBlock *block;
+
+    ram_cache_enable = false;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->host_cache) {
+            qemu_anon_ram_free(block->host_cache, block->used_length);
+            block->host_cache = NULL;
+        }
+    }
+    rcu_read_unlock();
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (12 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-13 16:02   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 15/38] ram/COLO: Record pages received from PVM by re-using migration dirty bitmap zhanghailiang
                   ` (23 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

We should not destroy the state of SVM (Secondary VM) until we receive the whole
state from the PVM (Primary VM), in case the primary fails in the middle of sending
the state, so, here we cache the device state in Secondary before restore it.

Besides, we should call qemu_system_reset() before load VM state,
which can ensure the data is intact.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/migration/colo.c b/migration/colo.c
index 25f85b2..1339774 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -287,6 +287,9 @@ static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
+    QEMUFile *fb = NULL;
+    QEMUSizedBuffer *buffer = NULL; /* Cache incoming device state */
+    int  total_size;
     int fd, ret = 0;
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
@@ -310,6 +313,12 @@ void *colo_process_incoming_thread(void *opaque)
         goto out;
     }
 
+    buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
+    if (buffer == NULL) {
+        error_report("Failed to allocate colo buffer!");
+        goto out;
+    }
+
     ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
     if (ret < 0) {
         goto out;
@@ -337,19 +346,50 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
-        /* TODO Load VM state */
+        /* read the VM state total size first */
+        total_size = colo_ctl_get(mis->from_src_file,
+                                  COLO_COMMAND_VMSTATE_SIZE);
+        if (total_size <= 0) {
+            goto out;
+        }
+
+        /* read vm device state into colo buffer */
+        ret = qsb_fill_buffer(buffer, mis->from_src_file, total_size);
+        if (ret != total_size) {
+            error_report("can't get all migration data");
+            goto out;
+        }
 
         ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
         if (ret < 0) {
             goto out;
         }
 
+        /* open colo buffer for read */
+        fb = qemu_bufopen("r", buffer);
+        if (!fb) {
+            error_report("can't open colo buffer for read");
+            goto out;
+        }
+
+        qemu_mutex_lock_iothread();
+        qemu_system_reset(VMRESET_SILENT);
+        if (qemu_loadvm_state(fb) < 0) {
+            error_report("COLO: loadvm failed");
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+        qemu_mutex_unlock_iothread();
+
         /* TODO: flush vm state */
 
         ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
         if (ret < 0) {
             goto out;
         }
+
+        qemu_fclose(fb);
+        fb = NULL;
     }
 
 out:
@@ -358,6 +398,11 @@ out:
                      strerror(-ret));
     }
 
+    if (fb) {
+        qemu_fclose(fb);
+    }
+    qsb_free(buffer);
+
     qemu_mutex_lock_iothread();
     colo_release_ram_cache();
     qemu_mutex_unlock_iothread();
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 15/38] ram/COLO: Record pages received from PVM by re-using migration dirty bitmap
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (13 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-13 16:19   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 16/38] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
                   ` (22 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

We need to record the address of the dirty pages that received from PVM,
It will help flushing pages that cached into SVM.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
v10:
- New patch split from v9's patch 13
- Rebase to master to use 'migration_bitmap_rcu'
---
 migration/ram.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index b094dc3..70879bd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1448,6 +1448,18 @@ static inline void *host_from_stream_offset(QEMUFile *f,
         }
 
         if (ram_cache_enable) {
+            unsigned long *bitmap;
+            long k = (block->mr->ram_addr + offset) >> TARGET_PAGE_BITS;
+
+            bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
+            /*
+            * During colo checkpoint, we need bitmap of these migrated pages.
+            * It help us to decide which pages in ram cache should be flushed
+            * into VM's RAM later.
+            */
+            if (!test_and_set_bit(k, bitmap)) {
+                migration_dirty_pages++;
+            }
             return block->host_cache + offset;
         } else {
             return block->host + offset;
@@ -1462,6 +1474,13 @@ static inline void *host_from_stream_offset(QEMUFile *f,
         if (!strncmp(id, block->idstr, sizeof(id)) &&
             block->max_length > offset) {
             if (ram_cache_enable) {
+                unsigned long *bitmap;
+                long k = (block->mr->ram_addr + offset) >> TARGET_PAGE_BITS;
+
+                bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
+                if (!test_and_set_bit(k, bitmap)) {
+                    migration_dirty_pages++;
+                }
                 return block->host_cache + offset;
             } else {
                 return block->host + offset;
@@ -1723,6 +1742,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
 int colo_init_ram_cache(void)
 {
     RAMBlock *block;
+    int64_t ram_cache_pages = last_ram_offset() >> TARGET_PAGE_BITS;
 
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
@@ -1734,6 +1754,15 @@ int colo_init_ram_cache(void)
     }
     rcu_read_unlock();
     ram_cache_enable = true;
+    /*
+    * Record the dirty pages that sent by PVM, we use this dirty bitmap together
+    * with to decide which page in cache should be flushed into SVM's RAM. Here
+    * we use the same name 'migration_bitmap_rcu' as for migration.
+    */
+    migration_bitmap_rcu = g_new(struct BitmapRcu, 1);
+    migration_bitmap_rcu->bmap = bitmap_new(ram_cache_pages);
+    migration_dirty_pages = 0;
+
     return 0;
 
 out_locked:
@@ -1751,9 +1780,15 @@ out_locked:
 void colo_release_ram_cache(void)
 {
     RAMBlock *block;
+    struct BitmapRcu *bitmap = migration_bitmap_rcu;
 
     ram_cache_enable = false;
 
+    atomic_rcu_set(&migration_bitmap_rcu, NULL);
+    if (bitmap) {
+        call_rcu(bitmap, migration_bitmap_free, rcu);
+    }
+
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
         if (block->host_cache) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 16/38] COLO: Flush PVM's cached RAM into SVM's memory
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (14 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 15/38] ram/COLO: Record pages received from PVM by re-using migration dirty bitmap zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-13 16:38   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically zhanghailiang
                   ` (21 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

During the time of VM's running, PVM may dirty some pages, we will transfer
PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
time. So, the content of SVM's RAM cache will always be some with PVM's memory
after checkpoint.

Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
we do this in a more efficient way:
Only flush any page that dirtied by PVM since last checkpoint.
In this way, we can ensure SVM's memory same with PVM's.

Besides, we must ensure flush RAM cache before load device state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
v10: trace the number of dirty pages that be received.
---
 include/migration/colo.h |  1 +
 migration/colo.c         |  2 --
 migration/ram.c          | 40 ++++++++++++++++++++++++++++++++++++++++
 trace-events             |  1 +
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 8edd5f1..be2890b 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -32,4 +32,5 @@ bool migration_incoming_in_colo_state(void);
 /* ram cache */
 int colo_init_ram_cache(void);
 void colo_release_ram_cache(void);
+void colo_flush_ram_cache(void);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index 1339774..0efab21 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -381,8 +381,6 @@ void *colo_process_incoming_thread(void *opaque)
         }
         qemu_mutex_unlock_iothread();
 
-        /* TODO: flush vm state */
-
         ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
         if (ret < 0) {
             goto out;
diff --git a/migration/ram.c b/migration/ram.c
index 70879bd..d7e0e37 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1601,6 +1601,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     int flags = 0, ret = 0;
     static uint64_t seq_iter;
     int len = 0;
+    bool need_flush = false;
 
     seq_iter++;
 
@@ -1669,6 +1670,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
+
+            need_flush = true;
             ch = qemu_get_byte(f);
             ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
             break;
@@ -1679,6 +1682,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
+
+            need_flush = true;
             qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
             break;
         case RAM_SAVE_FLAG_COMPRESS_PAGE:
@@ -1711,6 +1716,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
+            need_flush = true;
             break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
@@ -1730,6 +1736,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     }
 
     rcu_read_unlock();
+
+    if (!ret  && ram_cache_enable && need_flush) {
+        DPRINTF("Flush ram_cache\n");
+        colo_flush_ram_cache();
+    }
     DPRINTF("Completed load of VM with exit code %d seq iteration "
             "%" PRIu64 "\n", ret, seq_iter);
     return ret;
@@ -1799,6 +1810,35 @@ void colo_release_ram_cache(void)
     rcu_read_unlock();
 }
 
+/*
+ * Flush content of RAM cache into SVM's memory.
+ * Only flush the pages that be dirtied by PVM or SVM or both.
+ */
+void colo_flush_ram_cache(void)
+{
+    RAMBlock *block = NULL;
+    void *dst_host;
+    void *src_host;
+    ram_addr_t  offset = 0;
+
+    trace_colo_flush_ram_cache(migration_dirty_pages);
+    rcu_read_lock();
+    block = QLIST_FIRST_RCU(&ram_list.blocks);
+    while (block) {
+        offset = migration_bitmap_find_and_reset_dirty(block, offset);
+        if (offset >= block->used_length) {
+            offset = 0;
+            block = QLIST_NEXT_RCU(block, next);
+        } else {
+            dst_host = block->host + offset;
+            src_host = block->host_cache + offset;
+            memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+        }
+    }
+    rcu_read_unlock();
+    assert(migration_dirty_pages == 0);
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/trace-events b/trace-events
index ee4679c..c98bc13 100644
--- a/trace-events
+++ b/trace-events
@@ -1232,6 +1232,7 @@ qemu_file_fclose(void) ""
 migration_bitmap_sync_start(void) ""
 migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64""
 migration_throttle(void) ""
+colo_flush_ram_cache(uint64_t dirty_pages) "dirty_pages %" PRIu64""
 
 # hw/display/qxl.c
 disable qxl_interface_set_mm_time(int qid, uint32_t mm_time) "%d %d"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (15 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 16/38] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-13 18:34   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover zhanghailiang
                   ` (20 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

Do checkpoint periodically, the default interval is 200ms.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 0efab21..a6791f4 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -11,12 +11,19 @@
  */
 
 #include <unistd.h>
+#include "qemu/timer.h"
 #include "sysemu/sysemu.h"
 #include "migration/colo.h"
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
 
+/*
+ * checkpoint interval: unit ms
+ * Note: Please change this default value to 10000 when we support hybrid mode.
+ */
+#define CHECKPOINT_MAX_PEROID 200
+
 /* colo buffer */
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
@@ -183,6 +190,7 @@ out:
 static void colo_process_checkpoint(MigrationState *s)
 {
     QEMUSizedBuffer *buffer = NULL;
+    int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int fd, ret = 0;
 
     /* Dup the fd of to_dst_file */
@@ -220,11 +228,17 @@ static void colo_process_checkpoint(MigrationState *s)
     trace_colo_vm_state_change("stop", "run");
 
     while (s->state == MIGRATION_STATUS_COLO) {
+        current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
+            g_usleep(100000);
+            continue;
+        }
         /* start a colo checkpoint */
         ret = colo_do_checkpoint_transaction(s, buffer);
         if (ret < 0) {
             goto out;
         }
+        checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     }
 
 out:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (16 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-13 16:59   ` Eric Blake
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process zhanghailiang
                   ` (19 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, Luiz Capitulino, zhanghailiang

We leave users to choose whatever heartbeat solution they want, if the heartbeat
is lost, or other errors they detect, they can use experimental command
'x_colo_lost_heartbeat' to tell COLO to do failover, COLO will do operations
accordingly.

For example, if the command is sent to the PVM, the Primary side will
exit COLO mode and take over operation. If sent to the Secondary, the
secondary will run failover work, then take over server operation to
become the new Primary.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
v10: Rename command colo_lost_hearbeat to experimental 'x_colo_lost_heartbeat'
---
 hmp-commands.hx              | 15 +++++++++++++++
 hmp.c                        |  8 ++++++++
 hmp.h                        |  1 +
 include/migration/colo.h     |  4 ++++
 include/migration/failover.h | 20 ++++++++++++++++++++
 migration/Makefile.objs      |  2 +-
 migration/colo-comm.c        | 11 +++++++++++
 migration/colo-failover.c    | 41 +++++++++++++++++++++++++++++++++++++++++
 migration/colo.c             |  1 +
 qapi-schema.json             | 26 ++++++++++++++++++++++++++
 qmp-commands.hx              | 19 +++++++++++++++++++
 stubs/migration-colo.c       |  8 ++++++++
 12 files changed, 155 insertions(+), 1 deletion(-)
 create mode 100644 include/migration/failover.h
 create mode 100644 migration/colo-failover.c

diff --git a/hmp-commands.hx b/hmp-commands.hx
index a493101..9d12ba8 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1071,6 +1071,21 @@ Set the parameter @var{parameter} for migration.
 ETEXI
 
     {
+        .name       = "x_colo_lost_heartbeat",
+        .args_type  = "",
+        .params     = "",
+        .help       = "Tell COLO that heartbeat is lost,\n\t\t\t"
+                      "a failover or takeover is needed.",
+        .mhandler.cmd = hmp_x_colo_lost_heartbeat,
+    },
+
+STEXI
+@item x_colo_lost_heartbeat
+@findex x_colo_lost_heartbeat
+Tell COLO that heartbeat is lost, a failover or takeover is needed.
+ETEXI
+
+    {
         .name       = "client_migrate_info",
         .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
         .params     = "protocol hostname port tls-port cert-subject",
diff --git a/hmp.c b/hmp.c
index b46ccc2..5e416d0 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1294,6 +1294,14 @@ void hmp_client_migrate_info(Monitor *mon, const QDict *qdict)
     hmp_handle_error(mon, &err);
 }
 
+void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict)
+{
+    Error *err = NULL;
+
+    qmp_x_colo_lost_heartbeat(&err);
+    hmp_handle_error(mon, &err);
+}
+
 void hmp_set_password(Monitor *mon, const QDict *qdict)
 {
     const char *protocol  = qdict_get_str(qdict, "protocol");
diff --git a/hmp.h b/hmp.h
index 1783bdf..9996c3b 100644
--- a/hmp.h
+++ b/hmp.h
@@ -69,6 +69,7 @@ void hmp_migrate_set_capability(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict);
 void hmp_migrate_set_cache_size(Monitor *mon, const QDict *qdict);
 void hmp_client_migrate_info(Monitor *mon, const QDict *qdict);
+void hmp_x_colo_lost_heartbeat(Monitor *mon, const QDict *qdict);
 void hmp_set_password(Monitor *mon, const QDict *qdict);
 void hmp_expire_password(Monitor *mon, const QDict *qdict);
 void hmp_eject(Monitor *mon, const QDict *qdict);
diff --git a/include/migration/colo.h b/include/migration/colo.h
index be2890b..34e8127 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -17,6 +17,7 @@
 #include "migration/migration.h"
 #include "qemu/coroutine_int.h"
 #include "qemu/thread.h"
+#include "qemu/main-loop.h"
 
 bool colo_supported(void);
 void colo_info_mig_init(void);
@@ -29,6 +30,9 @@ bool migration_incoming_enable_colo(void);
 void migration_incoming_exit_colo(void);
 void *colo_process_incoming_thread(void *opaque);
 bool migration_incoming_in_colo_state(void);
+
+int get_colo_mode(void);
+
 /* ram cache */
 int colo_init_ram_cache(void);
 void colo_release_ram_cache(void);
diff --git a/include/migration/failover.h b/include/migration/failover.h
new file mode 100644
index 0000000..1785b52
--- /dev/null
+++ b/include/migration/failover.h
@@ -0,0 +1,20 @@
+/*
+ *  COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ *  (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO.,LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_FAILOVER_H
+#define QEMU_FAILOVER_H
+
+#include "qemu-common.h"
+
+void failover_request_active(Error **errp);
+
+#endif
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index cb7bd30..50d8392 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,6 +1,6 @@
 common-obj-y += migration.o tcp.o
-common-obj-$(CONFIG_COLO) += colo.o
 common-obj-y += colo-comm.o
+common-obj-$(CONFIG_COLO) += colo.o colo-failover.o
 common-obj-y += vmstate.o
 common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o qemu-file-stdio.o
 common-obj-y += xbzrle.o
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
index 30df3d3..61a5fb3 100644
--- a/migration/colo-comm.c
+++ b/migration/colo-comm.c
@@ -20,6 +20,17 @@ typedef struct {
 
 static COLOInfo colo_info;
 
+int get_colo_mode(void)
+{
+    if (migration_in_colo_state()) {
+        return COLO_MODE_PRIMARY;
+    } else if (migration_incoming_in_colo_state()) {
+        return COLO_MODE_SECONDARY;
+    } else {
+        return COLO_MODE_UNKNOWN;
+    }
+}
+
 static void colo_info_pre_save(void *opaque)
 {
     COLOInfo *s = opaque;
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
new file mode 100644
index 0000000..e3897c6
--- /dev/null
+++ b/migration/colo-failover.c
@@ -0,0 +1,41 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2015 FUJITSU LIMITED
+ * Copyright (c) 2015 Intel Corporation
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "migration/colo.h"
+#include "migration/failover.h"
+#include "qmp-commands.h"
+#include "qapi/qmp/qerror.h"
+
+static QEMUBH *failover_bh;
+
+static void colo_failover_bh(void *opaque)
+{
+    qemu_bh_delete(failover_bh);
+    failover_bh = NULL;
+    /*TODO: Do failover work */
+}
+
+void failover_request_active(Error **errp)
+{
+    failover_bh = qemu_bh_new(colo_failover_bh, NULL);
+    qemu_bh_schedule(failover_bh);
+}
+
+void qmp_x_colo_lost_heartbeat(Error **errp)
+{
+    if (get_colo_mode() == COLO_MODE_UNKNOWN) {
+        error_setg(errp, QERR_FEATURE_DISABLED, "colo");
+        return;
+    }
+
+    failover_request_active(errp);
+}
diff --git a/migration/colo.c b/migration/colo.c
index a6791f4..64daee9 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -17,6 +17,7 @@
 #include "trace.h"
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
+#include "migration/failover.h"
 
 /*
  * checkpoint interval: unit ms
diff --git a/qapi-schema.json b/qapi-schema.json
index 49f2a90..ff0e941 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -734,6 +734,32 @@
             'checkpoint-reply', 'vmstate-send', 'vmstate-size',
             'vmstate-received', 'vmstate-loaded' ] }
 
+##
+# @COLOMode
+#
+# The colo mode
+#
+# @unknown: unknown mode
+#
+# @primary: master side
+#
+# @secondary: slave side
+#
+# Since: 2.5
+##
+{ 'enum': 'COLOMode',
+  'data': [ 'unknown', 'primary', 'secondary'] }
+
+##
+# @x-colo-lost-heartbeat
+#
+# Tell qemu that heartbeat is lost, request it to do takeover procedures.
+#
+# Since: 2.5
+##
+{ 'command': 'x-colo-lost-heartbeat' }
+
+##
 # @MouseInfo:
 #
 # Information about a mouse device.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 2f885b3..eccdd17 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -786,6 +786,25 @@ Example:
 EQMP
 
     {
+        .name       = "x-colo-lost-heartbeat",
+        .args_type  = "",
+        .mhandler.cmd_new = qmp_marshal_x_colo_lost_heartbeat,
+    },
+
+SQMP
+x-colo-lost-heartbeat
+--------------------
+
+Tell COLO that heartbeat is lost, a failover or takeover is needed.
+
+Example:
+
+-> { "execute": "x-colo-lost-heartbeat" }
+<- { "return": {} }
+
+EQMP
+
+    {
         .name       = "client_migrate_info",
         .args_type  = "protocol:s,hostname:s,port:i?,tls-port:i?,cert-subject:s?",
         .params     = "protocol hostname port tls-port cert-subject",
diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index c12516e..5028f63 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -11,6 +11,7 @@
  */
 
 #include "migration/colo.h"
+#include "qmp-commands.h"
 
 bool colo_supported(void)
 {
@@ -35,3 +36,10 @@ void *colo_process_incoming_thread(void *opaque)
 {
     return NULL;
 }
+
+void qmp_x_colo_lost_heartbeat(Error **errp)
+{
+    error_setg(errp, "COLO is not supported, please rerun configure"
+                     " with --enable-colo option in order to support"
+                     " COLO feature");
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (17 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-20 15:51   ` Dr. David Alan Gilbert
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 20/38] COLO: Implement failover work for Primary VM zhanghailiang
                   ` (18 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

When handling failover, we do different things according to the different stage
of failover process, here we introduce a global atomic variable to record the
status of failover.

We add four failover status to indicate the different stage of failover process.
You should use the helpers to get and set the value.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 include/migration/failover.h | 10 ++++++++++
 migration/colo-failover.c    | 37 +++++++++++++++++++++++++++++++++++++
 migration/colo.c             |  4 ++++
 trace-events                 |  1 +
 4 files changed, 52 insertions(+)

diff --git a/include/migration/failover.h b/include/migration/failover.h
index 1785b52..882c625 100644
--- a/include/migration/failover.h
+++ b/include/migration/failover.h
@@ -15,6 +15,16 @@
 
 #include "qemu-common.h"
 
+typedef enum COLOFailoverStatus {
+    FAILOVER_STATUS_NONE = 0,
+    FAILOVER_STATUS_REQUEST = 1, /* Request but not handled */
+    FAILOVER_STATUS_HANDLING = 2, /* In the process of handling failover */
+    FAILOVER_STATUS_COMPLETED = 3, /* Finish the failover process */
+} COLOFailoverStatus;
+
+void failover_init_state(void);
+int failover_set_state(int old_state, int new_state);
+int failover_get_state(void);
 void failover_request_active(Error **errp);
 
 #endif
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
index e3897c6..ae06c16 100644
--- a/migration/colo-failover.c
+++ b/migration/colo-failover.c
@@ -14,22 +14,59 @@
 #include "migration/failover.h"
 #include "qmp-commands.h"
 #include "qapi/qmp/qerror.h"
+#include "qemu/error-report.h"
+#include "trace.h"
 
 static QEMUBH *failover_bh;
+static COLOFailoverStatus failover_state;
 
 static void colo_failover_bh(void *opaque)
 {
+    int old_state;
+
     qemu_bh_delete(failover_bh);
     failover_bh = NULL;
+    old_state = failover_set_state(FAILOVER_STATUS_REQUEST,
+                                   FAILOVER_STATUS_HANDLING);
+    if (old_state != FAILOVER_STATUS_REQUEST) {
+        error_report(" Unkown error for failover, old_state=%d", old_state);
+        return;
+    }
     /*TODO: Do failover work */
 }
 
 void failover_request_active(Error **errp)
 {
+   if (failover_set_state(FAILOVER_STATUS_NONE, FAILOVER_STATUS_REQUEST)
+         != FAILOVER_STATUS_NONE) {
+        error_setg(errp, "COLO failover is already actived");
+        return;
+    }
     failover_bh = qemu_bh_new(colo_failover_bh, NULL);
     qemu_bh_schedule(failover_bh);
 }
 
+void failover_init_state(void)
+{
+    failover_state = FAILOVER_STATUS_NONE;
+}
+
+int failover_set_state(int old_state, int new_state)
+{
+    int old;
+
+    old = atomic_cmpxchg(&failover_state, old_state, new_state);;
+    if (old == old_state) {
+        trace_colo_failover_set_state(new_state);
+    }
+    return old;
+}
+
+int failover_get_state(void)
+{
+    return atomic_read(&failover_state);
+}
+
 void qmp_x_colo_lost_heartbeat(Error **errp)
 {
     if (get_colo_mode() == COLO_MODE_UNKNOWN) {
diff --git a/migration/colo.c b/migration/colo.c
index 64daee9..7732f60 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -194,6 +194,8 @@ static void colo_process_checkpoint(MigrationState *s)
     int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int fd, ret = 0;
 
+    failover_init_state();
+
     /* Dup the fd of to_dst_file */
     fd = dup(qemu_get_fd(s->to_dst_file));
     if (fd == -1) {
@@ -310,6 +312,8 @@ void *colo_process_incoming_thread(void *opaque)
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_COLO);
 
+    failover_init_state();
+
     fd = dup(qemu_get_fd(mis->from_src_file));
     if (fd < 0) {
         ret = -errno;
diff --git a/trace-events b/trace-events
index c98bc13..61e89c7 100644
--- a/trace-events
+++ b/trace-events
@@ -1502,6 +1502,7 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) ""
 colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
 colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64""
 colo_ctl_get(const char *msg) "Receive '%s' cmd"
+colo_failover_set_state(int new_state) "new state %d"
 
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 20/38] COLO: Implement failover work for Primary VM
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (18 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 21/38] COLO: Implement failover work for Secondary VM zhanghailiang
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

For PVM, if there is failover request from users.
The colo thread will exit the loop while the failover BH does the
cleanup work and resumes VM.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
v10: Call migration_end() in primary_vm_do_failover()
---
 include/migration/colo.h      |  4 +++
 include/migration/failover.h  |  1 +
 include/migration/migration.h |  1 +
 migration/colo-failover.c     |  7 ++++-
 migration/colo.c              | 59 +++++++++++++++++++++++++++++++++++++++++--
 migration/ram.c               |  2 +-
 6 files changed, 70 insertions(+), 4 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 34e8127..3e375c1 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -37,4 +37,8 @@ int get_colo_mode(void);
 int colo_init_ram_cache(void);
 void colo_release_ram_cache(void);
 void colo_flush_ram_cache(void);
+
+/* failover */
+void colo_do_failover(MigrationState *s);
+
 #endif
diff --git a/include/migration/failover.h b/include/migration/failover.h
index 882c625..fba3931 100644
--- a/include/migration/failover.h
+++ b/include/migration/failover.h
@@ -26,5 +26,6 @@ void failover_init_state(void);
 int failover_set_state(int old_state, int new_state);
 int failover_get_state(void);
 void failover_request_active(Error **errp);
+bool failover_request_is_active(void);
 
 #endif
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 0c0309d..406874f 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -210,6 +210,7 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                              uint64_t *bytes_sent);
 
 void ram_mig_init(void);
+void migration_end(void);
 void savevm_skip_section_footers(void);
 void register_global_state(void);
 void global_state_set_optional(void);
diff --git a/migration/colo-failover.c b/migration/colo-failover.c
index ae06c16..33c82c1 100644
--- a/migration/colo-failover.c
+++ b/migration/colo-failover.c
@@ -32,7 +32,7 @@ static void colo_failover_bh(void *opaque)
         error_report(" Unkown error for failover, old_state=%d", old_state);
         return;
     }
-    /*TODO: Do failover work */
+    colo_do_failover(NULL);
 }
 
 void failover_request_active(Error **errp)
@@ -67,6 +67,11 @@ int failover_get_state(void)
     return atomic_read(&failover_state);
 }
 
+bool failover_request_is_active(void)
+{
+    return ((failover_get_state() != FAILOVER_STATUS_NONE));
+}
+
 void qmp_x_colo_lost_heartbeat(Error **errp)
 {
     if (get_colo_mode() == COLO_MODE_UNKNOWN) {
diff --git a/migration/colo.c b/migration/colo.c
index 7732f60..95f1405 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -47,6 +47,45 @@ bool migration_incoming_in_colo_state(void)
     return mis && (mis->state == MIGRATION_STATUS_COLO);
 }
 
+static bool colo_runstate_is_stopped(void)
+{
+    return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
+}
+
+static void primary_vm_do_failover(void)
+{
+    MigrationState *s = migrate_get_current();
+    int old_state;
+
+    if (s->state != MIGRATION_STATUS_FAILED) {
+        migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
+                          MIGRATION_STATUS_COMPLETED);
+    }
+    migration_end();
+
+    vm_start();
+
+    old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
+                                   FAILOVER_STATUS_COMPLETED);
+    if (old_state != FAILOVER_STATUS_HANDLING) {
+        error_report("Serious error while do failover for Primary VM,"
+                     "old_state: %d", old_state);
+        return;
+    }
+}
+
+void colo_do_failover(MigrationState *s)
+{
+    /* Make sure vm stopped while failover */
+    if (!colo_runstate_is_stopped()) {
+        vm_stop_force_state(RUN_STATE_COLO);
+    }
+
+    if (get_colo_mode() == COLO_MODE_PRIMARY) {
+        primary_vm_do_failover();
+    }
+}
+
 /* colo checkpoint control helper */
 static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value)
 {
@@ -130,9 +169,22 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     }
 
     qemu_mutex_lock_iothread();
+    if (failover_request_is_active()) {
+        qemu_mutex_unlock_iothread();
+        ret = -1;
+        goto out;
+    }
     vm_stop_force_state(RUN_STATE_COLO);
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("run", "stop");
+    /*
+     * failover request bh could be called after
+     * vm_stop_force_state so we check failover_request_is_active() again.
+     */
+    if (failover_request_is_active()) {
+        ret = -1;
+        goto out;
+    }
 
     /* Disable block migration */
     s->params.blk = 0;
@@ -231,6 +283,11 @@ static void colo_process_checkpoint(MigrationState *s)
     trace_colo_vm_state_change("stop", "run");
 
     while (s->state == MIGRATION_STATUS_COLO) {
+        if (failover_request_is_active()) {
+            error_report("failover request");
+            goto out;
+        }
+
         current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
         if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
             g_usleep(100000);
@@ -248,8 +305,6 @@ out:
     if (ret < 0) {
         error_report("%s: %s", __func__, strerror(-ret));
     }
-    migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
-                      MIGRATION_STATUS_COMPLETED);
 
     qsb_free(buffer);
     buffer = NULL;
diff --git a/migration/ram.c b/migration/ram.c
index d7e0e37..8de5a5f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1102,7 +1102,7 @@ static void migration_bitmap_free(struct BitmapRcu *bmap)
     g_free(bmap);
 }
 
-static void migration_end(void)
+void migration_end(void)
 {
     /* caller have hold iothread lock or is in a bh, so there is
      * no writing race against this migration_bitmap
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 21/38] COLO: Implement failover work for Secondary VM
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (19 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 20/38] COLO: Implement failover work for Primary VM zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 22/38] COLO: implement default failover treatment zhanghailiang
                   ` (16 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

If users require SVM to takeover work, colo incoming thread should
exit from loop while failover BH helps backing to migration incoming
coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c | 41 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 95f1405..925a694 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -52,6 +52,33 @@ static bool colo_runstate_is_stopped(void)
     return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
 }
 
+static void secondary_vm_do_failover(void)
+{
+    int old_state;
+    MigrationIncomingState *mis = migration_incoming_get_current();
+
+    migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
+                      MIGRATION_STATUS_COMPLETED);
+
+    if (!autostart) {
+        error_report("\"-S\" qemu option will be ignored in secondary side");
+        /* recover runstate to normal migration finish state */
+        autostart = true;
+    }
+
+    old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
+                                   FAILOVER_STATUS_COMPLETED);
+    if (old_state != FAILOVER_STATUS_HANDLING) {
+        error_report("Serious error while do failover for secondary VM,"
+                     "old_state: %d", old_state);
+        return;
+    }
+    /* For Secondary VM, jump to incoming co */
+    if (mis->migration_incoming_co) {
+        qemu_coroutine_enter(mis->migration_incoming_co, NULL);
+    }
+}
+
 static void primary_vm_do_failover(void)
 {
     MigrationState *s = migrate_get_current();
@@ -83,6 +110,8 @@ void colo_do_failover(MigrationState *s)
 
     if (get_colo_mode() == COLO_MODE_PRIMARY) {
         primary_vm_do_failover();
+    } else {
+        secondary_vm_do_failover();
     }
 }
 
@@ -410,6 +439,11 @@ void *colo_process_incoming_thread(void *opaque)
             }
         }
 
+        if (failover_request_is_active()) {
+            error_report("failover request");
+            goto out;
+        }
+
         ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_REPLY, 0);
         if (ret < 0) {
             goto out;
@@ -474,10 +508,11 @@ out:
         qemu_fclose(fb);
     }
     qsb_free(buffer);
-
-    qemu_mutex_lock_iothread();
+    /* Here, we can ensure BH is hold the global lock, and will join colo
+    * incoming thread, so here it is not necessary to lock here again,
+    * or there will be a deadlock error.
+    */
     colo_release_ram_cache();
-    qemu_mutex_unlock_iothread();
 
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 22/38] COLO: implement default failover treatment
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (20 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 21/38] COLO: Implement failover work for Secondary VM zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error zhanghailiang
                   ` (15 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

If we detect some error in colo,  we will wait for some time,
hoping users also detect it. If users don't issue failover command.
We will go into default failover procedure, which the PVM will takeover
work while SVM is exit in default.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 925a694..de6265e 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -25,6 +25,14 @@
  */
 #define CHECKPOINT_MAX_PEROID 200
 
+/*
+ * The delay time before qemu begin the procedure of default failover treatment.
+ * Unit: ms
+ * Fix me: This value should be able to change by command
+ * 'migrate-set-parameters'
+ */
+#define DEFAULT_FAILOVER_DELAY 2000
+
 /* colo buffer */
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
@@ -273,6 +281,7 @@ static void colo_process_checkpoint(MigrationState *s)
 {
     QEMUSizedBuffer *buffer = NULL;
     int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    int64_t error_time;
     int fd, ret = 0;
 
     failover_init_state();
@@ -331,8 +340,25 @@ static void colo_process_checkpoint(MigrationState *s)
     }
 
 out:
+    current_time = error_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     if (ret < 0) {
         error_report("%s: %s", __func__, strerror(-ret));
+        /* Give users time to get involved in this verdict */
+        while (current_time - error_time <= DEFAULT_FAILOVER_DELAY) {
+            if (failover_request_is_active()) {
+                error_report("Primary VM will take over work");
+                break;
+            }
+            usleep(100 * 1000);
+            current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+        }
+
+        qemu_mutex_lock_iothread();
+        if (!failover_request_is_active()) {
+            error_report("Primary VM will take over work in default");
+            failover_request_active(NULL);
+        }
+        qemu_mutex_unlock_iothread();
     }
 
     qsb_free(buffer);
@@ -391,6 +417,7 @@ void *colo_process_incoming_thread(void *opaque)
     QEMUFile *fb = NULL;
     QEMUSizedBuffer *buffer = NULL; /* Cache incoming device state */
     int  total_size;
+    int64_t error_time, current_time;
     int fd, ret = 0;
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
@@ -499,9 +526,28 @@ void *colo_process_incoming_thread(void *opaque)
     }
 
 out:
+    current_time = error_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     if (ret < 0) {
         error_report("colo incoming thread will exit, detect error: %s",
                      strerror(-ret));
+        /* Give users time to get involved in this verdict */
+        while (current_time - error_time <= DEFAULT_FAILOVER_DELAY) {
+            if (failover_request_is_active()) {
+                error_report("Secondary VM will take over work");
+                break;
+            }
+            usleep(100 * 1000);
+            current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+        }
+        /* check flag again*/
+        if (!failover_request_is_active()) {
+            /*
+            * We assume that Primary VM is still alive according to
+            * heartbeat, just kill Secondary VM
+            */
+            error_report("SVM is going to exit in default!");
+            exit(1);
+        }
     }
 
     if (fb) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (21 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 22/38] COLO: implement default failover treatment zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-20 21:50   ` Eric Blake
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 24/38] COLO failover: Shutdown related socket fd when do failover zhanghailiang
                   ` (14 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, Michael Roth, zhanghailiang

If some errors happen during VM's COLO FT stage, it's important to notify the users
of this event. Together with 'colo_lost_heartbeat', users can intervene in COLO's
failover work immediately.
If users don't want to get involved in COLO's failover verdict,
it is still necessary to notify users that we exit COLO mode.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 docs/qmp-events.txt | 17 +++++++++++++++++
 migration/colo.c    | 13 +++++++++++++
 qapi-schema.json    | 16 ++++++++++++++++
 qapi/event.json     | 17 +++++++++++++++++
 4 files changed, 63 insertions(+)

diff --git a/docs/qmp-events.txt b/docs/qmp-events.txt
index d2f1ce4..165dd76 100644
--- a/docs/qmp-events.txt
+++ b/docs/qmp-events.txt
@@ -184,6 +184,23 @@ Example:
 Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
 event.
 
+COLO_EXIT
+---------
+
+Emitted when VM finishes COLO mode due to some errors happening or
+the request of users.
+
+Data:
+
+ - "mode": COLO mode, primary or secondary side (json-string)
+ - "reason":  the exit reason, internal error or external request. (json-string)
+ - "error": error message (json-string, operation)
+
+Example:
+
+{"timestamp": {"seconds": 2032141960, "microseconds": 417172},
+ "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
+
 DEVICE_DELETED
 --------------
 
diff --git a/migration/colo.c b/migration/colo.c
index de6265e..247b40f 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -18,6 +18,7 @@
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
 #include "migration/failover.h"
+#include "qapi-event.h"
 
 /*
  * checkpoint interval: unit ms
@@ -343,6 +344,9 @@ out:
     current_time = error_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     if (ret < 0) {
         error_report("%s: %s", __func__, strerror(-ret));
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY, COLO_EXIT_REASON_ERROR,
+                                  true, strerror(-ret), NULL);
+
         /* Give users time to get involved in this verdict */
         while (current_time - error_time <= DEFAULT_FAILOVER_DELAY) {
             if (failover_request_is_active()) {
@@ -359,6 +363,9 @@ out:
             failover_request_active(NULL);
         }
         qemu_mutex_unlock_iothread();
+    } else {
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY, COLO_EXIT_REASON_REQUEST,
+                                  false, NULL, NULL);
     }
 
     qsb_free(buffer);
@@ -530,6 +537,9 @@ out:
     if (ret < 0) {
         error_report("colo incoming thread will exit, detect error: %s",
                      strerror(-ret));
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY, COLO_EXIT_REASON_ERROR,
+                                  true, strerror(-ret), NULL);
+
         /* Give users time to get involved in this verdict */
         while (current_time - error_time <= DEFAULT_FAILOVER_DELAY) {
             if (failover_request_is_active()) {
@@ -548,6 +558,9 @@ out:
             error_report("SVM is going to exit in default!");
             exit(1);
         }
+    } else {
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY, COLO_EXIT_REASON_REQUEST,
+                                  false, NULL, NULL);
     }
 
     if (fb) {
diff --git a/qapi-schema.json b/qapi-schema.json
index ff0e941..8cc1f60 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -751,6 +751,22 @@
   'data': [ 'unknown', 'primary', 'secondary'] }
 
 ##
+# @COLOExitReason
+#
+# The reason of COLO exit
+#
+# @unknow: unknown reason
+#
+# @request: COLO exit is due to an external request
+#
+# @error: COLO exit is due to an internal error
+#
+# Since: 2.5
+##
+{ 'enum': 'COLOExitReason',
+  'data': [ 'unknown', 'request', 'error'] }
+
+##
 # @x-colo-lost-heartbeat
 #
 # Tell qemu that heartbeat is lost, request it to do takeover procedures.
diff --git a/qapi/event.json b/qapi/event.json
index f0cef01..6158ab5 100644
--- a/qapi/event.json
+++ b/qapi/event.json
@@ -255,6 +255,23 @@
   'data': {'status': 'MigrationStatus'}}
 
 ##
+# @COLO_EXIT
+#
+# Emitted when VM finishes COLO mode due to some errors happening or
+# the request of users.
+#
+# @mode: @COLOMode describing which side of VM is exit.
+#
+# @reason: @COLOExitReason describing the reason of colo exit.
+#
+# @error: #optional, error message. Only present on error happening.
+#
+# Since: 2.5
+##
+{ 'event': 'COLO_EXIT',
+  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason', '*error': 'str' } }
+
+##
 # @ACPI_DEVICE_OST
 #
 # Emitted when guest executes ACPI _OST method.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 24/38] COLO failover: Shutdown related socket fd when do failover
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (22 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 25/38] COLO failover: Don't do failover during loading VM's state zhanghailiang
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

If the net connection between COLO's two sides is broken while colo/colo incoming
thread is blocked in 'read'/'write' socket fd. It will not detect this error until
connect timeout. It will be a long time.

Here we shutdown all the related socket file descriptors to wake up the blocking
operation in failover BH. Besides, we should close the corresponding file descriptors
after failvoer BH shutdown them, or there will be an error.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 247b40f..240ccda 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -74,6 +74,13 @@ static void secondary_vm_do_failover(void)
         /* recover runstate to normal migration finish state */
         autostart = true;
     }
+    /* Make sure colo incoming thread not block in recv */
+    if (mis->from_src_file) {
+        qemu_file_shutdown(mis->from_src_file);
+    }
+    if (mis->to_src_file) {
+        qemu_file_shutdown(mis->to_src_file);
+    }
 
     old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
                                    FAILOVER_STATUS_COMPLETED);
@@ -99,6 +106,13 @@ static void primary_vm_do_failover(void)
     }
     migration_end();
 
+    if (s->from_dst_file) { /* Make sure colo thread no block in recv */
+        qemu_file_shutdown(s->from_dst_file);
+    }
+    if (s->to_dst_file) {
+        qemu_file_shutdown(s->to_dst_file);
+    }
+
     vm_start();
 
     old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
@@ -342,7 +356,7 @@ static void colo_process_checkpoint(MigrationState *s)
 
 out:
     current_time = error_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-    if (ret < 0) {
+    if (ret < 0 || (!ret && !failover_request_is_active())) {
         error_report("%s: %s", __func__, strerror(-ret));
         qapi_event_send_colo_exit(COLO_MODE_PRIMARY, COLO_EXIT_REASON_ERROR,
                                   true, strerror(-ret), NULL);
@@ -371,6 +385,11 @@ out:
     qsb_free(buffer);
     buffer = NULL;
 
+    /* Hope this not to be too long to loop here */
+    while (failover_get_state() != FAILOVER_STATUS_COMPLETED) {
+        ;
+    }
+    /* Must be called after failover BH is completed */
     if (s->from_dst_file) {
         qemu_fclose(s->from_dst_file);
     }
@@ -534,7 +553,7 @@ void *colo_process_incoming_thread(void *opaque)
 
 out:
     current_time = error_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-    if (ret < 0) {
+    if (ret < 0 || (!ret && !failover_request_is_active())) {
         error_report("colo incoming thread will exit, detect error: %s",
                      strerror(-ret));
         qapi_event_send_colo_exit(COLO_MODE_SECONDARY, COLO_EXIT_REASON_ERROR,
@@ -573,6 +592,11 @@ out:
     */
     colo_release_ram_cache();
 
+    /* Hope this not to be too long to loop here */
+    while (failover_get_state() != FAILOVER_STATUS_COMPLETED) {
+        ;
+    }
+    /* Must be called after failover BH is completed */
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
     }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 25/38] COLO failover: Don't do failover during loading VM's state
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (23 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 24/38] COLO failover: Shutdown related socket fd when do failover zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 26/38] COLO: Control the checkpoint delay time by migrate-set-parameters command zhanghailiang
                   ` (12 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

We should not do failover work while the main thread is loading
VM's state, otherwise it will destroy the consistent of VM's memory and
device state.

Here we add a new failover status 'RELAUNCH' which means we should
relaunch the process of failover.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/migration/failover.h |  2 ++
 migration/colo.c             | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/include/migration/failover.h b/include/migration/failover.h
index fba3931..e115d25 100644
--- a/include/migration/failover.h
+++ b/include/migration/failover.h
@@ -20,6 +20,8 @@ typedef enum COLOFailoverStatus {
     FAILOVER_STATUS_REQUEST = 1, /* Request but not handled */
     FAILOVER_STATUS_HANDLING = 2, /* In the process of handling failover */
     FAILOVER_STATUS_COMPLETED = 3, /* Finish the failover process */
+    /* Optional, Relaunch the failover process, again 'NONE' -> 'COMPLETED' */
+    FAILOVER_STATUS_RELAUNCH = 4,
 } COLOFailoverStatus;
 
 void failover_init_state(void);
diff --git a/migration/colo.c b/migration/colo.c
index 240ccda..9960bd6 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -34,6 +34,7 @@
  */
 #define DEFAULT_FAILOVER_DELAY 2000
 
+static bool vmstate_loading;
 /* colo buffer */
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
@@ -66,6 +67,19 @@ static void secondary_vm_do_failover(void)
     int old_state;
     MigrationIncomingState *mis = migration_incoming_get_current();
 
+    /* Can not do failover during the process of VM's loading VMstate, Or
+      * it will break the secondary VM.
+      */
+    if (vmstate_loading) {
+        old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
+                                       FAILOVER_STATUS_RELAUNCH);
+        if (old_state != FAILOVER_STATUS_HANDLING) {
+            error_report("Unknow error while do failover for secondary VM,"
+                         "old_state: %d", old_state);
+        }
+        return;
+    }
+
     migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
 
@@ -535,13 +549,23 @@ void *colo_process_incoming_thread(void *opaque)
 
         qemu_mutex_lock_iothread();
         qemu_system_reset(VMRESET_SILENT);
+        vmstate_loading = true;
         if (qemu_loadvm_state(fb) < 0) {
             error_report("COLO: loadvm failed");
+            vmstate_loading = false;
             qemu_mutex_unlock_iothread();
             goto out;
         }
+
+        vmstate_loading = false;
         qemu_mutex_unlock_iothread();
 
+        if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
+            failover_set_state(FAILOVER_STATUS_RELAUNCH, FAILOVER_STATUS_NONE);
+            failover_request_active(NULL);
+            goto out;
+        }
+
         ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
         if (ret < 0) {
             goto out;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 26/38] COLO: Control the checkpoint delay time by migrate-set-parameters command
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (24 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 25/38] COLO failover: Don't do failover during loading VM's state zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 27/38] COLO: Process shutdown command for VM in COLO state zhanghailiang
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, Luiz Capitulino, zhanghailiang

Add checkpoint-delay parameter for migrate-set-parameters, so that
we can control the checkpoint frequency when COLO is in periodic mode.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
v10: Fix related qmp command
---
 hmp.c                 |  7 +++++++
 migration/colo.c      | 12 +++++-------
 migration/migration.c | 28 +++++++++++++++++++++++++++-
 qapi-schema.json      | 19 ++++++++++++++++---
 qmp-commands.hx       |  2 +-
 5 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/hmp.c b/hmp.c
index 5e416d0..dedbadc 100644
--- a/hmp.c
+++ b/hmp.c
@@ -284,6 +284,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, " %s: %" PRId64,
             MigrationParameter_lookup[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT],
             params->x_cpu_throttle_increment);
+        monitor_printf(mon, " %s: %" PRId64,
+            MigrationParameter_lookup[MIGRATION_PARAMETER_CHECKPOINT_DELAY],
+            params->checkpoint_delay);
         monitor_printf(mon, "\n");
     }
 
@@ -1235,6 +1238,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
     bool has_decompress_threads = false;
     bool has_x_cpu_throttle_initial = false;
     bool has_x_cpu_throttle_increment = false;
+    bool has_checkpoint_delay = false;
     int i;
 
     for (i = 0; i < MIGRATION_PARAMETER_MAX; i++) {
@@ -1254,6 +1258,8 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
                 break;
             case MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT:
                 has_x_cpu_throttle_increment = true;
+            case MIGRATION_PARAMETER_CHECKPOINT_DELAY:
+                has_checkpoint_delay = true;
                 break;
             }
             qmp_migrate_set_parameters(has_compress_level, value,
@@ -1261,6 +1267,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
                                        has_decompress_threads, value,
                                        has_x_cpu_throttle_initial, value,
                                        has_x_cpu_throttle_increment, value,
+                                       has_checkpoint_delay, value,
                                        &err);
             break;
         }
diff --git a/migration/colo.c b/migration/colo.c
index 9960bd6..74e091d 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -19,12 +19,8 @@
 #include "qemu/sockets.h"
 #include "migration/failover.h"
 #include "qapi-event.h"
-
-/*
- * checkpoint interval: unit ms
- * Note: Please change this default value to 10000 when we support hybrid mode.
- */
-#define CHECKPOINT_MAX_PEROID 200
+#include "qmp-commands.h"
+#include "qapi-types.h"
 
 /*
  * The delay time before qemu begin the procedure of default failover treatment.
@@ -35,6 +31,7 @@
 #define DEFAULT_FAILOVER_DELAY 2000
 
 static bool vmstate_loading;
+
 /* colo buffer */
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
@@ -356,7 +353,8 @@ static void colo_process_checkpoint(MigrationState *s)
         }
 
         current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
+        if (current_time - checkpoint_time <
+            s->parameters[MIGRATION_PARAMETER_CHECKPOINT_DELAY]) {
             g_usleep(100000);
             continue;
         }
diff --git a/migration/migration.c b/migration/migration.c
index 227243e..41ec693 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -53,6 +53,11 @@
 /* Migration XBZRLE default cache size */
 #define DEFAULT_MIGRATE_CACHE_SIZE (64 * 1024 * 1024)
 
+/* The delay time (in ms) between two COLO checkpoints
+ * Note: Please change this default value to 10000 when we support hybrid mode.
+ */
+#define DEFAULT_MIGRATE_CHECKPOINT_DELAY 200
+
 static NotifierList migration_state_notifiers =
     NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
 
@@ -80,6 +85,8 @@ MigrationState *migrate_get_current(void)
                 DEFAULT_MIGRATE_X_CPU_THROTTLE_INITIAL,
         .parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
                 DEFAULT_MIGRATE_X_CPU_THROTTLE_INCREMENT,
+        .parameters[MIGRATION_PARAMETER_CHECKPOINT_DELAY] =
+                DEFAULT_MIGRATE_CHECKPOINT_DELAY,
     };
 
     return &current_migration;
@@ -426,6 +433,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
             s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL];
     params->x_cpu_throttle_increment =
             s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT];
+    params->checkpoint_delay =
+            s->parameters[MIGRATION_PARAMETER_CHECKPOINT_DELAY];
 
     return params;
 }
@@ -568,7 +577,10 @@ void qmp_migrate_set_parameters(bool has_compress_level,
                                 bool has_x_cpu_throttle_initial,
                                 int64_t x_cpu_throttle_initial,
                                 bool has_x_cpu_throttle_increment,
-                                int64_t x_cpu_throttle_increment, Error **errp)
+                                int64_t x_cpu_throttle_increment,
+                                bool has_checkpoint_delay,
+                                int64_t checkpoint_delay,
+                                Error **errp)
 {
     MigrationState *s = migrate_get_current();
 
@@ -603,6 +615,11 @@ void qmp_migrate_set_parameters(bool has_compress_level,
                    "x_cpu_throttle_increment",
                    "an integer in the range of 1 to 99");
     }
+    if (has_checkpoint_delay && (checkpoint_delay < 0)) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                    "checkpoint_delay",
+                    "is invalid, it should be positive");
+    }
 
     if (has_compress_level) {
         s->parameters[MIGRATION_PARAMETER_COMPRESS_LEVEL] = compress_level;
@@ -623,6 +640,10 @@ void qmp_migrate_set_parameters(bool has_compress_level,
         s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
                                                     x_cpu_throttle_increment;
     }
+
+    if (has_checkpoint_delay) {
+        s->parameters[MIGRATION_PARAMETER_CHECKPOINT_DELAY] = checkpoint_delay;
+    }
 }
 
 /* shared migration helpers */
@@ -743,6 +764,8 @@ static MigrationState *migrate_init(const MigrationParams *params)
             s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL];
     int x_cpu_throttle_increment =
             s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT];
+    int checkpoint_delay =
+            s->parameters[MIGRATION_PARAMETER_CHECKPOINT_DELAY];
 
     memcpy(enabled_capabilities, s->enabled_capabilities,
            sizeof(enabled_capabilities));
@@ -762,6 +785,9 @@ static MigrationState *migrate_init(const MigrationParams *params)
                 x_cpu_throttle_initial;
     s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
                 x_cpu_throttle_increment;
+    s->parameters[MIGRATION_PARAMETER_CHECKPOINT_DELAY] =
+                checkpoint_delay;
+
     s->bandwidth_limit = bandwidth_limit;
     migrate_set_state(&s->state, MIGRATION_STATUS_NONE, MIGRATION_STATUS_SETUP);
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 8cc1f60..143abea 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -617,11 +617,16 @@
 # @x-cpu-throttle-increment: throttle percentage increase each time
 #                            auto-converge detects that migration is not making
 #                            progress. The default value is 10. (Since 2.5)
+#
+# @checkpoint-delay: The delay time (in ms) between two COLO checkpoints in
+#          periodic mode.
+#
 # Since: 2.4
 ##
 { 'enum': 'MigrationParameter',
   'data': ['compress-level', 'compress-threads', 'decompress-threads',
-           'x-cpu-throttle-initial', 'x-cpu-throttle-increment'] }
+           'x-cpu-throttle-initial', 'x-cpu-throttle-increment',
+           'checkpoint-delay' ] }
 
 #
 # @migrate-set-parameters
@@ -641,6 +646,9 @@
 # @x-cpu-throttle-increment: throttle percentage increase each time
 #                            auto-converge detects that migration is not making
 #                            progress. The default value is 10. (Since 2.5)
+#
+# @checkpoint-delay: the delay time between two checkpoints
+#
 # Since: 2.4
 ##
 { 'command': 'migrate-set-parameters',
@@ -648,7 +656,8 @@
             '*compress-threads': 'int',
             '*decompress-threads': 'int',
             '*x-cpu-throttle-initial': 'int',
-            '*x-cpu-throttle-increment': 'int'} }
+            '*x-cpu-throttle-increment': 'int',
+            '*checkpoint-delay': 'int' } }
 
 #
 # @MigrationParameters
@@ -667,6 +676,8 @@
 #                            auto-converge detects that migration is not making
 #                            progress. The default value is 10. (Since 2.5)
 #
+# @checkpoint-delay: the delay time between two COLO checkpoints
+#
 # Since: 2.4
 ##
 { 'struct': 'MigrationParameters',
@@ -674,7 +685,9 @@
             'compress-threads': 'int',
             'decompress-threads': 'int',
             'x-cpu-throttle-initial': 'int',
-            'x-cpu-throttle-increment': 'int'} }
+            'x-cpu-throttle-increment': 'int',
+            'checkpoint-delay': 'int'} }
+
 ##
 # @query-migrate-parameters
 #
diff --git a/qmp-commands.hx b/qmp-commands.hx
index eccdd17..2f3c081 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3556,7 +3556,7 @@ EQMP
     {
         .name       = "migrate-set-parameters",
         .args_type  =
-            "compress-level:i?,compress-threads:i?,decompress-threads:i?",
+            "compress-level:i?,compress-threads:i?,decompress-threads:i?,checkpoint-delay:i?",
         .mhandler.cmd_new = qmp_marshal_migrate_set_parameters,
     },
 SQMP
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 27/38] COLO: Process shutdown command for VM in COLO state
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (25 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 26/38] COLO: Control the checkpoint delay time by migrate-set-parameters command zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 28/38] COLO: Update the global runstate after going into colo state zhanghailiang
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, Paolo Bonzini,
	zhanghailiang

If VM is in COLO FT state, we should do some extra work before normal shutdown
process. SVM will ignore the shutdown command if this command is issued directly
to it, PVM will send the shutdown command to SVM if it gets this command.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/sysemu/sysemu.h |  3 +++
 migration/colo.c        | 26 +++++++++++++++++++++++---
 qapi-schema.json        |  4 +++-
 vl.c                    | 26 ++++++++++++++++++++++++--
 4 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index c439975..7297678 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -52,6 +52,8 @@ typedef enum WakeupReason {
     QEMU_WAKEUP_REASON_OTHER,
 } WakeupReason;
 
+extern int colo_shutdown_requested;
+
 void qemu_system_reset_request(void);
 void qemu_system_suspend_request(void);
 void qemu_register_suspend_notifier(Notifier *notifier);
@@ -59,6 +61,7 @@ void qemu_system_wakeup_request(WakeupReason reason);
 void qemu_system_wakeup_enable(WakeupReason reason, bool enabled);
 void qemu_register_wakeup_notifier(Notifier *notifier);
 void qemu_system_shutdown_request(void);
+void qemu_system_shutdown_request_core(void);
 void qemu_system_powerdown_request(void);
 void qemu_register_powerdown_notifier(Notifier *notifier);
 void qemu_system_debug_request(void);
diff --git a/migration/colo.c b/migration/colo.c
index 74e091d..e57cb71 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -210,7 +210,7 @@ static int colo_ctl_get(QEMUFile *f, uint32_t require)
 static int colo_do_checkpoint_transaction(MigrationState *s,
                                           QEMUSizedBuffer *buffer)
 {
-    int ret;
+    int colo_shutdown, ret;
     size_t size;
     QEMUFile *trans = NULL;
 
@@ -237,6 +237,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         ret = -1;
         goto out;
     }
+    colo_shutdown = colo_shutdown_requested;
     vm_stop_force_state(RUN_STATE_COLO);
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("run", "stop");
@@ -288,6 +289,15 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    if (colo_shutdown) {
+        colo_ctl_put(s->to_dst_file, COLO_COMMAND_GUEST_SHUTDOWN, 0);
+        qemu_fflush(s->to_dst_file);
+        colo_shutdown_requested = 0;
+        qemu_system_shutdown_request_core();
+        /* Fix me: Just let the colo thread exit ? */
+        qemu_thread_exit(0);
+    }
+
     ret = 0;
     /* resume master */
     qemu_mutex_lock_iothread();
@@ -353,8 +363,9 @@ static void colo_process_checkpoint(MigrationState *s)
         }
 
         current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-        if (current_time - checkpoint_time <
-            s->parameters[MIGRATION_PARAMETER_CHECKPOINT_DELAY]) {
+        if ((current_time - checkpoint_time <
+            s->parameters[MIGRATION_PARAMETER_CHECKPOINT_DELAY]) &&
+            !colo_shutdown_requested) {
             g_usleep(100000);
             continue;
         }
@@ -444,6 +455,15 @@ static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
     case COLO_COMMAND_CHECKPOINT_REQUEST:
         *checkpoint_request = 1;
         return 0;
+    case COLO_COMMAND_GUEST_SHUTDOWN:
+        qemu_mutex_lock_iothread();
+        vm_stop_force_state(RUN_STATE_COLO);
+        qemu_system_shutdown_request_core();
+        qemu_mutex_unlock_iothread();
+        /* the main thread will exit and termiante the whole
+        * process, do we need some cleanup?
+        */
+        qemu_thread_exit(0);
     default:
         return -EINVAL;
     }
diff --git a/qapi-schema.json b/qapi-schema.json
index 143abea..9b1512f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -740,12 +740,14 @@
 #
 # @vmstate-loaded: VM's state has been loaded by SVM
 #
+# @guest-shutdown: shutdown require from PVM to SVM
+#
 # Since: 2.5
 ##
 { 'enum': 'COLOCommand',
   'data': [ 'invalid', 'checkpoint-ready', 'checkpoint-request',
             'checkpoint-reply', 'vmstate-send', 'vmstate-size',
-            'vmstate-received', 'vmstate-loaded' ] }
+            'vmstate-received', 'vmstate-loaded', 'guest-shutdown' ] }
 
 ##
 # @COLOMode
diff --git a/vl.c b/vl.c
index c459a3e..eb12454 100644
--- a/vl.c
+++ b/vl.c
@@ -1615,6 +1615,8 @@ static NotifierList wakeup_notifiers =
     NOTIFIER_LIST_INITIALIZER(wakeup_notifiers);
 static uint32_t wakeup_reason_mask = ~(1 << QEMU_WAKEUP_REASON_NONE);
 
+int colo_shutdown_requested;
+
 int qemu_shutdown_requested_get(void)
 {
     return shutdown_requested;
@@ -1740,6 +1742,10 @@ void qemu_system_guest_panicked(void)
 void qemu_system_reset_request(void)
 {
     if (no_reboot) {
+        qemu_system_shutdown_request();
+        if (!shutdown_requested) {/* colo handle it ? */
+            return;
+        }
         shutdown_requested = 1;
     } else {
         reset_requested = 1;
@@ -1808,13 +1814,29 @@ void qemu_system_killed(int signal, pid_t pid)
     qemu_system_shutdown_request();
 }
 
-void qemu_system_shutdown_request(void)
+void qemu_system_shutdown_request_core(void)
 {
-    trace_qemu_system_shutdown_request();
     shutdown_requested = 1;
     qemu_notify_event();
 }
 
+void qemu_system_shutdown_request(void)
+{
+    trace_qemu_system_shutdown_request();
+    /*
+    * if in colo mode, we need do some significant work before respond to the
+    * shutdown request.
+    */
+    if (migration_incoming_in_colo_state()) {
+        return ; /* primary's responsibility */
+    }
+    if (migration_in_colo_state()) {
+        colo_shutdown_requested = 1;
+        return;
+    }
+    qemu_system_shutdown_request_core();
+}
+
 static void qemu_system_powerdown(void)
 {
     qapi_event_send_powerdown(&error_abort);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 28/38] COLO: Update the global runstate after going into colo state
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (26 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 27/38] COLO: Process shutdown command for VM in COLO state zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 29/38] savevm: Split load vm state function qemu_loadvm_state zhanghailiang
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

If we start qemu with -S, the runstate will change from 'prelaunch' to 'running'
after going into colo state.
So it is necessary to update the global runstate after going into colo state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index e57cb71..8a3cc1c 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -356,6 +356,11 @@ static void colo_process_checkpoint(MigrationState *s)
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
 
+    ret = global_state_store();
+    if (ret < 0) {
+        goto out;
+    }
+
     while (s->state == MIGRATION_STATUS_COLO) {
         if (failover_request_is_active()) {
             error_report("failover request");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 29/38] savevm: Split load vm state function qemu_loadvm_state
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (27 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 28/38] COLO: Update the global runstate after going into colo state zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 30/38] COLO: Separate the process of saving/loading ram and device state zhanghailiang
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

qemu_loadvm_state is too long, and we can simplify it by splitting up
with three helper functions.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 migration/savevm.c | 165 +++++++++++++++++++++++++++++++----------------------
 1 file changed, 96 insertions(+), 69 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 0faf12b..1296cc3 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1053,6 +1053,100 @@ void loadvm_free_handlers(MigrationIncomingState *mis)
     }
 }
 
+static int
+qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
+{
+    uint32_t instance_id, version_id, section_id;
+    SaveStateEntry *se;
+    LoadStateEntry *le;
+    char idstr[256];
+    int ret;
+
+    /* Read section start */
+    section_id = qemu_get_be32(f);
+    if (!qemu_get_counted_string(f, idstr)) {
+        error_report("Unable to read ID string for section %u",
+                     section_id);
+        return -EINVAL;
+    }
+    instance_id = qemu_get_be32(f);
+    version_id = qemu_get_be32(f);
+
+    trace_qemu_loadvm_state_section_startfull(section_id, idstr,
+            instance_id, version_id);
+    /* Find savevm section */
+    se = find_se(idstr, instance_id);
+    if (se == NULL) {
+        error_report("Unknown savevm section or instance '%s' %d",
+                     idstr, instance_id);
+        ret = -EINVAL;
+        return ret;
+    }
+
+    /* Validate version */
+    if (version_id > se->version_id) {
+        error_report("savevm: unsupported version %d for '%s' v%d",
+                     version_id, idstr, se->version_id);
+        ret = -EINVAL;
+        return ret;
+    }
+
+    /* Add entry */
+    le = g_malloc0(sizeof(*le));
+
+    le->se = se;
+    le->section_id = section_id;
+    le->version_id = version_id;
+    QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry);
+
+    ret = vmstate_load(f, le->se, le->version_id);
+    if (ret < 0) {
+        error_report("error while loading state for instance 0x%x of"
+                     " device '%s'", instance_id, idstr);
+        return ret;
+    }
+    if (!check_section_footer(f, le)) {
+        ret = -EINVAL;
+        return ret;
+    }
+
+    return 0;
+}
+
+static int
+qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
+{
+    uint32_t section_id;
+    LoadStateEntry *le;
+    int ret;
+
+    section_id = qemu_get_be32(f);
+
+    trace_qemu_loadvm_state_section_partend(section_id);
+    QLIST_FOREACH(le, &mis->loadvm_handlers, entry) {
+        if (le->section_id == section_id) {
+            break;
+        }
+    }
+    if (le == NULL) {
+        error_report("Unknown savevm section %d", section_id);
+        ret = -EINVAL;
+        return ret;
+    }
+
+    ret = vmstate_load(f, le->se, le->version_id);
+    if (ret < 0) {
+        error_report("error while loading state section id %d(%s)",
+                     section_id, le->se->idstr);
+        return ret;
+    }
+    if (!check_section_footer(f, le)) {
+        ret = -EINVAL;
+        return ret;
+    }
+
+    return 0;
+}
 int qemu_loadvm_state(QEMUFile *f)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
@@ -1096,87 +1190,20 @@ int qemu_loadvm_state(QEMUFile *f)
     }
 
     while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) {
-        uint32_t instance_id, version_id, section_id;
-        SaveStateEntry *se;
-        LoadStateEntry *le;
-        char idstr[256];
 
         trace_qemu_loadvm_state_section(section_type);
         switch (section_type) {
         case QEMU_VM_SECTION_START:
         case QEMU_VM_SECTION_FULL:
-            /* Read section start */
-            section_id = qemu_get_be32(f);
-            if (!qemu_get_counted_string(f, idstr)) {
-                error_report("Unable to read ID string for section %u",
-                            section_id);
-                return -EINVAL;
-            }
-            instance_id = qemu_get_be32(f);
-            version_id = qemu_get_be32(f);
-
-            trace_qemu_loadvm_state_section_startfull(section_id, idstr,
-                                                      instance_id, version_id);
-            /* Find savevm section */
-            se = find_se(idstr, instance_id);
-            if (se == NULL) {
-                error_report("Unknown savevm section or instance '%s' %d",
-                             idstr, instance_id);
-                ret = -EINVAL;
-                goto out;
-            }
-
-            /* Validate version */
-            if (version_id > se->version_id) {
-                error_report("savevm: unsupported version %d for '%s' v%d",
-                             version_id, idstr, se->version_id);
-                ret = -EINVAL;
-                goto out;
-            }
-
-            /* Add entry */
-            le = g_malloc0(sizeof(*le));
-
-            le->se = se;
-            le->section_id = section_id;
-            le->version_id = version_id;
-            QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry);
-
-            ret = vmstate_load(f, le->se, le->version_id);
+            ret = qemu_loadvm_section_start_full(f, mis);
             if (ret < 0) {
-                error_report("error while loading state for instance 0x%x of"
-                             " device '%s'", instance_id, idstr);
-                goto out;
-            }
-            if (!check_section_footer(f, le)) {
-                ret = -EINVAL;
                 goto out;
             }
             break;
         case QEMU_VM_SECTION_PART:
         case QEMU_VM_SECTION_END:
-            section_id = qemu_get_be32(f);
-
-            trace_qemu_loadvm_state_section_partend(section_id);
-            QLIST_FOREACH(le, &mis->loadvm_handlers, entry) {
-                if (le->section_id == section_id) {
-                    break;
-                }
-            }
-            if (le == NULL) {
-                error_report("Unknown savevm section %d", section_id);
-                ret = -EINVAL;
-                goto out;
-            }
-
-            ret = vmstate_load(f, le->se, le->version_id);
+            ret = qemu_loadvm_section_part_end(f, mis);
             if (ret < 0) {
-                error_report("error while loading state section id %d(%s)",
-                             section_id, le->se->idstr);
-                goto out;
-            }
-            if (!check_section_footer(f, le)) {
-                ret = -EINVAL;
                 goto out;
             }
             break;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 30/38] COLO: Separate the process of saving/loading ram and device state
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (28 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 29/38] savevm: Split load vm state function qemu_loadvm_state zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 31/38] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

We separate the process of saving/loading ram and device state when do checkpoint,
we add new helpers for save/load ram/device. With this change, we can directly
transfer ram from master to slave without using QEMUSizeBuffer as assistant,
which also reduce the size of extra memory been used during checkpoint.

Besides, we move the colo_flush_ram_cache to the proper position after the
above change.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 include/sysemu/sysemu.h |   5 ++
 migration/colo.c        |  43 +++++++++++----
 migration/ram.c         |   8 ---
 migration/savevm.c      | 142 +++++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 177 insertions(+), 21 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 7297678..af1e1c7 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -94,7 +94,12 @@ int qemu_savevm_state_iterate(QEMUFile *f);
 void qemu_savevm_state_complete(QEMUFile *f);
 void qemu_savevm_state_cancel(void);
 uint64_t qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size);
+int qemu_save_ram_state(QEMUFile *f);
+int qemu_save_device_state(QEMUFile *f);
 int qemu_loadvm_state(QEMUFile *f);
+int qemu_loadvm_state_begin(QEMUFile *f);
+int qemu_load_ram_state(QEMUFile *f);
+int qemu_load_device_state(QEMUFile *f);
 
 typedef enum DisplayType
 {
diff --git a/migration/colo.c b/migration/colo.c
index 8a3cc1c..21cef34 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -250,21 +250,32 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
+    if (ret < 0) {
+        goto out;
+    }
     /* Disable block migration */
     s->params.blk = 0;
     s->params.shared = 0;
-    qemu_savevm_state_header(trans);
-    qemu_savevm_state_begin(trans, &s->params);
-    qemu_mutex_lock_iothread();
-    qemu_savevm_state_complete(trans);
-    qemu_mutex_unlock_iothread();
-
-    qemu_fflush(trans);
+    qemu_savevm_state_begin(s->to_dst_file, &s->params);
+    ret = qemu_file_get_error(s->to_dst_file);
+    if (ret < 0) {
+        error_report("save vm state begin error\n");
+        goto out;
+    }
 
-    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
+    qemu_mutex_lock_iothread();
+    /* Note: device state is saved into buffer */
+    ret = qemu_save_device_state(trans);
     if (ret < 0) {
+        error_report("save device state error\n");
+        qemu_mutex_unlock_iothread();
         goto out;
     }
+    qemu_fflush(trans);
+    qemu_save_ram_state(s->to_dst_file);
+    qemu_mutex_unlock_iothread();
+
     /* we send the total size of the vmstate first */
     size = qsb_get_length(buffer);
     ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SIZE, size);
@@ -544,6 +555,16 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        ret = qemu_loadvm_state_begin(mis->from_src_file);
+        if (ret < 0) {
+            error_report("load vm state begin error, ret=%d", ret);
+            goto out;
+        }
+        ret = qemu_load_ram_state(mis->from_src_file);
+        if (ret < 0) {
+            error_report("load ram state error");
+            goto out;
+        }
         /* read the VM state total size first */
         total_size = colo_ctl_get(mis->from_src_file,
                                   COLO_COMMAND_VMSTATE_SIZE);
@@ -573,8 +594,10 @@ void *colo_process_incoming_thread(void *opaque)
         qemu_mutex_lock_iothread();
         qemu_system_reset(VMRESET_SILENT);
         vmstate_loading = true;
-        if (qemu_loadvm_state(fb) < 0) {
-            error_report("COLO: loadvm failed");
+        colo_flush_ram_cache();
+        ret = qemu_load_device_state(fb);
+        if (ret < 0) {
+            error_report("COLO: load device state failed\n");
             vmstate_loading = false;
             qemu_mutex_unlock_iothread();
             goto out;
diff --git a/migration/ram.c b/migration/ram.c
index 8de5a5f..94bb47b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1601,7 +1601,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     int flags = 0, ret = 0;
     static uint64_t seq_iter;
     int len = 0;
-    bool need_flush = false;
 
     seq_iter++;
 
@@ -1671,7 +1670,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 break;
             }
 
-            need_flush = true;
             ch = qemu_get_byte(f);
             ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
             break;
@@ -1683,7 +1681,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 break;
             }
 
-            need_flush = true;
             qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
             break;
         case RAM_SAVE_FLAG_COMPRESS_PAGE:
@@ -1716,7 +1713,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                 ret = -EINVAL;
                 break;
             }
-            need_flush = true;
             break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
@@ -1737,10 +1733,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
 
     rcu_read_unlock();
 
-    if (!ret  && ram_cache_enable && need_flush) {
-        DPRINTF("Flush ram_cache\n");
-        colo_flush_ram_cache();
-    }
     DPRINTF("Completed load of VM with exit code %d seq iteration "
             "%" PRIu64 "\n", ret, seq_iter);
     return ret;
diff --git a/migration/savevm.c b/migration/savevm.c
index 1296cc3..8dc4b64 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -752,6 +752,10 @@ void qemu_savevm_state_begin(QEMUFile *f,
             break;
         }
     }
+    if (migration_in_colo_state()) {
+        qemu_put_byte(f, QEMU_VM_EOF);
+        qemu_fflush(f);
+    }
 }
 
 /*
@@ -949,13 +953,44 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
     return ret;
 }
 
-static int qemu_save_device_state(QEMUFile *f)
+int qemu_save_ram_state(QEMUFile *f)
 {
     SaveStateEntry *se;
+    int ret = 0;
 
-    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
-    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
+    QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+        if (!se->ops || !se->ops->save_live_complete) {
+            continue;
+        }
+        if (se->ops && se->ops->is_active) {
+            if (!se->ops->is_active(se->opaque)) {
+                continue;
+            }
+        }
+        trace_savevm_section_start(se->idstr, se->section_id);
+
+        save_section_header(f, se, QEMU_VM_SECTION_END);
+
+        ret = se->ops->save_live_complete(f, se->opaque);
+        trace_savevm_section_end(se->idstr, se->section_id, ret);
+        save_section_footer(f, se);
+        if (ret < 0) {
+            qemu_file_set_error(f, ret);
+            return ret;
+        }
+    }
+    qemu_put_byte(f, QEMU_VM_EOF);
 
+    return 0;
+}
+
+int qemu_save_device_state(QEMUFile *f)
+{
+    SaveStateEntry *se;
+
+    if (!migration_in_colo_state()) {
+        qemu_savevm_state_header(f);
+    }
     cpu_synchronize_all_states();
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
@@ -1264,6 +1299,107 @@ out:
     return ret;
 }
 
+int qemu_loadvm_state_begin(QEMUFile *f)
+{
+    uint8_t section_type;
+    int ret = -1;
+    MigrationIncomingState *mis = migration_incoming_get_current();
+
+    if (!mis) {
+        error_report("qemu_loadvm_state_begin");
+        return -EINVAL;
+    }
+    /* CleanUp */
+    loadvm_free_handlers(mis);
+
+    if (qemu_savevm_state_blocked(NULL)) {
+        return -EINVAL;
+    }
+
+    if (!savevm_state.skip_configuration) {
+        if (qemu_get_byte(f) != QEMU_VM_CONFIGURATION) {
+            error_report("Configuration section missing");
+            return -EINVAL;
+        }
+        ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0);
+
+        if (ret) {
+            return ret;
+        }
+    }
+
+    while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) {
+        if (section_type != QEMU_VM_SECTION_START) {
+            error_report("QEMU_VM_SECTION_START");
+            ret = -EINVAL;
+            goto out;
+        }
+        ret = qemu_loadvm_section_start_full(f, mis);
+        if (ret < 0) {
+            goto out;
+        }
+    }
+    ret = qemu_file_get_error(f);
+    if (ret == 0) {
+        return 0;
+     }
+out:
+    return ret;
+}
+
+int qemu_load_ram_state(QEMUFile *f)
+{
+    uint8_t section_type;
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    int ret = -1;
+
+    while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) {
+        if (section_type != QEMU_VM_SECTION_PART &&
+            section_type != QEMU_VM_SECTION_END) {
+            error_report("load ram state, not get "
+                         "QEMU_VM_SECTION_FULL or QEMU_VM_SECTION_END");
+            return -EINVAL;
+        }
+        ret = qemu_loadvm_section_part_end(f, mis);
+        if (ret < 0) {
+            goto out;
+        }
+    }
+    ret = qemu_file_get_error(f);
+    if (ret == 0) {
+        return 0;
+     }
+out:
+    return ret;
+}
+
+int qemu_load_device_state(QEMUFile *f)
+{
+    uint8_t section_type;
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    int ret = -1;
+
+    while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) {
+        if (section_type != QEMU_VM_SECTION_FULL) {
+            error_report("load device state error: "
+                         "Not get QEMU_VM_SECTION_FULL");
+            return -EINVAL;
+        }
+         ret = qemu_loadvm_section_start_full(f, mis);
+         if (ret < 0) {
+            goto out;
+         }
+    }
+
+     ret = qemu_file_get_error(f);
+
+    cpu_synchronize_all_post_init();
+     if (ret == 0) {
+        return 0;
+     }
+out:
+    return ret;
+}
 static BlockDriverState *find_vmstate_bs(void)
 {
     BlockDriverState *bs = NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 31/38] COLO: Split qemu_savevm_state_begin out of checkpoint process
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (29 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 30/38] COLO: Separate the process of saving/loading ram and device state zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets zhanghailiang
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

It is unnecessary to call qemu_savevm_state_begin() in every checkponit process.
It mainly sets up devices and does the first device state pass. These data will
not change during the later checkpoint process. So, we split it out of
colo_do_checkpoint_transaction(), in this way, we can reduce these data
transferring in the later checkpoint.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c | 51 +++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 21cef34..36f737a 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -254,15 +254,6 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     if (ret < 0) {
         goto out;
     }
-    /* Disable block migration */
-    s->params.blk = 0;
-    s->params.shared = 0;
-    qemu_savevm_state_begin(s->to_dst_file, &s->params);
-    ret = qemu_file_get_error(s->to_dst_file);
-    if (ret < 0) {
-        error_report("save vm state begin error\n");
-        goto out;
-    }
 
     qemu_mutex_lock_iothread();
     /* Note: device state is saved into buffer */
@@ -324,6 +315,21 @@ out:
     return ret;
 }
 
+static int colo_prepare_before_save(MigrationState *s)
+{
+    int ret;
+    /* Disable block migration */
+    s->params.blk = 0;
+    s->params.shared = 0;
+    qemu_savevm_state_begin(s->to_dst_file, &s->params);
+    ret = qemu_file_get_error(s->to_dst_file);
+    if (ret < 0) {
+        error_report("save vm state begin error\n");
+        return ret;
+    }
+    return 0;
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     QEMUSizedBuffer *buffer = NULL;
@@ -346,6 +352,11 @@ static void colo_process_checkpoint(MigrationState *s)
         goto out;
     }
 
+    ret = colo_prepare_before_save(s);
+    if (ret < 0) {
+        goto out;
+    }
+
     /*
      * Wait for Secondary finish loading vm states and enter COLO
      * restore.
@@ -485,6 +496,18 @@ static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
     }
 }
 
+static int colo_prepare_before_load(QEMUFile *f)
+{
+    int ret;
+
+    ret = qemu_loadvm_state_begin(f);
+    if (ret < 0) {
+        error_report("load vm state begin error, ret=%d", ret);
+        return ret;
+    }
+    return 0;
+}
+
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
@@ -523,6 +546,11 @@ void *colo_process_incoming_thread(void *opaque)
         goto out;
     }
 
+    ret = colo_prepare_before_load(mis->from_src_file);
+    if (ret < 0) {
+        goto out;
+    }
+
     ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
     if (ret < 0) {
         goto out;
@@ -555,11 +583,6 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
-        ret = qemu_loadvm_state_begin(mis->from_src_file);
-        if (ret < 0) {
-            error_report("load vm state begin error, ret=%d", ret);
-            goto out;
-        }
         ret = qemu_load_ram_state(mis->from_src_file);
         if (ret < 0) {
             error_report("load ram state error");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (30 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 31/38] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 12:39   ` Yang Hongyang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters zhanghailiang
                   ` (5 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah,
	zhanghailiang

For COLO or MC FT, We need a function to release all the buffered packets
actively.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
---
v10: new patch
---
 include/net/filter.h |  1 +
 include/net/net.h    |  4 ++++
 net/filter-buffer.c  | 15 +++++++++++++++
 net/net.c            | 24 ++++++++++++++++++++++++
 4 files changed, 44 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 2deda36..5a09607 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -73,5 +73,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
                                     const struct iovec *iov,
                                     int iovcnt,
                                     void *opaque);
+void filter_buffer_release_all(void);
 
 #endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 7af3e15..5c65c45 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -125,6 +125,10 @@ NetClientState *qemu_find_vlan_client_by_name(Monitor *mon, int vlan_id,
                                               const char *client_str);
 typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
 void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
+typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
+                                       Error **errp);
+void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
+                            Error **errp);
 int qemu_can_send_packet(NetClientState *nc);
 ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
                           int iovcnt);
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 57be149..b344901 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -14,6 +14,7 @@
 #include "qapi/qmp/qerror.h"
 #include "qapi-visit.h"
 #include "qom/object.h"
+#include "net/net.h"
 
 #define TYPE_FILTER_BUFFER "filter-buffer"
 
@@ -163,6 +164,20 @@ out:
     error_propagate(errp, local_err);
 }
 
+static void filter_buffer_release_packets(NetFilterState *nf, void *opaque,
+                                          Error **errp)
+{
+    if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+        filter_buffer_flush(nf);
+    }
+}
+
+/* public APIs */
+void filter_buffer_release_all(void)
+{
+    qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
+}
+
 static void filter_buffer_init(Object *obj)
 {
     object_property_add(obj, "interval", "int",
diff --git a/net/net.c b/net/net.c
index a3e9d1a..a333b01 100644
--- a/net/net.c
+++ b/net/net.c
@@ -259,6 +259,30 @@ static char *assign_name(NetClientState *nc1, const char *model)
     return g_strdup_printf("%s.%d", model, id);
 }
 
+void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
+                            Error **errp)
+{
+    NetClientState *nc;
+    NetFilterState *nf;
+
+    QTAILQ_FOREACH(nc, &net_clients, next) {
+        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
+            continue;
+        }
+        QTAILQ_FOREACH(nf, &nc->filters, next) {
+            if (func) {
+                Error *local_err = NULL;
+
+                func(nf, opaque, &local_err);
+                if (local_err) {
+                    error_propagate(errp, local_err);
+                    return;
+                }
+            }
+        }
+    }
+}
+
 static void qemu_net_client_destructor(NetClientState *nc)
 {
     g_free(nc);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (31 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 12:41   ` Yang Hongyang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval zhanghailiang
                   ` (4 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah,
	zhanghailiang

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
---
v10: new patch
---
 include/net/filter.h |  1 +
 net/filter-buffer.c  | 17 +++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 5a09607..4499d60 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -74,5 +74,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
                                     int iovcnt,
                                     void *opaque);
 void filter_buffer_release_all(void);
+void  filter_buffer_del_all_timers(void);
 
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index b344901..5f0ea70 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -178,6 +178,23 @@ void filter_buffer_release_all(void)
     qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
 }
 
+static void filter_buffer_del_timer(NetFilterState *nf, void *opaque,
+                                    Error **errp)
+{
+    if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+        FilterBufferState *s = FILTER_BUFFER(nf);
+
+        if (s->interval) {
+            timer_del(&s->release_timer);
+        }
+    }
+}
+
+void filter_buffer_del_all_timers(void)
+{
+    qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
+}
+
 static void filter_buffer_init(Object *obj)
 {
     object_property_add(obj, "interval", "int",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (32 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 12:43   ` Yang Hongyang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev zhanghailiang
                   ` (3 subsequent siblings)
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah,
	zhanghailiang

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
---
v10: new patch
---
 net/filter-buffer.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 5f0ea70..05313de 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -104,16 +104,6 @@ static void filter_buffer_setup(NetFilterState *nf, Error **errp)
 {
     FilterBufferState *s = FILTER_BUFFER(nf);
 
-    /*
-     * We may want to accept zero interval when VM FT solutions like MC
-     * or COLO use this filter to release packets on demand.
-     */
-    if (!s->interval) {
-        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval",
-                   "a non-zero interval");
-        return;
-    }
-
     s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
     if (s->interval) {
         timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (33 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 12:57   ` Yang Hongyang
  2015-11-04  2:56   ` Jason Wang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters zhanghailiang
                   ` (2 subsequent siblings)
  37 siblings, 2 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah,
	zhanghailiang

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
---
v10: new patch
---
 include/net/filter.h |  1 +
 include/net/net.h    |  3 ++
 net/filter-buffer.c  | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/net.c            | 20 +++++++++++++
 4 files changed, 108 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 4499d60..b0954ba 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
                                     void *opaque);
 void filter_buffer_release_all(void);
 void  filter_buffer_del_all_timers(void);
+void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
 
 #endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 5c65c45..e32bd90 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
                                        Error **errp);
 void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
                             Error **errp);
+typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
+                                    Error **errp);
+void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
 int qemu_can_send_packet(NetClientState *nc);
 ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
                           int iovcnt);
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 05313de..0dc1efb 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -15,6 +15,11 @@
 #include "qapi-visit.h"
 #include "qom/object.h"
 #include "net/net.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp-output-visitor.h"
+#include "qapi/qmp-input-visitor.h"
+#include "monitor/monitor.h"
+
 
 #define TYPE_FILTER_BUFFER "filter-buffer"
 
@@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
     qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
 }
 
+static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
+                                     Error **errp)
+{
+    NetFilterState *nf;
+    bool found = false;
+
+    QTAILQ_FOREACH(nf, &nc->filters, next) {
+        if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+            found = true;
+            break;
+        }
+    }
+
+    if (!found) {
+        QmpOutputVisitor *qov;
+        QmpInputVisitor *qiv;
+        Visitor *ov, *iv;
+        QObject *obj = NULL;
+        QDict *qdict;
+        void *dummy = NULL;
+        char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
+        char *queue = (char *) opaque;
+        bool auto_add = true;
+        Error *err = NULL;
+
+        qov = qmp_output_visitor_new();
+        ov = qmp_output_get_visitor(qov);
+        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
+        if (err) {
+            goto out;
+        }
+        visit_type_str(ov, &nc->name, "netdev", &err);
+        if (err) {
+            goto out;
+        }
+        visit_type_str(ov, &queue, "queue", &err);
+        if (err) {
+            goto out;
+        }
+        visit_type_bool(ov, &auto_add, "auto", &err);
+        if (err) {
+            goto out;
+        }
+        visit_end_struct(ov, &err);
+        if (err) {
+            goto out;
+        }
+        obj = qmp_output_get_qobject(qov);
+        g_assert(obj != NULL);
+        qdict = qobject_to_qdict(obj);
+        qmp_output_visitor_cleanup(qov);
+
+        qiv = qmp_input_visitor_new(obj);
+        iv = qmp_input_get_visitor(qiv);
+        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
+        qmp_input_visitor_cleanup(qiv);
+        qobject_decref(obj);
+out:
+        g_free(id);
+        if (err) {
+            error_propagate(errp, err);
+        }
+    }
+}
+/*
+* This will be used by COLO or MC FT, for which they will need
+* to buffer all the packets of all VM's net devices, Here we check
+* and automatically add netfilter for netdev that doesn't attach any buffer
+* netfilter.
+*/
+void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
+{
+    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
+
+    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
+                                        errp);
+    g_free(queue);
+}
+
 static void filter_buffer_init(Object *obj)
 {
     object_property_add(obj, "interval", "int",
diff --git a/net/net.c b/net/net.c
index a333b01..4fbe0af 100644
--- a/net/net.c
+++ b/net/net.c
@@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
     }
 }
 
+void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
+{
+    NetClientState *nc;
+
+    QTAILQ_FOREACH(nc, &net_clients, next) {
+        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
+            continue;
+        }
+        if (func) {
+            Error *local_err = NULL;
+
+            func(nc, opaque, &local_err);
+            if (local_err) {
+                error_propagate(errp, local_err);
+                return;
+            }
+        }
+    }
+}
+
 static void qemu_net_client_destructor(NetClientState *nc)
 {
     g_free(nc);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (34 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 12:58   ` Yang Hongyang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 37/38] colo: Use the netfilter to buffer and release packets zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 38/38] COLO: Add block replication into colo process zhanghailiang
  37 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah,
	zhanghailiang

We add a new property 'auto' for netfilter to distinguish if netfilter is
added by user or automatically added.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
---
v10: new patch
---
 include/net/filter.h |  2 ++
 net/filter-buffer.c  | 17 +++++++++++++++++
 net/filter.c         | 15 +++++++++++++++
 3 files changed, 34 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index b0954ba..46d3ef9 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -55,6 +55,7 @@ struct NetFilterState {
     char *netdev_id;
     NetClientState *netdev;
     NetFilterDirection direction;
+    bool auto_add;
     char info_str[256];
     QTAILQ_ENTRY(NetFilterState) next;
 };
@@ -76,5 +77,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
 void filter_buffer_release_all(void);
 void  filter_buffer_del_all_timers(void);
 void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
+void qemu_auto_del_filter_buffer(Error **errp);
 
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 0dc1efb..ea4481c 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -19,6 +19,7 @@
 #include "qapi/qmp-output-visitor.h"
 #include "qapi/qmp-input-visitor.h"
 #include "monitor/monitor.h"
+#include "qmp-commands.h"
 
 
 #define TYPE_FILTER_BUFFER "filter-buffer"
@@ -269,6 +270,22 @@ void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
     g_free(queue);
 }
 
+static void netdev_del_filter_buffer(NetFilterState *nf, void *opaque,
+                                     Error **errp)
+{
+    if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER) &&
+        nf->auto_add) {
+        char *id = object_get_canonical_path_component(OBJECT(nf));
+
+        qmp_object_del(id, errp);
+    }
+}
+
+void qemu_auto_del_filter_buffer(Error **errp)
+{
+    qemu_foreach_netfilter(netdev_del_filter_buffer, NULL, errp);
+}
+
 static void filter_buffer_init(Object *obj)
 {
     object_property_add(obj, "interval", "int",
diff --git a/net/filter.c b/net/filter.c
index 326f2b5..dcbcb80 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -117,6 +117,18 @@ static void netfilter_set_direction(Object *obj, int direction, Error **errp)
     nf->direction = direction;
 }
 
+static bool netfilter_get_auto_flag(Object *obj, Error **errp)
+{
+    NetFilterState *nf = NETFILTER(obj);
+    return nf->auto_add;
+}
+
+static void netfilter_set_auto_flag(Object *obj, bool flag, Error **errp)
+{
+    NetFilterState *nf = NETFILTER(obj);
+    nf->auto_add = flag;
+}
+
 static void netfilter_init(Object *obj)
 {
     object_property_add_str(obj, "netdev",
@@ -126,6 +138,9 @@ static void netfilter_init(Object *obj)
                              NetFilterDirection_lookup,
                              netfilter_get_direction, netfilter_set_direction,
                              NULL);
+    object_property_add_bool(obj, "auto",
+                             netfilter_get_auto_flag, netfilter_set_auto_flag,
+                             NULL);
 }
 
 static void netfilter_complete(UserCreatable *uc, Error **errp)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 37/38] colo: Use the netfilter to buffer and release packets
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (35 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 38/38] COLO: Add block replication into colo process zhanghailiang
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
v10: Use the new API
---
 migration/colo.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 36f737a..25335db 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -21,6 +21,8 @@
 #include "qapi-event.h"
 #include "qmp-commands.h"
 #include "qapi-types.h"
+#include "net/filter.h"
+#include "net/net.h"
 
 /*
  * The delay time before qemu begin the procedure of default failover treatment.
@@ -59,6 +61,24 @@ static bool colo_runstate_is_stopped(void)
     return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
 }
 
+static int colo_init_filter_buffers(void)
+{
+    Error *local_err = NULL;
+
+    qemu_auto_add_filter_buffer(NET_FILTER_DIRECTION_RX, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+        return -1;
+    }
+    filter_buffer_del_all_timers();
+    return 0;
+}
+
+static void colo_cleanup_filter_buffers(void)
+{
+    qemu_auto_del_filter_buffer(NULL);
+}
+
 static void secondary_vm_do_failover(void)
 {
     int old_state;
@@ -123,6 +143,7 @@ static void primary_vm_do_failover(void)
     if (s->to_dst_file) {
         qemu_file_shutdown(s->to_dst_file);
     }
+    colo_cleanup_filter_buffers();
 
     vm_start();
 
@@ -291,6 +312,8 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    filter_buffer_release_all();
+
     if (colo_shutdown) {
         colo_ctl_put(s->to_dst_file, COLO_COMMAND_GUEST_SHUTDOWN, 0);
         qemu_fflush(s->to_dst_file);
@@ -339,6 +362,12 @@ static void colo_process_checkpoint(MigrationState *s)
 
     failover_init_state();
 
+    ret = colo_init_filter_buffers();
+    if (ret < 0) {
+        ret = -EINVAL;
+        goto out;
+    }
+
     /* Dup the fd of to_dst_file */
     fd = dup(qemu_get_fd(s->to_dst_file));
     if (fd == -1) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* [Qemu-devel] [PATCH COLO-Frame v10 38/38] COLO: Add block replication into colo process
  2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
                   ` (36 preceding siblings ...)
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 37/38] colo: Use the netfilter to buffer and release packets zhanghailiang
@ 2015-11-03 11:56 ` zhanghailiang
  37 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 11:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah, zhanghailiang

Make sure master start block replication after slave's block replication started.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
 migration/colo.c      | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 migration/migration.c | 10 ---------
 trace-events          |  2 ++
 3 files changed, 63 insertions(+), 11 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 25335db..cb9c6db 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -23,6 +23,7 @@
 #include "qapi-types.h"
 #include "net/filter.h"
 #include "net/net.h"
+#include "block/block_int.h"
 
 /*
  * The delay time before qemu begin the procedure of default failover treatment.
@@ -83,6 +84,7 @@ static void secondary_vm_do_failover(void)
 {
     int old_state;
     MigrationIncomingState *mis = migration_incoming_get_current();
+    Error *local_err = NULL;
 
     /* Can not do failover during the process of VM's loading VMstate, Or
       * it will break the secondary VM.
@@ -100,6 +102,12 @@ static void secondary_vm_do_failover(void)
     migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
 
+    bdrv_stop_replication_all(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
+    trace_colo_stop_block_replication("failover");
+
     if (!autostart) {
         error_report("\"-S\" qemu option will be ignored in secondary side");
         /* recover runstate to normal migration finish state */
@@ -130,6 +138,7 @@ static void primary_vm_do_failover(void)
 {
     MigrationState *s = migrate_get_current();
     int old_state;
+    Error *local_err = NULL;
 
     if (s->state != MIGRATION_STATUS_FAILED) {
         migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
@@ -145,6 +154,12 @@ static void primary_vm_do_failover(void)
     }
     colo_cleanup_filter_buffers();
 
+    bdrv_stop_replication_all(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
+    trace_colo_stop_block_replication("failover");
+
     vm_start();
 
     old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
@@ -234,6 +249,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     int colo_shutdown, ret;
     size_t size;
     QEMUFile *trans = NULL;
+    Error *local_err = NULL;
 
     ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
     if (ret < 0) {
@@ -271,6 +287,16 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    /* we call this api although this may do nothing on primary side */
+    qemu_mutex_lock_iothread();
+    bdrv_do_checkpoint_all(&local_err);
+    qemu_mutex_unlock_iothread();
+    if (local_err) {
+        error_report_err(local_err);
+        ret = -1;
+        goto out;
+    }
+
     ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
     if (ret < 0) {
         goto out;
@@ -315,6 +341,10 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     filter_buffer_release_all();
 
     if (colo_shutdown) {
+        qemu_mutex_lock_iothread();
+        bdrv_stop_replication_all(false, NULL);
+        trace_colo_stop_block_replication("shutdown");
+        qemu_mutex_unlock_iothread();
         colo_ctl_put(s->to_dst_file, COLO_COMMAND_GUEST_SHUTDOWN, 0);
         qemu_fflush(s->to_dst_file);
         colo_shutdown_requested = 0;
@@ -359,6 +389,7 @@ static void colo_process_checkpoint(MigrationState *s)
     int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int64_t error_time;
     int fd, ret = 0;
+    Error *local_err = NULL;
 
     failover_init_state();
 
@@ -403,6 +434,15 @@ static void colo_process_checkpoint(MigrationState *s)
     }
 
     qemu_mutex_lock_iothread();
+    /* start block replication */
+    bdrv_start_replication_all(REPLICATION_MODE_PRIMARY, &local_err);
+    if (local_err) {
+        qemu_mutex_unlock_iothread();
+        error_report_err(local_err);
+        ret = -EINVAL;
+        goto out;
+    }
+    trace_colo_start_block_replication();
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
@@ -514,6 +554,8 @@ static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
     case COLO_COMMAND_GUEST_SHUTDOWN:
         qemu_mutex_lock_iothread();
         vm_stop_force_state(RUN_STATE_COLO);
+        bdrv_stop_replication_all(false, NULL);
+        trace_colo_stop_block_replication("shutdown");
         qemu_system_shutdown_request_core();
         qemu_mutex_unlock_iothread();
         /* the main thread will exit and termiante the whole
@@ -545,6 +587,7 @@ void *colo_process_incoming_thread(void *opaque)
     int  total_size;
     int64_t error_time, current_time;
     int fd, ret = 0;
+    Error *local_err = NULL;
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_COLO);
@@ -580,6 +623,16 @@ void *colo_process_incoming_thread(void *opaque)
         goto out;
     }
 
+    qemu_mutex_lock_iothread();
+    /* start block replication */
+    bdrv_start_replication_all(REPLICATION_MODE_SECONDARY, &local_err);
+    qemu_mutex_unlock_iothread();
+    if (local_err) {
+        error_report_err(local_err);
+        goto out;
+    }
+    trace_colo_start_block_replication();
+
     ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
     if (ret < 0) {
         goto out;
@@ -655,8 +708,15 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
-        vmstate_loading = false;
+        /* discard colo disk buffer */
+        bdrv_do_checkpoint_all(&local_err);
         qemu_mutex_unlock_iothread();
+        if (local_err) {
+            vmstate_loading = false;
+            goto out;
+        }
+
+        vmstate_loading = false;
 
         if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
             failover_set_state(FAILOVER_STATUS_RELAUNCH, FAILOVER_STATUS_NONE);
diff --git a/migration/migration.c b/migration/migration.c
index 41ec693..72a2b63 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -328,16 +328,6 @@ static void process_incoming_migration_co(void *opaque)
         exit(EXIT_FAILURE);
     }
 
-    /* Make sure all file formats flush their mutable metadata */
-    bdrv_invalidate_cache_all(&local_err);
-    if (local_err) {
-        migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
-                          MIGRATION_STATUS_FAILED);
-        error_report_err(local_err);
-        migrate_decompress_threads_join();
-        exit(EXIT_FAILURE);
-    }
-
     /*
      * This must happen after all error conditions are dealt with and
      * we're sure the VM is going to be running on this host.
diff --git a/trace-events b/trace-events
index 61e89c7..8ab56b5 100644
--- a/trace-events
+++ b/trace-events
@@ -1503,6 +1503,8 @@ colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
 colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64""
 colo_ctl_get(const char *msg) "Receive '%s' cmd"
 colo_failover_set_state(int new_state) "new state %d"
+colo_start_block_replication(void) "Block replication is started"
+colo_stop_block_replication(const char *reason) "Block replication is stopped(reason: '%s')"
 
 # kvm-all.c
 kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets zhanghailiang
@ 2015-11-03 12:39   ` Yang Hongyang
  2015-11-03 13:19     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Yang Hongyang @ 2015-11-03 12:39 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah

On 2015年11月03日 19:56, zhanghailiang wrote:
> For COLO or MC FT, We need a function to release all the buffered packets
> actively.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Cc: Jason Wang <jasowang@redhat.com>
> ---
> v10: new patch
> ---
>   include/net/filter.h |  1 +
>   include/net/net.h    |  4 ++++
>   net/filter-buffer.c  | 15 +++++++++++++++
>   net/net.c            | 24 ++++++++++++++++++++++++
>   4 files changed, 44 insertions(+)
>
> diff --git a/include/net/filter.h b/include/net/filter.h
> index 2deda36..5a09607 100644
> --- a/include/net/filter.h
> +++ b/include/net/filter.h
> @@ -73,5 +73,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>                                       const struct iovec *iov,
>                                       int iovcnt,
>                                       void *opaque);
> +void filter_buffer_release_all(void);
>
>   #endif /* QEMU_NET_FILTER_H */
> diff --git a/include/net/net.h b/include/net/net.h
> index 7af3e15..5c65c45 100644
> --- a/include/net/net.h
> +++ b/include/net/net.h
> @@ -125,6 +125,10 @@ NetClientState *qemu_find_vlan_client_by_name(Monitor *mon, int vlan_id,
>                                                 const char *client_str);
>   typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
>   void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
> +typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
> +                                       Error **errp);
> +void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
> +                            Error **errp);
>   int qemu_can_send_packet(NetClientState *nc);
>   ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
>                             int iovcnt);
> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
> index 57be149..b344901 100644
> --- a/net/filter-buffer.c
> +++ b/net/filter-buffer.c
> @@ -14,6 +14,7 @@
>   #include "qapi/qmp/qerror.h"
>   #include "qapi-visit.h"
>   #include "qom/object.h"
> +#include "net/net.h"
>
>   #define TYPE_FILTER_BUFFER "filter-buffer"
>
> @@ -163,6 +164,20 @@ out:
>       error_propagate(errp, local_err);
>   }
>
> +static void filter_buffer_release_packets(NetFilterState *nf, void *opaque,
> +                                          Error **errp)
> +{
> +    if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
> +        filter_buffer_flush(nf);
> +    }
> +}
> +
> +/* public APIs */
> +void filter_buffer_release_all(void)
> +{
> +    qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
> +}
> +
>   static void filter_buffer_init(Object *obj)
>   {
>       object_property_add(obj, "interval", "int",
> diff --git a/net/net.c b/net/net.c
> index a3e9d1a..a333b01 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -259,6 +259,30 @@ static char *assign_name(NetClientState *nc1, const char *model)
>       return g_strdup_printf("%s.%d", model, id);
>   }
>
> +void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
> +                            Error **errp)
> +{
> +    NetClientState *nc;
> +    NetFilterState *nf;
> +
> +    QTAILQ_FOREACH(nc, &net_clients, next) {

Going through every filters this way might cause problem under
multiqueue case. IIRC, Jason suggested that we implement multiqueue
by this way: attach the same filter to all queues instead of
attach the clone of the filter obj to other queues. So if we
attach the same filter to all queues, going through filters
this way will cause the func been called multiple(=num of queues) times.

> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
> +            continue;
> +        }
> +        QTAILQ_FOREACH(nf, &nc->filters, next) {
> +            if (func) {
> +                Error *local_err = NULL;
> +
> +                func(nf, opaque, &local_err);
> +                if (local_err) {
> +                    error_propagate(errp, local_err);
> +                    return;
> +                }
> +            }
> +        }
> +    }
> +}
> +
>   static void qemu_net_client_destructor(NetClientState *nc)
>   {
>       g_free(nc);
>

-- 
Thanks,
Yang

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters zhanghailiang
@ 2015-11-03 12:41   ` Yang Hongyang
  2015-11-03 13:07     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Yang Hongyang @ 2015-11-03 12:41 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah

Can you explain why this is needed? Seems that this api hasn't
been used in this series.

On 2015年11月03日 19:56, zhanghailiang wrote:
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Cc: Jason Wang <jasowang@redhat.com>
> ---
> v10: new patch
> ---
>   include/net/filter.h |  1 +
>   net/filter-buffer.c  | 17 +++++++++++++++++
>   2 files changed, 18 insertions(+)
>
> diff --git a/include/net/filter.h b/include/net/filter.h
> index 5a09607..4499d60 100644
> --- a/include/net/filter.h
> +++ b/include/net/filter.h
> @@ -74,5 +74,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>                                       int iovcnt,
>                                       void *opaque);
>   void filter_buffer_release_all(void);
> +void  filter_buffer_del_all_timers(void);
>
>   #endif /* QEMU_NET_FILTER_H */
> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
> index b344901..5f0ea70 100644
> --- a/net/filter-buffer.c
> +++ b/net/filter-buffer.c
> @@ -178,6 +178,23 @@ void filter_buffer_release_all(void)
>       qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
>   }
>
> +static void filter_buffer_del_timer(NetFilterState *nf, void *opaque,
> +                                    Error **errp)
> +{
> +    if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
> +        FilterBufferState *s = FILTER_BUFFER(nf);
> +
> +        if (s->interval) {
> +            timer_del(&s->release_timer);
> +        }
> +    }
> +}
> +
> +void filter_buffer_del_all_timers(void)
> +{
> +    qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
> +}
> +
>   static void filter_buffer_init(Object *obj)
>   {
>       object_property_add(obj, "interval", "int",
>

-- 
Thanks,
Yang

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval zhanghailiang
@ 2015-11-03 12:43   ` Yang Hongyang
  2015-11-04  2:52     ` Jason Wang
  0 siblings, 1 reply; 100+ messages in thread
From: Yang Hongyang @ 2015-11-03 12:43 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah

Some commit message would be better.

On 2015年11月03日 19:56, zhanghailiang wrote:
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Cc: Jason Wang <jasowang@redhat.com>

Reviewed-by: Yang Hongyang <hongyang.yang@easystack.cn>

> ---
> v10: new patch
> ---
>   net/filter-buffer.c | 10 ----------
>   1 file changed, 10 deletions(-)
>
> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
> index 5f0ea70..05313de 100644
> --- a/net/filter-buffer.c
> +++ b/net/filter-buffer.c
> @@ -104,16 +104,6 @@ static void filter_buffer_setup(NetFilterState *nf, Error **errp)
>   {
>       FilterBufferState *s = FILTER_BUFFER(nf);
>
> -    /*
> -     * We may want to accept zero interval when VM FT solutions like MC
> -     * or COLO use this filter to release packets on demand.
> -     */
> -    if (!s->interval) {
> -        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval",
> -                   "a non-zero interval");
> -        return;
> -    }
> -
>       s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
>       if (s->interval) {
>           timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL,
>

-- 
Thanks,
Yang

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev zhanghailiang
@ 2015-11-03 12:57   ` Yang Hongyang
  2015-11-03 13:16     ` zhanghailiang
  2015-11-04  2:56   ` Jason Wang
  1 sibling, 1 reply; 100+ messages in thread
From: Yang Hongyang @ 2015-11-03 12:57 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah



On 2015年11月03日 19:56, zhanghailiang wrote:
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Cc: Jason Wang <jasowang@redhat.com>
> ---
> v10: new patch
> ---
>   include/net/filter.h |  1 +
>   include/net/net.h    |  3 ++
>   net/filter-buffer.c  | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/net.c            | 20 +++++++++++++
>   4 files changed, 108 insertions(+)
>
> diff --git a/include/net/filter.h b/include/net/filter.h
> index 4499d60..b0954ba 100644
> --- a/include/net/filter.h
> +++ b/include/net/filter.h
> @@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>                                       void *opaque);
>   void filter_buffer_release_all(void);
>   void  filter_buffer_del_all_timers(void);
> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
>
>   #endif /* QEMU_NET_FILTER_H */
> diff --git a/include/net/net.h b/include/net/net.h
> index 5c65c45..e32bd90 100644
> --- a/include/net/net.h
> +++ b/include/net/net.h
> @@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>                                          Error **errp);
>   void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>                               Error **errp);
> +typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
> +                                    Error **errp);
> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
>   int qemu_can_send_packet(NetClientState *nc);
>   ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
>                             int iovcnt);
> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
> index 05313de..0dc1efb 100644
> --- a/net/filter-buffer.c
> +++ b/net/filter-buffer.c
> @@ -15,6 +15,11 @@
>   #include "qapi-visit.h"
>   #include "qom/object.h"
>   #include "net/net.h"
> +#include "qapi/qmp/qdict.h"
> +#include "qapi/qmp-output-visitor.h"
> +#include "qapi/qmp-input-visitor.h"
> +#include "monitor/monitor.h"
> +
>
>   #define TYPE_FILTER_BUFFER "filter-buffer"
>
> @@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
>       qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>   }
>
> +static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
> +                                     Error **errp)
> +{
> +    NetFilterState *nf;
> +    bool found = false;
> +
> +    QTAILQ_FOREACH(nf, &nc->filters, next) {
> +        if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
> +            found = true;

What if a filter-buffer already attached to a netdev, but has interval
set?
Is this API really necessary?

> +            break;
> +        }
> +    }
> +
> +    if (!found) {
> +        QmpOutputVisitor *qov;
> +        QmpInputVisitor *qiv;
> +        Visitor *ov, *iv;
> +        QObject *obj = NULL;
> +        QDict *qdict;
> +        void *dummy = NULL;
> +        char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
> +        char *queue = (char *) opaque;
> +        bool auto_add = true;
> +        Error *err = NULL;
> +
> +        qov = qmp_output_visitor_new();
> +        ov = qmp_output_get_visitor(qov);
> +        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
> +        if (err) {
> +            goto out;
> +        }
> +        visit_type_str(ov, &nc->name, "netdev", &err);
> +        if (err) {
> +            goto out;
> +        }
> +        visit_type_str(ov, &queue, "queue", &err);
> +        if (err) {
> +            goto out;
> +        }
> +        visit_type_bool(ov, &auto_add, "auto", &err);
> +        if (err) {
> +            goto out;
> +        }
> +        visit_end_struct(ov, &err);
> +        if (err) {
> +            goto out;
> +        }
> +        obj = qmp_output_get_qobject(qov);
> +        g_assert(obj != NULL);
> +        qdict = qobject_to_qdict(obj);
> +        qmp_output_visitor_cleanup(qov);
> +
> +        qiv = qmp_input_visitor_new(obj);
> +        iv = qmp_input_get_visitor(qiv);
> +        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
> +        qmp_input_visitor_cleanup(qiv);
> +        qobject_decref(obj);
> +out:
> +        g_free(id);
> +        if (err) {
> +            error_propagate(errp, err);
> +        }
> +    }
> +}
> +/*
> +* This will be used by COLO or MC FT, for which they will need
> +* to buffer all the packets of all VM's net devices, Here we check
> +* and automatically add netfilter for netdev that doesn't attach any buffer
> +* netfilter.
> +*/
> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
> +{
> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
> +
> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
> +                                        errp);
> +    g_free(queue);
> +}
> +
>   static void filter_buffer_init(Object *obj)
>   {
>       object_property_add(obj, "interval", "int",
> diff --git a/net/net.c b/net/net.c
> index a333b01..4fbe0af 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>       }
>   }
>
> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
> +{
> +    NetClientState *nc;
> +
> +    QTAILQ_FOREACH(nc, &net_clients, next) {
> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
> +            continue;
> +        }
> +        if (func) {
> +            Error *local_err = NULL;
> +
> +            func(nc, opaque, &local_err);
> +            if (local_err) {
> +                error_propagate(errp, local_err);
> +                return;
> +            }
> +        }
> +    }
> +}
> +
>   static void qemu_net_client_destructor(NetClientState *nc)
>   {
>       g_free(nc);
>

-- 
Thanks,
Yang

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters zhanghailiang
@ 2015-11-03 12:58   ` Yang Hongyang
  0 siblings, 0 replies; 100+ messages in thread
From: Yang Hongyang @ 2015-11-03 12:58 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah

On 2015年11月03日 19:56, zhanghailiang wrote:
> We add a new property 'auto' for netfilter to distinguish if netfilter is
> added by user or automatically added.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Cc: Jason Wang <jasowang@redhat.com>
> ---
> v10: new patch
> ---
>   include/net/filter.h |  2 ++
>   net/filter-buffer.c  | 17 +++++++++++++++++
>   net/filter.c         | 15 +++++++++++++++
>   3 files changed, 34 insertions(+)
>
> diff --git a/include/net/filter.h b/include/net/filter.h
> index b0954ba..46d3ef9 100644
> --- a/include/net/filter.h
> +++ b/include/net/filter.h
> @@ -55,6 +55,7 @@ struct NetFilterState {
>       char *netdev_id;
>       NetClientState *netdev;
>       NetFilterDirection direction;
> +    bool auto_add;
>       char info_str[256];
>       QTAILQ_ENTRY(NetFilterState) next;
>   };
> @@ -76,5 +77,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>   void filter_buffer_release_all(void);
>   void  filter_buffer_del_all_timers(void);
>   void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
> +void qemu_auto_del_filter_buffer(Error **errp);
>
>   #endif /* QEMU_NET_FILTER_H */
> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
> index 0dc1efb..ea4481c 100644
> --- a/net/filter-buffer.c
> +++ b/net/filter-buffer.c
> @@ -19,6 +19,7 @@
>   #include "qapi/qmp-output-visitor.h"
>   #include "qapi/qmp-input-visitor.h"
>   #include "monitor/monitor.h"
> +#include "qmp-commands.h"
>
>
>   #define TYPE_FILTER_BUFFER "filter-buffer"
> @@ -269,6 +270,22 @@ void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
>       g_free(queue);
>   }
>
> +static void netdev_del_filter_buffer(NetFilterState *nf, void *opaque,
> +                                     Error **errp)
> +{
> +    if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER) &&
> +        nf->auto_add) {
> +        char *id = object_get_canonical_path_component(OBJECT(nf));
> +
> +        qmp_object_del(id, errp);
> +    }
> +}
> +
> +void qemu_auto_del_filter_buffer(Error **errp)
> +{
> +    qemu_foreach_netfilter(netdev_del_filter_buffer, NULL, errp);
> +}
> +
>   static void filter_buffer_init(Object *obj)
>   {
>       object_property_add(obj, "interval", "int",
> diff --git a/net/filter.c b/net/filter.c
> index 326f2b5..dcbcb80 100644
> --- a/net/filter.c
> +++ b/net/filter.c
> @@ -117,6 +117,18 @@ static void netfilter_set_direction(Object *obj, int direction, Error **errp)
>       nf->direction = direction;
>   }
>
> +static bool netfilter_get_auto_flag(Object *obj, Error **errp)
> +{
> +    NetFilterState *nf = NETFILTER(obj);
> +    return nf->auto_add;
> +}
> +
> +static void netfilter_set_auto_flag(Object *obj, bool flag, Error **errp)
> +{
> +    NetFilterState *nf = NETFILTER(obj);
> +    nf->auto_add = flag;
> +}
> +

This chunk of code should be in previous patch.

>   static void netfilter_init(Object *obj)
>   {
>       object_property_add_str(obj, "netdev",
> @@ -126,6 +138,9 @@ static void netfilter_init(Object *obj)
>                                NetFilterDirection_lookup,
>                                netfilter_get_direction, netfilter_set_direction,
>                                NULL);
> +    object_property_add_bool(obj, "auto",
> +                             netfilter_get_auto_flag, netfilter_set_auto_flag,
> +                             NULL);
>   }

Ditto.

>
>   static void netfilter_complete(UserCreatable *uc, Error **errp)
>

-- 
Thanks,
Yang

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters
  2015-11-03 12:41   ` Yang Hongyang
@ 2015-11-03 13:07     ` zhanghailiang
  2015-11-04  2:51       ` Jason Wang
  0 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 13:07 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah

Hi,

On 2015/11/3 20:41, Yang Hongyang wrote:
> Can you explain why this is needed? Seems that this api hasn't
> been used in this series.
>

We will call it in colo_init_filter_buffers() which is introduced in patch 37,
We should remove the timers of filter-buffers which are configured by users.
Or there will be two places to release packets when we enable colo ft, one in timer callback,
the other one in COLO when we do checkpoint.


Thanks,
zhanghailiang

> On 2015年11月03日 19:56, zhanghailiang wrote:
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Cc: Jason Wang <jasowang@redhat.com>
>> ---
>> v10: new patch
>> ---
>>   include/net/filter.h |  1 +
>>   net/filter-buffer.c  | 17 +++++++++++++++++
>>   2 files changed, 18 insertions(+)
>>
>> diff --git a/include/net/filter.h b/include/net/filter.h
>> index 5a09607..4499d60 100644
>> --- a/include/net/filter.h
>> +++ b/include/net/filter.h
>> @@ -74,5 +74,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>>                                       int iovcnt,
>>                                       void *opaque);
>>   void filter_buffer_release_all(void);
>> +void  filter_buffer_del_all_timers(void);
>>
>>   #endif /* QEMU_NET_FILTER_H */
>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>> index b344901..5f0ea70 100644
>> --- a/net/filter-buffer.c
>> +++ b/net/filter-buffer.c
>> @@ -178,6 +178,23 @@ void filter_buffer_release_all(void)
>>       qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
>>   }
>>
>> +static void filter_buffer_del_timer(NetFilterState *nf, void *opaque,
>> +                                    Error **errp)
>> +{
>> +    if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
>> +        FilterBufferState *s = FILTER_BUFFER(nf);
>> +
>> +        if (s->interval) {
>> +            timer_del(&s->release_timer);
>> +        }
>> +    }
>> +}
>> +
>> +void filter_buffer_del_all_timers(void)
>> +{
>> +    qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>> +}
>> +
>>   static void filter_buffer_init(Object *obj)
>>   {
>>       object_property_add(obj, "interval", "int",
>>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-03 12:57   ` Yang Hongyang
@ 2015-11-03 13:16     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 13:16 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah

On 2015/11/3 20:57, Yang Hongyang wrote:
>
>
> On 2015年11月03日 19:56, zhanghailiang wrote:
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Cc: Jason Wang <jasowang@redhat.com>
>> ---
>> v10: new patch
>> ---
>>   include/net/filter.h |  1 +
>>   include/net/net.h    |  3 ++
>>   net/filter-buffer.c  | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/net.c            | 20 +++++++++++++
>>   4 files changed, 108 insertions(+)
>>
>> diff --git a/include/net/filter.h b/include/net/filter.h
>> index 4499d60..b0954ba 100644
>> --- a/include/net/filter.h
>> +++ b/include/net/filter.h
>> @@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>>                                       void *opaque);
>>   void filter_buffer_release_all(void);
>>   void  filter_buffer_del_all_timers(void);
>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
>>
>>   #endif /* QEMU_NET_FILTER_H */
>> diff --git a/include/net/net.h b/include/net/net.h
>> index 5c65c45..e32bd90 100644
>> --- a/include/net/net.h
>> +++ b/include/net/net.h
>> @@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>>                                          Error **errp);
>>   void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>                               Error **errp);
>> +typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
>> +                                    Error **errp);
>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
>>   int qemu_can_send_packet(NetClientState *nc);
>>   ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
>>                             int iovcnt);
>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>> index 05313de..0dc1efb 100644
>> --- a/net/filter-buffer.c
>> +++ b/net/filter-buffer.c
>> @@ -15,6 +15,11 @@
>>   #include "qapi-visit.h"
>>   #include "qom/object.h"
>>   #include "net/net.h"
>> +#include "qapi/qmp/qdict.h"
>> +#include "qapi/qmp-output-visitor.h"
>> +#include "qapi/qmp-input-visitor.h"
>> +#include "monitor/monitor.h"
>> +
>>
>>   #define TYPE_FILTER_BUFFER "filter-buffer"
>>
>> @@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
>>       qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>>   }
>>
>> +static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
>> +                                     Error **errp)
>> +{
>> +    NetFilterState *nf;
>> +    bool found = false;
>> +
>> +    QTAILQ_FOREACH(nf, &nc->filters, next) {
>> +        if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
>> +            found = true;
>
> What if a filter-buffer already attached to a netdev, but has interval
> set?
> Is this API really necessary?
>

We will jump this netdev, but remove its filter-buffer timer. Meanwhile, we will
release the packets all together in colo checkpoint process.
Besides, we should resume the timer after exit COLO. (We didn't do this in this version).

I don't know if it is a good idea to automatically add filter-buffer for the device
that not configured with it. But it is really reduce the complexity of testing.

>> +            break;
>> +        }
>> +    }
>> +
>> +    if (!found) {
>> +        QmpOutputVisitor *qov;
>> +        QmpInputVisitor *qiv;
>> +        Visitor *ov, *iv;
>> +        QObject *obj = NULL;
>> +        QDict *qdict;
>> +        void *dummy = NULL;
>> +        char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
>> +        char *queue = (char *) opaque;
>> +        bool auto_add = true;
>> +        Error *err = NULL;
>> +
>> +        qov = qmp_output_visitor_new();
>> +        ov = qmp_output_get_visitor(qov);
>> +        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_str(ov, &nc->name, "netdev", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_str(ov, &queue, "queue", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_bool(ov, &auto_add, "auto", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_end_struct(ov, &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        obj = qmp_output_get_qobject(qov);
>> +        g_assert(obj != NULL);
>> +        qdict = qobject_to_qdict(obj);
>> +        qmp_output_visitor_cleanup(qov);
>> +
>> +        qiv = qmp_input_visitor_new(obj);
>> +        iv = qmp_input_get_visitor(qiv);
>> +        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
>> +        qmp_input_visitor_cleanup(qiv);
>> +        qobject_decref(obj);
>> +out:
>> +        g_free(id);
>> +        if (err) {
>> +            error_propagate(errp, err);
>> +        }
>> +    }
>> +}
>> +/*
>> +* This will be used by COLO or MC FT, for which they will need
>> +* to buffer all the packets of all VM's net devices, Here we check
>> +* and automatically add netfilter for netdev that doesn't attach any buffer
>> +* netfilter.
>> +*/
>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
>> +{
>> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
>> +
>> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
>> +                                        errp);
>> +    g_free(queue);
>> +}
>> +
>>   static void filter_buffer_init(Object *obj)
>>   {
>>       object_property_add(obj, "interval", "int",
>> diff --git a/net/net.c b/net/net.c
>> index a333b01..4fbe0af 100644
>> --- a/net/net.c
>> +++ b/net/net.c
>> @@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>       }
>>   }
>>
>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
>> +{
>> +    NetClientState *nc;
>> +
>> +    QTAILQ_FOREACH(nc, &net_clients, next) {
>> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>> +            continue;
>> +        }
>> +        if (func) {
>> +            Error *local_err = NULL;
>> +
>> +            func(nc, opaque, &local_err);
>> +            if (local_err) {
>> +                error_propagate(errp, local_err);
>> +                return;
>> +            }
>> +        }
>> +    }
>> +}
>> +
>>   static void qemu_net_client_destructor(NetClientState *nc)
>>   {
>>       g_free(nc);
>>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets
  2015-11-03 12:39   ` Yang Hongyang
@ 2015-11-03 13:19     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-03 13:19 UTC (permalink / raw)
  To: Yang Hongyang, qemu-devel
  Cc: lizhijian, quintela, Jason Wang, yunhong.jiang, eddie.dong,
	peter.huangpeng, dgilbert, arei.gonglei, stefanha, amit.shah

On 2015/11/3 20:39, Yang Hongyang wrote:
> On 2015年11月03日 19:56, zhanghailiang wrote:
>> For COLO or MC FT, We need a function to release all the buffered packets
>> actively.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Cc: Jason Wang <jasowang@redhat.com>
>> ---
>> v10: new patch
>> ---
>>   include/net/filter.h |  1 +
>>   include/net/net.h    |  4 ++++
>>   net/filter-buffer.c  | 15 +++++++++++++++
>>   net/net.c            | 24 ++++++++++++++++++++++++
>>   4 files changed, 44 insertions(+)
>>
>> diff --git a/include/net/filter.h b/include/net/filter.h
>> index 2deda36..5a09607 100644
>> --- a/include/net/filter.h
>> +++ b/include/net/filter.h
>> @@ -73,5 +73,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>>                                       const struct iovec *iov,
>>                                       int iovcnt,
>>                                       void *opaque);
>> +void filter_buffer_release_all(void);
>>
>>   #endif /* QEMU_NET_FILTER_H */
>> diff --git a/include/net/net.h b/include/net/net.h
>> index 7af3e15..5c65c45 100644
>> --- a/include/net/net.h
>> +++ b/include/net/net.h
>> @@ -125,6 +125,10 @@ NetClientState *qemu_find_vlan_client_by_name(Monitor *mon, int vlan_id,
>>                                                 const char *client_str);
>>   typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
>>   void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
>> +typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>> +                                       Error **errp);
>> +void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>> +                            Error **errp);
>>   int qemu_can_send_packet(NetClientState *nc);
>>   ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
>>                             int iovcnt);
>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>> index 57be149..b344901 100644
>> --- a/net/filter-buffer.c
>> +++ b/net/filter-buffer.c
>> @@ -14,6 +14,7 @@
>>   #include "qapi/qmp/qerror.h"
>>   #include "qapi-visit.h"
>>   #include "qom/object.h"
>> +#include "net/net.h"
>>
>>   #define TYPE_FILTER_BUFFER "filter-buffer"
>>
>> @@ -163,6 +164,20 @@ out:
>>       error_propagate(errp, local_err);
>>   }
>>
>> +static void filter_buffer_release_packets(NetFilterState *nf, void *opaque,
>> +                                          Error **errp)
>> +{
>> +    if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
>> +        filter_buffer_flush(nf);
>> +    }
>> +}
>> +
>> +/* public APIs */
>> +void filter_buffer_release_all(void)
>> +{
>> +    qemu_foreach_netfilter(filter_buffer_release_packets, NULL, NULL);
>> +}
>> +
>>   static void filter_buffer_init(Object *obj)
>>   {
>>       object_property_add(obj, "interval", "int",
>> diff --git a/net/net.c b/net/net.c
>> index a3e9d1a..a333b01 100644
>> --- a/net/net.c
>> +++ b/net/net.c
>> @@ -259,6 +259,30 @@ static char *assign_name(NetClientState *nc1, const char *model)
>>       return g_strdup_printf("%s.%d", model, id);
>>   }
>>
>> +void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>> +                            Error **errp)
>> +{
>> +    NetClientState *nc;
>> +    NetFilterState *nf;
>> +
>> +    QTAILQ_FOREACH(nc, &net_clients, next) {
>
> Going through every filters this way might cause problem under
> multiqueue case. IIRC, Jason suggested that we implement multiqueue
> by this way: attach the same filter to all queues instead of
> attach the clone of the filter obj to other queues. So if we
> attach the same filter to all queues, going through filters
> this way will cause the func been called multiple(=num of queues) times.
>

Got it, i will investigate it.

Thanks.
zhanghailiang

>> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>> +            continue;
>> +        }
>> +        QTAILQ_FOREACH(nf, &nc->filters, next) {
>> +            if (func) {
>> +                Error *local_err = NULL;
>> +
>> +                func(nf, opaque, &local_err);
>> +                if (local_err) {
>> +                    error_propagate(errp, local_err);
>> +                    return;
>> +                }
>> +            }
>> +        }
>> +    }
>> +}
>> +
>>   static void qemu_net_client_destructor(NetClientState *nc)
>>   {
>>       g_free(nc);
>>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters
  2015-11-03 13:07     ` zhanghailiang
@ 2015-11-04  2:51       ` Jason Wang
  2015-11-04  3:08         ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Jason Wang @ 2015-11-04  2:51 UTC (permalink / raw)
  To: zhanghailiang, Yang Hongyang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah



On 11/03/2015 09:07 PM, zhanghailiang wrote:
> Hi,
>
> On 2015/11/3 20:41, Yang Hongyang wrote:
>> Can you explain why this is needed? Seems that this api hasn't
>> been used in this series.
>>
>
> We will call it in colo_init_filter_buffers() which is introduced in
> patch 37,
> We should remove the timers of filter-buffers which are configured by
> users.
> Or there will be two places to release packets when we enable colo ft,
> one in timer callback,
> the other one in COLO when we do checkpoint.
>
>
> Thanks,
> zhanghailiang

Hi:

Then you'd better explain this in commit log.

Thanks
>
>> On 2015年11月03日 19:56, zhanghailiang wrote:
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> Cc: Jason Wang <jasowang@redhat.com>
>>> ---
>>> v10: new patch
>>> ---
>>>   include/net/filter.h |  1 +
>>>   net/filter-buffer.c  | 17 +++++++++++++++++
>>>   2 files changed, 18 insertions(+)
>>>
>>> diff --git a/include/net/filter.h b/include/net/filter.h
>>> index 5a09607..4499d60 100644
>>> --- a/include/net/filter.h
>>> +++ b/include/net/filter.h
>>> @@ -74,5 +74,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState
>>> *sender,
>>>                                       int iovcnt,
>>>                                       void *opaque);
>>>   void filter_buffer_release_all(void);
>>> +void  filter_buffer_del_all_timers(void);
>>>
>>>   #endif /* QEMU_NET_FILTER_H */
>>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>>> index b344901..5f0ea70 100644
>>> --- a/net/filter-buffer.c
>>> +++ b/net/filter-buffer.c
>>> @@ -178,6 +178,23 @@ void filter_buffer_release_all(void)
>>>       qemu_foreach_netfilter(filter_buffer_release_packets, NULL,
>>> NULL);
>>>   }
>>>
>>> +static void filter_buffer_del_timer(NetFilterState *nf, void *opaque,
>>> +                                    Error **errp)
>>> +{
>>> +    if (!strcmp(object_get_typename(OBJECT(nf)),
>>> TYPE_FILTER_BUFFER)) {
>>> +        FilterBufferState *s = FILTER_BUFFER(nf);
>>> +
>>> +        if (s->interval) {
>>> +            timer_del(&s->release_timer);
>>> +        }
>>> +    }
>>> +}
>>> +
>>> +void filter_buffer_del_all_timers(void)
>>> +{
>>> +    qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>>> +}
>>> +
>>>   static void filter_buffer_init(Object *obj)
>>>   {
>>>       object_property_add(obj, "interval", "int",
>>>
>>
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval
  2015-11-03 12:43   ` Yang Hongyang
@ 2015-11-04  2:52     ` Jason Wang
  0 siblings, 0 replies; 100+ messages in thread
From: Jason Wang @ 2015-11-04  2:52 UTC (permalink / raw)
  To: Yang Hongyang, zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah



On 11/03/2015 08:43 PM, Yang Hongyang wrote:
> Some commit message would be better.

+1

>
> On 2015年11月03日 19:56, zhanghailiang wrote:
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Cc: Jason Wang <jasowang@redhat.com>
>
> Reviewed-by: Yang Hongyang <hongyang.yang@easystack.cn>
>
>> ---
>> v10: new patch
>> ---
>>   net/filter-buffer.c | 10 ----------
>>   1 file changed, 10 deletions(-)
>>
>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>> index 5f0ea70..05313de 100644
>> --- a/net/filter-buffer.c
>> +++ b/net/filter-buffer.c
>> @@ -104,16 +104,6 @@ static void filter_buffer_setup(NetFilterState
>> *nf, Error **errp)
>>   {
>>       FilterBufferState *s = FILTER_BUFFER(nf);
>>
>> -    /*
>> -     * We may want to accept zero interval when VM FT solutions like MC
>> -     * or COLO use this filter to release packets on demand.
>> -     */
>> -    if (!s->interval) {
>> -        error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval",
>> -                   "a non-zero interval");
>> -        return;
>> -    }
>> -
>>       s->incoming_queue =
>> qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
>>       if (s->interval) {
>>           timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL,
>>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev zhanghailiang
  2015-11-03 12:57   ` Yang Hongyang
@ 2015-11-04  2:56   ` Jason Wang
  2015-11-04  3:07     ` zhanghailiang
  2015-11-05  7:43     ` zhanghailiang
  1 sibling, 2 replies; 100+ messages in thread
From: Jason Wang @ 2015-11-04  2:56 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah



On 11/03/2015 07:56 PM, zhanghailiang wrote:
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Cc: Jason Wang <jasowang@redhat.com>

Commit log please.

> ---
> v10: new patch
> ---
>  include/net/filter.h |  1 +
>  include/net/net.h    |  3 ++
>  net/filter-buffer.c  | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  net/net.c            | 20 +++++++++++++
>  4 files changed, 108 insertions(+)
>
> diff --git a/include/net/filter.h b/include/net/filter.h
> index 4499d60..b0954ba 100644
> --- a/include/net/filter.h
> +++ b/include/net/filter.h
> @@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>                                      void *opaque);
>  void filter_buffer_release_all(void);
>  void  filter_buffer_del_all_timers(void);
> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
>  
>  #endif /* QEMU_NET_FILTER_H */
> diff --git a/include/net/net.h b/include/net/net.h
> index 5c65c45..e32bd90 100644
> --- a/include/net/net.h
> +++ b/include/net/net.h
> @@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>                                         Error **errp);
>  void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>                              Error **errp);
> +typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
> +                                    Error **errp);
> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
>  int qemu_can_send_packet(NetClientState *nc);
>  ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
>                            int iovcnt);
> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
> index 05313de..0dc1efb 100644
> --- a/net/filter-buffer.c
> +++ b/net/filter-buffer.c
> @@ -15,6 +15,11 @@
>  #include "qapi-visit.h"
>  #include "qom/object.h"
>  #include "net/net.h"
> +#include "qapi/qmp/qdict.h"
> +#include "qapi/qmp-output-visitor.h"
> +#include "qapi/qmp-input-visitor.h"
> +#include "monitor/monitor.h"
> +
>  
>  #define TYPE_FILTER_BUFFER "filter-buffer"
>  
> @@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
>      qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>  }
>  
> +static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
> +                                     Error **errp)
> +{
> +    NetFilterState *nf;
> +    bool found = false;
> +
> +    QTAILQ_FOREACH(nf, &nc->filters, next) {
> +        if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
> +            found = true;
> +            break;
> +        }
> +    }
> +
> +    if (!found) {
> +        QmpOutputVisitor *qov;
> +        QmpInputVisitor *qiv;
> +        Visitor *ov, *iv;
> +        QObject *obj = NULL;
> +        QDict *qdict;
> +        void *dummy = NULL;
> +        char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
> +        char *queue = (char *) opaque;
> +        bool auto_add = true;
> +        Error *err = NULL;
> +
> +        qov = qmp_output_visitor_new();
> +        ov = qmp_output_get_visitor(qov);
> +        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
> +        if (err) {
> +            goto out;
> +        }
> +        visit_type_str(ov, &nc->name, "netdev", &err);
> +        if (err) {
> +            goto out;
> +        }
> +        visit_type_str(ov, &queue, "queue", &err);
> +        if (err) {
> +            goto out;
> +        }
> +        visit_type_bool(ov, &auto_add, "auto", &err);
> +        if (err) {
> +            goto out;
> +        }
> +        visit_end_struct(ov, &err);
> +        if (err) {
> +            goto out;
> +        }
> +        obj = qmp_output_get_qobject(qov);
> +        g_assert(obj != NULL);
> +        qdict = qobject_to_qdict(obj);
> +        qmp_output_visitor_cleanup(qov);
> +
> +        qiv = qmp_input_visitor_new(obj);
> +        iv = qmp_input_get_visitor(qiv);
> +        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
> +        qmp_input_visitor_cleanup(qiv);
> +        qobject_decref(obj);
> +out:
> +        g_free(id);
> +        if (err) {
> +            error_propagate(errp, err);
> +        }
> +    }
> +}
> +/*
> +* This will be used by COLO or MC FT, for which they will need
> +* to buffer all the packets of all VM's net devices, Here we check
> +* and automatically add netfilter for netdev that doesn't attach any buffer
> +* netfilter.
> +*/
> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
> +{
> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
> +
> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
> +                                        errp);
> +    g_free(queue);
> +}
> +

This make me think for following questions:

- What if a nic is hot added after this "automatically" filter add?
- Maybe a better way is to have a default filter? It could be specified
through qemu cli or other (And default filter could be 'nop' which means
no filter) ?

>  static void filter_buffer_init(Object *obj)
>  {
>      object_property_add(obj, "interval", "int",
> diff --git a/net/net.c b/net/net.c
> index a333b01..4fbe0af 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>      }
>  }
>  
> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
> +{
> +    NetClientState *nc;
> +
> +    QTAILQ_FOREACH(nc, &net_clients, next) {
> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
> +            continue;
> +        }
> +        if (func) {
> +            Error *local_err = NULL;
> +
> +            func(nc, opaque, &local_err);
> +            if (local_err) {
> +                error_propagate(errp, local_err);
> +                return;
> +            }
> +        }
> +    }
> +}
> +
>  static void qemu_net_client_destructor(NetClientState *nc)
>  {
>      g_free(nc);

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-04  2:56   ` Jason Wang
@ 2015-11-04  3:07     ` zhanghailiang
  2015-11-05  7:43     ` zhanghailiang
  1 sibling, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-04  3:07 UTC (permalink / raw)
  To: Jason Wang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah

On 2015/11/4 10:56, Jason Wang wrote:
>
>
> On 11/03/2015 07:56 PM, zhanghailiang wrote:
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Cc: Jason Wang <jasowang@redhat.com>
>
> Commit log please.
>
>> ---
>> v10: new patch
>> ---
>>   include/net/filter.h |  1 +
>>   include/net/net.h    |  3 ++
>>   net/filter-buffer.c  | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/net.c            | 20 +++++++++++++
>>   4 files changed, 108 insertions(+)
>>
>> diff --git a/include/net/filter.h b/include/net/filter.h
>> index 4499d60..b0954ba 100644
>> --- a/include/net/filter.h
>> +++ b/include/net/filter.h
>> @@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>>                                       void *opaque);
>>   void filter_buffer_release_all(void);
>>   void  filter_buffer_del_all_timers(void);
>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
>>
>>   #endif /* QEMU_NET_FILTER_H */
>> diff --git a/include/net/net.h b/include/net/net.h
>> index 5c65c45..e32bd90 100644
>> --- a/include/net/net.h
>> +++ b/include/net/net.h
>> @@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>>                                          Error **errp);
>>   void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>                               Error **errp);
>> +typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
>> +                                    Error **errp);
>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
>>   int qemu_can_send_packet(NetClientState *nc);
>>   ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
>>                             int iovcnt);
>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>> index 05313de..0dc1efb 100644
>> --- a/net/filter-buffer.c
>> +++ b/net/filter-buffer.c
>> @@ -15,6 +15,11 @@
>>   #include "qapi-visit.h"
>>   #include "qom/object.h"
>>   #include "net/net.h"
>> +#include "qapi/qmp/qdict.h"
>> +#include "qapi/qmp-output-visitor.h"
>> +#include "qapi/qmp-input-visitor.h"
>> +#include "monitor/monitor.h"
>> +
>>
>>   #define TYPE_FILTER_BUFFER "filter-buffer"
>>
>> @@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
>>       qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>>   }
>>
>> +static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
>> +                                     Error **errp)
>> +{
>> +    NetFilterState *nf;
>> +    bool found = false;
>> +
>> +    QTAILQ_FOREACH(nf, &nc->filters, next) {
>> +        if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
>> +            found = true;
>> +            break;
>> +        }
>> +    }
>> +
>> +    if (!found) {
>> +        QmpOutputVisitor *qov;
>> +        QmpInputVisitor *qiv;
>> +        Visitor *ov, *iv;
>> +        QObject *obj = NULL;
>> +        QDict *qdict;
>> +        void *dummy = NULL;
>> +        char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
>> +        char *queue = (char *) opaque;
>> +        bool auto_add = true;
>> +        Error *err = NULL;
>> +
>> +        qov = qmp_output_visitor_new();
>> +        ov = qmp_output_get_visitor(qov);
>> +        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_str(ov, &nc->name, "netdev", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_str(ov, &queue, "queue", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_bool(ov, &auto_add, "auto", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_end_struct(ov, &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        obj = qmp_output_get_qobject(qov);
>> +        g_assert(obj != NULL);
>> +        qdict = qobject_to_qdict(obj);
>> +        qmp_output_visitor_cleanup(qov);
>> +
>> +        qiv = qmp_input_visitor_new(obj);
>> +        iv = qmp_input_get_visitor(qiv);
>> +        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
>> +        qmp_input_visitor_cleanup(qiv);
>> +        qobject_decref(obj);
>> +out:
>> +        g_free(id);
>> +        if (err) {
>> +            error_propagate(errp, err);
>> +        }
>> +    }
>> +}
>> +/*
>> +* This will be used by COLO or MC FT, for which they will need
>> +* to buffer all the packets of all VM's net devices, Here we check
>> +* and automatically add netfilter for netdev that doesn't attach any buffer
>> +* netfilter.
>> +*/
>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
>> +{
>> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
>> +
>> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
>> +                                        errp);
>> +    g_free(queue);
>> +}
>> +
>
> This make me think for following questions:
>
> - What if a nic is hot added after this "automatically" filter add?

Actually, we didn't support hotplug device when COLO is enabled, maybe we
could support it in future.
If we support hotplug, and yes, it is a problem.

> - Maybe a better way is to have a default filter? It could be specified
> through qemu cli or other (And default filter could be 'nop' which means
> no filter) ?
>

That's really a good idea, and we can enable packets 'buffer' capability only when needed.
I will investigate ... :)

Thanks,
zhanghailiang

>>   static void filter_buffer_init(Object *obj)
>>   {
>>       object_property_add(obj, "interval", "int",
>> diff --git a/net/net.c b/net/net.c
>> index a333b01..4fbe0af 100644
>> --- a/net/net.c
>> +++ b/net/net.c
>> @@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>       }
>>   }
>>
>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
>> +{
>> +    NetClientState *nc;
>> +
>> +    QTAILQ_FOREACH(nc, &net_clients, next) {
>> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>> +            continue;
>> +        }
>> +        if (func) {
>> +            Error *local_err = NULL;
>> +
>> +            func(nc, opaque, &local_err);
>> +            if (local_err) {
>> +                error_propagate(errp, local_err);
>> +                return;
>> +            }
>> +        }
>> +    }
>> +}
>> +
>>   static void qemu_net_client_destructor(NetClientState *nc)
>>   {
>>       g_free(nc);
>
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters
  2015-11-04  2:51       ` Jason Wang
@ 2015-11-04  3:08         ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-04  3:08 UTC (permalink / raw)
  To: Jason Wang, Yang Hongyang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah

On 2015/11/4 10:51, Jason Wang wrote:
>
>
> On 11/03/2015 09:07 PM, zhanghailiang wrote:
>> Hi,
>>
>> On 2015/11/3 20:41, Yang Hongyang wrote:
>>> Can you explain why this is needed? Seems that this api hasn't
>>> been used in this series.
>>>
>>
>> We will call it in colo_init_filter_buffers() which is introduced in
>> patch 37,
>> We should remove the timers of filter-buffers which are configured by
>> users.
>> Or there will be two places to release packets when we enable colo ft,
>> one in timer callback,
>> the other one in COLO when we do checkpoint.
>>
>>
>> Thanks,
>> zhanghailiang
>
> Hi:
>
> Then you'd better explain this in commit log.
>

OK, will fix in next version, thanks.

> Thanks
>>
>>> On 2015年11月03日 19:56, zhanghailiang wrote:
>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> Cc: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>> v10: new patch
>>>> ---
>>>>    include/net/filter.h |  1 +
>>>>    net/filter-buffer.c  | 17 +++++++++++++++++
>>>>    2 files changed, 18 insertions(+)
>>>>
>>>> diff --git a/include/net/filter.h b/include/net/filter.h
>>>> index 5a09607..4499d60 100644
>>>> --- a/include/net/filter.h
>>>> +++ b/include/net/filter.h
>>>> @@ -74,5 +74,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState
>>>> *sender,
>>>>                                        int iovcnt,
>>>>                                        void *opaque);
>>>>    void filter_buffer_release_all(void);
>>>> +void  filter_buffer_del_all_timers(void);
>>>>
>>>>    #endif /* QEMU_NET_FILTER_H */
>>>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>>>> index b344901..5f0ea70 100644
>>>> --- a/net/filter-buffer.c
>>>> +++ b/net/filter-buffer.c
>>>> @@ -178,6 +178,23 @@ void filter_buffer_release_all(void)
>>>>        qemu_foreach_netfilter(filter_buffer_release_packets, NULL,
>>>> NULL);
>>>>    }
>>>>
>>>> +static void filter_buffer_del_timer(NetFilterState *nf, void *opaque,
>>>> +                                    Error **errp)
>>>> +{
>>>> +    if (!strcmp(object_get_typename(OBJECT(nf)),
>>>> TYPE_FILTER_BUFFER)) {
>>>> +        FilterBufferState *s = FILTER_BUFFER(nf);
>>>> +
>>>> +        if (s->interval) {
>>>> +            timer_del(&s->release_timer);
>>>> +        }
>>>> +    }
>>>> +}
>>>> +
>>>> +void filter_buffer_del_all_timers(void)
>>>> +{
>>>> +    qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>>>> +}
>>>> +
>>>>    static void filter_buffer_init(Object *obj)
>>>>    {
>>>>        object_property_add(obj, "interval", "int",
>>>>
>>>
>>
>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-04  2:56   ` Jason Wang
  2015-11-04  3:07     ` zhanghailiang
@ 2015-11-05  7:43     ` zhanghailiang
  2015-11-05  8:52       ` Wen Congyang
  2015-11-05  9:19       ` Jason Wang
  1 sibling, 2 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-05  7:43 UTC (permalink / raw)
  To: Jason Wang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah

Hi Jason,

On 2015/11/4 10:56, Jason Wang wrote:
>
>
> On 11/03/2015 07:56 PM, zhanghailiang wrote:
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Cc: Jason Wang <jasowang@redhat.com>
>
> Commit log please.
>
>> ---
>> v10: new patch
>> ---
>>   include/net/filter.h |  1 +
>>   include/net/net.h    |  3 ++
>>   net/filter-buffer.c  | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/net.c            | 20 +++++++++++++
>>   4 files changed, 108 insertions(+)
>>
>> diff --git a/include/net/filter.h b/include/net/filter.h
>> index 4499d60..b0954ba 100644
>> --- a/include/net/filter.h
>> +++ b/include/net/filter.h
>> @@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>>                                       void *opaque);
>>   void filter_buffer_release_all(void);
>>   void  filter_buffer_del_all_timers(void);
>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
>>
>>   #endif /* QEMU_NET_FILTER_H */
>> diff --git a/include/net/net.h b/include/net/net.h
>> index 5c65c45..e32bd90 100644
>> --- a/include/net/net.h
>> +++ b/include/net/net.h
>> @@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>>                                          Error **errp);
>>   void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>                               Error **errp);
>> +typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
>> +                                    Error **errp);
>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
>>   int qemu_can_send_packet(NetClientState *nc);
>>   ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
>>                             int iovcnt);
>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>> index 05313de..0dc1efb 100644
>> --- a/net/filter-buffer.c
>> +++ b/net/filter-buffer.c
>> @@ -15,6 +15,11 @@
>>   #include "qapi-visit.h"
>>   #include "qom/object.h"
>>   #include "net/net.h"
>> +#include "qapi/qmp/qdict.h"
>> +#include "qapi/qmp-output-visitor.h"
>> +#include "qapi/qmp-input-visitor.h"
>> +#include "monitor/monitor.h"
>> +
>>
>>   #define TYPE_FILTER_BUFFER "filter-buffer"
>>
>> @@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
>>       qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>>   }
>>
>> +static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
>> +                                     Error **errp)
>> +{
>> +    NetFilterState *nf;
>> +    bool found = false;
>> +
>> +    QTAILQ_FOREACH(nf, &nc->filters, next) {
>> +        if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
>> +            found = true;
>> +            break;
>> +        }
>> +    }
>> +
>> +    if (!found) {
>> +        QmpOutputVisitor *qov;
>> +        QmpInputVisitor *qiv;
>> +        Visitor *ov, *iv;
>> +        QObject *obj = NULL;
>> +        QDict *qdict;
>> +        void *dummy = NULL;
>> +        char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
>> +        char *queue = (char *) opaque;
>> +        bool auto_add = true;
>> +        Error *err = NULL;
>> +
>> +        qov = qmp_output_visitor_new();
>> +        ov = qmp_output_get_visitor(qov);
>> +        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_str(ov, &nc->name, "netdev", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_str(ov, &queue, "queue", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_type_bool(ov, &auto_add, "auto", &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        visit_end_struct(ov, &err);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +        obj = qmp_output_get_qobject(qov);
>> +        g_assert(obj != NULL);
>> +        qdict = qobject_to_qdict(obj);
>> +        qmp_output_visitor_cleanup(qov);
>> +
>> +        qiv = qmp_input_visitor_new(obj);
>> +        iv = qmp_input_get_visitor(qiv);
>> +        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
>> +        qmp_input_visitor_cleanup(qiv);
>> +        qobject_decref(obj);
>> +out:
>> +        g_free(id);
>> +        if (err) {
>> +            error_propagate(errp, err);
>> +        }
>> +    }
>> +}
>> +/*
>> +* This will be used by COLO or MC FT, for which they will need
>> +* to buffer all the packets of all VM's net devices, Here we check
>> +* and automatically add netfilter for netdev that doesn't attach any buffer
>> +* netfilter.
>> +*/
>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
>> +{
>> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
>> +
>> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
>> +                                        errp);
>> +    g_free(queue);
>> +}
>> +
>
> This make me think for following questions:
>
> - What if a nic is hot added after this "automatically" filter add?
> - Maybe a better way is to have a default filter? It could be specified
> through qemu cli or other (And default filter could be 'nop' which means
> no filter) ?
>

I have thought about this. I'd like to add this default buffer filter quietly,
not through qemu cli. In this way, we can still keep the buffer filter that configured by users,
and keep its delay release packets capability. Though the delay time is not what
users suppose. (This is only happened in COLO's periodic mode, in normal colo mode, the delay time
is almost same with user's configure.)

What about call netdev_add_filter_buffer() in each netdev's init() ?
I didn't found a common code path for every netdev in their init path.

Thanks,
zhanghailiang

>>   static void filter_buffer_init(Object *obj)
>>   {
>>       object_property_add(obj, "interval", "int",
>> diff --git a/net/net.c b/net/net.c
>> index a333b01..4fbe0af 100644
>> --- a/net/net.c
>> +++ b/net/net.c
>> @@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>       }
>>   }
>>
>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
>> +{
>> +    NetClientState *nc;
>> +
>> +    QTAILQ_FOREACH(nc, &net_clients, next) {
>> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>> +            continue;
>> +        }
>> +        if (func) {
>> +            Error *local_err = NULL;
>> +
>> +            func(nc, opaque, &local_err);
>> +            if (local_err) {
>> +                error_propagate(errp, local_err);
>> +                return;
>> +            }
>> +        }
>> +    }
>> +}
>> +
>>   static void qemu_net_client_destructor(NetClientState *nc)
>>   {
>>       g_free(nc);
>
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-05  7:43     ` zhanghailiang
@ 2015-11-05  8:52       ` Wen Congyang
  2015-11-05  9:21         ` Jason Wang
  2015-11-05  9:19       ` Jason Wang
  1 sibling, 1 reply; 100+ messages in thread
From: Wen Congyang @ 2015-11-05  8:52 UTC (permalink / raw)
  To: zhanghailiang, Jason Wang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah

On 11/05/2015 03:43 PM, zhanghailiang wrote:
> Hi Jason,
> 
> On 2015/11/4 10:56, Jason Wang wrote:
>>
>>
>> On 11/03/2015 07:56 PM, zhanghailiang wrote:
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> Cc: Jason Wang <jasowang@redhat.com>
>>
>> Commit log please.
>>
>>> ---
>>> v10: new patch
>>> ---
>>>   include/net/filter.h |  1 +
>>>   include/net/net.h    |  3 ++
>>>   net/filter-buffer.c  | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   net/net.c            | 20 +++++++++++++
>>>   4 files changed, 108 insertions(+)
>>>
>>> diff --git a/include/net/filter.h b/include/net/filter.h
>>> index 4499d60..b0954ba 100644
>>> --- a/include/net/filter.h
>>> +++ b/include/net/filter.h
>>> @@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>>>                                       void *opaque);
>>>   void filter_buffer_release_all(void);
>>>   void  filter_buffer_del_all_timers(void);
>>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp);
>>>
>>>   #endif /* QEMU_NET_FILTER_H */
>>> diff --git a/include/net/net.h b/include/net/net.h
>>> index 5c65c45..e32bd90 100644
>>> --- a/include/net/net.h
>>> +++ b/include/net/net.h
>>> @@ -129,6 +129,9 @@ typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>>>                                          Error **errp);
>>>   void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>>                               Error **errp);
>>> +typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
>>> +                                    Error **errp);
>>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp);
>>>   int qemu_can_send_packet(NetClientState *nc);
>>>   ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
>>>                             int iovcnt);
>>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>>> index 05313de..0dc1efb 100644
>>> --- a/net/filter-buffer.c
>>> +++ b/net/filter-buffer.c
>>> @@ -15,6 +15,11 @@
>>>   #include "qapi-visit.h"
>>>   #include "qom/object.h"
>>>   #include "net/net.h"
>>> +#include "qapi/qmp/qdict.h"
>>> +#include "qapi/qmp-output-visitor.h"
>>> +#include "qapi/qmp-input-visitor.h"
>>> +#include "monitor/monitor.h"
>>> +
>>>
>>>   #define TYPE_FILTER_BUFFER "filter-buffer"
>>>
>>> @@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
>>>       qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>>>   }
>>>
>>> +static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
>>> +                                     Error **errp)
>>> +{
>>> +    NetFilterState *nf;
>>> +    bool found = false;
>>> +
>>> +    QTAILQ_FOREACH(nf, &nc->filters, next) {
>>> +        if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
>>> +            found = true;
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    if (!found) {
>>> +        QmpOutputVisitor *qov;
>>> +        QmpInputVisitor *qiv;
>>> +        Visitor *ov, *iv;
>>> +        QObject *obj = NULL;
>>> +        QDict *qdict;
>>> +        void *dummy = NULL;
>>> +        char *id = g_strdup_printf("%s-%s.0", nc->name, TYPE_FILTER_BUFFER);
>>> +        char *queue = (char *) opaque;
>>> +        bool auto_add = true;
>>> +        Error *err = NULL;
>>> +
>>> +        qov = qmp_output_visitor_new();
>>> +        ov = qmp_output_get_visitor(qov);
>>> +        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        visit_type_str(ov, &nc->name, "netdev", &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        visit_type_str(ov, &queue, "queue", &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        visit_type_bool(ov, &auto_add, "auto", &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        visit_end_struct(ov, &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        obj = qmp_output_get_qobject(qov);
>>> +        g_assert(obj != NULL);
>>> +        qdict = qobject_to_qdict(obj);
>>> +        qmp_output_visitor_cleanup(qov);
>>> +
>>> +        qiv = qmp_input_visitor_new(obj);
>>> +        iv = qmp_input_get_visitor(qiv);
>>> +        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
>>> +        qmp_input_visitor_cleanup(qiv);
>>> +        qobject_decref(obj);
>>> +out:
>>> +        g_free(id);
>>> +        if (err) {
>>> +            error_propagate(errp, err);
>>> +        }
>>> +    }
>>> +}
>>> +/*
>>> +* This will be used by COLO or MC FT, for which they will need
>>> +* to buffer all the packets of all VM's net devices, Here we check
>>> +* and automatically add netfilter for netdev that doesn't attach any buffer
>>> +* netfilter.
>>> +*/
>>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
>>> +{
>>> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
>>> +
>>> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
>>> +                                        errp);
>>> +    g_free(queue);
>>> +}
>>> +
>>
>> This make me think for following questions:
>>
>> - What if a nic is hot added after this "automatically" filter add?

IIRC, we don't allow the user to hotplug a device when colo is running.

Thanks
Wen Congyang

>> - Maybe a better way is to have a default filter? It could be specified
>> through qemu cli or other (And default filter could be 'nop' which means
>> no filter) ?
>>
> 
> I have thought about this. I'd like to add this default buffer filter quietly,
> not through qemu cli. In this way, we can still keep the buffer filter that configured by users,
> and keep its delay release packets capability. Though the delay time is not what
> users suppose. (This is only happened in COLO's periodic mode, in normal colo mode, the delay time
> is almost same with user's configure.)
> 
> What about call netdev_add_filter_buffer() in each netdev's init() ?
> I didn't found a common code path for every netdev in their init path.
> 
> Thanks,
> zhanghailiang
> 
>>>   static void filter_buffer_init(Object *obj)
>>>   {
>>>       object_property_add(obj, "interval", "int",
>>> diff --git a/net/net.c b/net/net.c
>>> index a333b01..4fbe0af 100644
>>> --- a/net/net.c
>>> +++ b/net/net.c
>>> @@ -283,6 +283,26 @@ void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>>       }
>>>   }
>>>
>>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque, Error **errp)
>>> +{
>>> +    NetClientState *nc;
>>> +
>>> +    QTAILQ_FOREACH(nc, &net_clients, next) {
>>> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>>> +            continue;
>>> +        }
>>> +        if (func) {
>>> +            Error *local_err = NULL;
>>> +
>>> +            func(nc, opaque, &local_err);
>>> +            if (local_err) {
>>> +                error_propagate(errp, local_err);
>>> +                return;
>>> +            }
>>> +        }
>>> +    }
>>> +}
>>> +
>>>   static void qemu_net_client_destructor(NetClientState *nc)
>>>   {
>>>       g_free(nc);
>>
>>
>> .
>>
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-05  7:43     ` zhanghailiang
  2015-11-05  8:52       ` Wen Congyang
@ 2015-11-05  9:19       ` Jason Wang
  2015-11-05 10:58         ` zhanghailiang
  1 sibling, 1 reply; 100+ messages in thread
From: Jason Wang @ 2015-11-05  9:19 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah



On 11/05/2015 03:43 PM, zhanghailiang wrote:
> Hi Jason,
>
> On 2015/11/4 10:56, Jason Wang wrote:
>>
>>
>> On 11/03/2015 07:56 PM, zhanghailiang wrote:
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> Cc: Jason Wang <jasowang@redhat.com>
>>
>> Commit log please.
>>
>>> ---
>>> v10: new patch
>>> ---
>>>   include/net/filter.h |  1 +
>>>   include/net/net.h    |  3 ++
>>>   net/filter-buffer.c  | 84
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   net/net.c            | 20 +++++++++++++
>>>   4 files changed, 108 insertions(+)
>>>
>>> diff --git a/include/net/filter.h b/include/net/filter.h
>>> index 4499d60..b0954ba 100644
>>> --- a/include/net/filter.h
>>> +++ b/include/net/filter.h
>>> @@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState
>>> *sender,
>>>                                       void *opaque);
>>>   void filter_buffer_release_all(void);
>>>   void  filter_buffer_del_all_timers(void);
>>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction,
>>> Error **errp);
>>>
>>>   #endif /* QEMU_NET_FILTER_H */
>>> diff --git a/include/net/net.h b/include/net/net.h
>>> index 5c65c45..e32bd90 100644
>>> --- a/include/net/net.h
>>> +++ b/include/net/net.h
>>> @@ -129,6 +129,9 @@ typedef void
>>> (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>>>                                          Error **errp);
>>>   void qemu_foreach_netfilter(qemu_netfilter_foreach func, void
>>> *opaque,
>>>                               Error **errp);
>>> +typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
>>> +                                    Error **errp);
>>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque,
>>> Error **errp);
>>>   int qemu_can_send_packet(NetClientState *nc);
>>>   ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec
>>> *iov,
>>>                             int iovcnt);
>>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>>> index 05313de..0dc1efb 100644
>>> --- a/net/filter-buffer.c
>>> +++ b/net/filter-buffer.c
>>> @@ -15,6 +15,11 @@
>>>   #include "qapi-visit.h"
>>>   #include "qom/object.h"
>>>   #include "net/net.h"
>>> +#include "qapi/qmp/qdict.h"
>>> +#include "qapi/qmp-output-visitor.h"
>>> +#include "qapi/qmp-input-visitor.h"
>>> +#include "monitor/monitor.h"
>>> +
>>>
>>>   #define TYPE_FILTER_BUFFER "filter-buffer"
>>>
>>> @@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
>>>       qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>>>   }
>>>
>>> +static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
>>> +                                     Error **errp)
>>> +{
>>> +    NetFilterState *nf;
>>> +    bool found = false;
>>> +
>>> +    QTAILQ_FOREACH(nf, &nc->filters, next) {
>>> +        if (!strcmp(object_get_typename(OBJECT(nf)),
>>> TYPE_FILTER_BUFFER)) {
>>> +            found = true;
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    if (!found) {
>>> +        QmpOutputVisitor *qov;
>>> +        QmpInputVisitor *qiv;
>>> +        Visitor *ov, *iv;
>>> +        QObject *obj = NULL;
>>> +        QDict *qdict;
>>> +        void *dummy = NULL;
>>> +        char *id = g_strdup_printf("%s-%s.0", nc->name,
>>> TYPE_FILTER_BUFFER);
>>> +        char *queue = (char *) opaque;
>>> +        bool auto_add = true;
>>> +        Error *err = NULL;
>>> +
>>> +        qov = qmp_output_visitor_new();
>>> +        ov = qmp_output_get_visitor(qov);
>>> +        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        visit_type_str(ov, &nc->name, "netdev", &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        visit_type_str(ov, &queue, "queue", &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        visit_type_bool(ov, &auto_add, "auto", &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        visit_end_struct(ov, &err);
>>> +        if (err) {
>>> +            goto out;
>>> +        }
>>> +        obj = qmp_output_get_qobject(qov);
>>> +        g_assert(obj != NULL);
>>> +        qdict = qobject_to_qdict(obj);
>>> +        qmp_output_visitor_cleanup(qov);
>>> +
>>> +        qiv = qmp_input_visitor_new(obj);
>>> +        iv = qmp_input_get_visitor(qiv);
>>> +        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
>>> +        qmp_input_visitor_cleanup(qiv);
>>> +        qobject_decref(obj);
>>> +out:
>>> +        g_free(id);
>>> +        if (err) {
>>> +            error_propagate(errp, err);
>>> +        }
>>> +    }
>>> +}
>>> +/*
>>> +* This will be used by COLO or MC FT, for which they will need
>>> +* to buffer all the packets of all VM's net devices, Here we check
>>> +* and automatically add netfilter for netdev that doesn't attach
>>> any buffer
>>> +* netfilter.
>>> +*/
>>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction,
>>> Error **errp)
>>> +{
>>> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
>>> +
>>> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
>>> +                                        errp);
>>> +    g_free(queue);
>>> +}
>>> +
>>
>> This make me think for following questions:
>>
>> - What if a nic is hot added after this "automatically" filter add?
>> - Maybe a better way is to have a default filter? It could be specified
>> through qemu cli or other (And default filter could be 'nop' which means
>> no filter) ?
>>
>
> I have thought about this. I'd like to add this default buffer filter
> quietly,
> not through qemu cli. In this way, we can still keep the buffer filter
> that configured by users,

Actually, this does not break the ones that added by user. We support
attach more than one filters to be attached to a single netdev.

If I understand the case correctly (I was only partially cced in this
series). Before each synchronization, you need:

1) add a buffer filter to each netdev
2) release all buffers on demand
3) delete all buffer filters

You can just remove step 1 if you know all device has a default buffer
filter. And step 3 could be also removed if you can let buffer filter
won't buffer any packet through a new command or other.

> and keep its delay release packets capability. Though the delay time
> is not what
> users suppose. (This is only happened in COLO's periodic mode, in
> normal colo mode, the delay time
> is almost same with user's configure.)

This is not good unless you want to limit the buffer filter only for
COLO. And I want also know the role of management: technically it can do
all the above 3 steps ( And looks like management was a better place to
do this).

Thanks

>
> What about call netdev_add_filter_buffer() in each netdev's init() ?
> I didn't found a common code path for every netdev in their init path.
>
> Thanks,
> zhanghailiang
>
>>>   static void filter_buffer_init(Object *obj)
>>>   {
>>>       object_property_add(obj, "interval", "int",
>>> diff --git a/net/net.c b/net/net.c
>>> index a333b01..4fbe0af 100644
>>> --- a/net/net.c
>>> +++ b/net/net.c
>>> @@ -283,6 +283,26 @@ void
>>> qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>>       }
>>>   }
>>>
>>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque,
>>> Error **errp)
>>> +{
>>> +    NetClientState *nc;
>>> +
>>> +    QTAILQ_FOREACH(nc, &net_clients, next) {
>>> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>>> +            continue;
>>> +        }
>>> +        if (func) {
>>> +            Error *local_err = NULL;
>>> +
>>> +            func(nc, opaque, &local_err);
>>> +            if (local_err) {
>>> +                error_propagate(errp, local_err);
>>> +                return;
>>> +            }
>>> +        }
>>> +    }
>>> +}
>>> +
>>>   static void qemu_net_client_destructor(NetClientState *nc)
>>>   {
>>>       g_free(nc);
>>
>>
>> .
>>
>
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-05  8:52       ` Wen Congyang
@ 2015-11-05  9:21         ` Jason Wang
  2015-11-05  9:33           ` Wen Congyang
  0 siblings, 1 reply; 100+ messages in thread
From: Jason Wang @ 2015-11-05  9:21 UTC (permalink / raw)
  To: Wen Congyang, zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah



On 11/05/2015 04:52 PM, Wen Congyang wrote:
> On 11/05/2015 03:43 PM, zhanghailiang wrote:
>> > Hi Jason,
>> > 
>> > On 2015/11/4 10:56, Jason Wang wrote:
>>> >>
>>> >>
>>> >> On 11/03/2015 07:56 PM, zhanghailiang wrote:
>>>> >>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> >>> Cc: Jason Wang <jasowang@redhat.com>
>>> >>
>>> >> Commit log please.
>>> >>
>>>> >>> ---
>>>> >>> v10: new patch
>>>> >>> ---

[...]

>>>> >>> +}
>>>> >>> +/*
>>>> >>> +* This will be used by COLO or MC FT, for which they will need
>>>> >>> +* to buffer all the packets of all VM's net devices, Here we check
>>>> >>> +* and automatically add netfilter for netdev that doesn't attach any buffer
>>>> >>> +* netfilter.
>>>> >>> +*/
>>>> >>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
>>>> >>> +{
>>>> >>> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
>>>> >>> +
>>>> >>> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
>>>> >>> +                                        errp);
>>>> >>> +    g_free(queue);
>>>> >>> +}
>>>> >>> +
>>> >>
>>> >> This make me think for following questions:
>>> >>
>>> >> - What if a nic is hot added after this "automatically" filter add?
> IIRC, we don't allow the user to hotplug a device when colo is running.
>
> Thanks
> Wen Congyang
>

Even in the future? And how could forbid the user to do this, management?

Thanks

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-05  9:21         ` Jason Wang
@ 2015-11-05  9:33           ` Wen Congyang
  0 siblings, 0 replies; 100+ messages in thread
From: Wen Congyang @ 2015-11-05  9:33 UTC (permalink / raw)
  To: Jason Wang, zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah

On 11/05/2015 05:21 PM, Jason Wang wrote:
> 
> 
> On 11/05/2015 04:52 PM, Wen Congyang wrote:
>> On 11/05/2015 03:43 PM, zhanghailiang wrote:
>>>> Hi Jason,
>>>>
>>>> On 2015/11/4 10:56, Jason Wang wrote:
>>>>>>
>>>>>>
>>>>>> On 11/03/2015 07:56 PM, zhanghailiang wrote:
>>>>>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>>>>>> Cc: Jason Wang <jasowang@redhat.com>
>>>>>>
>>>>>> Commit log please.
>>>>>>
>>>>>>>> ---
>>>>>>>> v10: new patch
>>>>>>>> ---
> 
> [...]
> 
>>>>>>>> +}
>>>>>>>> +/*
>>>>>>>> +* This will be used by COLO or MC FT, for which they will need
>>>>>>>> +* to buffer all the packets of all VM's net devices, Here we check
>>>>>>>> +* and automatically add netfilter for netdev that doesn't attach any buffer
>>>>>>>> +* netfilter.
>>>>>>>> +*/
>>>>>>>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction, Error **errp)
>>>>>>>> +{
>>>>>>>> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
>>>>>>>> +
>>>>>>>> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
>>>>>>>> +                                        errp);
>>>>>>>> +    g_free(queue);
>>>>>>>> +}
>>>>>>>> +
>>>>>>
>>>>>> This make me think for following questions:
>>>>>>
>>>>>> - What if a nic is hot added after this "automatically" filter add?
>> IIRC, we don't allow the user to hotplug a device when colo is running.
>>
>> Thanks
>> Wen Congyang
>>
> 
> Even in the future? And how could forbid the user to do this, management?

If we allow the user to hotplug a device, we need to auto hotplug the same device
in the secondary qemu. It is hard to implement it now.

We can add a new flag(for example: MONITOR_CMD_COLO_UNSUPPORTED), and do the check
in handle_user_command() and handle_qmp_command().

Thanks
Wen Congyang

> 
> Thanks
> .
> 

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev
  2015-11-05  9:19       ` Jason Wang
@ 2015-11-05 10:58         ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-05 10:58 UTC (permalink / raw)
  To: Jason Wang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, dgilbert,
	peter.huangpeng, arei.gonglei, stefanha, amit.shah

On 2015/11/5 17:19, Jason Wang wrote:
>
>
> On 11/05/2015 03:43 PM, zhanghailiang wrote:
>> Hi Jason,
>>
>> On 2015/11/4 10:56, Jason Wang wrote:
>>>
>>>
>>> On 11/03/2015 07:56 PM, zhanghailiang wrote:
>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> Cc: Jason Wang <jasowang@redhat.com>
>>>
>>> Commit log please.
>>>
>>>> ---
>>>> v10: new patch
>>>> ---
>>>>    include/net/filter.h |  1 +
>>>>    include/net/net.h    |  3 ++
>>>>    net/filter-buffer.c  | 84
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    net/net.c            | 20 +++++++++++++
>>>>    4 files changed, 108 insertions(+)
>>>>
>>>> diff --git a/include/net/filter.h b/include/net/filter.h
>>>> index 4499d60..b0954ba 100644
>>>> --- a/include/net/filter.h
>>>> +++ b/include/net/filter.h
>>>> @@ -75,5 +75,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState
>>>> *sender,
>>>>                                        void *opaque);
>>>>    void filter_buffer_release_all(void);
>>>>    void  filter_buffer_del_all_timers(void);
>>>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction,
>>>> Error **errp);
>>>>
>>>>    #endif /* QEMU_NET_FILTER_H */
>>>> diff --git a/include/net/net.h b/include/net/net.h
>>>> index 5c65c45..e32bd90 100644
>>>> --- a/include/net/net.h
>>>> +++ b/include/net/net.h
>>>> @@ -129,6 +129,9 @@ typedef void
>>>> (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
>>>>                                           Error **errp);
>>>>    void qemu_foreach_netfilter(qemu_netfilter_foreach func, void
>>>> *opaque,
>>>>                                Error **errp);
>>>> +typedef void (*qemu_netdev_foreach)(NetClientState *nc, void *opaque,
>>>> +                                    Error **errp);
>>>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque,
>>>> Error **errp);
>>>>    int qemu_can_send_packet(NetClientState *nc);
>>>>    ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec
>>>> *iov,
>>>>                              int iovcnt);
>>>> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
>>>> index 05313de..0dc1efb 100644
>>>> --- a/net/filter-buffer.c
>>>> +++ b/net/filter-buffer.c
>>>> @@ -15,6 +15,11 @@
>>>>    #include "qapi-visit.h"
>>>>    #include "qom/object.h"
>>>>    #include "net/net.h"
>>>> +#include "qapi/qmp/qdict.h"
>>>> +#include "qapi/qmp-output-visitor.h"
>>>> +#include "qapi/qmp-input-visitor.h"
>>>> +#include "monitor/monitor.h"
>>>> +
>>>>
>>>>    #define TYPE_FILTER_BUFFER "filter-buffer"
>>>>
>>>> @@ -185,6 +190,85 @@ void filter_buffer_del_all_timers(void)
>>>>        qemu_foreach_netfilter(filter_buffer_del_timer, NULL, NULL);
>>>>    }
>>>>
>>>> +static void netdev_add_filter_buffer(NetClientState *nc, void *opaque,
>>>> +                                     Error **errp)
>>>> +{
>>>> +    NetFilterState *nf;
>>>> +    bool found = false;
>>>> +
>>>> +    QTAILQ_FOREACH(nf, &nc->filters, next) {
>>>> +        if (!strcmp(object_get_typename(OBJECT(nf)),
>>>> TYPE_FILTER_BUFFER)) {
>>>> +            found = true;
>>>> +            break;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    if (!found) {
>>>> +        QmpOutputVisitor *qov;
>>>> +        QmpInputVisitor *qiv;
>>>> +        Visitor *ov, *iv;
>>>> +        QObject *obj = NULL;
>>>> +        QDict *qdict;
>>>> +        void *dummy = NULL;
>>>> +        char *id = g_strdup_printf("%s-%s.0", nc->name,
>>>> TYPE_FILTER_BUFFER);
>>>> +        char *queue = (char *) opaque;
>>>> +        bool auto_add = true;
>>>> +        Error *err = NULL;
>>>> +
>>>> +        qov = qmp_output_visitor_new();
>>>> +        ov = qmp_output_get_visitor(qov);
>>>> +        visit_start_struct(ov,  &dummy, NULL, NULL, 0, &err);
>>>> +        if (err) {
>>>> +            goto out;
>>>> +        }
>>>> +        visit_type_str(ov, &nc->name, "netdev", &err);
>>>> +        if (err) {
>>>> +            goto out;
>>>> +        }
>>>> +        visit_type_str(ov, &queue, "queue", &err);
>>>> +        if (err) {
>>>> +            goto out;
>>>> +        }
>>>> +        visit_type_bool(ov, &auto_add, "auto", &err);
>>>> +        if (err) {
>>>> +            goto out;
>>>> +        }
>>>> +        visit_end_struct(ov, &err);
>>>> +        if (err) {
>>>> +            goto out;
>>>> +        }
>>>> +        obj = qmp_output_get_qobject(qov);
>>>> +        g_assert(obj != NULL);
>>>> +        qdict = qobject_to_qdict(obj);
>>>> +        qmp_output_visitor_cleanup(qov);
>>>> +
>>>> +        qiv = qmp_input_visitor_new(obj);
>>>> +        iv = qmp_input_get_visitor(qiv);
>>>> +        object_add(TYPE_FILTER_BUFFER, id, qdict, iv, &err);
>>>> +        qmp_input_visitor_cleanup(qiv);
>>>> +        qobject_decref(obj);
>>>> +out:
>>>> +        g_free(id);
>>>> +        if (err) {
>>>> +            error_propagate(errp, err);
>>>> +        }
>>>> +    }
>>>> +}
>>>> +/*
>>>> +* This will be used by COLO or MC FT, for which they will need
>>>> +* to buffer all the packets of all VM's net devices, Here we check
>>>> +* and automatically add netfilter for netdev that doesn't attach
>>>> any buffer
>>>> +* netfilter.
>>>> +*/
>>>> +void qemu_auto_add_filter_buffer(NetFilterDirection direction,
>>>> Error **errp)
>>>> +{
>>>> +    char *queue = g_strdup(NetFilterDirection_lookup[direction]);
>>>> +
>>>> +    qemu_foreach_netdev(netdev_add_filter_buffer, queue,
>>>> +                                        errp);
>>>> +    g_free(queue);
>>>> +}
>>>> +
>>>
>>> This make me think for following questions:
>>>
>>> - What if a nic is hot added after this "automatically" filter add?
>>> - Maybe a better way is to have a default filter? It could be specified
>>> through qemu cli or other (And default filter could be 'nop' which means
>>> no filter) ?
>>>
>>
>> I have thought about this. I'd like to add this default buffer filter
>> quietly,
>> not through qemu cli. In this way, we can still keep the buffer filter
>> that configured by users,
>
> Actually, this does not break the ones that added by user. We support
> attach more than one filters to be attached to a single netdev.
>

Yes, and the packets will go through the default buffer filter before
the ones that added by users. We only control the default buffer filter.

> If I understand the case correctly (I was only partially cced in this
> series). Before each synchronization, you need:
>
> 1) add a buffer filter to each netdev
> 2) release all buffers on demand
> 3) delete all buffer filters
>

Actually, for now, we only do step 1) in COLO's init process, and do step 3)
while exit COLO.

> You can just remove step 1 if you know all device has a default buffer
> filter. And step 3 could be also removed if you can let buffer filter
> won't buffer any packet through a new command or other.
>

Agreed. We will not let the default buffer filter to buffer any packets, before
we go into COLO process.

>> and keep its delay release packets capability. Though the delay time
>> is not what
>> users suppose. (This is only happened in COLO's periodic mode, in
>> normal colo mode, the delay time
>> is almost same with user's configure.)
>
> This is not good unless you want to limit the buffer filter only for

Er, maybe i didn't describe clearly. I mean the solution of adding default filter
buffer will not break the ones that added by users. We only manage the default filter
buffer, but the delay time of the buffer filter that added by users will be change :
New delay time = Checkpoint period + configured delay time.

Thanks,
zhanghailiang

> COLO. And I want also know the role of management: technically it can do
> all the above 3 steps ( And looks like management was a better place to
> do this).
>
> Thanks
>
>>
>> What about call netdev_add_filter_buffer() in each netdev's init() ?
>> I didn't found a common code path for every netdev in their init path.
>>
>> Thanks,
>> zhanghailiang
>>
>>>>    static void filter_buffer_init(Object *obj)
>>>>    {
>>>>        object_property_add(obj, "interval", "int",
>>>> diff --git a/net/net.c b/net/net.c
>>>> index a333b01..4fbe0af 100644
>>>> --- a/net/net.c
>>>> +++ b/net/net.c
>>>> @@ -283,6 +283,26 @@ void
>>>> qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
>>>>        }
>>>>    }
>>>>
>>>> +void qemu_foreach_netdev(qemu_netdev_foreach func, void *opaque,
>>>> Error **errp)
>>>> +{
>>>> +    NetClientState *nc;
>>>> +
>>>> +    QTAILQ_FOREACH(nc, &net_clients, next) {
>>>> +        if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
>>>> +            continue;
>>>> +        }
>>>> +        if (func) {
>>>> +            Error *local_err = NULL;
>>>> +
>>>> +            func(nc, opaque, &local_err);
>>>> +            if (local_err) {
>>>> +                error_propagate(errp, local_err);
>>>> +                return;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +}
>>>> +
>>>>    static void qemu_net_client_destructor(NetClientState *nc)
>>>>    {
>>>>        g_free(nc);
>>>
>>>
>>> .
>>>
>>
>>
>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
@ 2015-11-05 14:52   ` Eric Blake
  2015-11-06  7:36     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Eric Blake @ 2015-11-05 14:52 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah

[-- Attachment #1: Type: text/plain, Size: 700 bytes --]

On 11/03/2015 04:56 AM, zhanghailiang wrote:
> configure --enable-colo/--disable-colo to switch COLO
> support on/off.
> COLO support is off by default.

Off by default risks bit-rot for people not building it; it's generally
best to default to off only if the feature requires dragging in extra
libraries or similar build items not likely to be present in all
development environments (and even then, auto-probing for required
prerequisites is nicer than hard-coding off). Would on-by-default be a
saner choice, or would that drag in too many extra dependencies for
building?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support
  2015-11-05 14:52   ` Eric Blake
@ 2015-11-06  7:36     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-06  7:36 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah

Hi Eric,

On 2015/11/5 22:52, Eric Blake wrote:
> On 11/03/2015 04:56 AM, zhanghailiang wrote:
>> configure --enable-colo/--disable-colo to switch COLO
>> support on/off.
>> COLO support is off by default.
>
> Off by default risks bit-rot for people not building it; it's generally
> best to default to off only if the feature requires dragging in extra
> libraries or similar build items not likely to be present in all
> development environments (and even then, auto-probing for required
> prerequisites is nicer than hard-coding off). Would on-by-default be a
> saner choice, or would that drag in too many extra dependencies for
> building?
>

In old version, which uses kernel proxy, we have to
install relevant nfnetlink libraries.

For the new version, we have dropped kernel proxy, so it is OK to turn it On by default.
I will fix it in next version :)

Thanks,
zhanghailiang

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 03/38] COLO: migrate colo related info to secondary node
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 03/38] COLO: migrate colo related info to secondary node zhanghailiang
@ 2015-11-06 16:36   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-06 16:36 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We can know if VM in destination should go into COLO mode by refer to
> the info that been migrated from PVM.
> 
> We skip this section if colo is not enabled (i.e.
> migrate_set_capability colo off), so that, It not break compatibility with migration
> however the --enable-colo/disable-colo on the source/destination;
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
> v10:
> - Use VMSTATE_BOOL instead of VMSTATE_UNIT32 for 'colo_requested' (Dave's suggestion).
> ---
>  include/migration/colo.h |  2 ++
>  migration/Makefile.objs  |  1 +
>  migration/colo-comm.c    | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
>  vl.c                     |  3 ++-
>  4 files changed, 55 insertions(+), 1 deletion(-)
>  create mode 100644 migration/colo-comm.c
> 
> diff --git a/include/migration/colo.h b/include/migration/colo.h
> index c60a590..9b6662d 100644
> --- a/include/migration/colo.h
> +++ b/include/migration/colo.h
> @@ -14,7 +14,9 @@
>  #define QEMU_COLO_H
>  
>  #include "qemu-common.h"
> +#include "migration/migration.h"
>  
>  bool colo_supported(void);
> +void colo_info_mig_init(void);
>  
>  #endif
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index 5a25d39..cb7bd30 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -1,5 +1,6 @@
>  common-obj-y += migration.o tcp.o
>  common-obj-$(CONFIG_COLO) += colo.o
> +common-obj-y += colo-comm.o
>  common-obj-y += vmstate.o
>  common-obj-y += qemu-file.o qemu-file-buf.o qemu-file-unix.o qemu-file-stdio.o
>  common-obj-y += xbzrle.o
> diff --git a/migration/colo-comm.c b/migration/colo-comm.c
> new file mode 100644
> index 0000000..fb407e0
> --- /dev/null
> +++ b/migration/colo-comm.c
> @@ -0,0 +1,50 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2015 FUJITSU LIMITED
> + * Copyright (c) 2015 Intel Corporation
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later. See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include <migration/colo.h>
> +#include "trace.h"
> +
> +typedef struct {
> +     bool colo_requested;
> +} COLOInfo;
> +
> +static COLOInfo colo_info;
> +
> +static void colo_info_pre_save(void *opaque)
> +{
> +    COLOInfo *s = opaque;
> +
> +    s->colo_requested = migrate_colo_enabled();
> +}
> +
> +static bool colo_info_need(void *opaque)
> +{
> +   return migrate_colo_enabled();
> +}
> +
> +static const VMStateDescription colo_state = {
> +     .name = "COLOState",
> +     .version_id = 1,
> +     .minimum_version_id = 1,
> +     .pre_save = colo_info_pre_save,
> +     .needed = colo_info_need,
> +     .fields = (VMStateField[]) {
> +         VMSTATE_BOOL(colo_requested, COLOInfo),
> +         VMSTATE_END_OF_LIST()
> +        },
> +};
> +
> +void colo_info_mig_init(void)
> +{
> +    vmstate_register(NULL, 0, &colo_state, &colo_info);
> +}
> diff --git a/vl.c b/vl.c
> index f5f7c3f..10e6cbe 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -91,6 +91,7 @@ int main(int argc, char **argv)
>  #include "sysemu/dma.h"
>  #include "audio/audio.h"
>  #include "migration/migration.h"
> +#include "migration/colo.h"
>  #include "sysemu/kvm.h"
>  #include "qapi/qmp/qjson.h"
>  #include "qemu/option.h"
> @@ -4421,7 +4422,7 @@ int main(int argc, char **argv, char **envp)
>  
>      blk_mig_init();
>      ram_mig_init();
> -
> +    colo_info_mig_init();
>      /* If the currently selected machine wishes to override the units-per-bus
>       * property of its default HBA interface type, do so now. */
>      if (machine_class->units_per_default_bus) {
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration zhanghailiang
@ 2015-11-06 16:48   ` Dr. David Alan Gilbert
  2015-11-13 16:42   ` Eric Blake
  1 sibling, 0 replies; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-06 16:48 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state
> after the first live migration successfully finished.
> 
> We reuse migration thread, so if colo is enabled by user, migration thread will
> go into the process of colo.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
> v10: Simplify process by dropping colo thread and reusing migration thread.
>      (Dave's suggestion)
> ---
>  include/migration/colo.h |  3 +++
>  migration/colo.c         | 31 +++++++++++++++++++++++++++++++
>  migration/migration.c    | 19 +++++++++++++++----
>  qapi-schema.json         |  2 +-
>  stubs/migration-colo.c   |  9 +++++++++
>  trace-events             |  3 +++
>  6 files changed, 62 insertions(+), 5 deletions(-)
> 
> diff --git a/include/migration/colo.h b/include/migration/colo.h
> index 9b6662d..f462f06 100644
> --- a/include/migration/colo.h
> +++ b/include/migration/colo.h
> @@ -19,4 +19,7 @@
>  bool colo_supported(void);
>  void colo_info_mig_init(void);
>  
> +void migrate_start_colo_process(MigrationState *s);
> +bool migration_in_colo_state(void);
> +
>  #endif
> diff --git a/migration/colo.c b/migration/colo.c
> index 2c40d2e..cf0ccb8 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -10,9 +10,40 @@
>   * later.  See the COPYING file in the top-level directory.
>   */
>  
> +#include "sysemu/sysemu.h"
>  #include "migration/colo.h"
> +#include "trace.h"
>  
>  bool colo_supported(void)
>  {
>      return true;
>  }
> +
> +bool migration_in_colo_state(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +
> +    return (s->state == MIGRATION_STATUS_COLO);
> +}
> +
> +static void colo_process_checkpoint(MigrationState *s)
> +{
> +    qemu_mutex_lock_iothread();
> +    vm_start();
> +    qemu_mutex_unlock_iothread();
> +    trace_colo_vm_state_change("stop", "run");
> +
> +    /*TODO: COLO checkpoint savevm loop*/
> +
> +    migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
> +                      MIGRATION_STATUS_COMPLETED);
> +}
> +
> +void migrate_start_colo_process(MigrationState *s)
> +{
> +    qemu_mutex_unlock_iothread();
> +    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                      MIGRATION_STATUS_COLO);
> +    colo_process_checkpoint(s);
> +    qemu_mutex_lock_iothread();
> +}
> diff --git a/migration/migration.c b/migration/migration.c
> index b179464..cf83531 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -475,6 +475,10 @@ MigrationInfo *qmp_query_migrate(Error **errp)
>  
>          get_xbzrle_cache_stats(info);
>          break;
> +    case MIGRATION_STATUS_COLO:
> +        info->has_status = true;
> +        /* TODO: display COLO specific information (checkpoint info etc.) */
> +        break;
>      case MIGRATION_STATUS_COMPLETED:
>          get_xbzrle_cache_stats(info);
>  
> @@ -793,7 +797,8 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
>  
>      if (s->state == MIGRATION_STATUS_ACTIVE ||
>          s->state == MIGRATION_STATUS_SETUP ||
> -        s->state == MIGRATION_STATUS_CANCELLING) {
> +        s->state == MIGRATION_STATUS_CANCELLING ||
> +        s->state == MIGRATION_STATUS_COLO) {
>          error_setg(errp, QERR_MIGRATION_ACTIVE);
>          return;
>      }
> @@ -1030,8 +1035,11 @@ static void migration_completion(MigrationState *s, bool *old_vm_running,
>          goto fail;
>      }
>  
> -    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                      MIGRATION_STATUS_COMPLETED);
> +    if (!migrate_colo_enabled()) {
> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                          MIGRATION_STATUS_COMPLETED);
> +    }
> +
>      return;
>  
>  fail:
> @@ -1056,6 +1064,7 @@ static void *migration_thread(void *opaque)
>      int64_t max_size = 0;
>      int64_t start_time = initial_time;
>      bool old_vm_running = false;
> +    bool enable_colo = migrate_colo_enabled();
>  
>      rcu_register_thread();
>  
> @@ -1130,7 +1139,9 @@ static void *migration_thread(void *opaque)
>          }
>          runstate_set(RUN_STATE_POSTMIGRATE);
>      } else {
> -        if (old_vm_running) {
> +        if (s->state == MIGRATION_STATUS_ACTIVE && enable_colo) {
> +            migrate_start_colo_process(s);
> +        } else if (old_vm_running) {
>              vm_start();
>          }
>      }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index cb5e5fd..22251ec 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -439,7 +439,7 @@
>  ##
>  { 'enum': 'MigrationStatus',
>    'data': [ 'none', 'setup', 'cancelling', 'cancelled',
> -            'active', 'completed', 'failed' ] }
> +            'active', 'completed', 'failed', 'colo' ] }
>  
>  ##
>  # @MigrationInfo
> diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
> index 3d817df..acddca6 100644
> --- a/stubs/migration-colo.c
> +++ b/stubs/migration-colo.c
> @@ -16,3 +16,12 @@ bool colo_supported(void)
>  {
>      return false;
>  }
> +
> +bool migration_in_colo_state(void)
> +{
> +    return false;
> +}
> +
> +void migrate_start_colo_process(MigrationState *s)
> +{
> +}
> diff --git a/trace-events b/trace-events
> index 72136b9..9cd6391 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1497,6 +1497,9 @@ rdma_start_incoming_migration_after_rdma_listen(void) ""
>  rdma_start_outgoing_migration_after_rdma_connect(void) ""
>  rdma_start_outgoing_migration_after_rdma_source_init(void) ""
>  
> +# migration/colo.c
> +colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
> +
>  # kvm-all.c
>  kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
>  kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p"
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
@ 2015-11-06 17:29   ` Dr. David Alan Gilbert
  2015-11-09  6:09     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-06 17:29 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Switch from normal migration loadvm process into COLO checkpoint process if
> COLO mode is enabled.
> We add three new members to struct MigrationIncomingState, 'have_colo_incoming_thread'
> and 'colo_incoming_thread' record the colo related threads for secondary VM,
> 'migration_incoming_co' records the original migration incoming coroutine.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---
> v10: fix a bug about fd leak which is found by Dave.

<snip>

> diff --git a/migration/migration.c b/migration/migration.c
> index cf83531..7d8cd38 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -288,6 +288,27 @@ static void process_incoming_migration_co(void *opaque)
>                        MIGRATION_STATUS_ACTIVE);
>      ret = qemu_loadvm_state(f);
>  
> +    if (!ret) {
> +        /* Make sure all file formats flush their mutable metadata */
> +        bdrv_invalidate_cache_all(&local_err);
> +        if (local_err) {
> +            error_report_err(local_err);
> +            migrate_decompress_threads_join();
> +            exit(EXIT_FAILURE);
> +        }
> +    }

Are you moving this code? Because I think the bdrv_invalidate_cache_all is a few lines
below here - just....

> +    /* we get colo info, and know if we are in colo mode */
> +    if (!ret && migration_incoming_enable_colo()) {
> +        mis->migration_incoming_co = qemu_coroutine_self();
> +        qemu_thread_create(&mis->colo_incoming_thread, "colo incoming",
> +             colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
> +        mis->have_colo_incoming_thread = true;
> +        qemu_coroutine_yield();
> +
> +        /* Wait checkpoint incoming thread exit before free resource */
> +        qemu_thread_join(&mis->colo_incoming_thread);
> +    }
> +
>      qemu_fclose(f);
>      free_xbzrle_decoded_buf();
>      migration_incoming_state_destroy();


.... here in my current head world; so shouldn't you be deleting
the bdrv_invalidate_cache_all here?

(Otherwise OK)

Dave

> diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
> index acddca6..c12516e 100644
> --- a/stubs/migration-colo.c
> +++ b/stubs/migration-colo.c
> @@ -22,6 +22,16 @@ bool migration_in_colo_state(void)
>      return false;
>  }
>  
> +bool migration_incoming_in_colo_state(void)
> +{
> +    return false;
> +}
> +
>  void migrate_start_colo_process(MigrationState *s)
>  {
>  }
> +
> +void *colo_process_incoming_thread(void *opaque)
> +{
> +    return NULL;
> +}
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol zhanghailiang
@ 2015-11-06 18:26   ` Dr. David Alan Gilbert
  2015-11-09  6:51     ` zhanghailiang
  2015-11-13 16:46   ` Eric Blake
  1 sibling, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-06 18:26 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We need communications protocol of user-defined to control the checkpoint
> process.
> 
> The new checkpoint request is started by Primary VM, and the interactive process
> like below:
> Checkpoint synchronizing points,
> 
>                        Primary                         Secondary
> 'checkpoint-request'   @ ----------------------------->
>                                                        Suspend (In hybrid mode)
> 'checkpoint-reply'     <------------------------------ @
>                        Suspend&Save state

Why is this initial pair necessary?  Can't you just start with the vmstate-send
and save the extra request pair/round trip?  On the 2nd checkpoint we know
the SVM already received the previous checkpoint because we got it's first
vmstate-load.

I guess in full-COLO (rather than simple checkpoint) you can get the secondary to
do some of it's stopping/cleanup after it sends the checkpoint-reply
but before vmstate-send, so you can hide some of the time.

(Perhaps add a comment to explain)

> 'vmstate-send'         @ ----------------------------->
>                        Send state                      Receive state
> 'vmstate-received'     <------------------------------ @
>                        Release packets                 Load state
> 'vmstate-load'         <------------------------------ @
>                        Resume                          Resume (In hybrid mode)
> 
>                        Start Comparing (In hybrid mode)
> NOTE:
>  1) '@' who sends the message
>  2) Every sync-point is synchronized by two sides with only
>     one handshake(single direction) for low-latency.
>     If more strict synchronization is required, a opposite direction
>     sync-point should be added.
>  3) Since sync-points are single direction, the remote side may
>     go forward a lot when this side just receives the sync-point.
>  4) For now, we only support 'periodic' checkpoint, for which
>    the Secondary VM is not running, later we will support 'hybrid' mode.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Cc: Eric Blake <eblake@redhat.com>
> ---
> v10:
> - Rename enum COLOCmd to COLOCommand (Eric's suggestion).
> - Remove unused 'ram-steal'
> ---
>  migration/colo.c | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  qapi-schema.json |  27 ++++++++
>  trace-events     |   2 +
>  3 files changed, 219 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 4fdf3a9..2510762 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -10,10 +10,12 @@
>   * later.  See the COPYING file in the top-level directory.
>   */
>  
> +#include <unistd.h>
>  #include "sysemu/sysemu.h"
>  #include "migration/colo.h"
>  #include "trace.h"
>  #include "qemu/error-report.h"
> +#include "qemu/sockets.h"
>  
>  bool colo_supported(void)
>  {
> @@ -34,6 +36,103 @@ bool migration_incoming_in_colo_state(void)
>      return mis && (mis->state == MIGRATION_STATUS_COLO);
>  }
>  
> +/* colo checkpoint control helper */
> +static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value)
> +{
> +    int ret = 0;
> +
> +    qemu_put_be32(f, cmd);
> +    qemu_put_be64(f, value);
> +    qemu_fflush(f);
> +
> +    ret = qemu_file_get_error(f);
> +    trace_colo_ctl_put(COLOCommand_lookup[cmd], value);
> +
> +    return ret;
> +}
> +
> +static int colo_ctl_get_cmd(QEMUFile *f, uint32_t *cmd)
> +{
> +    int ret = 0;
> +
> +    *cmd = qemu_get_be32(f);
> +    ret = qemu_file_get_error(f);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +    if (*cmd >= COLO_COMMAND_MAX) {
> +        error_report("Invalid colo command, get cmd:%d", *cmd);
> +        return -EINVAL;
> +    }
> +    trace_colo_ctl_get(COLOCommand_lookup[*cmd]);
> +
> +    return 0;
> +}
> +
> +static int colo_ctl_get(QEMUFile *f, uint32_t require)
> +{
> +    int ret;
> +    uint32_t cmd;
> +    uint64_t value;
> +
> +    ret = colo_ctl_get_cmd(f, &cmd);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +    if (cmd != require) {
> +        error_report("Unexpect colo command, expect:%d, but get cmd:%d",
> +                     require, cmd);
> +        return -EINVAL;
> +    }
> +
> +    value = qemu_get_be64(f);
> +    ret = qemu_file_get_error(f);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    return value;
> +}

Should the return type be uint64_t since you're returning value?
But then you're also using it to return an error code; so perhaps
it might be better to have a     uint64_t *value    parameter to
return the value separately; or define the range that the value
can actually take.

(Also very minor typo: 'got' not 'get' in a few errors)

> +static int colo_do_checkpoint_transaction(MigrationState *s)
> +{
> +    int ret;
> +
> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_REPLY);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    /* TODO: suspend and save vm state to colo buffer */
> +
> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    /* TODO: send vmstate to Secondary */
> +
> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
> +    if (ret < 0) {
> +        goto out;
> +    }> +
> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_LOADED);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    /* TODO: resume Primary */
> +
> +out:
> +    return ret;
> +}
> +
>  static void colo_process_checkpoint(MigrationState *s)
>  {
>      int fd, ret = 0;
> @@ -51,12 +150,27 @@ static void colo_process_checkpoint(MigrationState *s)
>          goto out;
>      }
>  
> +    /*
> +     * Wait for Secondary finish loading vm states and enter COLO
> +     * restore.
> +     */
> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_READY);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
>      qemu_mutex_lock_iothread();
>      vm_start();
>      qemu_mutex_unlock_iothread();
>      trace_colo_vm_state_change("stop", "run");
>  
> -    /*TODO: COLO checkpoint savevm loop*/
> +    while (s->state == MIGRATION_STATUS_COLO) {
> +        /* start a colo checkpoint */
> +        ret = colo_do_checkpoint_transaction(s);
> +        if (ret < 0) {
> +            goto out;
> +        }
> +    }
>  
>  out:
>      if (ret < 0) {
> @@ -79,6 +193,39 @@ void migrate_start_colo_process(MigrationState *s)
>      qemu_mutex_lock_iothread();
>  }
>  
> +/*
> + * return:
> + * 0: start a checkpoint
> + * -1: some error happened, exit colo restore
> + */
> +static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
> +{
> +    int ret;
> +    uint32_t cmd;
> +    uint64_t value;
> +
> +    ret = colo_ctl_get_cmd(f, &cmd);
> +    if (ret < 0) {
> +        /* do failover ? */
> +        return ret;
> +    }
> +    /* Fix me: this value should be 0, which is not so good,
> +     * should be used for checking ?
> +     */
> +    value = qemu_get_be64(f);
> +    if (value != 0) {

should output error message as well?

> +        return -EINVAL;
> +    }
> +
> +    switch (cmd) {
> +    case COLO_COMMAND_CHECKPOINT_REQUEST:
> +        *checkpoint_request = 1;
> +        return 0;
> +    default:
> +        return -EINVAL;
> +    }
> +}
> +
>  void *colo_process_incoming_thread(void *opaque)
>  {
>      MigrationIncomingState *mis = opaque;
> @@ -98,7 +245,48 @@ void *colo_process_incoming_thread(void *opaque)
>          error_report("colo incoming thread: Open QEMUFile to_src_file failed");
>          goto out;
>      }
> -    /* TODO: COLO checkpoint restore loop */
> +
> +    ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
> +    if (ret < 0) {
> +        goto out;
> +    }
> +
> +    while (mis->state == MIGRATION_STATUS_COLO) {
> +        int request = 0;
> +        int ret = colo_wait_handle_cmd(mis->from_src_file, &request);
> +
> +        if (ret < 0) {
> +            break;
> +        } else {
> +            if (!request) {
> +                continue;
> +            }
> +        }
> +
> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_REPLY, 0);
> +        if (ret < 0) {
> +            goto out;
> +        }
> +
> +        ret = colo_ctl_get(mis->from_src_file, COLO_COMMAND_VMSTATE_SEND);
> +        if (ret < 0) {
> +            goto out;
> +        }
> +
> +        /* TODO: read migration data into colo buffer */
> +
> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
> +        if (ret < 0) {
> +            goto out;
> +        }
> +
> +        /* TODO: load vm state */
> +
> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
> +        if (ret < 0) {
> +            goto out;
> +        }
> +    }
>  
>  out:
>      if (ret < 0) {
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 22251ec..5c4fe6d 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -702,6 +702,33 @@
>              '*tls-port': 'int', '*cert-subject': 'str' } }
>  
>  ##
> +# @COLOCommand
> +#
> +# The colo command
> +#
> +# @invalid: unknown command
> +#
> +# @checkpoint-ready: SVM is ready for checkpointing
> +#
> +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
> +#
> +# @checkpoint-reply: SVM gets PVM's checkpoint request
> +#
> +# @vmstate-send: VM's state will be sent by PVM.
> +#
> +# @vmstate-size: The total size of VMstate.
> +#
> +# @vmstate-received: VM's state has been received by SVM
> +#
> +# @vmstate-loaded: VM's state has been loaded by SVM
> +#
> +# Since: 2.5
> +##
> +{ 'enum': 'COLOCommand',
> +  'data': [ 'invalid', 'checkpoint-ready', 'checkpoint-request',
> +            'checkpoint-reply', 'vmstate-send', 'vmstate-size',
> +            'vmstate-received', 'vmstate-loaded' ] }
> +
>  # @MouseInfo:
>  #
>  # Information about a mouse device.
> diff --git a/trace-events b/trace-events
> index 9cd6391..ee4679c 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1499,6 +1499,8 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) ""
>  
>  # migration/colo.c
>  colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
> +colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64""
> +colo_ctl_get(const char *msg) "Receive '%s' cmd"
>  
>  # kvm-all.c
>  kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
@ 2015-11-06 18:28   ` Dr. David Alan Gilbert
  2015-11-13 16:47   ` Eric Blake
  1 sibling, 0 replies; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-06 18:28 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, qemu-devel, arei.gonglei, stefanha,
	amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Guest will enter this state when paused to save/restore VM state
> under colo checkpoint.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> 
> Cc: Eric Blake <eblake@redhat.com>
> Cc: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  qapi-schema.json | 7 ++++++-
>  vl.c             | 8 ++++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 5c4fe6d..49f2a90 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -154,12 +154,15 @@
>  # @watchdog: the watchdog action is configured to pause and has been triggered
>  #
>  # @guest-panicked: guest has been panicked as a result of guest OS panic
> +#
> +# @colo: guest is paused to save/restore VM state under colo checkpoint (since
> +# 2.5)
>  ##
>  { 'enum': 'RunState',
>    'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>              'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>              'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
> -            'guest-panicked' ] }
> +            'guest-panicked', 'colo' ] }
>  
>  ##
>  # @StatusInfo:
> @@ -434,6 +437,8 @@
>  #
>  # @failed: some error occurred during migration process.
>  #
> +# @colo: VM is in the process of fault tolerance. (since 2.5)
> +#
>  # Since: 2.3
>  #
>  ##
> diff --git a/vl.c b/vl.c
> index 10e6cbe..c459a3e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -586,6 +586,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_INMIGRATE, RUN_STATE_WATCHDOG },
>      { RUN_STATE_INMIGRATE, RUN_STATE_GUEST_PANICKED },
>      { RUN_STATE_INMIGRATE, RUN_STATE_FINISH_MIGRATE },
> +    { RUN_STATE_INMIGRATE, RUN_STATE_COLO },
>  
>      { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
>      { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
> @@ -595,6 +596,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>  
>      { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
>      { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
> +    { RUN_STATE_PAUSED, RUN_STATE_COLO},
>  
>      { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
>      { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
> @@ -605,9 +607,12 @@ static const RunStateTransition runstate_transitions_def[] = {
>  
>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
> +    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
>  
>      { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
>  
> +    { RUN_STATE_COLO, RUN_STATE_RUNNING },
> +
>      { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
>      { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
>      { RUN_STATE_RUNNING, RUN_STATE_IO_ERROR },
> @@ -618,6 +623,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
>      { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
>      { RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
> +    { RUN_STATE_RUNNING, RUN_STATE_COLO},
>  
>      { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
>  
> @@ -628,9 +634,11 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
>      { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
>      { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
> +    { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
>  
>      { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>      { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
> +    { RUN_STATE_WATCHDOG, RUN_STATE_COLO},
>  
>      { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
>      { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
@ 2015-11-06 18:30   ` Dr. David Alan Gilbert
  2015-11-09  8:14     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-06 18:30 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Introduce two new QEMUSizedBuffer APIs which will be used by COLO to buffer
> VM state:
> One is qsb_put_buffer(), which put the content of a given QEMUSizedBuffer
> into QEMUFile, this is used to send buffered VM state to secondary.
> Another is qsb_fill_buffer(), read 'size' bytes of data from the file into
> qsb, this is used to get VM state from socket into a buffer.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  include/migration/qemu-file.h |  3 ++-
>  migration/qemu-file-buf.c     | 58 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
> index 29a338d..de42d5b 100644
> --- a/include/migration/qemu-file.h
> +++ b/include/migration/qemu-file.h
> @@ -144,7 +144,8 @@ ssize_t qsb_get_buffer(const QEMUSizedBuffer *, off_t start, size_t count,
>                         uint8_t *buf);
>  ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *buf,
>                       off_t pos, size_t count);
> -
> +void qsb_put_buffer(QEMUFile *f, QEMUSizedBuffer *qsb, int size);
> +int qsb_fill_buffer(QEMUSizedBuffer *qsb, QEMUFile *f, int size);

I made most of the qemu_file use size_t back in August; cna you update
this please.

Dave

>  
>  /*
>   * For use on files opened with qemu_bufopen
> diff --git a/migration/qemu-file-buf.c b/migration/qemu-file-buf.c
> index 49516b8..e58004d 100644
> --- a/migration/qemu-file-buf.c
> +++ b/migration/qemu-file-buf.c
> @@ -366,6 +366,64 @@ ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *source,
>      return count;
>  }
>  
> +
> +/**
> + * Put the content of a given QEMUSizedBuffer into QEMUFile.
> + *
> + * @f: A QEMUFile
> + * @qsb: A QEMUSizedBuffer
> + * @size: size of content to write
> + */
> +void qsb_put_buffer(QEMUFile *f, QEMUSizedBuffer *qsb, int size)
> +{
> +    int i, l;
> +
> +    for (i = 0; i < qsb->n_iov && size > 0; i++) {
> +        l = MIN(qsb->iov[i].iov_len, size);
> +        qemu_put_buffer(f, qsb->iov[i].iov_base, l);
> +        size -= l;
> +    }
> +}
> +
> +/*
> + * Read 'size' bytes of data from the file into qsb.
> + * always fill from pos 0 and used after qsb_create().
> + *
> + * It will return size bytes unless there was an error, in which case it will
> + * return as many as it managed to read (assuming blocking fd's which
> + * all current QEMUFile are)
> + */
> +int qsb_fill_buffer(QEMUSizedBuffer *qsb, QEMUFile *f, int size)
> +{
> +    ssize_t rc = qsb_grow(qsb, size);
> +    int pending = size, i;
> +    qsb->used = 0;
> +    uint8_t *buf = NULL;
> +
> +    if (rc < 0) {
> +        return rc;
> +    }
> +
> +    for (i = 0; i < qsb->n_iov && pending > 0; i++) {
> +        int doneone = 0;
> +        /* read until iov full */
> +        while (doneone < qsb->iov[i].iov_len && pending > 0) {
> +            int readone = 0;
> +            buf = qsb->iov[i].iov_base;
> +            readone = qemu_get_buffer(f, buf,
> +                                MIN(qsb->iov[i].iov_len - doneone, pending));
> +            if (readone == 0) {
> +                return qsb->used;
> +            }
> +            buf += readone;
> +            doneone += readone;
> +            pending -= readone;
> +            qsb->used += readone;
> +        }
> +    }
> +    return qsb->used;
> +}
> +
>  typedef struct QEMUBuffer {
>      QEMUSizedBuffer *qsb;
>      QEMUFile *file;
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint zhanghailiang
@ 2015-11-06 18:59   ` Dr. David Alan Gilbert
  2015-11-09  9:17     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-06 18:59 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> The main process of checkpoint is to synchronize SVM with PVM.
> VM's state includes ram and device state. So we will migrate PVM's
> state to SVM when do checkpoint, just like migration does.
> 
> We will cache PVM's state in slave, we use QEMUSizedBuffer
> to store the data, we need to know the size of VM state, so in master,
> we use qsb to store VM state temporarily, get the data size by call qsb_get_length()
> and then migrate the data to the qsb in the secondary side.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---
>  migration/colo.c   | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++----
>  migration/ram.c    | 47 +++++++++++++++++++++++++++++--------
>  migration/savevm.c |  2 +-
>  3 files changed, 101 insertions(+), 16 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 2510762..b865513 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -17,6 +17,9 @@
>  #include "qemu/error-report.h"
>  #include "qemu/sockets.h"
>  
> +/* colo buffer */
> +#define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> +
>  bool colo_supported(void)
>  {
>      return true;
> @@ -94,9 +97,12 @@ static int colo_ctl_get(QEMUFile *f, uint32_t require)
>      return value;
>  }
>  
> -static int colo_do_checkpoint_transaction(MigrationState *s)
> +static int colo_do_checkpoint_transaction(MigrationState *s,
> +                                          QEMUSizedBuffer *buffer)
>  {
>      int ret;
> +    size_t size;
> +    QEMUFile *trans = NULL;
>  
>      ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
>      if (ret < 0) {
> @@ -107,15 +113,47 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
>      if (ret < 0) {
>          goto out;
>      }
> +    /* Reset colo buffer and open it for write */
> +    qsb_set_length(buffer, 0);
> +    trans = qemu_bufopen("w", buffer);
> +    if (!trans) {
> +        error_report("Open colo buffer for write failed");
> +        goto out;
> +    }
>  
> -    /* TODO: suspend and save vm state to colo buffer */
> +    qemu_mutex_lock_iothread();
> +    vm_stop_force_state(RUN_STATE_COLO);
> +    qemu_mutex_unlock_iothread();
> +    trace_colo_vm_state_change("run", "stop");
> +
> +    /* Disable block migration */
> +    s->params.blk = 0;
> +    s->params.shared = 0;
> +    qemu_savevm_state_header(trans);
> +    qemu_savevm_state_begin(trans, &s->params);
> +    qemu_mutex_lock_iothread();
> +    qemu_savevm_state_complete(trans);
> +    qemu_mutex_unlock_iothread();
> +
> +    qemu_fflush(trans);
>  
>      ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
>      if (ret < 0) {
>          goto out;
>      }
> +    /* we send the total size of the vmstate first */
> +    size = qsb_get_length(buffer);
> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SIZE, size);
> +    if (ret < 0) {
> +        goto out;
> +    }
>  
> -    /* TODO: send vmstate to Secondary */
> +    qsb_put_buffer(s->to_dst_file, buffer, size);
> +    qemu_fflush(s->to_dst_file);
> +    ret = qemu_file_get_error(s->to_dst_file);
> +    if (ret < 0) {
> +        goto out;
> +    }
>  
>      ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
>      if (ret < 0) {
> @@ -127,14 +165,24 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
>          goto out;
>      }
>  
> -    /* TODO: resume Primary */
> +    ret = 0;
> +    /* resume master */
> +    qemu_mutex_lock_iothread();
> +    vm_start();
> +    qemu_mutex_unlock_iothread();
> +    trace_colo_vm_state_change("stop", "run");
>  
>  out:
> +    if (trans) {
> +        qemu_fclose(trans);
> +    }
> +
>      return ret;
>  }
>  
>  static void colo_process_checkpoint(MigrationState *s)
>  {
> +    QEMUSizedBuffer *buffer = NULL;
>      int fd, ret = 0;
>  
>      /* Dup the fd of to_dst_file */
> @@ -159,6 +207,13 @@ static void colo_process_checkpoint(MigrationState *s)
>          goto out;
>      }
>  
> +    buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
> +    if (buffer == NULL) {
> +        ret = -ENOMEM;
> +        error_report("Failed to allocate buffer!");

Please say 'Failed to allocate colo buffer'; QEMU has lots and lots of buffers.

> +        goto out;
> +    }
> +
>      qemu_mutex_lock_iothread();
>      vm_start();
>      qemu_mutex_unlock_iothread();
> @@ -166,7 +221,7 @@ static void colo_process_checkpoint(MigrationState *s)
>  
>      while (s->state == MIGRATION_STATUS_COLO) {
>          /* start a colo checkpoint */
> -        ret = colo_do_checkpoint_transaction(s);
> +        ret = colo_do_checkpoint_transaction(s, buffer);
>          if (ret < 0) {
>              goto out;
>          }
> @@ -179,6 +234,9 @@ out:
>      migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
>                        MIGRATION_STATUS_COMPLETED);
>  
> +    qsb_free(buffer);
> +    buffer = NULL;
> +
>      if (s->from_dst_file) {
>          qemu_fclose(s->from_dst_file);
>      }
> diff --git a/migration/ram.c b/migration/ram.c
> index a25bcc7..5784c15 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -38,6 +38,7 @@
>  #include "trace.h"
>  #include "exec/ram_addr.h"
>  #include "qemu/rcu_queue.h"
> +#include "migration/colo.h"
>  
>  #ifdef DEBUG_MIGRATION_RAM
>  #define DPRINTF(fmt, ...) \
> @@ -1165,15 +1166,8 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
>      }
>  }
>  
> -/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
> - * long-running RCU critical section.  When rcu-reclaims in the code
> - * start to become numerous it will be necessary to reduce the
> - * granularity of these critical sections.
> - */
> -
> -static int ram_save_setup(QEMUFile *f, void *opaque)
> +static int ram_save_init_globals(void)
>  {
> -    RAMBlock *block;
>      int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
>  
>      dirty_rate_high_cnt = 0;
> @@ -1233,6 +1227,31 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      migration_bitmap_sync();
>      qemu_mutex_unlock_ramlist();
>      qemu_mutex_unlock_iothread();
> +    rcu_read_unlock();
> +
> +    return 0;
> +}

It surprises me you want migration_bitmap_sync in ram_save_init_globals(),
but I guess you want the first sync at the start.

> +/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
> + * long-running RCU critical section.  When rcu-reclaims in the code
> + * start to become numerous it will be necessary to reduce the
> + * granularity of these critical sections.
> + */
> +
> +static int ram_save_setup(QEMUFile *f, void *opaque)
> +{
> +    RAMBlock *block;
> +
> +    /*
> +     * migration has already setup the bitmap, reuse it.
> +     */
> +    if (!migration_in_colo_state()) {
> +        if (ram_save_init_globals() < 0) {
> +            return -1;
> +         }
> +    }
> +
> +    rcu_read_lock();
>  
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
> @@ -1332,7 +1351,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>      while (true) {
>          int pages;
>  
> -        pages = ram_find_and_save_block(f, true, &bytes_transferred);
> +        pages = ram_find_and_save_block(f, !migration_in_colo_state(),
> +                                        &bytes_transferred);
>          /* no more blocks to sent */
>          if (pages == 0) {
>              break;
> @@ -1343,8 +1363,15 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>      ram_control_after_iterate(f, RAM_CONTROL_FINISH);
>  
>      rcu_read_unlock();
> +    /*
> +     * Since we need to reuse dirty bitmap in colo,
> +     * don't cleanup the bitmap.
> +     */
> +    if (!migrate_colo_enabled() ||
> +        migration_has_failed(migrate_get_current())) {
> +        migration_end();
> +    }
>  
> -    migration_end();
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>  
>      return 0;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index dbcc39a..0faf12b 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -48,7 +48,7 @@
>  #include "qemu/iov.h"
>  #include "block/snapshot.h"
>  #include "block/qapi.h"
> -
> +#include "migration/colo.h"
>  
>  #ifndef ETH_P_RARP
>  #define ETH_P_RARP 0x8035

Wrong patch?


So other than those minor things:

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

but watch out for the recent changes to migrate_end that went in
a few days ago.

Dave

> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm
  2015-11-06 17:29   ` Dr. David Alan Gilbert
@ 2015-11-09  6:09     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-09  6:09 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/7 1:29, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> Switch from normal migration loadvm process into COLO checkpoint process if
>> COLO mode is enabled.
>> We add three new members to struct MigrationIncomingState, 'have_colo_incoming_thread'
>> and 'colo_incoming_thread' record the colo related threads for secondary VM,
>> 'migration_incoming_co' records the original migration incoming coroutine.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> ---
>> v10: fix a bug about fd leak which is found by Dave.
>
> <snip>
>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index cf83531..7d8cd38 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -288,6 +288,27 @@ static void process_incoming_migration_co(void *opaque)
>>                         MIGRATION_STATUS_ACTIVE);
>>       ret = qemu_loadvm_state(f);
>>
>> +    if (!ret) {
>> +        /* Make sure all file formats flush their mutable metadata */
>> +        bdrv_invalidate_cache_all(&local_err);
>> +        if (local_err) {
>> +            error_report_err(local_err);
>> +            migrate_decompress_threads_join();
>> +            exit(EXIT_FAILURE);
>> +        }
>> +    }
>
> Are you moving this code? Because I think the bdrv_invalidate_cache_all is a few lines
> below here - just....
>
>> +    /* we get colo info, and know if we are in colo mode */
>> +    if (!ret && migration_incoming_enable_colo()) {
>> +        mis->migration_incoming_co = qemu_coroutine_self();
>> +        qemu_thread_create(&mis->colo_incoming_thread, "colo incoming",
>> +             colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
>> +        mis->have_colo_incoming_thread = true;
>> +        qemu_coroutine_yield();
>> +
>> +        /* Wait checkpoint incoming thread exit before free resource */
>> +        qemu_thread_join(&mis->colo_incoming_thread);
>> +    }
>> +
>>       qemu_fclose(f);
>>       free_xbzrle_decoded_buf();
>>       migration_incoming_state_destroy();
>
>
> .... here in my current head world; so shouldn't you be deleting
> the bdrv_invalidate_cache_all here?
>

Good catch! I deleted it in patch 38, which should be moved into
this patch. I will fix it in next version.

Thanks,
zhanghailiang

> (Otherwise OK)
>
> Dave
>
>> diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
>> index acddca6..c12516e 100644
>> --- a/stubs/migration-colo.c
>> +++ b/stubs/migration-colo.c
>> @@ -22,6 +22,16 @@ bool migration_in_colo_state(void)
>>       return false;
>>   }
>>
>> +bool migration_incoming_in_colo_state(void)
>> +{
>> +    return false;
>> +}
>> +
>>   void migrate_start_colo_process(MigrationState *s)
>>   {
>>   }
>> +
>> +void *colo_process_incoming_thread(void *opaque)
>> +{
>> +    return NULL;
>> +}
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol
  2015-11-06 18:26   ` Dr. David Alan Gilbert
@ 2015-11-09  6:51     ` zhanghailiang
  2015-11-09  7:33       ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-09  6:51 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/7 2:26, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> We need communications protocol of user-defined to control the checkpoint
>> process.
>>
>> The new checkpoint request is started by Primary VM, and the interactive process
>> like below:
>> Checkpoint synchronizing points,
>>
>>                         Primary                         Secondary
>> 'checkpoint-request'   @ ----------------------------->
>>                                                         Suspend (In hybrid mode)
>> 'checkpoint-reply'     <------------------------------ @
>>                         Suspend&Save state
>
> Why is this initial pair necessary?  Can't you just start with the vmstate-send
> and save the extra request pair/round trip?  On the 2nd checkpoint we know
> the SVM already received the previous checkpoint because we got it's first
> vmstate-load.
>

Yes, we can certainly drop this handshaking in simple checkpoint mode.
But we still need to do some initial work (preparing work) in simple checkpoint mode.
And i'm not sure if this initial work is time-wasting or not. We choose to do this preparing work
before send the 'checkpoint-reply' to reducing VM's STOP time as possible as we can.

> I guess in full-COLO (rather than simple checkpoint) you can get the secondary to
> do some of it's stopping/cleanup after it sends the checkpoint-reply

Actually, we do it before it sends 'checkpoint-reply' :)

> but before vmstate-send, so you can hide some of the time.
>

> (Perhaps add a comment to explain)
>

OK, I will add more comment about this.

>> 'vmstate-send'         @ ----------------------------->
>>                         Send state                      Receive state
>> 'vmstate-received'     <------------------------------ @
>>                         Release packets                 Load state
>> 'vmstate-load'         <------------------------------ @
>>                         Resume                          Resume (In hybrid mode)
>>
>>                         Start Comparing (In hybrid mode)
>> NOTE:
>>   1) '@' who sends the message
>>   2) Every sync-point is synchronized by two sides with only
>>      one handshake(single direction) for low-latency.
>>      If more strict synchronization is required, a opposite direction
>>      sync-point should be added.
>>   3) Since sync-points are single direction, the remote side may
>>      go forward a lot when this side just receives the sync-point.
>>   4) For now, we only support 'periodic' checkpoint, for which
>>     the Secondary VM is not running, later we will support 'hybrid' mode.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> Cc: Eric Blake <eblake@redhat.com>
>> ---
>> v10:
>> - Rename enum COLOCmd to COLOCommand (Eric's suggestion).
>> - Remove unused 'ram-steal'
>> ---
>>   migration/colo.c | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   qapi-schema.json |  27 ++++++++
>>   trace-events     |   2 +
>>   3 files changed, 219 insertions(+), 2 deletions(-)
>>
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 4fdf3a9..2510762 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -10,10 +10,12 @@
>>    * later.  See the COPYING file in the top-level directory.
>>    */
>>
>> +#include <unistd.h>
>>   #include "sysemu/sysemu.h"
>>   #include "migration/colo.h"
>>   #include "trace.h"
>>   #include "qemu/error-report.h"
>> +#include "qemu/sockets.h"
>>
>>   bool colo_supported(void)
>>   {
>> @@ -34,6 +36,103 @@ bool migration_incoming_in_colo_state(void)
>>       return mis && (mis->state == MIGRATION_STATUS_COLO);
>>   }
>>
>> +/* colo checkpoint control helper */
>> +static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value)
>> +{
>> +    int ret = 0;
>> +
>> +    qemu_put_be32(f, cmd);
>> +    qemu_put_be64(f, value);
>> +    qemu_fflush(f);
>> +
>> +    ret = qemu_file_get_error(f);
>> +    trace_colo_ctl_put(COLOCommand_lookup[cmd], value);
>> +
>> +    return ret;
>> +}
>> +
>> +static int colo_ctl_get_cmd(QEMUFile *f, uint32_t *cmd)
>> +{
>> +    int ret = 0;
>> +
>> +    *cmd = qemu_get_be32(f);
>> +    ret = qemu_file_get_error(f);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +    if (*cmd >= COLO_COMMAND_MAX) {
>> +        error_report("Invalid colo command, get cmd:%d", *cmd);
>> +        return -EINVAL;
>> +    }
>> +    trace_colo_ctl_get(COLOCommand_lookup[*cmd]);
>> +
>> +    return 0;
>> +}
>> +
>> +static int colo_ctl_get(QEMUFile *f, uint32_t require)
>> +{
>> +    int ret;
>> +    uint32_t cmd;
>> +    uint64_t value;
>> +
>> +    ret = colo_ctl_get_cmd(f, &cmd);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +    if (cmd != require) {
>> +        error_report("Unexpect colo command, expect:%d, but get cmd:%d",
>> +                     require, cmd);
>> +        return -EINVAL;
>> +    }
>> +
>> +    value = qemu_get_be64(f);
>> +    ret = qemu_file_get_error(f);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +
>> +    return value;
>> +}
>
> Should the return type be uint64_t since you're returning value?
> But then you're also using it to return an error code; so perhaps
> it might be better to have a     uint64_t *value    parameter to
> return the value separately; or define the range that the value
> can actually take.
>

Good catch. Use parameter to return 'value' is a good idea,
Will fix it in next version.

Thanks,
zhanghailiang

> (Also very minor typo: 'got' not 'get' in a few errors)
>

>> +static int colo_do_checkpoint_transaction(MigrationState *s)
>> +{
>> +    int ret;
>> +
>> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>> +
>> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_REPLY);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: suspend and save vm state to colo buffer */
>> +
>> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: send vmstate to Secondary */
>> +
>> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }> +
>> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_LOADED);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: resume Primary */
>> +
>> +out:
>> +    return ret;
>> +}
>> +
>>   static void colo_process_checkpoint(MigrationState *s)
>>   {
>>       int fd, ret = 0;
>> @@ -51,12 +150,27 @@ static void colo_process_checkpoint(MigrationState *s)
>>           goto out;
>>       }
>>
>> +    /*
>> +     * Wait for Secondary finish loading vm states and enter COLO
>> +     * restore.
>> +     */
>> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_READY);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>> +
>>       qemu_mutex_lock_iothread();
>>       vm_start();
>>       qemu_mutex_unlock_iothread();
>>       trace_colo_vm_state_change("stop", "run");
>>
>> -    /*TODO: COLO checkpoint savevm loop*/
>> +    while (s->state == MIGRATION_STATUS_COLO) {
>> +        /* start a colo checkpoint */
>> +        ret = colo_do_checkpoint_transaction(s);
>> +        if (ret < 0) {
>> +            goto out;
>> +        }
>> +    }
>>
>>   out:
>>       if (ret < 0) {
>> @@ -79,6 +193,39 @@ void migrate_start_colo_process(MigrationState *s)
>>       qemu_mutex_lock_iothread();
>>   }
>>
>> +/*
>> + * return:
>> + * 0: start a checkpoint
>> + * -1: some error happened, exit colo restore
>> + */
>> +static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
>> +{
>> +    int ret;
>> +    uint32_t cmd;
>> +    uint64_t value;
>> +
>> +    ret = colo_ctl_get_cmd(f, &cmd);
>> +    if (ret < 0) {
>> +        /* do failover ? */
>> +        return ret;
>> +    }
>> +    /* Fix me: this value should be 0, which is not so good,
>> +     * should be used for checking ?
>> +     */
>> +    value = qemu_get_be64(f);
>> +    if (value != 0) {
>
> should output error message as well?
>
>> +        return -EINVAL;
>> +    }
>> +
>> +    switch (cmd) {
>> +    case COLO_COMMAND_CHECKPOINT_REQUEST:
>> +        *checkpoint_request = 1;
>> +        return 0;
>> +    default:
>> +        return -EINVAL;
>> +    }
>> +}
>> +
>>   void *colo_process_incoming_thread(void *opaque)
>>   {
>>       MigrationIncomingState *mis = opaque;
>> @@ -98,7 +245,48 @@ void *colo_process_incoming_thread(void *opaque)
>>           error_report("colo incoming thread: Open QEMUFile to_src_file failed");
>>           goto out;
>>       }
>> -    /* TODO: COLO checkpoint restore loop */
>> +
>> +    ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>> +
>> +    while (mis->state == MIGRATION_STATUS_COLO) {
>> +        int request = 0;
>> +        int ret = colo_wait_handle_cmd(mis->from_src_file, &request);
>> +
>> +        if (ret < 0) {
>> +            break;
>> +        } else {
>> +            if (!request) {
>> +                continue;
>> +            }
>> +        }
>> +
>> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_REPLY, 0);
>> +        if (ret < 0) {
>> +            goto out;
>> +        }
>> +
>> +        ret = colo_ctl_get(mis->from_src_file, COLO_COMMAND_VMSTATE_SEND);
>> +        if (ret < 0) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: read migration data into colo buffer */
>> +
>> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
>> +        if (ret < 0) {
>> +            goto out;
>> +        }
>> +
>> +        /* TODO: load vm state */
>> +
>> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
>> +        if (ret < 0) {
>> +            goto out;
>> +        }
>> +    }
>>
>>   out:
>>       if (ret < 0) {
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 22251ec..5c4fe6d 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -702,6 +702,33 @@
>>               '*tls-port': 'int', '*cert-subject': 'str' } }
>>
>>   ##
>> +# @COLOCommand
>> +#
>> +# The colo command
>> +#
>> +# @invalid: unknown command
>> +#
>> +# @checkpoint-ready: SVM is ready for checkpointing
>> +#
>> +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
>> +#
>> +# @checkpoint-reply: SVM gets PVM's checkpoint request
>> +#
>> +# @vmstate-send: VM's state will be sent by PVM.
>> +#
>> +# @vmstate-size: The total size of VMstate.
>> +#
>> +# @vmstate-received: VM's state has been received by SVM
>> +#
>> +# @vmstate-loaded: VM's state has been loaded by SVM
>> +#
>> +# Since: 2.5
>> +##
>> +{ 'enum': 'COLOCommand',
>> +  'data': [ 'invalid', 'checkpoint-ready', 'checkpoint-request',
>> +            'checkpoint-reply', 'vmstate-send', 'vmstate-size',
>> +            'vmstate-received', 'vmstate-loaded' ] }
>> +
>>   # @MouseInfo:
>>   #
>>   # Information about a mouse device.
>> diff --git a/trace-events b/trace-events
>> index 9cd6391..ee4679c 100644
>> --- a/trace-events
>> +++ b/trace-events
>> @@ -1499,6 +1499,8 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) ""
>>
>>   # migration/colo.c
>>   colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
>> +colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64""
>> +colo_ctl_get(const char *msg) "Receive '%s' cmd"
>>
>>   # kvm-all.c
>>   kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol
  2015-11-09  6:51     ` zhanghailiang
@ 2015-11-09  7:33       ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-09  7:33 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/9 14:51, zhanghailiang wrote:
> On 2015/11/7 2:26, Dr. David Alan Gilbert wrote:
>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>> We need communications protocol of user-defined to control the checkpoint
>>> process.
>>>
>>> The new checkpoint request is started by Primary VM, and the interactive process
>>> like below:
>>> Checkpoint synchronizing points,
>>>
>>>                         Primary                         Secondary
>>> 'checkpoint-request'   @ ----------------------------->
>>>                                                         Suspend (In hybrid mode)
>>> 'checkpoint-reply'     <------------------------------ @
>>>                         Suspend&Save state
>>
>> Why is this initial pair necessary?  Can't you just start with the vmstate-send
>> and save the extra request pair/round trip?  On the 2nd checkpoint we know
>> the SVM already received the previous checkpoint because we got it's first
>> vmstate-load.
>>
>

Er, i have made some mistake, before 'checkpoint-request' command, we have a 'checkpoint-ready'
communication, which is sent by SVM to PVM, to tell PVM that SVM is ready for checkpoint.
We do the initial work before send 'checkpoint-ready' in Secondary.

So, yes, you are right, this 'checkpoint-reply' is unnecessary for simple checkpoint mode.
I will remove it or maybe just add more comment about this, and add 'checkpoint-ready' in the comment.

> Yes, we can certainly drop this handshaking in simple checkpoint mode.
> But we still need to do some initial work (preparing work) in simple checkpoint mode.
> And i'm not sure if this initial work is time-wasting or not. We choose to do this preparing work
> before send the 'checkpoint-reply' to reducing VM's STOP time as possible as we can.
>
>> I guess in full-COLO (rather than simple checkpoint) you can get the secondary to
>> do some of it's stopping/cleanup after it sends the checkpoint-reply
>
> Actually, we do it before it sends 'checkpoint-reply' :)
>
>> but before vmstate-send, so you can hide some of the time.
>>
>
>> (Perhaps add a comment to explain)
>>
>
> OK, I will add more comment about this.
>
>>> 'vmstate-send'         @ ----------------------------->
>>>                         Send state                      Receive state
>>> 'vmstate-received'     <------------------------------ @
>>>                         Release packets                 Load state
>>> 'vmstate-load'         <------------------------------ @
>>>                         Resume                          Resume (In hybrid mode)
>>>
>>>                         Start Comparing (In hybrid mode)
>>> NOTE:
>>>   1) '@' who sends the message
>>>   2) Every sync-point is synchronized by two sides with only
>>>      one handshake(single direction) for low-latency.
>>>      If more strict synchronization is required, a opposite direction
>>>      sync-point should be added.
>>>   3) Since sync-points are single direction, the remote side may
>>>      go forward a lot when this side just receives the sync-point.
>>>   4) For now, we only support 'periodic' checkpoint, for which
>>>     the Secondary VM is not running, later we will support 'hybrid' mode.
>>>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>> Cc: Eric Blake <eblake@redhat.com>
>>> ---
>>> v10:
>>> - Rename enum COLOCmd to COLOCommand (Eric's suggestion).
>>> - Remove unused 'ram-steal'
>>> ---
>>>   migration/colo.c | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>   qapi-schema.json |  27 ++++++++
>>>   trace-events     |   2 +
>>>   3 files changed, 219 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/migration/colo.c b/migration/colo.c
>>> index 4fdf3a9..2510762 100644
>>> --- a/migration/colo.c
>>> +++ b/migration/colo.c
>>> @@ -10,10 +10,12 @@
>>>    * later.  See the COPYING file in the top-level directory.
>>>    */
>>>
>>> +#include <unistd.h>
>>>   #include "sysemu/sysemu.h"
>>>   #include "migration/colo.h"
>>>   #include "trace.h"
>>>   #include "qemu/error-report.h"
>>> +#include "qemu/sockets.h"
>>>
>>>   bool colo_supported(void)
>>>   {
>>> @@ -34,6 +36,103 @@ bool migration_incoming_in_colo_state(void)
>>>       return mis && (mis->state == MIGRATION_STATUS_COLO);
>>>   }
>>>
>>> +/* colo checkpoint control helper */
>>> +static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value)
>>> +{
>>> +    int ret = 0;
>>> +
>>> +    qemu_put_be32(f, cmd);
>>> +    qemu_put_be64(f, value);
>>> +    qemu_fflush(f);
>>> +
>>> +    ret = qemu_file_get_error(f);
>>> +    trace_colo_ctl_put(COLOCommand_lookup[cmd], value);
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> +static int colo_ctl_get_cmd(QEMUFile *f, uint32_t *cmd)
>>> +{
>>> +    int ret = 0;
>>> +
>>> +    *cmd = qemu_get_be32(f);
>>> +    ret = qemu_file_get_error(f);
>>> +    if (ret < 0) {
>>> +        return ret;
>>> +    }
>>> +    if (*cmd >= COLO_COMMAND_MAX) {
>>> +        error_report("Invalid colo command, get cmd:%d", *cmd);
>>> +        return -EINVAL;
>>> +    }
>>> +    trace_colo_ctl_get(COLOCommand_lookup[*cmd]);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int colo_ctl_get(QEMUFile *f, uint32_t require)
>>> +{
>>> +    int ret;
>>> +    uint32_t cmd;
>>> +    uint64_t value;
>>> +
>>> +    ret = colo_ctl_get_cmd(f, &cmd);
>>> +    if (ret < 0) {
>>> +        return ret;
>>> +    }
>>> +    if (cmd != require) {
>>> +        error_report("Unexpect colo command, expect:%d, but get cmd:%d",
>>> +                     require, cmd);
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    value = qemu_get_be64(f);
>>> +    ret = qemu_file_get_error(f);
>>> +    if (ret < 0) {
>>> +        return ret;
>>> +    }
>>> +
>>> +    return value;
>>> +}
>>
>> Should the return type be uint64_t since you're returning value?
>> But then you're also using it to return an error code; so perhaps
>> it might be better to have a     uint64_t *value    parameter to
>> return the value separately; or define the range that the value
>> can actually take.
>>
>
> Good catch. Use parameter to return 'value' is a good idea,
> Will fix it in next version.
>
> Thanks,
> zhanghailiang
>
>> (Also very minor typo: 'got' not 'get' in a few errors)
>>
>
>>> +static int colo_do_checkpoint_transaction(MigrationState *s)
>>> +{
>>> +    int ret;
>>> +
>>> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
>>> +    if (ret < 0) {
>>> +        goto out;
>>> +    }
>>> +
>>> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_REPLY);
>>> +    if (ret < 0) {
>>> +        goto out;
>>> +    }
>>> +
>>> +    /* TODO: suspend and save vm state to colo buffer */
>>> +
>>> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
>>> +    if (ret < 0) {
>>> +        goto out;
>>> +    }
>>> +
>>> +    /* TODO: send vmstate to Secondary */
>>> +
>>> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
>>> +    if (ret < 0) {
>>> +        goto out;
>>> +    }> +
>>> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_LOADED);
>>> +    if (ret < 0) {
>>> +        goto out;
>>> +    }
>>> +
>>> +    /* TODO: resume Primary */
>>> +
>>> +out:
>>> +    return ret;
>>> +}
>>> +
>>>   static void colo_process_checkpoint(MigrationState *s)
>>>   {
>>>       int fd, ret = 0;
>>> @@ -51,12 +150,27 @@ static void colo_process_checkpoint(MigrationState *s)
>>>           goto out;
>>>       }
>>>
>>> +    /*
>>> +     * Wait for Secondary finish loading vm states and enter COLO
>>> +     * restore.
>>> +     */
>>> +    ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_CHECKPOINT_READY);
>>> +    if (ret < 0) {
>>> +        goto out;
>>> +    }
>>> +
>>>       qemu_mutex_lock_iothread();
>>>       vm_start();
>>>       qemu_mutex_unlock_iothread();
>>>       trace_colo_vm_state_change("stop", "run");
>>>
>>> -    /*TODO: COLO checkpoint savevm loop*/
>>> +    while (s->state == MIGRATION_STATUS_COLO) {
>>> +        /* start a colo checkpoint */
>>> +        ret = colo_do_checkpoint_transaction(s);
>>> +        if (ret < 0) {
>>> +            goto out;
>>> +        }
>>> +    }
>>>
>>>   out:
>>>       if (ret < 0) {
>>> @@ -79,6 +193,39 @@ void migrate_start_colo_process(MigrationState *s)
>>>       qemu_mutex_lock_iothread();
>>>   }
>>>
>>> +/*
>>> + * return:
>>> + * 0: start a checkpoint
>>> + * -1: some error happened, exit colo restore
>>> + */
>>> +static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
>>> +{
>>> +    int ret;
>>> +    uint32_t cmd;
>>> +    uint64_t value;
>>> +
>>> +    ret = colo_ctl_get_cmd(f, &cmd);
>>> +    if (ret < 0) {
>>> +        /* do failover ? */
>>> +        return ret;
>>> +    }
>>> +    /* Fix me: this value should be 0, which is not so good,
>>> +     * should be used for checking ?
>>> +     */
>>> +    value = qemu_get_be64(f);
>>> +    if (value != 0) {
>>
>> should output error message as well?
>>
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    switch (cmd) {
>>> +    case COLO_COMMAND_CHECKPOINT_REQUEST:
>>> +        *checkpoint_request = 1;
>>> +        return 0;
>>> +    default:
>>> +        return -EINVAL;
>>> +    }
>>> +}
>>> +
>>>   void *colo_process_incoming_thread(void *opaque)
>>>   {
>>>       MigrationIncomingState *mis = opaque;
>>> @@ -98,7 +245,48 @@ void *colo_process_incoming_thread(void *opaque)
>>>           error_report("colo incoming thread: Open QEMUFile to_src_file failed");
>>>           goto out;
>>>       }
>>> -    /* TODO: COLO checkpoint restore loop */
>>> +
>>> +    ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
>>> +    if (ret < 0) {
>>> +        goto out;
>>> +    }
>>> +
>>> +    while (mis->state == MIGRATION_STATUS_COLO) {
>>> +        int request = 0;
>>> +        int ret = colo_wait_handle_cmd(mis->from_src_file, &request);
>>> +
>>> +        if (ret < 0) {
>>> +            break;
>>> +        } else {
>>> +            if (!request) {
>>> +                continue;
>>> +            }
>>> +        }
>>> +
>>> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_REPLY, 0);
>>> +        if (ret < 0) {
>>> +            goto out;
>>> +        }
>>> +
>>> +        ret = colo_ctl_get(mis->from_src_file, COLO_COMMAND_VMSTATE_SEND);
>>> +        if (ret < 0) {
>>> +            goto out;
>>> +        }
>>> +
>>> +        /* TODO: read migration data into colo buffer */
>>> +
>>> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
>>> +        if (ret < 0) {
>>> +            goto out;
>>> +        }
>>> +
>>> +        /* TODO: load vm state */
>>> +
>>> +        ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
>>> +        if (ret < 0) {
>>> +            goto out;
>>> +        }
>>> +    }
>>>
>>>   out:
>>>       if (ret < 0) {
>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>> index 22251ec..5c4fe6d 100644
>>> --- a/qapi-schema.json
>>> +++ b/qapi-schema.json
>>> @@ -702,6 +702,33 @@
>>>               '*tls-port': 'int', '*cert-subject': 'str' } }
>>>
>>>   ##
>>> +# @COLOCommand
>>> +#
>>> +# The colo command
>>> +#
>>> +# @invalid: unknown command
>>> +#
>>> +# @checkpoint-ready: SVM is ready for checkpointing
>>> +#
>>> +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
>>> +#
>>> +# @checkpoint-reply: SVM gets PVM's checkpoint request
>>> +#
>>> +# @vmstate-send: VM's state will be sent by PVM.
>>> +#
>>> +# @vmstate-size: The total size of VMstate.
>>> +#
>>> +# @vmstate-received: VM's state has been received by SVM
>>> +#
>>> +# @vmstate-loaded: VM's state has been loaded by SVM
>>> +#
>>> +# Since: 2.5
>>> +##
>>> +{ 'enum': 'COLOCommand',
>>> +  'data': [ 'invalid', 'checkpoint-ready', 'checkpoint-request',
>>> +            'checkpoint-reply', 'vmstate-send', 'vmstate-size',
>>> +            'vmstate-received', 'vmstate-loaded' ] }
>>> +
>>>   # @MouseInfo:
>>>   #
>>>   # Information about a mouse device.
>>> diff --git a/trace-events b/trace-events
>>> index 9cd6391..ee4679c 100644
>>> --- a/trace-events
>>> +++ b/trace-events
>>> @@ -1499,6 +1499,8 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) ""
>>>
>>>   # migration/colo.c
>>>   colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
>>> +colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64""
>>> +colo_ctl_get(const char *msg) "Receive '%s' cmd"
>>>
>>>   # kvm-all.c
>>>   kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
>>> --
>>> 1.8.3.1
>>>
>>>
>> --
>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>
>> .
>>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb
  2015-11-06 18:30   ` Dr. David Alan Gilbert
@ 2015-11-09  8:14     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-09  8:14 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/7 2:30, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> Introduce two new QEMUSizedBuffer APIs which will be used by COLO to buffer
>> VM state:
>> One is qsb_put_buffer(), which put the content of a given QEMUSizedBuffer
>> into QEMUFile, this is used to send buffered VM state to secondary.
>> Another is qsb_fill_buffer(), read 'size' bytes of data from the file into
>> qsb, this is used to get VM state from socket into a buffer.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> ---
>>   include/migration/qemu-file.h |  3 ++-
>>   migration/qemu-file-buf.c     | 58 +++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 60 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
>> index 29a338d..de42d5b 100644
>> --- a/include/migration/qemu-file.h
>> +++ b/include/migration/qemu-file.h
>> @@ -144,7 +144,8 @@ ssize_t qsb_get_buffer(const QEMUSizedBuffer *, off_t start, size_t count,
>>                          uint8_t *buf);
>>   ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *buf,
>>                        off_t pos, size_t count);
>> -
>> +void qsb_put_buffer(QEMUFile *f, QEMUSizedBuffer *qsb, int size);
>> +int qsb_fill_buffer(QEMUSizedBuffer *qsb, QEMUFile *f, int size);
>
> I made most of the qemu_file use size_t back in August; cna you update
> this please.

Of course, Will do that in next version, thanks.

> Dave
>
>>
>>   /*
>>    * For use on files opened with qemu_bufopen
>> diff --git a/migration/qemu-file-buf.c b/migration/qemu-file-buf.c
>> index 49516b8..e58004d 100644
>> --- a/migration/qemu-file-buf.c
>> +++ b/migration/qemu-file-buf.c
>> @@ -366,6 +366,64 @@ ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *source,
>>       return count;
>>   }
>>
>> +
>> +/**
>> + * Put the content of a given QEMUSizedBuffer into QEMUFile.
>> + *
>> + * @f: A QEMUFile
>> + * @qsb: A QEMUSizedBuffer
>> + * @size: size of content to write
>> + */
>> +void qsb_put_buffer(QEMUFile *f, QEMUSizedBuffer *qsb, int size)
>> +{
>> +    int i, l;
>> +
>> +    for (i = 0; i < qsb->n_iov && size > 0; i++) {
>> +        l = MIN(qsb->iov[i].iov_len, size);
>> +        qemu_put_buffer(f, qsb->iov[i].iov_base, l);
>> +        size -= l;
>> +    }
>> +}
>> +
>> +/*
>> + * Read 'size' bytes of data from the file into qsb.
>> + * always fill from pos 0 and used after qsb_create().
>> + *
>> + * It will return size bytes unless there was an error, in which case it will
>> + * return as many as it managed to read (assuming blocking fd's which
>> + * all current QEMUFile are)
>> + */
>> +int qsb_fill_buffer(QEMUSizedBuffer *qsb, QEMUFile *f, int size)
>> +{
>> +    ssize_t rc = qsb_grow(qsb, size);
>> +    int pending = size, i;
>> +    qsb->used = 0;
>> +    uint8_t *buf = NULL;
>> +
>> +    if (rc < 0) {
>> +        return rc;
>> +    }
>> +
>> +    for (i = 0; i < qsb->n_iov && pending > 0; i++) {
>> +        int doneone = 0;
>> +        /* read until iov full */
>> +        while (doneone < qsb->iov[i].iov_len && pending > 0) {
>> +            int readone = 0;
>> +            buf = qsb->iov[i].iov_base;
>> +            readone = qemu_get_buffer(f, buf,
>> +                                MIN(qsb->iov[i].iov_len - doneone, pending));
>> +            if (readone == 0) {
>> +                return qsb->used;
>> +            }
>> +            buf += readone;
>> +            doneone += readone;
>> +            pending -= readone;
>> +            qsb->used += readone;
>> +        }
>> +    }
>> +    return qsb->used;
>> +}
>> +
>>   typedef struct QEMUBuffer {
>>       QEMUSizedBuffer *qsb;
>>       QEMUFile *file;
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint
  2015-11-06 18:59   ` Dr. David Alan Gilbert
@ 2015-11-09  9:17     ` zhanghailiang
  2015-11-13 18:53       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-09  9:17 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/7 2:59, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> The main process of checkpoint is to synchronize SVM with PVM.
>> VM's state includes ram and device state. So we will migrate PVM's
>> state to SVM when do checkpoint, just like migration does.
>>
>> We will cache PVM's state in slave, we use QEMUSizedBuffer
>> to store the data, we need to know the size of VM state, so in master,
>> we use qsb to store VM state temporarily, get the data size by call qsb_get_length()
>> and then migrate the data to the qsb in the secondary side.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> ---
>>   migration/colo.c   | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++----
>>   migration/ram.c    | 47 +++++++++++++++++++++++++++++--------
>>   migration/savevm.c |  2 +-
>>   3 files changed, 101 insertions(+), 16 deletions(-)
>>
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 2510762..b865513 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -17,6 +17,9 @@
>>   #include "qemu/error-report.h"
>>   #include "qemu/sockets.h"
>>
>> +/* colo buffer */
>> +#define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>> +
>>   bool colo_supported(void)
>>   {
>>       return true;
>> @@ -94,9 +97,12 @@ static int colo_ctl_get(QEMUFile *f, uint32_t require)
>>       return value;
>>   }
>>
>> -static int colo_do_checkpoint_transaction(MigrationState *s)
>> +static int colo_do_checkpoint_transaction(MigrationState *s,
>> +                                          QEMUSizedBuffer *buffer)
>>   {
>>       int ret;
>> +    size_t size;
>> +    QEMUFile *trans = NULL;
>>
>>       ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
>>       if (ret < 0) {
>> @@ -107,15 +113,47 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
>>       if (ret < 0) {
>>           goto out;
>>       }
>> +    /* Reset colo buffer and open it for write */
>> +    qsb_set_length(buffer, 0);
>> +    trans = qemu_bufopen("w", buffer);
>> +    if (!trans) {
>> +        error_report("Open colo buffer for write failed");
>> +        goto out;
>> +    }
>>
>> -    /* TODO: suspend and save vm state to colo buffer */
>> +    qemu_mutex_lock_iothread();
>> +    vm_stop_force_state(RUN_STATE_COLO);
>> +    qemu_mutex_unlock_iothread();
>> +    trace_colo_vm_state_change("run", "stop");
>> +
>> +    /* Disable block migration */
>> +    s->params.blk = 0;
>> +    s->params.shared = 0;
>> +    qemu_savevm_state_header(trans);
>> +    qemu_savevm_state_begin(trans, &s->params);
>> +    qemu_mutex_lock_iothread();
>> +    qemu_savevm_state_complete(trans);
>> +    qemu_mutex_unlock_iothread();
>> +
>> +    qemu_fflush(trans);
>>
>>       ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
>>       if (ret < 0) {
>>           goto out;
>>       }
>> +    /* we send the total size of the vmstate first */
>> +    size = qsb_get_length(buffer);
>> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SIZE, size);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>>
>> -    /* TODO: send vmstate to Secondary */
>> +    qsb_put_buffer(s->to_dst_file, buffer, size);
>> +    qemu_fflush(s->to_dst_file);
>> +    ret = qemu_file_get_error(s->to_dst_file);
>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>>
>>       ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
>>       if (ret < 0) {
>> @@ -127,14 +165,24 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
>>           goto out;
>>       }
>>
>> -    /* TODO: resume Primary */
>> +    ret = 0;
>> +    /* resume master */
>> +    qemu_mutex_lock_iothread();
>> +    vm_start();
>> +    qemu_mutex_unlock_iothread();
>> +    trace_colo_vm_state_change("stop", "run");
>>
>>   out:
>> +    if (trans) {
>> +        qemu_fclose(trans);
>> +    }
>> +
>>       return ret;
>>   }
>>
>>   static void colo_process_checkpoint(MigrationState *s)
>>   {
>> +    QEMUSizedBuffer *buffer = NULL;
>>       int fd, ret = 0;
>>
>>       /* Dup the fd of to_dst_file */
>> @@ -159,6 +207,13 @@ static void colo_process_checkpoint(MigrationState *s)
>>           goto out;
>>       }
>>
>> +    buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
>> +    if (buffer == NULL) {
>> +        ret = -ENOMEM;
>> +        error_report("Failed to allocate buffer!");
>
> Please say 'Failed to allocate colo buffer'; QEMU has lots and lots of buffers.
>

OK, will fix it in next version.

>> +        goto out;
>> +    }
>> +
>>       qemu_mutex_lock_iothread();
>>       vm_start();
>>       qemu_mutex_unlock_iothread();
>> @@ -166,7 +221,7 @@ static void colo_process_checkpoint(MigrationState *s)
>>
>>       while (s->state == MIGRATION_STATUS_COLO) {
>>           /* start a colo checkpoint */
>> -        ret = colo_do_checkpoint_transaction(s);
>> +        ret = colo_do_checkpoint_transaction(s, buffer);
>>           if (ret < 0) {
>>               goto out;
>>           }
>> @@ -179,6 +234,9 @@ out:
>>       migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
>>                         MIGRATION_STATUS_COMPLETED);
>>
>> +    qsb_free(buffer);
>> +    buffer = NULL;
>> +
>>       if (s->from_dst_file) {
>>           qemu_fclose(s->from_dst_file);
>>       }
>> diff --git a/migration/ram.c b/migration/ram.c
>> index a25bcc7..5784c15 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -38,6 +38,7 @@
>>   #include "trace.h"
>>   #include "exec/ram_addr.h"
>>   #include "qemu/rcu_queue.h"
>> +#include "migration/colo.h"
>>
>>   #ifdef DEBUG_MIGRATION_RAM
>>   #define DPRINTF(fmt, ...) \
>> @@ -1165,15 +1166,8 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
>>       }
>>   }
>>
>> -/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>> - * long-running RCU critical section.  When rcu-reclaims in the code
>> - * start to become numerous it will be necessary to reduce the
>> - * granularity of these critical sections.
>> - */
>> -
>> -static int ram_save_setup(QEMUFile *f, void *opaque)
>> +static int ram_save_init_globals(void)
>>   {
>> -    RAMBlock *block;
>>       int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
>>
>>       dirty_rate_high_cnt = 0;
>> @@ -1233,6 +1227,31 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>       migration_bitmap_sync();
>>       qemu_mutex_unlock_ramlist();
>>       qemu_mutex_unlock_iothread();
>> +    rcu_read_unlock();
>> +
>> +    return 0;
>> +}
>
> It surprises me you want migration_bitmap_sync in ram_save_init_globals(),
> but I guess you want the first sync at the start.
>

Er, sorry,i don't quite understand.
Here. I just split part codes of ram_save_setup()
into a helper function ram_save_init_global(), to make it more clear.
We can't do initial work for twice. Is there any thing wrong ?


>> +/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>> + * long-running RCU critical section.  When rcu-reclaims in the code
>> + * start to become numerous it will be necessary to reduce the
>> + * granularity of these critical sections.
>> + */
>> +
>> +static int ram_save_setup(QEMUFile *f, void *opaque)
>> +{
>> +    RAMBlock *block;
>> +
>> +    /*
>> +     * migration has already setup the bitmap, reuse it.
>> +     */
>> +    if (!migration_in_colo_state()) {
>> +        if (ram_save_init_globals() < 0) {
>> +            return -1;
>> +         }
>> +    }
>> +
>> +    rcu_read_lock();
>>
>>       qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>>
>> @@ -1332,7 +1351,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>>       while (true) {
>>           int pages;
>>
>> -        pages = ram_find_and_save_block(f, true, &bytes_transferred);
>> +        pages = ram_find_and_save_block(f, !migration_in_colo_state(),
>> +                                        &bytes_transferred);
>>           /* no more blocks to sent */
>>           if (pages == 0) {
>>               break;
>> @@ -1343,8 +1363,15 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>>       ram_control_after_iterate(f, RAM_CONTROL_FINISH);
>>
>>       rcu_read_unlock();
>> +    /*
>> +     * Since we need to reuse dirty bitmap in colo,
>> +     * don't cleanup the bitmap.
>> +     */
>> +    if (!migrate_colo_enabled() ||
>> +        migration_has_failed(migrate_get_current())) {
>> +        migration_end();
>> +    }
>>
>> -    migration_end();
>>       qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>>
>>       return 0;
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index dbcc39a..0faf12b 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -48,7 +48,7 @@
>>   #include "qemu/iov.h"
>>   #include "block/snapshot.h"
>>   #include "block/qapi.h"
>> -
>> +#include "migration/colo.h"
>>
>>   #ifndef ETH_P_RARP
>>   #define ETH_P_RARP 0x8035
>
> Wrong patch?
>

No, we have call migration_in_colo_state() in qemu_savevm_state_begin().
So we have to include "migration/colo.h"

>
> So other than those minor things:
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
> but watch out for the recent changes to migrate_end that went in
> a few days ago.
>

Thanks for reminding me, i have rebased that. ;)

zhanghailiang

> Dave
>
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
@ 2015-11-13 15:39   ` Dr. David Alan Gilbert
  2015-11-16  7:57     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-13 15:39 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We should not load PVM's state directly into SVM, because there maybe some
> errors happen when SVM is receving data, which will break SVM.
> 
> We need to ensure receving all data before load the state into SVM. We use
> an extra memory to cache these data (PVM's ram). The ram cache in secondary side
> is initially the same as SVM/PVM's memory. And in the process of checkpoint,
> we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
> always the same as PVM's memory at every checkpoint, then we flush this cached ram
> to SVM after we receive all PVM's state.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
> v10: Split the process of dirty pages recording into a new patch
> ---
>  include/exec/ram_addr.h  |  1 +
>  include/migration/colo.h |  3 +++
>  migration/colo.c         | 14 +++++++++--
>  migration/ram.c          | 61 ++++++++++++++++++++++++++++++++++++++++++++++--
>  4 files changed, 75 insertions(+), 4 deletions(-)
> 
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 3360ac5..e7c4310 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -28,6 +28,7 @@ struct RAMBlock {
>      struct rcu_head rcu;
>      struct MemoryRegion *mr;
>      uint8_t *host;
> +    uint8_t *host_cache; /* For colo, VM's ram cache */

I suggest you make the name have 'colo' in it; e.g. colo_cache;
'host_cache' is a bit generic.

>      ram_addr_t offset;
>      ram_addr_t used_length;
>      ram_addr_t max_length;
> diff --git a/include/migration/colo.h b/include/migration/colo.h
> index 2676c4a..8edd5f1 100644
> --- a/include/migration/colo.h
> +++ b/include/migration/colo.h
> @@ -29,4 +29,7 @@ bool migration_incoming_enable_colo(void);
>  void migration_incoming_exit_colo(void);
>  void *colo_process_incoming_thread(void *opaque);
>  bool migration_incoming_in_colo_state(void);
> +/* ram cache */
> +int colo_init_ram_cache(void);
> +void colo_release_ram_cache(void);
>  #endif
> diff --git a/migration/colo.c b/migration/colo.c
> index b865513..25f85b2 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -304,6 +304,12 @@ void *colo_process_incoming_thread(void *opaque)
>          goto out;
>      }
>  
> +    ret = colo_init_ram_cache();
> +    if (ret < 0) {
> +        error_report("Failed to initialize ram cache");
> +        goto out;
> +    }
> +
>      ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
>      if (ret < 0) {
>          goto out;
> @@ -331,14 +337,14 @@ void *colo_process_incoming_thread(void *opaque)
>              goto out;
>          }
>  
> -        /* TODO: read migration data into colo buffer */
> +        /* TODO Load VM state */
>  
>          ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
>          if (ret < 0) {
>              goto out;
>          }
>  
> -        /* TODO: load vm state */
> +        /* TODO: flush vm state */

Do you really need to update/change the TODOs here?

>          ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
>          if (ret < 0) {
> @@ -352,6 +358,10 @@ out:
>                       strerror(-ret));
>      }
>  
> +    qemu_mutex_lock_iothread();
> +    colo_release_ram_cache();
> +    qemu_mutex_unlock_iothread();
> +
>      if (mis->to_src_file) {
>          qemu_fclose(mis->to_src_file);
>      }
> diff --git a/migration/ram.c b/migration/ram.c
> index 5784c15..b094dc3 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -222,6 +222,7 @@ static RAMBlock *last_sent_block;
>  static ram_addr_t last_offset;
>  static QemuMutex migration_bitmap_mutex;
>  static uint64_t migration_dirty_pages;
> +static bool ram_cache_enable;
>  static uint32_t last_version;
>  static bool ram_bulk_stage;
>  
> @@ -1446,7 +1447,11 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>              return NULL;
>          }
>  
> -        return block->host + offset;
> +        if (ram_cache_enable) {
> +            return block->host_cache + offset;
> +        } else {
> +            return block->host + offset;
> +        }
>      }
>  
>      len = qemu_get_byte(f);
> @@ -1456,7 +1461,11 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>          if (!strncmp(id, block->idstr, sizeof(id)) &&
>              block->max_length > offset) {
> -            return block->host + offset;
> +            if (ram_cache_enable) {
> +                return block->host_cache + offset;
> +            } else {
> +                return block->host + offset;
> +            }
>          }
>      }
>  
> @@ -1707,6 +1716,54 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>      return ret;
>  }
>  
> +/*
> + * colo cache: this is for secondary VM, we cache the whole
> + * memory of the secondary VM, it will be called after first migration.
> + */
> +int colo_init_ram_cache(void)
> +{
> +    RAMBlock *block;
> +
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        block->host_cache = qemu_anon_ram_alloc(block->used_length, NULL);
> +        if (!block->host_cache) {
> +            goto out_locked;
> +        }

Please print an error message; stating the function, block name and size that
failed.

> +        memcpy(block->host_cache, block->host, block->used_length);
> +    }
> +    rcu_read_unlock();
> +    ram_cache_enable = true;
> +    return 0;
> +
> +out_locked:
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (block->host_cache) {
> +            qemu_anon_ram_free(block->host_cache, block->used_length);
> +            block->host_cache = NULL;
> +        }
> +    }
> +
> +    rcu_read_unlock();
> +    return -errno;
> +}
> +
> +void colo_release_ram_cache(void)
> +{
> +    RAMBlock *block;
> +
> +    ram_cache_enable = false;
> +
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (block->host_cache) {
> +            qemu_anon_ram_free(block->host_cache, block->used_length);
> +            block->host_cache = NULL;
> +        }
> +    }
> +    rcu_read_unlock();
> +}
> +
>  static SaveVMHandlers savevm_ram_handlers = {
>      .save_live_setup = ram_save_setup,
>      .save_live_iterate = ram_save_iterate,
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration zhanghailiang
@ 2015-11-13 16:01   ` Eric Blake
  2015-11-16  8:35     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Eric Blake @ 2015-11-13 16:01 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah

[-- Attachment #1: Type: text/plain, Size: 1519 bytes --]

On 11/03/2015 04:56 AM, zhanghailiang wrote:
> We add helper function colo_supported() to indicate whether
> colo is supported or not, with which we use to control whether or not
> showing 'x-colo' string to users, they can use qmp command
> 'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
> to learn if colo is supported.
> 
> Cc: Juan Quintela <quintela@redhat.com>
> Cc: Amit Shah <amit.shah@redhat.com>
> Cc: Eric Blake <eblake@redhat.com>
> Cc: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
> v10:
> - Rename capability 'colo' to experimental 'x-colo' (Eric's suggestion).
> - Rename migrate_enable_colo() to migrate_colo_enabled() (Eric's suggestion).

> +++ b/qapi-schema.json
> @@ -540,11 +540,15 @@
>  # @auto-converge: If enabled, QEMU will automatically throttle down the guest
>  #          to speed up convergence of RAM migration. (since 1.6)
>  #
> +# @x-colo: If enabled, migration will never end, and the state of the VM on the
> +#        primary side will be migrated continuously to the VM on secondary
> +#        side. (since 2.5)

I think this has missed 2.5, so you'll need to tweak it to say 2.6.

With that fixed,
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it zhanghailiang
@ 2015-11-13 16:02   ` Dr. David Alan Gilbert
  2015-11-16  8:46     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-13 16:02 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We should not destroy the state of SVM (Secondary VM) until we receive the whole
> state from the PVM (Primary VM), in case the primary fails in the middle of sending
> the state, so, here we cache the device state in Secondary before restore it.
> 
> Besides, we should call qemu_system_reset() before load VM state,
> which can ensure the data is intact.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/colo.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 46 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 25f85b2..1339774 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -287,6 +287,9 @@ static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
>  void *colo_process_incoming_thread(void *opaque)
>  {
>      MigrationIncomingState *mis = opaque;
> +    QEMUFile *fb = NULL;
> +    QEMUSizedBuffer *buffer = NULL; /* Cache incoming device state */
> +    int  total_size;
>      int fd, ret = 0;
>  
>      migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> @@ -310,6 +313,12 @@ void *colo_process_incoming_thread(void *opaque)
>          goto out;
>      }
>  
> +    buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
> +    if (buffer == NULL) {
> +        error_report("Failed to allocate colo buffer!");
> +        goto out;
> +    }
> +
>      ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
>      if (ret < 0) {
>          goto out;
> @@ -337,19 +346,50 @@ void *colo_process_incoming_thread(void *opaque)
>              goto out;
>          }
>  
> -        /* TODO Load VM state */
> +        /* read the VM state total size first */
> +        total_size = colo_ctl_get(mis->from_src_file,
> +                                  COLO_COMMAND_VMSTATE_SIZE);
> +        if (total_size <= 0) {

Error message?

> +            goto out;
> +        }

OK, and when you fix up the colo_ctl_get in the previous patch to
take a separate pointer for value, you can make total_size a size_t.


Other than those, it looks good.

Dave

> +        /* read vm device state into colo buffer */
> +        ret = qsb_fill_buffer(buffer, mis->from_src_file, total_size);
> +        if (ret != total_size) {
> +            error_report("can't get all migration data");
> +            goto out;
> +        }
>  
>          ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
>          if (ret < 0) {
>              goto out;
>          }
>  
> +        /* open colo buffer for read */
> +        fb = qemu_bufopen("r", buffer);
> +        if (!fb) {
> +            error_report("can't open colo buffer for read");
> +            goto out;
> +        }
> +
> +        qemu_mutex_lock_iothread();
> +        qemu_system_reset(VMRESET_SILENT);
> +        if (qemu_loadvm_state(fb) < 0) {
> +            error_report("COLO: loadvm failed");
> +            qemu_mutex_unlock_iothread();
> +            goto out;
> +        }
> +        qemu_mutex_unlock_iothread();
> +
>          /* TODO: flush vm state */
>  
>          ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
>          if (ret < 0) {
>              goto out;
>          }
> +
> +        qemu_fclose(fb);
> +        fb = NULL;
>      }
>  
>  out:
> @@ -358,6 +398,11 @@ out:
>                       strerror(-ret));
>      }
>  
> +    if (fb) {
> +        qemu_fclose(fb);
> +    }
> +    qsb_free(buffer);
> +
>      qemu_mutex_lock_iothread();
>      colo_release_ram_cache();
>      qemu_mutex_unlock_iothread();
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 15/38] ram/COLO: Record pages received from PVM by re-using migration dirty bitmap
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 15/38] ram/COLO: Record pages received from PVM by re-using migration dirty bitmap zhanghailiang
@ 2015-11-13 16:19   ` Dr. David Alan Gilbert
  2015-11-16  9:07     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-13 16:19 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We need to record the address of the dirty pages that received from PVM,
> It will help flushing pages that cached into SVM.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
> v10:
> - New patch split from v9's patch 13
> - Rebase to master to use 'migration_bitmap_rcu'
> ---
>  migration/ram.c | 35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index b094dc3..70879bd 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1448,6 +1448,18 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>          }
>  
>          if (ram_cache_enable) {
> +            unsigned long *bitmap;
> +            long k = (block->mr->ram_addr + offset) >> TARGET_PAGE_BITS;
> +
> +            bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
> +            /*
> +            * During colo checkpoint, we need bitmap of these migrated pages.
> +            * It help us to decide which pages in ram cache should be flushed
> +            * into VM's RAM later.
> +            */
> +            if (!test_and_set_bit(k, bitmap)) {
> +                migration_dirty_pages++;
> +            }

I don't like having this in host_from_stream_offset; if you look
at the current ram_load there is only a single call to host_from_stream_offset, so it's
now much easier for you to move it into a separate function.

>              return block->host_cache + offset;
>          } else {
>              return block->host + offset;
> @@ -1462,6 +1474,13 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>          if (!strncmp(id, block->idstr, sizeof(id)) &&
>              block->max_length > offset) {
>              if (ram_cache_enable) {
> +                unsigned long *bitmap;
> +                long k = (block->mr->ram_addr + offset) >> TARGET_PAGE_BITS;
> +
> +                bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
> +                if (!test_and_set_bit(k, bitmap)) {
> +                    migration_dirty_pages++;
> +                }
>                  return block->host_cache + offset;
>              } else {
>                  return block->host + offset;
> @@ -1723,6 +1742,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>  int colo_init_ram_cache(void)
>  {
>      RAMBlock *block;
> +    int64_t ram_cache_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>  
>      rcu_read_lock();
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> @@ -1734,6 +1754,15 @@ int colo_init_ram_cache(void)
>      }
>      rcu_read_unlock();
>      ram_cache_enable = true;
> +    /*
> +    * Record the dirty pages that sent by PVM, we use this dirty bitmap together
> +    * with to decide which page in cache should be flushed into SVM's RAM. Here
> +    * we use the same name 'migration_bitmap_rcu' as for migration.
> +    */
> +    migration_bitmap_rcu = g_new(struct BitmapRcu, 1);

Please update that to g_new0 (I changed the other use when I added postcopy).

Dave

> +    migration_bitmap_rcu->bmap = bitmap_new(ram_cache_pages);
> +    migration_dirty_pages = 0;
> +
>      return 0;
>  
>  out_locked:
> @@ -1751,9 +1780,15 @@ out_locked:
>  void colo_release_ram_cache(void)
>  {
>      RAMBlock *block;
> +    struct BitmapRcu *bitmap = migration_bitmap_rcu;
>  
>      ram_cache_enable = false;
>  
> +    atomic_rcu_set(&migration_bitmap_rcu, NULL);
> +    if (bitmap) {
> +        call_rcu(bitmap, migration_bitmap_free, rcu);
> +    }
> +
>      rcu_read_lock();
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>          if (block->host_cache) {
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 16/38] COLO: Flush PVM's cached RAM into SVM's memory
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 16/38] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
@ 2015-11-13 16:38   ` Dr. David Alan Gilbert
  2015-11-16 12:46     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-13 16:38 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> During the time of VM's running, PVM may dirty some pages, we will transfer
> PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
> time. So, the content of SVM's RAM cache will always be some with PVM's memory
> after checkpoint.
> 
> Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
> we do this in a more efficient way:
> Only flush any page that dirtied by PVM since last checkpoint.
> In this way, we can ensure SVM's memory same with PVM's.
> 
> Besides, we must ensure flush RAM cache before load device state.

Yes, just a couple of minor comments below; mostly OK.

> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
> v10: trace the number of dirty pages that be received.
> ---
>  include/migration/colo.h |  1 +
>  migration/colo.c         |  2 --
>  migration/ram.c          | 40 ++++++++++++++++++++++++++++++++++++++++
>  trace-events             |  1 +
>  4 files changed, 42 insertions(+), 2 deletions(-)
> 
> diff --git a/include/migration/colo.h b/include/migration/colo.h
> index 8edd5f1..be2890b 100644
> --- a/include/migration/colo.h
> +++ b/include/migration/colo.h
> @@ -32,4 +32,5 @@ bool migration_incoming_in_colo_state(void);
>  /* ram cache */
>  int colo_init_ram_cache(void);
>  void colo_release_ram_cache(void);
> +void colo_flush_ram_cache(void);
>  #endif
> diff --git a/migration/colo.c b/migration/colo.c
> index 1339774..0efab21 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -381,8 +381,6 @@ void *colo_process_incoming_thread(void *opaque)
>          }
>          qemu_mutex_unlock_iothread();
>  
> -        /* TODO: flush vm state */
> -
>          ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
>          if (ret < 0) {
>              goto out;
> diff --git a/migration/ram.c b/migration/ram.c
> index 70879bd..d7e0e37 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1601,6 +1601,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>      int flags = 0, ret = 0;
>      static uint64_t seq_iter;
>      int len = 0;
> +    bool need_flush = false;
>  
>      seq_iter++;
>  
> @@ -1669,6 +1670,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                  ret = -EINVAL;
>                  break;
>              }
> +
> +            need_flush = true;
>              ch = qemu_get_byte(f);
>              ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
>              break;
> @@ -1679,6 +1682,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                  ret = -EINVAL;
>                  break;
>              }
> +
> +            need_flush = true;
>              qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
>              break;
>          case RAM_SAVE_FLAG_COMPRESS_PAGE:
> @@ -1711,6 +1716,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                  ret = -EINVAL;
>                  break;
>              }
> +            need_flush = true;

You can probably move the 'need_flush' to the big if near the top of the loop in the
current version.

>              break;
>          case RAM_SAVE_FLAG_EOS:
>              /* normal exit */
> @@ -1730,6 +1736,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>      }
>  
>      rcu_read_unlock();
> +
> +    if (!ret  && ram_cache_enable && need_flush) {
> +        DPRINTF("Flush ram_cache\n");

trace_

> +        colo_flush_ram_cache();
> +    }
>      DPRINTF("Completed load of VM with exit code %d seq iteration "
>              "%" PRIu64 "\n", ret, seq_iter);
>      return ret;
> @@ -1799,6 +1810,35 @@ void colo_release_ram_cache(void)
>      rcu_read_unlock();
>  }
>  
> +/*
> + * Flush content of RAM cache into SVM's memory.
> + * Only flush the pages that be dirtied by PVM or SVM or both.
> + */
> +void colo_flush_ram_cache(void)
> +{
> +    RAMBlock *block = NULL;
> +    void *dst_host;
> +    void *src_host;
> +    ram_addr_t  offset = 0;
> +
> +    trace_colo_flush_ram_cache(migration_dirty_pages);
> +    rcu_read_lock();
> +    block = QLIST_FIRST_RCU(&ram_list.blocks);
> +    while (block) {
> +        offset = migration_bitmap_find_and_reset_dirty(block, offset);

You'll need to rework that a little (I split that into
migration_bitmap_find_dirty and migration_bitmap_clear_dirty)

> +        if (offset >= block->used_length) {
> +            offset = 0;
> +            block = QLIST_NEXT_RCU(block, next);
> +        } else {
> +            dst_host = block->host + offset;
> +            src_host = block->host_cache + offset;
> +            memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
> +        }
> +    }
> +    rcu_read_unlock();
> +    assert(migration_dirty_pages == 0);
> +}
> +
>  static SaveVMHandlers savevm_ram_handlers = {
>      .save_live_setup = ram_save_setup,
>      .save_live_iterate = ram_save_iterate,
> diff --git a/trace-events b/trace-events
> index ee4679c..c98bc13 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1232,6 +1232,7 @@ qemu_file_fclose(void) ""
>  migration_bitmap_sync_start(void) ""
>  migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64""
>  migration_throttle(void) ""
> +colo_flush_ram_cache(uint64_t dirty_pages) "dirty_pages %" PRIu64""
>  
>  # hw/display/qxl.c
>  disable qxl_interface_set_mm_time(int qid, uint32_t mm_time) "%d %d"
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration zhanghailiang
  2015-11-06 16:48   ` Dr. David Alan Gilbert
@ 2015-11-13 16:42   ` Eric Blake
  2015-11-16 13:00     ` zhanghailiang
  1 sibling, 1 reply; 100+ messages in thread
From: Eric Blake @ 2015-11-13 16:42 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah

[-- Attachment #1: Type: text/plain, Size: 1047 bytes --]

On 11/03/2015 04:56 AM, zhanghailiang wrote:
> Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state
> after the first live migration successfully finished.
> 
> We reuse migration thread, so if colo is enabled by user, migration thread will
> go into the process of colo.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
> v10: Simplify process by dropping colo thread and reusing migration thread.
>      (Dave's suggestion)
> ---

> +++ b/qapi-schema.json
> @@ -439,7 +439,7 @@
>  ##
>  { 'enum': 'MigrationStatus',
>    'data': [ 'none', 'setup', 'cancelling', 'cancelled',
> -            'active', 'completed', 'failed' ] }
> +            'active', 'completed', 'failed', 'colo' ] }
>  

Missing documentation of the new state, including a '(since 2.6)' tag.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol zhanghailiang
  2015-11-06 18:26   ` Dr. David Alan Gilbert
@ 2015-11-13 16:46   ` Eric Blake
  2015-11-17  7:04     ` zhanghailiang
  1 sibling, 1 reply; 100+ messages in thread
From: Eric Blake @ 2015-11-13 16:46 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah

[-- Attachment #1: Type: text/plain, Size: 3198 bytes --]

On 11/03/2015 04:56 AM, zhanghailiang wrote:
> We need communications protocol of user-defined to control the checkpoint
> process.
> 
> The new checkpoint request is started by Primary VM, and the interactive process
> like below:
> Checkpoint synchronizing points,
> 
>                        Primary                         Secondary
> 'checkpoint-request'   @ ----------------------------->
>                                                        Suspend (In hybrid mode)
> 'checkpoint-reply'     <------------------------------ @
>                        Suspend&Save state
> 'vmstate-send'         @ ----------------------------->
>                        Send state                      Receive state
> 'vmstate-received'     <------------------------------ @
>                        Release packets                 Load state
> 'vmstate-load'         <------------------------------ @
>                        Resume                          Resume (In hybrid mode)
> 
>                        Start Comparing (In hybrid mode)
> NOTE:
>  1) '@' who sends the message
>  2) Every sync-point is synchronized by two sides with only
>     one handshake(single direction) for low-latency.
>     If more strict synchronization is required, a opposite direction
>     sync-point should be added.
>  3) Since sync-points are single direction, the remote side may
>     go forward a lot when this side just receives the sync-point.
>  4) For now, we only support 'periodic' checkpoint, for which
>    the Secondary VM is not running, later we will support 'hybrid' mode.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Cc: Eric Blake <eblake@redhat.com>
> ---
> v10:
> - Rename enum COLOCmd to COLOCommand (Eric's suggestion).
> - Remove unused 'ram-steal'

Interface review only:


> +++ b/qapi-schema.json
> @@ -702,6 +702,33 @@
>              '*tls-port': 'int', '*cert-subject': 'str' } }
>  
>  ##
> +# @COLOCommand
> +#
> +# The colo command

Still might be nice to spell out what COLO means here, but it's fairly
obvious this will be related to anything else COLO, so I'm not too worried.

> +#
> +# @invalid: unknown command
> +#
> +# @checkpoint-ready: SVM is ready for checkpointing
> +#
> +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
> +#
> +# @checkpoint-reply: SVM gets PVM's checkpoint request
> +#
> +# @vmstate-send: VM's state will be sent by PVM.
> +#
> +# @vmstate-size: The total size of VMstate.
> +#
> +# @vmstate-received: VM's state has been received by SVM
> +#
> +# @vmstate-loaded: VM's state has been loaded by SVM
> +#
> +# Since: 2.5

Will need a tweak to say 2.6.  Otherwise looks okay.

> +##
> +{ 'enum': 'COLOCommand',
> +  'data': [ 'invalid', 'checkpoint-ready', 'checkpoint-request',
> +            'checkpoint-reply', 'vmstate-send', 'vmstate-size',
> +            'vmstate-received', 'vmstate-loaded' ] }
> +


-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
  2015-11-06 18:28   ` Dr. David Alan Gilbert
@ 2015-11-13 16:47   ` Eric Blake
  2015-11-17  7:15     ` zhanghailiang
  1 sibling, 1 reply; 100+ messages in thread
From: Eric Blake @ 2015-11-13 16:47 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah

[-- Attachment #1: Type: text/plain, Size: 1808 bytes --]

On 11/03/2015 04:56 AM, zhanghailiang wrote:
> Guest will enter this state when paused to save/restore VM state
> under colo checkpoint.
> 
> Cc: Eric Blake <eblake@redhat.com>
> Cc: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  qapi-schema.json | 7 ++++++-
>  vl.c             | 8 ++++++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 5c4fe6d..49f2a90 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -154,12 +154,15 @@
>  # @watchdog: the watchdog action is configured to pause and has been triggered
>  #
>  # @guest-panicked: guest has been panicked as a result of guest OS panic
> +#
> +# @colo: guest is paused to save/restore VM state under colo checkpoint (since
> +# 2.5)

Will need a tweak to 2.6;

>  ##
>  { 'enum': 'RunState',
>    'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>              'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>              'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
> -            'guest-panicked' ] }
> +            'guest-panicked', 'colo' ] }
>  
>  ##
>  # @StatusInfo:
> @@ -434,6 +437,8 @@
>  #
>  # @failed: some error occurred during migration process.
>  #
> +# @colo: VM is in the process of fault tolerance. (since 2.5)

Likewise.  But my R-b still stands after that minor tweak.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover zhanghailiang
@ 2015-11-13 16:59   ` Eric Blake
  2015-11-17  8:03     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Eric Blake @ 2015-11-13 16:59 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, Luiz Capitulino

[-- Attachment #1: Type: text/plain, Size: 3006 bytes --]

On 11/03/2015 04:56 AM, zhanghailiang wrote:
> We leave users to choose whatever heartbeat solution they want, if the heartbeat
> is lost, or other errors they detect, they can use experimental command
> 'x_colo_lost_heartbeat' to tell COLO to do failover, COLO will do operations
> accordingly.
> 
> For example, if the command is sent to the PVM, the Primary side will
> exit COLO mode and take over operation. If sent to the Secondary, the
> secondary will run failover work, then take over server operation to
> become the new Primary.
> 
> Cc: Luiz Capitulino <lcapitulino@redhat.com>
> Cc: Eric Blake <eblake@redhat.com>
> Cc: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---
> v10: Rename command colo_lost_hearbeat to experimental 'x_colo_lost_heartbeat'
> ---

> @@ -29,6 +30,9 @@ bool migration_incoming_enable_colo(void);
>  void migration_incoming_exit_colo(void);
>  void *colo_process_incoming_thread(void *opaque);
>  bool migration_incoming_in_colo_state(void);
> +
> +int get_colo_mode(void);

Should this return an enum type instead of an int?


> +++ b/migration/colo-comm.c
> @@ -20,6 +20,17 @@ typedef struct {
>  
>  static COLOInfo colo_info;
>  
> +int get_colo_mode(void)
> +{
> +    if (migration_in_colo_state()) {
> +        return COLO_MODE_PRIMARY;
> +    } else if (migration_incoming_in_colo_state()) {
> +        return COLO_MODE_SECONDARY;
> +    } else {
> +        return COLO_MODE_UNKNOWN;
> +    }
> +}

Particularly since it is always returning values of the same enum.

Not fatal to the patch, so much as a style issue.


> +void qmp_x_colo_lost_heartbeat(Error **errp)
> +{
> +    if (get_colo_mode() == COLO_MODE_UNKNOWN) {
> +        error_setg(errp, QERR_FEATURE_DISABLED, "colo");
> +        return;

We've slowly been trying to get rid of QERR_ usage.  But you aren't the
first user, and a global cleanup may be better. So I can overlook it for
now.

> +++ b/qapi-schema.json
> @@ -734,6 +734,32 @@
>              'checkpoint-reply', 'vmstate-send', 'vmstate-size',
>              'vmstate-received', 'vmstate-loaded' ] }
>  
> +##
> +# @COLOMode
> +#
> +# The colo mode
> +#
> +# @unknown: unknown mode
> +#
> +# @primary: master side
> +#
> +# @secondary: slave side
> +#
> +# Since: 2.5
> +##
> +{ 'enum': 'COLOMode',
> +  'data': [ 'unknown', 'primary', 'secondary'] }
> +
> +##
> +# @x-colo-lost-heartbeat
> +#
> +# Tell qemu that heartbeat is lost, request it to do takeover procedures.
> +#

The docs here are rather short, compared to your commit message (in
particular, the fact that it causes a different action depending on
whether it is sent to primary [takeover] or secondary [failover]).

> +# Since: 2.5

2.6 in both places.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically zhanghailiang
@ 2015-11-13 18:34   ` Dr. David Alan Gilbert
  2015-11-17  9:11     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-13 18:34 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Do checkpoint periodically, the default interval is 200ms.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---
>  migration/colo.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 0efab21..a6791f4 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -11,12 +11,19 @@
>   */
>  
>  #include <unistd.h>
> +#include "qemu/timer.h"
>  #include "sysemu/sysemu.h"
>  #include "migration/colo.h"
>  #include "trace.h"
>  #include "qemu/error-report.h"
>  #include "qemu/sockets.h"
>  
> +/*
> + * checkpoint interval: unit ms
> + * Note: Please change this default value to 10000 when we support hybrid mode.
> + */
> +#define CHECKPOINT_MAX_PEROID 200

Why not put the patch that makes this a configurable parameter before this,
and then we can use it straight away?

>  /* colo buffer */
>  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>  
> @@ -183,6 +190,7 @@ out:
>  static void colo_process_checkpoint(MigrationState *s)
>  {
>      QEMUSizedBuffer *buffer = NULL;
> +    int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      int fd, ret = 0;
>  
>      /* Dup the fd of to_dst_file */
> @@ -220,11 +228,17 @@ static void colo_process_checkpoint(MigrationState *s)
>      trace_colo_vm_state_change("stop", "run");
>  
>      while (s->state == MIGRATION_STATUS_COLO) {
> +        current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> +        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
> +            g_usleep(100000);
> +            continue;
> +        }

I'm a bit concerned at the 100ms wait, when the period is 200ms; 
depending how the times work out, couldn't we end up waiting for just
under 300ms? - that's a big error - and it's even more weird when
we make it configurable later.

I don't think we've got a sleep-until, which is a shame; but how
about something like:

   if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
       int64_t delay_ms;
       delay_ms = CHECKPOINT_MAX_PERIOD - (current_time - checkpoint_time);
       g_usleep (delay_ms * 1000);
   }

Dave

>          /* start a colo checkpoint */
>          ret = colo_do_checkpoint_transaction(s, buffer);
>          if (ret < 0) {
>              goto out;
>          }
> +        checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      }
>  
>  out:
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint
  2015-11-09  9:17     ` zhanghailiang
@ 2015-11-13 18:53       ` Dr. David Alan Gilbert
  2015-11-17 10:20         ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-13 18:53 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> On 2015/11/7 2:59, Dr. David Alan Gilbert wrote:
> >* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>The main process of checkpoint is to synchronize SVM with PVM.
> >>VM's state includes ram and device state. So we will migrate PVM's
> >>state to SVM when do checkpoint, just like migration does.
> >>
> >>We will cache PVM's state in slave, we use QEMUSizedBuffer
> >>to store the data, we need to know the size of VM state, so in master,
> >>we use qsb to store VM state temporarily, get the data size by call qsb_get_length()
> >>and then migrate the data to the qsb in the secondary side.
> >>
> >>Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> >>Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >>Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> >>---
> >>  migration/colo.c   | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++----
> >>  migration/ram.c    | 47 +++++++++++++++++++++++++++++--------
> >>  migration/savevm.c |  2 +-
> >>  3 files changed, 101 insertions(+), 16 deletions(-)
> >>
> >>diff --git a/migration/colo.c b/migration/colo.c
> >>index 2510762..b865513 100644
> >>--- a/migration/colo.c
> >>+++ b/migration/colo.c
> >>@@ -17,6 +17,9 @@
> >>  #include "qemu/error-report.h"
> >>  #include "qemu/sockets.h"
> >>
> >>+/* colo buffer */
> >>+#define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> >>+
> >>  bool colo_supported(void)
> >>  {
> >>      return true;
> >>@@ -94,9 +97,12 @@ static int colo_ctl_get(QEMUFile *f, uint32_t require)
> >>      return value;
> >>  }
> >>
> >>-static int colo_do_checkpoint_transaction(MigrationState *s)
> >>+static int colo_do_checkpoint_transaction(MigrationState *s,
> >>+                                          QEMUSizedBuffer *buffer)
> >>  {
> >>      int ret;
> >>+    size_t size;
> >>+    QEMUFile *trans = NULL;
> >>
> >>      ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
> >>      if (ret < 0) {
> >>@@ -107,15 +113,47 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
> >>      if (ret < 0) {
> >>          goto out;
> >>      }
> >>+    /* Reset colo buffer and open it for write */
> >>+    qsb_set_length(buffer, 0);
> >>+    trans = qemu_bufopen("w", buffer);
> >>+    if (!trans) {
> >>+        error_report("Open colo buffer for write failed");
> >>+        goto out;
> >>+    }
> >>
> >>-    /* TODO: suspend and save vm state to colo buffer */
> >>+    qemu_mutex_lock_iothread();
> >>+    vm_stop_force_state(RUN_STATE_COLO);
> >>+    qemu_mutex_unlock_iothread();
> >>+    trace_colo_vm_state_change("run", "stop");
> >>+
> >>+    /* Disable block migration */
> >>+    s->params.blk = 0;
> >>+    s->params.shared = 0;
> >>+    qemu_savevm_state_header(trans);
> >>+    qemu_savevm_state_begin(trans, &s->params);
> >>+    qemu_mutex_lock_iothread();
> >>+    qemu_savevm_state_complete(trans);
> >>+    qemu_mutex_unlock_iothread();
> >>+
> >>+    qemu_fflush(trans);
> >>
> >>      ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
> >>      if (ret < 0) {
> >>          goto out;
> >>      }
> >>+    /* we send the total size of the vmstate first */
> >>+    size = qsb_get_length(buffer);
> >>+    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SIZE, size);
> >>+    if (ret < 0) {
> >>+        goto out;
> >>+    }
> >>
> >>-    /* TODO: send vmstate to Secondary */
> >>+    qsb_put_buffer(s->to_dst_file, buffer, size);
> >>+    qemu_fflush(s->to_dst_file);
> >>+    ret = qemu_file_get_error(s->to_dst_file);
> >>+    if (ret < 0) {
> >>+        goto out;
> >>+    }
> >>
> >>      ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
> >>      if (ret < 0) {
> >>@@ -127,14 +165,24 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
> >>          goto out;
> >>      }
> >>
> >>-    /* TODO: resume Primary */
> >>+    ret = 0;
> >>+    /* resume master */
> >>+    qemu_mutex_lock_iothread();
> >>+    vm_start();
> >>+    qemu_mutex_unlock_iothread();
> >>+    trace_colo_vm_state_change("stop", "run");
> >>
> >>  out:
> >>+    if (trans) {
> >>+        qemu_fclose(trans);
> >>+    }
> >>+
> >>      return ret;
> >>  }
> >>
> >>  static void colo_process_checkpoint(MigrationState *s)
> >>  {
> >>+    QEMUSizedBuffer *buffer = NULL;
> >>      int fd, ret = 0;
> >>
> >>      /* Dup the fd of to_dst_file */
> >>@@ -159,6 +207,13 @@ static void colo_process_checkpoint(MigrationState *s)
> >>          goto out;
> >>      }
> >>
> >>+    buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
> >>+    if (buffer == NULL) {
> >>+        ret = -ENOMEM;
> >>+        error_report("Failed to allocate buffer!");
> >
> >Please say 'Failed to allocate colo buffer'; QEMU has lots and lots of buffers.
> >
> 
> OK, will fix it in next version.
> 
> >>+        goto out;
> >>+    }
> >>+
> >>      qemu_mutex_lock_iothread();
> >>      vm_start();
> >>      qemu_mutex_unlock_iothread();
> >>@@ -166,7 +221,7 @@ static void colo_process_checkpoint(MigrationState *s)
> >>
> >>      while (s->state == MIGRATION_STATUS_COLO) {
> >>          /* start a colo checkpoint */
> >>-        ret = colo_do_checkpoint_transaction(s);
> >>+        ret = colo_do_checkpoint_transaction(s, buffer);
> >>          if (ret < 0) {
> >>              goto out;
> >>          }
> >>@@ -179,6 +234,9 @@ out:
> >>      migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
> >>                        MIGRATION_STATUS_COMPLETED);
> >>
> >>+    qsb_free(buffer);
> >>+    buffer = NULL;
> >>+
> >>      if (s->from_dst_file) {
> >>          qemu_fclose(s->from_dst_file);
> >>      }
> >>diff --git a/migration/ram.c b/migration/ram.c
> >>index a25bcc7..5784c15 100644
> >>--- a/migration/ram.c
> >>+++ b/migration/ram.c
> >>@@ -38,6 +38,7 @@
> >>  #include "trace.h"
> >>  #include "exec/ram_addr.h"
> >>  #include "qemu/rcu_queue.h"
> >>+#include "migration/colo.h"
> >>
> >>  #ifdef DEBUG_MIGRATION_RAM
> >>  #define DPRINTF(fmt, ...) \
> >>@@ -1165,15 +1166,8 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
> >>      }
> >>  }
> >>
> >>-/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
> >>- * long-running RCU critical section.  When rcu-reclaims in the code
> >>- * start to become numerous it will be necessary to reduce the
> >>- * granularity of these critical sections.
> >>- */
> >>-
> >>-static int ram_save_setup(QEMUFile *f, void *opaque)
> >>+static int ram_save_init_globals(void)
> >>  {
> >>-    RAMBlock *block;
> >>      int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
> >>
> >>      dirty_rate_high_cnt = 0;
> >>@@ -1233,6 +1227,31 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >>      migration_bitmap_sync();
> >>      qemu_mutex_unlock_ramlist();
> >>      qemu_mutex_unlock_iothread();
> >>+    rcu_read_unlock();
> >>+
> >>+    return 0;
> >>+}
> >
> >It surprises me you want migration_bitmap_sync in ram_save_init_globals(),
> >but I guess you want the first sync at the start.
> >
> 
> Er, sorry,i don't quite understand.
> Here. I just split part codes of ram_save_setup()
> into a helper function ram_save_init_global(), to make it more clear.
> We can't do initial work for twice. Is there any thing wrong ?

No, that's OK - it just seemed odd for a function like 'init_globals'
to do such a big side effect of doing the sync; but yes, it makes sense
since it's just a split.

> >>+/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
> >>+ * long-running RCU critical section.  When rcu-reclaims in the code
> >>+ * start to become numerous it will be necessary to reduce the
> >>+ * granularity of these critical sections.
> >>+ */
> >>+
> >>+static int ram_save_setup(QEMUFile *f, void *opaque)
> >>+{
> >>+    RAMBlock *block;
> >>+
> >>+    /*
> >>+     * migration has already setup the bitmap, reuse it.
> >>+     */
> >>+    if (!migration_in_colo_state()) {
> >>+        if (ram_save_init_globals() < 0) {
> >>+            return -1;
> >>+         }
> >>+    }
> >>+
> >>+    rcu_read_lock();
> >>
> >>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
> >>
> >>@@ -1332,7 +1351,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
> >>      while (true) {
> >>          int pages;
> >>
> >>-        pages = ram_find_and_save_block(f, true, &bytes_transferred);
> >>+        pages = ram_find_and_save_block(f, !migration_in_colo_state(),
> >>+                                        &bytes_transferred);
> >>          /* no more blocks to sent */
> >>          if (pages == 0) {
> >>              break;
> >>@@ -1343,8 +1363,15 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
> >>      ram_control_after_iterate(f, RAM_CONTROL_FINISH);
> >>
> >>      rcu_read_unlock();
> >>+    /*
> >>+     * Since we need to reuse dirty bitmap in colo,
> >>+     * don't cleanup the bitmap.
> >>+     */
> >>+    if (!migrate_colo_enabled() ||
> >>+        migration_has_failed(migrate_get_current())) {
> >>+        migration_end();
> >>+    }
> >>
> >>-    migration_end();
> >>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> >>
> >>      return 0;
> >>diff --git a/migration/savevm.c b/migration/savevm.c
> >>index dbcc39a..0faf12b 100644
> >>--- a/migration/savevm.c
> >>+++ b/migration/savevm.c
> >>@@ -48,7 +48,7 @@
> >>  #include "qemu/iov.h"
> >>  #include "block/snapshot.h"
> >>  #include "block/qapi.h"
> >>-
> >>+#include "migration/colo.h"
> >>
> >>  #ifndef ETH_P_RARP
> >>  #define ETH_P_RARP 0x8035
> >
> >Wrong patch?
> >
> 
> No, we have call migration_in_colo_state() in qemu_savevm_state_begin().
> So we have to include "migration/colo.h"

I don't think you use it in savevm.c until patch 30, so you can add
the #include in patch 30 (or whichever is the patch that first needs it).

Dave


> 
> >
> >So other than those minor things:
> >
> >Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >
> >but watch out for the recent changes to migrate_end that went in
> >a few days ago.
> >
> 
> Thanks for reminding me, i have rebased that. ;)
> 
> zhanghailiang
> 
> >Dave
> >
> >>--
> >>1.8.3.1
> >>
> >>
> >--
> >Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> >.
> >
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
  2015-11-13 15:39   ` Dr. David Alan Gilbert
@ 2015-11-16  7:57     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-16  7:57 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/13 23:39, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> We should not load PVM's state directly into SVM, because there maybe some
>> errors happen when SVM is receving data, which will break SVM.
>>
>> We need to ensure receving all data before load the state into SVM. We use
>> an extra memory to cache these data (PVM's ram). The ram cache in secondary side
>> is initially the same as SVM/PVM's memory. And in the process of checkpoint,
>> we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
>> always the same as PVM's memory at every checkpoint, then we flush this cached ram
>> to SVM after we receive all PVM's state.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>> v10: Split the process of dirty pages recording into a new patch
>> ---
>>   include/exec/ram_addr.h  |  1 +
>>   include/migration/colo.h |  3 +++
>>   migration/colo.c         | 14 +++++++++--
>>   migration/ram.c          | 61 ++++++++++++++++++++++++++++++++++++++++++++++--
>>   4 files changed, 75 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>> index 3360ac5..e7c4310 100644
>> --- a/include/exec/ram_addr.h
>> +++ b/include/exec/ram_addr.h
>> @@ -28,6 +28,7 @@ struct RAMBlock {
>>       struct rcu_head rcu;
>>       struct MemoryRegion *mr;
>>       uint8_t *host;
>> +    uint8_t *host_cache; /* For colo, VM's ram cache */
>
> I suggest you make the name have 'colo' in it; e.g. colo_cache;
> 'host_cache' is a bit generic.
>

Hmm, this change makes sense, will update it in next version.

>>       ram_addr_t offset;
>>       ram_addr_t used_length;
>>       ram_addr_t max_length;
>> diff --git a/include/migration/colo.h b/include/migration/colo.h
>> index 2676c4a..8edd5f1 100644
>> --- a/include/migration/colo.h
>> +++ b/include/migration/colo.h
>> @@ -29,4 +29,7 @@ bool migration_incoming_enable_colo(void);
>>   void migration_incoming_exit_colo(void);
>>   void *colo_process_incoming_thread(void *opaque);
>>   bool migration_incoming_in_colo_state(void);
>> +/* ram cache */
>> +int colo_init_ram_cache(void);
>> +void colo_release_ram_cache(void);
>>   #endif
>> diff --git a/migration/colo.c b/migration/colo.c
>> index b865513..25f85b2 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -304,6 +304,12 @@ void *colo_process_incoming_thread(void *opaque)
>>           goto out;
>>       }
>>
>> +    ret = colo_init_ram_cache();
>> +    if (ret < 0) {
>> +        error_report("Failed to initialize ram cache");
>> +        goto out;
>> +    }
>> +
>>       ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
>>       if (ret < 0) {
>>           goto out;
>> @@ -331,14 +337,14 @@ void *colo_process_incoming_thread(void *opaque)
>>               goto out;
>>           }
>>
>> -        /* TODO: read migration data into colo buffer */
>> +        /* TODO Load VM state */
>>
>>           ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
>>           if (ret < 0) {
>>               goto out;
>>           }
>>
>> -        /* TODO: load vm state */
>> +        /* TODO: flush vm state */
>
> Do you really need to update/change the TODOs here?
>

No, i will drop this ;)

>>           ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
>>           if (ret < 0) {
>> @@ -352,6 +358,10 @@ out:
>>                        strerror(-ret));
>>       }
>>
>> +    qemu_mutex_lock_iothread();
>> +    colo_release_ram_cache();
>> +    qemu_mutex_unlock_iothread();
>> +
>>       if (mis->to_src_file) {
>>           qemu_fclose(mis->to_src_file);
>>       }
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 5784c15..b094dc3 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -222,6 +222,7 @@ static RAMBlock *last_sent_block;
>>   static ram_addr_t last_offset;
>>   static QemuMutex migration_bitmap_mutex;
>>   static uint64_t migration_dirty_pages;
>> +static bool ram_cache_enable;
>>   static uint32_t last_version;
>>   static bool ram_bulk_stage;
>>
>> @@ -1446,7 +1447,11 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>>               return NULL;
>>           }
>>
>> -        return block->host + offset;
>> +        if (ram_cache_enable) {
>> +            return block->host_cache + offset;
>> +        } else {
>> +            return block->host + offset;
>> +        }
>>       }
>>
>>       len = qemu_get_byte(f);
>> @@ -1456,7 +1461,11 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>>       QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>           if (!strncmp(id, block->idstr, sizeof(id)) &&
>>               block->max_length > offset) {
>> -            return block->host + offset;
>> +            if (ram_cache_enable) {
>> +                return block->host_cache + offset;
>> +            } else {
>> +                return block->host + offset;
>> +            }
>>           }
>>       }
>>
>> @@ -1707,6 +1716,54 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>       return ret;
>>   }
>>
>> +/*
>> + * colo cache: this is for secondary VM, we cache the whole
>> + * memory of the secondary VM, it will be called after first migration.
>> + */
>> +int colo_init_ram_cache(void)
>> +{
>> +    RAMBlock *block;
>> +
>> +    rcu_read_lock();
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        block->host_cache = qemu_anon_ram_alloc(block->used_length, NULL);
>> +        if (!block->host_cache) {
>> +            goto out_locked;
>> +        }
>
> Please print an error message; stating the function, block name and size that
> failed.
>

Good idea, will fix in next version, thanks.

>> +        memcpy(block->host_cache, block->host, block->used_length);
>> +    }
>> +    rcu_read_unlock();
>> +    ram_cache_enable = true;
>> +    return 0;
>> +
>> +out_locked:
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        if (block->host_cache) {
>> +            qemu_anon_ram_free(block->host_cache, block->used_length);
>> +            block->host_cache = NULL;
>> +        }
>> +    }
>> +
>> +    rcu_read_unlock();
>> +    return -errno;
>> +}
>> +
>> +void colo_release_ram_cache(void)
>> +{
>> +    RAMBlock *block;
>> +
>> +    ram_cache_enable = false;
>> +
>> +    rcu_read_lock();
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        if (block->host_cache) {
>> +            qemu_anon_ram_free(block->host_cache, block->used_length);
>> +            block->host_cache = NULL;
>> +        }
>> +    }
>> +    rcu_read_unlock();
>> +}
>> +
>>   static SaveVMHandlers savevm_ram_handlers = {
>>       .save_live_setup = ram_save_setup,
>>       .save_live_iterate = ram_save_iterate,
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration
  2015-11-13 16:01   ` Eric Blake
@ 2015-11-16  8:35     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-16  8:35 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah

On 2015/11/14 0:01, Eric Blake wrote:
> On 11/03/2015 04:56 AM, zhanghailiang wrote:
>> We add helper function colo_supported() to indicate whether
>> colo is supported or not, with which we use to control whether or not
>> showing 'x-colo' string to users, they can use qmp command
>> 'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
>> to learn if colo is supported.
>>
>> Cc: Juan Quintela <quintela@redhat.com>
>> Cc: Amit Shah <amit.shah@redhat.com>
>> Cc: Eric Blake <eblake@redhat.com>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>> v10:
>> - Rename capability 'colo' to experimental 'x-colo' (Eric's suggestion).
>> - Rename migrate_enable_colo() to migrate_colo_enabled() (Eric's suggestion).
>
>> +++ b/qapi-schema.json
>> @@ -540,11 +540,15 @@
>>   # @auto-converge: If enabled, QEMU will automatically throttle down the guest
>>   #          to speed up convergence of RAM migration. (since 1.6)
>>   #
>> +# @x-colo: If enabled, migration will never end, and the state of the VM on the
>> +#        primary side will be migrated continuously to the VM on secondary
>> +#        side. (since 2.5)
>
> I think this has missed 2.5, so you'll need to tweak it to say 2.6.
>

Yes, will update it in next version.

> With that fixed,
> Reviewed-by: Eric Blake <eblake@redhat.com>
>

Thanks.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it
  2015-11-13 16:02   ` Dr. David Alan Gilbert
@ 2015-11-16  8:46     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-16  8:46 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/14 0:02, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> We should not destroy the state of SVM (Secondary VM) until we receive the whole
>> state from the PVM (Primary VM), in case the primary fails in the middle of sending
>> the state, so, here we cache the device state in Secondary before restore it.
>>
>> Besides, we should call qemu_system_reset() before load VM state,
>> which can ensure the data is intact.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> ---
>>   migration/colo.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 46 insertions(+), 1 deletion(-)
>>
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 25f85b2..1339774 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -287,6 +287,9 @@ static int colo_wait_handle_cmd(QEMUFile *f, int *checkpoint_request)
>>   void *colo_process_incoming_thread(void *opaque)
>>   {
>>       MigrationIncomingState *mis = opaque;
>> +    QEMUFile *fb = NULL;
>> +    QEMUSizedBuffer *buffer = NULL; /* Cache incoming device state */
>> +    int  total_size;
>>       int fd, ret = 0;
>>
>>       migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>> @@ -310,6 +313,12 @@ void *colo_process_incoming_thread(void *opaque)
>>           goto out;
>>       }
>>
>> +    buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
>> +    if (buffer == NULL) {
>> +        error_report("Failed to allocate colo buffer!");
>> +        goto out;
>> +    }
>> +
>>       ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_READY, 0);
>>       if (ret < 0) {
>>           goto out;
>> @@ -337,19 +346,50 @@ void *colo_process_incoming_thread(void *opaque)
>>               goto out;
>>           }
>>
>> -        /* TODO Load VM state */
>> +        /* read the VM state total size first */
>> +        total_size = colo_ctl_get(mis->from_src_file,
>> +                                  COLO_COMMAND_VMSTATE_SIZE);
>> +        if (total_size <= 0) {
>
> Error message?
>

OK, we need one.

>> +            goto out;
>> +        }
>
> OK, and when you fix up the colo_ctl_get in the previous patch to
> take a separate pointer for value, you can make total_size a size_t.
>

Yes, i have updated it after addressing your review comment on patch 11.

>
> Other than those, it looks good.
>

Thanks.

> Dave
>
>> +        /* read vm device state into colo buffer */
>> +        ret = qsb_fill_buffer(buffer, mis->from_src_file, total_size);
>> +        if (ret != total_size) {
>> +            error_report("can't get all migration data");
>> +            goto out;
>> +        }
>>
>>           ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_RECEIVED, 0);
>>           if (ret < 0) {
>>               goto out;
>>           }
>>
>> +        /* open colo buffer for read */
>> +        fb = qemu_bufopen("r", buffer);
>> +        if (!fb) {
>> +            error_report("can't open colo buffer for read");
>> +            goto out;
>> +        }
>> +
>> +        qemu_mutex_lock_iothread();
>> +        qemu_system_reset(VMRESET_SILENT);
>> +        if (qemu_loadvm_state(fb) < 0) {
>> +            error_report("COLO: loadvm failed");
>> +            qemu_mutex_unlock_iothread();
>> +            goto out;
>> +        }
>> +        qemu_mutex_unlock_iothread();
>> +
>>           /* TODO: flush vm state */
>>
>>           ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
>>           if (ret < 0) {
>>               goto out;
>>           }
>> +
>> +        qemu_fclose(fb);
>> +        fb = NULL;
>>       }
>>
>>   out:
>> @@ -358,6 +398,11 @@ out:
>>                        strerror(-ret));
>>       }
>>
>> +    if (fb) {
>> +        qemu_fclose(fb);
>> +    }
>> +    qsb_free(buffer);
>> +
>>       qemu_mutex_lock_iothread();
>>       colo_release_ram_cache();
>>       qemu_mutex_unlock_iothread();
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 15/38] ram/COLO: Record pages received from PVM by re-using migration dirty bitmap
  2015-11-13 16:19   ` Dr. David Alan Gilbert
@ 2015-11-16  9:07     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-16  9:07 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/14 0:19, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> We need to record the address of the dirty pages that received from PVM,
>> It will help flushing pages that cached into SVM.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>> v10:
>> - New patch split from v9's patch 13
>> - Rebase to master to use 'migration_bitmap_rcu'
>> ---
>>   migration/ram.c | 35 +++++++++++++++++++++++++++++++++++
>>   1 file changed, 35 insertions(+)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index b094dc3..70879bd 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1448,6 +1448,18 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>>           }
>>
>>           if (ram_cache_enable) {
>> +            unsigned long *bitmap;
>> +            long k = (block->mr->ram_addr + offset) >> TARGET_PAGE_BITS;
>> +
>> +            bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
>> +            /*
>> +            * During colo checkpoint, we need bitmap of these migrated pages.
>> +            * It help us to decide which pages in ram cache should be flushed
>> +            * into VM's RAM later.
>> +            */
>> +            if (!test_and_set_bit(k, bitmap)) {
>> +                migration_dirty_pages++;
>> +            }
>
> I don't like having this in host_from_stream_offset; if you look
> at the current ram_load there is only a single call to host_from_stream_offset, so it's
> now much easier for you to move it into a separate function.
>

Hmm, that's really a good idea, i will split it in next version.

>>               return block->host_cache + offset;
>>           } else {
>>               return block->host + offset;
>> @@ -1462,6 +1474,13 @@ static inline void *host_from_stream_offset(QEMUFile *f,
>>           if (!strncmp(id, block->idstr, sizeof(id)) &&
>>               block->max_length > offset) {
>>               if (ram_cache_enable) {
>> +                unsigned long *bitmap;
>> +                long k = (block->mr->ram_addr + offset) >> TARGET_PAGE_BITS;
>> +
>> +                bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
>> +                if (!test_and_set_bit(k, bitmap)) {
>> +                    migration_dirty_pages++;
>> +                }
>>                   return block->host_cache + offset;
>>               } else {
>>                   return block->host + offset;
>> @@ -1723,6 +1742,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>   int colo_init_ram_cache(void)
>>   {
>>       RAMBlock *block;
>> +    int64_t ram_cache_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>>
>>       rcu_read_lock();
>>       QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> @@ -1734,6 +1754,15 @@ int colo_init_ram_cache(void)
>>       }
>>       rcu_read_unlock();
>>       ram_cache_enable = true;
>> +    /*
>> +    * Record the dirty pages that sent by PVM, we use this dirty bitmap together
>> +    * with to decide which page in cache should be flushed into SVM's RAM. Here
>> +    * we use the same name 'migration_bitmap_rcu' as for migration.
>> +    */
>> +    migration_bitmap_rcu = g_new(struct BitmapRcu, 1);
>
> Please update that to g_new0 (I changed the other use when I added postcopy).
>

OK, thanks.

> Dave
>
>> +    migration_bitmap_rcu->bmap = bitmap_new(ram_cache_pages);
>> +    migration_dirty_pages = 0;
>> +
>>       return 0;
>>
>>   out_locked:
>> @@ -1751,9 +1780,15 @@ out_locked:
>>   void colo_release_ram_cache(void)
>>   {
>>       RAMBlock *block;
>> +    struct BitmapRcu *bitmap = migration_bitmap_rcu;
>>
>>       ram_cache_enable = false;
>>
>> +    atomic_rcu_set(&migration_bitmap_rcu, NULL);
>> +    if (bitmap) {
>> +        call_rcu(bitmap, migration_bitmap_free, rcu);
>> +    }
>> +
>>       rcu_read_lock();
>>       QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>>           if (block->host_cache) {
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 16/38] COLO: Flush PVM's cached RAM into SVM's memory
  2015-11-13 16:38   ` Dr. David Alan Gilbert
@ 2015-11-16 12:46     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-16 12:46 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/14 0:38, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> During the time of VM's running, PVM may dirty some pages, we will transfer
>> PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
>> time. So, the content of SVM's RAM cache will always be some with PVM's memory
>> after checkpoint.
>>
>> Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
>> we do this in a more efficient way:
>> Only flush any page that dirtied by PVM since last checkpoint.
>> In this way, we can ensure SVM's memory same with PVM's.
>>
>> Besides, we must ensure flush RAM cache before load device state.
>
> Yes, just a couple of minor comments below; mostly OK.
>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>> v10: trace the number of dirty pages that be received.
>> ---
>>   include/migration/colo.h |  1 +
>>   migration/colo.c         |  2 --
>>   migration/ram.c          | 40 ++++++++++++++++++++++++++++++++++++++++
>>   trace-events             |  1 +
>>   4 files changed, 42 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/migration/colo.h b/include/migration/colo.h
>> index 8edd5f1..be2890b 100644
>> --- a/include/migration/colo.h
>> +++ b/include/migration/colo.h
>> @@ -32,4 +32,5 @@ bool migration_incoming_in_colo_state(void);
>>   /* ram cache */
>>   int colo_init_ram_cache(void);
>>   void colo_release_ram_cache(void);
>> +void colo_flush_ram_cache(void);
>>   #endif
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 1339774..0efab21 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -381,8 +381,6 @@ void *colo_process_incoming_thread(void *opaque)
>>           }
>>           qemu_mutex_unlock_iothread();
>>
>> -        /* TODO: flush vm state */
>> -
>>           ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_VMSTATE_LOADED, 0);
>>           if (ret < 0) {
>>               goto out;
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 70879bd..d7e0e37 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1601,6 +1601,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>       int flags = 0, ret = 0;
>>       static uint64_t seq_iter;
>>       int len = 0;
>> +    bool need_flush = false;
>>
>>       seq_iter++;
>>
>> @@ -1669,6 +1670,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>                   ret = -EINVAL;
>>                   break;
>>               }
>> +
>> +            need_flush = true;
>>               ch = qemu_get_byte(f);
>>               ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
>>               break;
>> @@ -1679,6 +1682,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>                   ret = -EINVAL;
>>                   break;
>>               }
>> +
>> +            need_flush = true;
>>               qemu_get_buffer(f, host, TARGET_PAGE_SIZE);
>>               break;
>>           case RAM_SAVE_FLAG_COMPRESS_PAGE:
>> @@ -1711,6 +1716,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>                   ret = -EINVAL;
>>                   break;
>>               }
>> +            need_flush = true;
>
> You can probably move the 'need_flush' to the big if near the top of the loop in the
> current version.
>

Good catch, i will fix it in next version.

>>               break;
>>           case RAM_SAVE_FLAG_EOS:
>>               /* normal exit */
>> @@ -1730,6 +1736,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>       }
>>
>>       rcu_read_unlock();
>> +
>> +    if (!ret  && ram_cache_enable && need_flush) {
>> +        DPRINTF("Flush ram_cache\n");
>
> trace_

Got it.

>
>> +        colo_flush_ram_cache();
>> +    }
>>       DPRINTF("Completed load of VM with exit code %d seq iteration "
>>               "%" PRIu64 "\n", ret, seq_iter);
>>       return ret;
>> @@ -1799,6 +1810,35 @@ void colo_release_ram_cache(void)
>>       rcu_read_unlock();
>>   }
>>
>> +/*
>> + * Flush content of RAM cache into SVM's memory.
>> + * Only flush the pages that be dirtied by PVM or SVM or both.
>> + */
>> +void colo_flush_ram_cache(void)
>> +{
>> +    RAMBlock *block = NULL;
>> +    void *dst_host;
>> +    void *src_host;
>> +    ram_addr_t  offset = 0;
>> +
>> +    trace_colo_flush_ram_cache(migration_dirty_pages);
>> +    rcu_read_lock();
>> +    block = QLIST_FIRST_RCU(&ram_list.blocks);
>> +    while (block) {
>> +        offset = migration_bitmap_find_and_reset_dirty(block, offset);
>
> You'll need to rework that a little (I split that into
> migration_bitmap_find_dirty and migration_bitmap_clear_dirty)
>

Yes, i have rebase it in my private branch after your post-copy merged. ;)

Thanks,
zhanghailiang

>> +        if (offset >= block->used_length) {
>> +            offset = 0;
>> +            block = QLIST_NEXT_RCU(block, next);
>> +        } else {
>> +            dst_host = block->host + offset;
>> +            src_host = block->host_cache + offset;
>> +            memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
>> +        }
>> +    }
>> +    rcu_read_unlock();
>> +    assert(migration_dirty_pages == 0);
>> +}
>> +
>>   static SaveVMHandlers savevm_ram_handlers = {
>>       .save_live_setup = ram_save_setup,
>>       .save_live_iterate = ram_save_iterate,
>> diff --git a/trace-events b/trace-events
>> index ee4679c..c98bc13 100644
>> --- a/trace-events
>> +++ b/trace-events
>> @@ -1232,6 +1232,7 @@ qemu_file_fclose(void) ""
>>   migration_bitmap_sync_start(void) ""
>>   migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64""
>>   migration_throttle(void) ""
>> +colo_flush_ram_cache(uint64_t dirty_pages) "dirty_pages %" PRIu64""
>>
>>   # hw/display/qxl.c
>>   disable qxl_interface_set_mm_time(int qid, uint32_t mm_time) "%d %d"
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration
  2015-11-13 16:42   ` Eric Blake
@ 2015-11-16 13:00     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-16 13:00 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah

On 2015/11/14 0:42, Eric Blake wrote:
> On 11/03/2015 04:56 AM, zhanghailiang wrote:
>> Add a migrate state: MIGRATION_STATUS_COLO, enter this migration state
>> after the first live migration successfully finished.
>>
>> We reuse migration thread, so if colo is enabled by user, migration thread will
>> go into the process of colo.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>> v10: Simplify process by dropping colo thread and reusing migration thread.
>>       (Dave's suggestion)
>> ---
>
>> +++ b/qapi-schema.json
>> @@ -439,7 +439,7 @@
>>   ##
>>   { 'enum': 'MigrationStatus',
>>     'data': [ 'none', 'setup', 'cancelling', 'cancelled',
>> -            'active', 'completed', 'failed' ] }
>> +            'active', 'completed', 'failed', 'colo' ] }
>>
>
> Missing documentation of the new state, including a '(since 2.6)' tag.
>

Good catch, i will fix it in next version, thanks.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol
  2015-11-13 16:46   ` Eric Blake
@ 2015-11-17  7:04     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-17  7:04 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, stefanha, amit.shah

On 2015/11/14 0:46, Eric Blake wrote:
> On 11/03/2015 04:56 AM, zhanghailiang wrote:
>> We need communications protocol of user-defined to control the checkpoint
>> process.
>>
>> The new checkpoint request is started by Primary VM, and the interactive process
>> like below:
>> Checkpoint synchronizing points,
>>
>>                         Primary                         Secondary
>> 'checkpoint-request'   @ ----------------------------->
>>                                                         Suspend (In hybrid mode)
>> 'checkpoint-reply'     <------------------------------ @
>>                         Suspend&Save state
>> 'vmstate-send'         @ ----------------------------->
>>                         Send state                      Receive state
>> 'vmstate-received'     <------------------------------ @
>>                         Release packets                 Load state
>> 'vmstate-load'         <------------------------------ @
>>                         Resume                          Resume (In hybrid mode)
>>
>>                         Start Comparing (In hybrid mode)
>> NOTE:
>>   1) '@' who sends the message
>>   2) Every sync-point is synchronized by two sides with only
>>      one handshake(single direction) for low-latency.
>>      If more strict synchronization is required, a opposite direction
>>      sync-point should be added.
>>   3) Since sync-points are single direction, the remote side may
>>      go forward a lot when this side just receives the sync-point.
>>   4) For now, we only support 'periodic' checkpoint, for which
>>     the Secondary VM is not running, later we will support 'hybrid' mode.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> Cc: Eric Blake <eblake@redhat.com>
>> ---
>> v10:
>> - Rename enum COLOCmd to COLOCommand (Eric's suggestion).
>> - Remove unused 'ram-steal'
>
> Interface review only:
>
>
>> +++ b/qapi-schema.json
>> @@ -702,6 +702,33 @@
>>               '*tls-port': 'int', '*cert-subject': 'str' } }
>>
>>   ##
>> +# @COLOCommand
>> +#
>> +# The colo command
>
> Still might be nice to spell out what COLO means here, but it's fairly
> obvious this will be related to anything else COLO, so I'm not too worried.
>

Hmm, i think the best way to solve the abbreviation problem is to give the full
name where it appears for the first time ;)

>> +#
>> +# @invalid: unknown command
>> +#
>> +# @checkpoint-ready: SVM is ready for checkpointing
>> +#
>> +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
>> +#
>> +# @checkpoint-reply: SVM gets PVM's checkpoint request
>> +#
>> +# @vmstate-send: VM's state will be sent by PVM.
>> +#
>> +# @vmstate-size: The total size of VMstate.
>> +#
>> +# @vmstate-received: VM's state has been received by SVM
>> +#
>> +# @vmstate-loaded: VM's state has been loaded by SVM
>> +#
>> +# Since: 2.5
>
> Will need a tweak to say 2.6.  Otherwise looks okay.
>

OK, i will update it. Thanks.

>> +##
>> +{ 'enum': 'COLOCommand',
>> +  'data': [ 'invalid', 'checkpoint-ready', 'checkpoint-request',
>> +            'checkpoint-reply', 'vmstate-send', 'vmstate-size',
>> +            'vmstate-received', 'vmstate-loaded' ] }
>> +
>
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO
  2015-11-13 16:47   ` Eric Blake
@ 2015-11-17  7:15     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-17  7:15 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah

On 2015/11/14 0:47, Eric Blake wrote:
> On 11/03/2015 04:56 AM, zhanghailiang wrote:
>> Guest will enter this state when paused to save/restore VM state
>> under colo checkpoint.
>>
>> Cc: Eric Blake <eblake@redhat.com>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> ---
>>   qapi-schema.json | 7 ++++++-
>>   vl.c             | 8 ++++++++
>>   2 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 5c4fe6d..49f2a90 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -154,12 +154,15 @@
>>   # @watchdog: the watchdog action is configured to pause and has been triggered
>>   #
>>   # @guest-panicked: guest has been panicked as a result of guest OS panic
>> +#
>> +# @colo: guest is paused to save/restore VM state under colo checkpoint (since
>> +# 2.5)
>
> Will need a tweak to 2.6;
>

OK, i will update it in next version

>>   ##
>>   { 'enum': 'RunState',
>>     'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>>               'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>>               'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
>> -            'guest-panicked' ] }
>> +            'guest-panicked', 'colo' ] }
>>
>>   ##
>>   # @StatusInfo:
>> @@ -434,6 +437,8 @@
>>   #
>>   # @failed: some error occurred during migration process.
>>   #
>> +# @colo: VM is in the process of fault tolerance. (since 2.5)
>
> Likewise.  But my R-b still stands after that minor tweak.
>

Thanks.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover
  2015-11-13 16:59   ` Eric Blake
@ 2015-11-17  8:03     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-17  8:03 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, Luiz Capitulino

On 2015/11/14 0:59, Eric Blake wrote:
> On 11/03/2015 04:56 AM, zhanghailiang wrote:
>> We leave users to choose whatever heartbeat solution they want, if the heartbeat
>> is lost, or other errors they detect, they can use experimental command
>> 'x_colo_lost_heartbeat' to tell COLO to do failover, COLO will do operations
>> accordingly.
>>
>> For example, if the command is sent to the PVM, the Primary side will
>> exit COLO mode and take over operation. If sent to the Secondary, the
>> secondary will run failover work, then take over server operation to
>> become the new Primary.
>>
>> Cc: Luiz Capitulino <lcapitulino@redhat.com>
>> Cc: Eric Blake <eblake@redhat.com>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> ---
>> v10: Rename command colo_lost_hearbeat to experimental 'x_colo_lost_heartbeat'
>> ---
>
>> @@ -29,6 +30,9 @@ bool migration_incoming_enable_colo(void);
>>   void migration_incoming_exit_colo(void);
>>   void *colo_process_incoming_thread(void *opaque);
>>   bool migration_incoming_in_colo_state(void);
>> +
>> +int get_colo_mode(void);
>
> Should this return an enum type instead of an int?
>
>
>> +++ b/migration/colo-comm.c
>> @@ -20,6 +20,17 @@ typedef struct {
>>
>>   static COLOInfo colo_info;
>>
>> +int get_colo_mode(void)
>> +{
>> +    if (migration_in_colo_state()) {
>> +        return COLO_MODE_PRIMARY;
>> +    } else if (migration_incoming_in_colo_state()) {
>> +        return COLO_MODE_SECONDARY;
>> +    } else {
>> +        return COLO_MODE_UNKNOWN;
>> +    }
>> +}
>
> Particularly since it is always returning values of the same enum.
>
> Not fatal to the patch, so much as a style issue.
>

Seems reasonable. I will fix it in next version.

>
>> +void qmp_x_colo_lost_heartbeat(Error **errp)
>> +{
>> +    if (get_colo_mode() == COLO_MODE_UNKNOWN) {
>> +        error_setg(errp, QERR_FEATURE_DISABLED, "colo");
>> +        return;
>
> We've slowly been trying to get rid of QERR_ usage.  But you aren't the
> first user, and a global cleanup may be better. So I can overlook it for
> now.
>

Yes, there are still several places in qemu  that use 'QERR_FEATURE_DISABLED',
How to cleanup them ? Change it to 'error_setg(errp, "COLO feature is not enabled") here ?

>> +++ b/qapi-schema.json
>> @@ -734,6 +734,32 @@
>>               'checkpoint-reply', 'vmstate-send', 'vmstate-size',
>>               'vmstate-received', 'vmstate-loaded' ] }
>>
>> +##
>> +# @COLOMode
>> +#
>> +# The colo mode
>> +#
>> +# @unknown: unknown mode
>> +#
>> +# @primary: master side
>> +#
>> +# @secondary: slave side
>> +#
>> +# Since: 2.5
>> +##
>> +{ 'enum': 'COLOMode',
>> +  'data': [ 'unknown', 'primary', 'secondary'] }
>> +
>> +##
>> +# @x-colo-lost-heartbeat
>> +#
>> +# Tell qemu that heartbeat is lost, request it to do takeover procedures.
>> +#
>
> The docs here are rather short, compared to your commit message (in
> particular, the fact that it causes a different action depending on
> whether it is sent to primary [takeover] or secondary [failover]).
>

Ok, I will add more comments here. Thanks.

>> +# Since: 2.5
>
> 2.6 in both places.
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically
  2015-11-13 18:34   ` Dr. David Alan Gilbert
@ 2015-11-17  9:11     ` zhanghailiang
  2015-11-17 10:08       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 100+ messages in thread
From: zhanghailiang @ 2015-11-17  9:11 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/14 2:34, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> Do checkpoint periodically, the default interval is 200ms.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> ---
>>   migration/colo.c | 14 ++++++++++++++
>>   1 file changed, 14 insertions(+)
>>
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 0efab21..a6791f4 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -11,12 +11,19 @@
>>    */
>>
>>   #include <unistd.h>
>> +#include "qemu/timer.h"
>>   #include "sysemu/sysemu.h"
>>   #include "migration/colo.h"
>>   #include "trace.h"
>>   #include "qemu/error-report.h"
>>   #include "qemu/sockets.h"
>>
>> +/*
>> + * checkpoint interval: unit ms
>> + * Note: Please change this default value to 10000 when we support hybrid mode.
>> + */
>> +#define CHECKPOINT_MAX_PEROID 200
>
> Why not put the patch that makes this a configurable parameter before this,
> and then we can use it straight away?
>

Do you mean setting this value by command  "migrate_set_parameter" ?
I have realized it in patch 26.

>>   /* colo buffer */
>>   #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>>
>> @@ -183,6 +190,7 @@ out:
>>   static void colo_process_checkpoint(MigrationState *s)
>>   {
>>       QEMUSizedBuffer *buffer = NULL;
>> +    int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>       int fd, ret = 0;
>>
>>       /* Dup the fd of to_dst_file */
>> @@ -220,11 +228,17 @@ static void colo_process_checkpoint(MigrationState *s)
>>       trace_colo_vm_state_change("stop", "run");
>>
>>       while (s->state == MIGRATION_STATUS_COLO) {
>> +        current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> +        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
>> +            g_usleep(100000);
>> +            continue;
>> +        }
>
> I'm a bit concerned at the 100ms wait, when the period is 200ms;
> depending how the times work out, couldn't we end up waiting for just
> under 300ms? - that's a big error - and it's even more weird when
> we make it configurable later.
>

Agreed.

> I don't think we've got a sleep-until, which is a shame; but how
> about something like:
>
>     if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
>         int64_t delay_ms;
>         delay_ms = CHECKPOINT_MAX_PERIOD - (current_time - checkpoint_time);
>         g_usleep (delay_ms * 1000);
>     }
>

That's a reasonable modification. I will fix it like that in next version.

Thanks,
zhanghailiang

> Dave
>
>>           /* start a colo checkpoint */
>>           ret = colo_do_checkpoint_transaction(s, buffer);
>>           if (ret < 0) {
>>               goto out;
>>           }
>> +        checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>       }
>>
>>   out:
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically
  2015-11-17  9:11     ` zhanghailiang
@ 2015-11-17 10:08       ` Dr. David Alan Gilbert
  2015-11-17 10:29         ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-17 10:08 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> On 2015/11/14 2:34, Dr. David Alan Gilbert wrote:
> >* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>Do checkpoint periodically, the default interval is 200ms.
> >>
> >>Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> >>Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> >>---
> >>  migration/colo.c | 14 ++++++++++++++
> >>  1 file changed, 14 insertions(+)
> >>
> >>diff --git a/migration/colo.c b/migration/colo.c
> >>index 0efab21..a6791f4 100644
> >>--- a/migration/colo.c
> >>+++ b/migration/colo.c
> >>@@ -11,12 +11,19 @@
> >>   */
> >>
> >>  #include <unistd.h>
> >>+#include "qemu/timer.h"
> >>  #include "sysemu/sysemu.h"
> >>  #include "migration/colo.h"
> >>  #include "trace.h"
> >>  #include "qemu/error-report.h"
> >>  #include "qemu/sockets.h"
> >>
> >>+/*
> >>+ * checkpoint interval: unit ms
> >>+ * Note: Please change this default value to 10000 when we support hybrid mode.
> >>+ */
> >>+#define CHECKPOINT_MAX_PEROID 200
> >
> >Why not put the patch that makes this a configurable parameter before this,
> >and then we can use it straight away?
> >
> 
> Do you mean setting this value by command  "migrate_set_parameter" ?
> I have realized it in patch 26.

Yes, I mean reorder the patch series; put the migrate_set_parameter addition
before this patch, and then use it straight away.

Dave

> >>  /* colo buffer */
> >>  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> >>
> >>@@ -183,6 +190,7 @@ out:
> >>  static void colo_process_checkpoint(MigrationState *s)
> >>  {
> >>      QEMUSizedBuffer *buffer = NULL;
> >>+    int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> >>      int fd, ret = 0;
> >>
> >>      /* Dup the fd of to_dst_file */
> >>@@ -220,11 +228,17 @@ static void colo_process_checkpoint(MigrationState *s)
> >>      trace_colo_vm_state_change("stop", "run");
> >>
> >>      while (s->state == MIGRATION_STATUS_COLO) {
> >>+        current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> >>+        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
> >>+            g_usleep(100000);
> >>+            continue;
> >>+        }
> >
> >I'm a bit concerned at the 100ms wait, when the period is 200ms;
> >depending how the times work out, couldn't we end up waiting for just
> >under 300ms? - that's a big error - and it's even more weird when
> >we make it configurable later.
> >
> 
> Agreed.
> 
> >I don't think we've got a sleep-until, which is a shame; but how
> >about something like:
> >
> >    if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
> >        int64_t delay_ms;
> >        delay_ms = CHECKPOINT_MAX_PERIOD - (current_time - checkpoint_time);
> >        g_usleep (delay_ms * 1000);
> >    }
> >
> 
> That's a reasonable modification. I will fix it like that in next version.
> 
> Thanks,
> zhanghailiang
> 
> >Dave
> >
> >>          /* start a colo checkpoint */
> >>          ret = colo_do_checkpoint_transaction(s, buffer);
> >>          if (ret < 0) {
> >>              goto out;
> >>          }
> >>+        checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> >>      }
> >>
> >>  out:
> >>--
> >>1.8.3.1
> >>
> >>
> >--
> >Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >
> >.
> >
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint
  2015-11-13 18:53       ` Dr. David Alan Gilbert
@ 2015-11-17 10:20         ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-17 10:20 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/14 2:53, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2015/11/7 2:59, Dr. David Alan Gilbert wrote:
>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>>> The main process of checkpoint is to synchronize SVM with PVM.
>>>> VM's state includes ram and device state. So we will migrate PVM's
>>>> state to SVM when do checkpoint, just like migration does.
>>>>
>>>> We will cache PVM's state in slave, we use QEMUSizedBuffer
>>>> to store the data, we need to know the size of VM state, so in master,
>>>> we use qsb to store VM state temporarily, get the data size by call qsb_get_length()
>>>> and then migrate the data to the qsb in the secondary side.
>>>>
>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> ---
>>>>   migration/colo.c   | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++----
>>>>   migration/ram.c    | 47 +++++++++++++++++++++++++++++--------
>>>>   migration/savevm.c |  2 +-
>>>>   3 files changed, 101 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/migration/colo.c b/migration/colo.c
>>>> index 2510762..b865513 100644
>>>> --- a/migration/colo.c
>>>> +++ b/migration/colo.c
>>>> @@ -17,6 +17,9 @@
>>>>   #include "qemu/error-report.h"
>>>>   #include "qemu/sockets.h"
>>>>
>>>> +/* colo buffer */
>>>> +#define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>>>> +
>>>>   bool colo_supported(void)
>>>>   {
>>>>       return true;
>>>> @@ -94,9 +97,12 @@ static int colo_ctl_get(QEMUFile *f, uint32_t require)
>>>>       return value;
>>>>   }
>>>>
>>>> -static int colo_do_checkpoint_transaction(MigrationState *s)
>>>> +static int colo_do_checkpoint_transaction(MigrationState *s,
>>>> +                                          QEMUSizedBuffer *buffer)
>>>>   {
>>>>       int ret;
>>>> +    size_t size;
>>>> +    QEMUFile *trans = NULL;
>>>>
>>>>       ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_CHECKPOINT_REQUEST, 0);
>>>>       if (ret < 0) {
>>>> @@ -107,15 +113,47 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
>>>>       if (ret < 0) {
>>>>           goto out;
>>>>       }
>>>> +    /* Reset colo buffer and open it for write */
>>>> +    qsb_set_length(buffer, 0);
>>>> +    trans = qemu_bufopen("w", buffer);
>>>> +    if (!trans) {
>>>> +        error_report("Open colo buffer for write failed");
>>>> +        goto out;
>>>> +    }
>>>>
>>>> -    /* TODO: suspend and save vm state to colo buffer */
>>>> +    qemu_mutex_lock_iothread();
>>>> +    vm_stop_force_state(RUN_STATE_COLO);
>>>> +    qemu_mutex_unlock_iothread();
>>>> +    trace_colo_vm_state_change("run", "stop");
>>>> +
>>>> +    /* Disable block migration */
>>>> +    s->params.blk = 0;
>>>> +    s->params.shared = 0;
>>>> +    qemu_savevm_state_header(trans);
>>>> +    qemu_savevm_state_begin(trans, &s->params);
>>>> +    qemu_mutex_lock_iothread();
>>>> +    qemu_savevm_state_complete(trans);
>>>> +    qemu_mutex_unlock_iothread();
>>>> +
>>>> +    qemu_fflush(trans);
>>>>
>>>>       ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SEND, 0);
>>>>       if (ret < 0) {
>>>>           goto out;
>>>>       }
>>>> +    /* we send the total size of the vmstate first */
>>>> +    size = qsb_get_length(buffer);
>>>> +    ret = colo_ctl_put(s->to_dst_file, COLO_COMMAND_VMSTATE_SIZE, size);
>>>> +    if (ret < 0) {
>>>> +        goto out;
>>>> +    }
>>>>
>>>> -    /* TODO: send vmstate to Secondary */
>>>> +    qsb_put_buffer(s->to_dst_file, buffer, size);
>>>> +    qemu_fflush(s->to_dst_file);
>>>> +    ret = qemu_file_get_error(s->to_dst_file);
>>>> +    if (ret < 0) {
>>>> +        goto out;
>>>> +    }
>>>>
>>>>       ret = colo_ctl_get(s->from_dst_file, COLO_COMMAND_VMSTATE_RECEIVED);
>>>>       if (ret < 0) {
>>>> @@ -127,14 +165,24 @@ static int colo_do_checkpoint_transaction(MigrationState *s)
>>>>           goto out;
>>>>       }
>>>>
>>>> -    /* TODO: resume Primary */
>>>> +    ret = 0;
>>>> +    /* resume master */
>>>> +    qemu_mutex_lock_iothread();
>>>> +    vm_start();
>>>> +    qemu_mutex_unlock_iothread();
>>>> +    trace_colo_vm_state_change("stop", "run");
>>>>
>>>>   out:
>>>> +    if (trans) {
>>>> +        qemu_fclose(trans);
>>>> +    }
>>>> +
>>>>       return ret;
>>>>   }
>>>>
>>>>   static void colo_process_checkpoint(MigrationState *s)
>>>>   {
>>>> +    QEMUSizedBuffer *buffer = NULL;
>>>>       int fd, ret = 0;
>>>>
>>>>       /* Dup the fd of to_dst_file */
>>>> @@ -159,6 +207,13 @@ static void colo_process_checkpoint(MigrationState *s)
>>>>           goto out;
>>>>       }
>>>>
>>>> +    buffer = qsb_create(NULL, COLO_BUFFER_BASE_SIZE);
>>>> +    if (buffer == NULL) {
>>>> +        ret = -ENOMEM;
>>>> +        error_report("Failed to allocate buffer!");
>>>
>>> Please say 'Failed to allocate colo buffer'; QEMU has lots and lots of buffers.
>>>
>>
>> OK, will fix it in next version.
>>
>>>> +        goto out;
>>>> +    }
>>>> +
>>>>       qemu_mutex_lock_iothread();
>>>>       vm_start();
>>>>       qemu_mutex_unlock_iothread();
>>>> @@ -166,7 +221,7 @@ static void colo_process_checkpoint(MigrationState *s)
>>>>
>>>>       while (s->state == MIGRATION_STATUS_COLO) {
>>>>           /* start a colo checkpoint */
>>>> -        ret = colo_do_checkpoint_transaction(s);
>>>> +        ret = colo_do_checkpoint_transaction(s, buffer);
>>>>           if (ret < 0) {
>>>>               goto out;
>>>>           }
>>>> @@ -179,6 +234,9 @@ out:
>>>>       migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
>>>>                         MIGRATION_STATUS_COMPLETED);
>>>>
>>>> +    qsb_free(buffer);
>>>> +    buffer = NULL;
>>>> +
>>>>       if (s->from_dst_file) {
>>>>           qemu_fclose(s->from_dst_file);
>>>>       }
>>>> diff --git a/migration/ram.c b/migration/ram.c
>>>> index a25bcc7..5784c15 100644
>>>> --- a/migration/ram.c
>>>> +++ b/migration/ram.c
>>>> @@ -38,6 +38,7 @@
>>>>   #include "trace.h"
>>>>   #include "exec/ram_addr.h"
>>>>   #include "qemu/rcu_queue.h"
>>>> +#include "migration/colo.h"
>>>>
>>>>   #ifdef DEBUG_MIGRATION_RAM
>>>>   #define DPRINTF(fmt, ...) \
>>>> @@ -1165,15 +1166,8 @@ void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
>>>>       }
>>>>   }
>>>>
>>>> -/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>>>> - * long-running RCU critical section.  When rcu-reclaims in the code
>>>> - * start to become numerous it will be necessary to reduce the
>>>> - * granularity of these critical sections.
>>>> - */
>>>> -
>>>> -static int ram_save_setup(QEMUFile *f, void *opaque)
>>>> +static int ram_save_init_globals(void)
>>>>   {
>>>> -    RAMBlock *block;
>>>>       int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
>>>>
>>>>       dirty_rate_high_cnt = 0;
>>>> @@ -1233,6 +1227,31 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>>>       migration_bitmap_sync();
>>>>       qemu_mutex_unlock_ramlist();
>>>>       qemu_mutex_unlock_iothread();
>>>> +    rcu_read_unlock();
>>>> +
>>>> +    return 0;
>>>> +}
>>>
>>> It surprises me you want migration_bitmap_sync in ram_save_init_globals(),
>>> but I guess you want the first sync at the start.
>>>
>>
>> Er, sorry,i don't quite understand.
>> Here. I just split part codes of ram_save_setup()
>> into a helper function ram_save_init_global(), to make it more clear.
>> We can't do initial work for twice. Is there any thing wrong ?
>
> No, that's OK - it just seemed odd for a function like 'init_globals'
> to do such a big side effect of doing the sync; but yes, it makes sense
> since it's just a split.
>
>>>> +/* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>>>> + * long-running RCU critical section.  When rcu-reclaims in the code
>>>> + * start to become numerous it will be necessary to reduce the
>>>> + * granularity of these critical sections.
>>>> + */
>>>> +
>>>> +static int ram_save_setup(QEMUFile *f, void *opaque)
>>>> +{
>>>> +    RAMBlock *block;
>>>> +
>>>> +    /*
>>>> +     * migration has already setup the bitmap, reuse it.
>>>> +     */
>>>> +    if (!migration_in_colo_state()) {
>>>> +        if (ram_save_init_globals() < 0) {
>>>> +            return -1;
>>>> +         }
>>>> +    }
>>>> +
>>>> +    rcu_read_lock();
>>>>
>>>>       qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>>>>
>>>> @@ -1332,7 +1351,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>>>>       while (true) {
>>>>           int pages;
>>>>
>>>> -        pages = ram_find_and_save_block(f, true, &bytes_transferred);
>>>> +        pages = ram_find_and_save_block(f, !migration_in_colo_state(),
>>>> +                                        &bytes_transferred);
>>>>           /* no more blocks to sent */
>>>>           if (pages == 0) {
>>>>               break;
>>>> @@ -1343,8 +1363,15 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>>>>       ram_control_after_iterate(f, RAM_CONTROL_FINISH);
>>>>
>>>>       rcu_read_unlock();
>>>> +    /*
>>>> +     * Since we need to reuse dirty bitmap in colo,
>>>> +     * don't cleanup the bitmap.
>>>> +     */
>>>> +    if (!migrate_colo_enabled() ||
>>>> +        migration_has_failed(migrate_get_current())) {
>>>> +        migration_end();
>>>> +    }
>>>>
>>>> -    migration_end();
>>>>       qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>>>>
>>>>       return 0;
>>>> diff --git a/migration/savevm.c b/migration/savevm.c
>>>> index dbcc39a..0faf12b 100644
>>>> --- a/migration/savevm.c
>>>> +++ b/migration/savevm.c
>>>> @@ -48,7 +48,7 @@
>>>>   #include "qemu/iov.h"
>>>>   #include "block/snapshot.h"
>>>>   #include "block/qapi.h"
>>>> -
>>>> +#include "migration/colo.h"
>>>>
>>>>   #ifndef ETH_P_RARP
>>>>   #define ETH_P_RARP 0x8035
>>>
>>> Wrong patch?
>>>
>>
>> No, we have call migration_in_colo_state() in qemu_savevm_state_begin().
>> So we have to include "migration/colo.h"
>
> I don't think you use it in savevm.c until patch 30, so you can add
> the #include in patch 30 (or whichever is the patch that first needs it).
>

Ha, I know what you mean. And yes, you are right,
we shouldn't call migration_in_colo_state() in qemu_savevm_state_begin() here.
Good catch. I will fix it in next version. Thanks.


> Dave
>
>
>>
>>>
>>> So other than those minor things:
>>>
>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>
>>> but watch out for the recent changes to migrate_end that went in
>>> a few days ago.
>>>
>>
>> Thanks for reminding me, i have rebased that. ;)
>>
>> zhanghailiang
>>
>>> Dave
>>>
>>>> --
>>>> 1.8.3.1
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>> .
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically
  2015-11-17 10:08       ` Dr. David Alan Gilbert
@ 2015-11-17 10:29         ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-17 10:29 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/17 18:08, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2015/11/14 2:34, Dr. David Alan Gilbert wrote:
>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>>> Do checkpoint periodically, the default interval is 200ms.
>>>>
>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> ---
>>>>   migration/colo.c | 14 ++++++++++++++
>>>>   1 file changed, 14 insertions(+)
>>>>
>>>> diff --git a/migration/colo.c b/migration/colo.c
>>>> index 0efab21..a6791f4 100644
>>>> --- a/migration/colo.c
>>>> +++ b/migration/colo.c
>>>> @@ -11,12 +11,19 @@
>>>>    */
>>>>
>>>>   #include <unistd.h>
>>>> +#include "qemu/timer.h"
>>>>   #include "sysemu/sysemu.h"
>>>>   #include "migration/colo.h"
>>>>   #include "trace.h"
>>>>   #include "qemu/error-report.h"
>>>>   #include "qemu/sockets.h"
>>>>
>>>> +/*
>>>> + * checkpoint interval: unit ms
>>>> + * Note: Please change this default value to 10000 when we support hybrid mode.
>>>> + */
>>>> +#define CHECKPOINT_MAX_PEROID 200
>>>
>>> Why not put the patch that makes this a configurable parameter before this,
>>> and then we can use it straight away?
>>>
>>
>> Do you mean setting this value by command  "migrate_set_parameter" ?
>> I have realized it in patch 26.
>
> Yes, I mean reorder the patch series; put the migrate_set_parameter addition
> before this patch, and then use it straight away.

OK, i will reorder them, thanks.

>
>>>>   /* colo buffer */
>>>>   #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>>>>
>>>> @@ -183,6 +190,7 @@ out:
>>>>   static void colo_process_checkpoint(MigrationState *s)
>>>>   {
>>>>       QEMUSizedBuffer *buffer = NULL;
>>>> +    int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>>>       int fd, ret = 0;
>>>>
>>>>       /* Dup the fd of to_dst_file */
>>>> @@ -220,11 +228,17 @@ static void colo_process_checkpoint(MigrationState *s)
>>>>       trace_colo_vm_state_change("stop", "run");
>>>>
>>>>       while (s->state == MIGRATION_STATUS_COLO) {
>>>> +        current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>>> +        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
>>>> +            g_usleep(100000);
>>>> +            continue;
>>>> +        }
>>>
>>> I'm a bit concerned at the 100ms wait, when the period is 200ms;
>>> depending how the times work out, couldn't we end up waiting for just
>>> under 300ms? - that's a big error - and it's even more weird when
>>> we make it configurable later.
>>>
>>
>> Agreed.
>>
>>> I don't think we've got a sleep-until, which is a shame; but how
>>> about something like:
>>>
>>>     if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
>>>         int64_t delay_ms;
>>>         delay_ms = CHECKPOINT_MAX_PERIOD - (current_time - checkpoint_time);
>>>         g_usleep (delay_ms * 1000);
>>>     }
>>>
>>
>> That's a reasonable modification. I will fix it like that in next version.
>>
>> Thanks,
>> zhanghailiang
>>
>>> Dave
>>>
>>>>           /* start a colo checkpoint */
>>>>           ret = colo_do_checkpoint_transaction(s, buffer);
>>>>           if (ret < 0) {
>>>>               goto out;
>>>>           }
>>>> +        checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>>>       }
>>>>
>>>>   out:
>>>> --
>>>> 1.8.3.1
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>> .
>>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process zhanghailiang
@ 2015-11-20 15:51   ` Dr. David Alan Gilbert
  2015-11-23  5:56     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Dr. David Alan Gilbert @ 2015-11-20 15:51 UTC (permalink / raw)
  To: zhanghailiang
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> When handling failover, we do different things according to the different stage
> of failover process, here we introduce a global atomic variable to record the
> status of failover.
> 
> We add four failover status to indicate the different stage of failover process.
> You should use the helpers to get and set the value.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>  include/migration/failover.h | 10 ++++++++++
>  migration/colo-failover.c    | 37 +++++++++++++++++++++++++++++++++++++
>  migration/colo.c             |  4 ++++
>  trace-events                 |  1 +
>  4 files changed, 52 insertions(+)
> 
> diff --git a/include/migration/failover.h b/include/migration/failover.h
> index 1785b52..882c625 100644
> --- a/include/migration/failover.h
> +++ b/include/migration/failover.h
> @@ -15,6 +15,16 @@
>  
>  #include "qemu-common.h"
>  
> +typedef enum COLOFailoverStatus {
> +    FAILOVER_STATUS_NONE = 0,
> +    FAILOVER_STATUS_REQUEST = 1, /* Request but not handled */
> +    FAILOVER_STATUS_HANDLING = 2, /* In the process of handling failover */
> +    FAILOVER_STATUS_COMPLETED = 3, /* Finish the failover process */
> +} COLOFailoverStatus;

OK - there's a couple of typo's later, but other than those:

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> +
> +void failover_init_state(void);
> +int failover_set_state(int old_state, int new_state);
> +int failover_get_state(void);
>  void failover_request_active(Error **errp);
>  
>  #endif
> diff --git a/migration/colo-failover.c b/migration/colo-failover.c
> index e3897c6..ae06c16 100644
> --- a/migration/colo-failover.c
> +++ b/migration/colo-failover.c
> @@ -14,22 +14,59 @@
>  #include "migration/failover.h"
>  #include "qmp-commands.h"
>  #include "qapi/qmp/qerror.h"
> +#include "qemu/error-report.h"
> +#include "trace.h"
>  
>  static QEMUBH *failover_bh;
> +static COLOFailoverStatus failover_state;
>  
>  static void colo_failover_bh(void *opaque)
>  {
> +    int old_state;
> +
>      qemu_bh_delete(failover_bh);
>      failover_bh = NULL;
> +    old_state = failover_set_state(FAILOVER_STATUS_REQUEST,
> +                                   FAILOVER_STATUS_HANDLING);
> +    if (old_state != FAILOVER_STATUS_REQUEST) {
> +        error_report(" Unkown error for failover, old_state=%d", old_state);

Typo 'Unkown'

> +        return;
> +    }
>      /*TODO: Do failover work */
>  }
>  
>  void failover_request_active(Error **errp)
>  {
> +   if (failover_set_state(FAILOVER_STATUS_NONE, FAILOVER_STATUS_REQUEST)
> +         != FAILOVER_STATUS_NONE) {
> +        error_setg(errp, "COLO failover is already actived");
> +        return;
> +    }
>      failover_bh = qemu_bh_new(colo_failover_bh, NULL);
>      qemu_bh_schedule(failover_bh);
>  }
>  
> +void failover_init_state(void)
> +{
> +    failover_state = FAILOVER_STATUS_NONE;
> +}
> +
> +int failover_set_state(int old_state, int new_state)
> +{
> +    int old;
> +
> +    old = atomic_cmpxchg(&failover_state, old_state, new_state);;

Typo double ;;

> +    if (old == old_state) {
> +        trace_colo_failover_set_state(new_state);
> +    }
> +    return old;
> +}
> +
> +int failover_get_state(void)
> +{
> +    return atomic_read(&failover_state);
> +}
> +
>  void qmp_x_colo_lost_heartbeat(Error **errp)
>  {
>      if (get_colo_mode() == COLO_MODE_UNKNOWN) {
> diff --git a/migration/colo.c b/migration/colo.c
> index 64daee9..7732f60 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -194,6 +194,8 @@ static void colo_process_checkpoint(MigrationState *s)
>      int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      int fd, ret = 0;
>  
> +    failover_init_state();
> +
>      /* Dup the fd of to_dst_file */
>      fd = dup(qemu_get_fd(s->to_dst_file));
>      if (fd == -1) {
> @@ -310,6 +312,8 @@ void *colo_process_incoming_thread(void *opaque)
>      migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>                        MIGRATION_STATUS_COLO);
>  
> +    failover_init_state();
> +
>      fd = dup(qemu_get_fd(mis->from_src_file));
>      if (fd < 0) {
>          ret = -errno;
> diff --git a/trace-events b/trace-events
> index c98bc13..61e89c7 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1502,6 +1502,7 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) ""
>  colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
>  colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64""
>  colo_ctl_get(const char *msg) "Receive '%s' cmd"
> +colo_failover_set_state(int new_state) "new state %d"
>  
>  # kvm-all.c
>  kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error
  2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error zhanghailiang
@ 2015-11-20 21:50   ` Eric Blake
  2015-11-23  6:01     ` zhanghailiang
  0 siblings, 1 reply; 100+ messages in thread
From: Eric Blake @ 2015-11-20 21:50 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, Michael Roth

[-- Attachment #1: Type: text/plain, Size: 2649 bytes --]

On 11/03/2015 04:56 AM, zhanghailiang wrote:
> If some errors happen during VM's COLO FT stage, it's important to notify the users
> of this event. Together with 'colo_lost_heartbeat', users can intervene in COLO's
> failover work immediately.
> If users don't want to get involved in COLO's failover verdict,
> it is still necessary to notify users that we exit COLO mode.

s/exit/exited/

> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---
>  docs/qmp-events.txt | 17 +++++++++++++++++
>  migration/colo.c    | 13 +++++++++++++
>  qapi-schema.json    | 16 ++++++++++++++++
>  qapi/event.json     | 17 +++++++++++++++++
>  4 files changed, 63 insertions(+)
> 
> diff --git a/docs/qmp-events.txt b/docs/qmp-events.txt
> index d2f1ce4..165dd76 100644
> --- a/docs/qmp-events.txt
> +++ b/docs/qmp-events.txt
> @@ -184,6 +184,23 @@ Example:
>  Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
>  event.
>  
> +COLO_EXIT
> +---------
> +
> +Emitted when VM finishes COLO mode due to some errors happening or
> +the request of users.

s/the/at the/


> +++ b/qapi-schema.json
> @@ -751,6 +751,22 @@
>    'data': [ 'unknown', 'primary', 'secondary'] }
>  
>  ##
> +# @COLOExitReason
> +#
> +# The reason of COLO exit

s/of/for a/

> +#
> +# @unknow: unknown reason

s/unknow/unknown/

> +#
> +# @request: COLO exit is due to an external request
> +#
> +# @error: COLO exit is due to an internal error
> +#
> +# Since: 2.5

2.6 (but you already know that throughout the series, so I'll quit
pointing it out)


> +++ b/qapi/event.json
> @@ -255,6 +255,23 @@
>    'data': {'status': 'MigrationStatus'}}
>  
>  ##
> +# @COLO_EXIT
> +#
> +# Emitted when VM finishes COLO mode due to some errors happening or
> +# the request of users.

s/the/at the/

> +#
> +# @mode: @COLOMode describing which side of VM is exit.

Maybe:

@mode: Which COLO mode the VM was in when it exited.

> +#
> +# @reason: @COLOExitReason describing the reason of colo exit.

@reason: describes the reason for the COLO exit.

> +#
> +# @error: #optional, error message. Only present on error happening.
> +#
> +# Since: 2.5
> +##
> +{ 'event': 'COLO_EXIT',
> +  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason', '*error': 'str' } }

Other than typos, the interface seems okay.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process
  2015-11-20 15:51   ` Dr. David Alan Gilbert
@ 2015-11-23  5:56     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-23  5:56 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	qemu-devel, arei.gonglei, stefanha, amit.shah

On 2015/11/20 23:51, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> When handling failover, we do different things according to the different stage
>> of failover process, here we introduce a global atomic variable to record the
>> status of failover.
>>
>> We add four failover status to indicate the different stage of failover process.
>> You should use the helpers to get and set the value.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>   include/migration/failover.h | 10 ++++++++++
>>   migration/colo-failover.c    | 37 +++++++++++++++++++++++++++++++++++++
>>   migration/colo.c             |  4 ++++
>>   trace-events                 |  1 +
>>   4 files changed, 52 insertions(+)
>>
>> diff --git a/include/migration/failover.h b/include/migration/failover.h
>> index 1785b52..882c625 100644
>> --- a/include/migration/failover.h
>> +++ b/include/migration/failover.h
>> @@ -15,6 +15,16 @@
>>
>>   #include "qemu-common.h"
>>
>> +typedef enum COLOFailoverStatus {
>> +    FAILOVER_STATUS_NONE = 0,
>> +    FAILOVER_STATUS_REQUEST = 1, /* Request but not handled */
>> +    FAILOVER_STATUS_HANDLING = 2, /* In the process of handling failover */
>> +    FAILOVER_STATUS_COMPLETED = 3, /* Finish the failover process */
>> +} COLOFailoverStatus;
>
> OK - there's a couple of typo's later, but other than those:
>

I will fix them all in next version, thanks.

> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
>> +
>> +void failover_init_state(void);
>> +int failover_set_state(int old_state, int new_state);
>> +int failover_get_state(void);
>>   void failover_request_active(Error **errp);
>>
>>   #endif
>> diff --git a/migration/colo-failover.c b/migration/colo-failover.c
>> index e3897c6..ae06c16 100644
>> --- a/migration/colo-failover.c
>> +++ b/migration/colo-failover.c
>> @@ -14,22 +14,59 @@
>>   #include "migration/failover.h"
>>   #include "qmp-commands.h"
>>   #include "qapi/qmp/qerror.h"
>> +#include "qemu/error-report.h"
>> +#include "trace.h"
>>
>>   static QEMUBH *failover_bh;
>> +static COLOFailoverStatus failover_state;
>>
>>   static void colo_failover_bh(void *opaque)
>>   {
>> +    int old_state;
>> +
>>       qemu_bh_delete(failover_bh);
>>       failover_bh = NULL;
>> +    old_state = failover_set_state(FAILOVER_STATUS_REQUEST,
>> +                                   FAILOVER_STATUS_HANDLING);
>> +    if (old_state != FAILOVER_STATUS_REQUEST) {
>> +        error_report(" Unkown error for failover, old_state=%d", old_state);
>
> Typo 'Unkown'
>
>> +        return;
>> +    }
>>       /*TODO: Do failover work */
>>   }
>>
>>   void failover_request_active(Error **errp)
>>   {
>> +   if (failover_set_state(FAILOVER_STATUS_NONE, FAILOVER_STATUS_REQUEST)
>> +         != FAILOVER_STATUS_NONE) {
>> +        error_setg(errp, "COLO failover is already actived");
>> +        return;
>> +    }
>>       failover_bh = qemu_bh_new(colo_failover_bh, NULL);
>>       qemu_bh_schedule(failover_bh);
>>   }
>>
>> +void failover_init_state(void)
>> +{
>> +    failover_state = FAILOVER_STATUS_NONE;
>> +}
>> +
>> +int failover_set_state(int old_state, int new_state)
>> +{
>> +    int old;
>> +
>> +    old = atomic_cmpxchg(&failover_state, old_state, new_state);;
>
> Typo double ;;
>
>> +    if (old == old_state) {
>> +        trace_colo_failover_set_state(new_state);
>> +    }
>> +    return old;
>> +}
>> +
>> +int failover_get_state(void)
>> +{
>> +    return atomic_read(&failover_state);
>> +}
>> +
>>   void qmp_x_colo_lost_heartbeat(Error **errp)
>>   {
>>       if (get_colo_mode() == COLO_MODE_UNKNOWN) {
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 64daee9..7732f60 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -194,6 +194,8 @@ static void colo_process_checkpoint(MigrationState *s)
>>       int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>       int fd, ret = 0;
>>
>> +    failover_init_state();
>> +
>>       /* Dup the fd of to_dst_file */
>>       fd = dup(qemu_get_fd(s->to_dst_file));
>>       if (fd == -1) {
>> @@ -310,6 +312,8 @@ void *colo_process_incoming_thread(void *opaque)
>>       migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>>                         MIGRATION_STATUS_COLO);
>>
>> +    failover_init_state();
>> +
>>       fd = dup(qemu_get_fd(mis->from_src_file));
>>       if (fd < 0) {
>>           ret = -errno;
>> diff --git a/trace-events b/trace-events
>> index c98bc13..61e89c7 100644
>> --- a/trace-events
>> +++ b/trace-events
>> @@ -1502,6 +1502,7 @@ rdma_start_outgoing_migration_after_rdma_source_init(void) ""
>>   colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
>>   colo_ctl_put(const char *msg, uint64_t value) "Send '%s' cmd, value: %" PRIu64""
>>   colo_ctl_get(const char *msg) "Receive '%s' cmd"
>> +colo_failover_set_state(int new_state) "new state %d"
>>
>>   # kvm-all.c
>>   kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
>> --
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error
  2015-11-20 21:50   ` Eric Blake
@ 2015-11-23  6:01     ` zhanghailiang
  0 siblings, 0 replies; 100+ messages in thread
From: zhanghailiang @ 2015-11-23  6:01 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: lizhijian, quintela, Markus Armbruster, yunhong.jiang,
	eddie.dong, peter.huangpeng, dgilbert, arei.gonglei, stefanha,
	amit.shah, Michael Roth

On 2015/11/21 5:50, Eric Blake wrote:
> On 11/03/2015 04:56 AM, zhanghailiang wrote:
>> If some errors happen during VM's COLO FT stage, it's important to notify the users
>> of this event. Together with 'colo_lost_heartbeat', users can intervene in COLO's
>> failover work immediately.
>> If users don't want to get involved in COLO's failover verdict,
>> it is still necessary to notify users that we exit COLO mode.
>
> s/exit/exited/
>
>>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> ---
>>   docs/qmp-events.txt | 17 +++++++++++++++++
>>   migration/colo.c    | 13 +++++++++++++
>>   qapi-schema.json    | 16 ++++++++++++++++
>>   qapi/event.json     | 17 +++++++++++++++++
>>   4 files changed, 63 insertions(+)
>>
>> diff --git a/docs/qmp-events.txt b/docs/qmp-events.txt
>> index d2f1ce4..165dd76 100644
>> --- a/docs/qmp-events.txt
>> +++ b/docs/qmp-events.txt
>> @@ -184,6 +184,23 @@ Example:
>>   Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
>>   event.
>>
>> +COLO_EXIT
>> +---------
>> +
>> +Emitted when VM finishes COLO mode due to some errors happening or
>> +the request of users.
>
> s/the/at the/
>
>
>> +++ b/qapi-schema.json
>> @@ -751,6 +751,22 @@
>>     'data': [ 'unknown', 'primary', 'secondary'] }
>>
>>   ##
>> +# @COLOExitReason
>> +#
>> +# The reason of COLO exit
>
> s/of/for a/
>
>> +#
>> +# @unknow: unknown reason
>
> s/unknow/unknown/
>
>> +#
>> +# @request: COLO exit is due to an external request
>> +#
>> +# @error: COLO exit is due to an internal error
>> +#
>> +# Since: 2.5
>
> 2.6 (but you already know that throughout the series, so I'll quit
> pointing it out)
>
>
>> +++ b/qapi/event.json
>> @@ -255,6 +255,23 @@
>>     'data': {'status': 'MigrationStatus'}}
>>
>>   ##
>> +# @COLO_EXIT
>> +#
>> +# Emitted when VM finishes COLO mode due to some errors happening or
>> +# the request of users.
>
> s/the/at the/
>
>> +#
>> +# @mode: @COLOMode describing which side of VM is exit.
>
> Maybe:
>
> @mode: Which COLO mode the VM was in when it exited.
>
>> +#
>> +# @reason: @COLOExitReason describing the reason of colo exit.
>
> @reason: describes the reason for the COLO exit.
>
>> +#
>> +# @error: #optional, error message. Only present on error happening.
>> +#
>> +# Since: 2.5
>> +##
>> +{ 'event': 'COLO_EXIT',
>> +  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason', '*error': 'str' } }
>
> Other than typos, the interface seems okay.
>

OK, i will fix them in next version, thanks.

^ permalink raw reply	[flat|nested] 100+ messages in thread

end of thread, other threads:[~2015-11-23  6:02 UTC | newest]

Thread overview: 100+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-03 11:56 [Qemu-devel] [PATCH COLO-Frame v10 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 01/38] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
2015-11-05 14:52   ` Eric Blake
2015-11-06  7:36     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 02/38] migration: Introduce capability 'x-colo' to migration zhanghailiang
2015-11-13 16:01   ` Eric Blake
2015-11-16  8:35     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 03/38] COLO: migrate colo related info to secondary node zhanghailiang
2015-11-06 16:36   ` Dr. David Alan Gilbert
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 04/38] migration: Add state records for migration incoming zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 05/38] migration: Integrate COLO checkpoint process into migration zhanghailiang
2015-11-06 16:48   ` Dr. David Alan Gilbert
2015-11-13 16:42   ` Eric Blake
2015-11-16 13:00     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
2015-11-06 17:29   ` Dr. David Alan Gilbert
2015-11-09  6:09     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 07/38] migration: Rename the'file' member of MigrationState and MigrationIncomingState zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 08/38] COLO/migration: establish a new communication path from destination to source zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol zhanghailiang
2015-11-06 18:26   ` Dr. David Alan Gilbert
2015-11-09  6:51     ` zhanghailiang
2015-11-09  7:33       ` zhanghailiang
2015-11-13 16:46   ` Eric Blake
2015-11-17  7:04     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 10/38] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2015-11-06 18:28   ` Dr. David Alan Gilbert
2015-11-13 16:47   ` Eric Blake
2015-11-17  7:15     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 11/38] QEMUSizedBuffer: Introduce two help functions for qsb zhanghailiang
2015-11-06 18:30   ` Dr. David Alan Gilbert
2015-11-09  8:14     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 12/38] COLO: Save PVM state to secondary side when do checkpoint zhanghailiang
2015-11-06 18:59   ` Dr. David Alan Gilbert
2015-11-09  9:17     ` zhanghailiang
2015-11-13 18:53       ` Dr. David Alan Gilbert
2015-11-17 10:20         ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 13/38] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
2015-11-13 15:39   ` Dr. David Alan Gilbert
2015-11-16  7:57     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 14/38] COLO: Load VMState into qsb before restore it zhanghailiang
2015-11-13 16:02   ` Dr. David Alan Gilbert
2015-11-16  8:46     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 15/38] ram/COLO: Record pages received from PVM by re-using migration dirty bitmap zhanghailiang
2015-11-13 16:19   ` Dr. David Alan Gilbert
2015-11-16  9:07     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 16/38] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
2015-11-13 16:38   ` Dr. David Alan Gilbert
2015-11-16 12:46     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize PVM's state to SVM periodically zhanghailiang
2015-11-13 18:34   ` Dr. David Alan Gilbert
2015-11-17  9:11     ` zhanghailiang
2015-11-17 10:08       ` Dr. David Alan Gilbert
2015-11-17 10:29         ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover zhanghailiang
2015-11-13 16:59   ` Eric Blake
2015-11-17  8:03     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 19/38] COLO failover: Introduce state to record failover process zhanghailiang
2015-11-20 15:51   ` Dr. David Alan Gilbert
2015-11-23  5:56     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 20/38] COLO: Implement failover work for Primary VM zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 21/38] COLO: Implement failover work for Secondary VM zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 22/38] COLO: implement default failover treatment zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 23/38] qmp event: Add event notification for COLO error zhanghailiang
2015-11-20 21:50   ` Eric Blake
2015-11-23  6:01     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 24/38] COLO failover: Shutdown related socket fd when do failover zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 25/38] COLO failover: Don't do failover during loading VM's state zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 26/38] COLO: Control the checkpoint delay time by migrate-set-parameters command zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 27/38] COLO: Process shutdown command for VM in COLO state zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 28/38] COLO: Update the global runstate after going into colo state zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 29/38] savevm: Split load vm state function qemu_loadvm_state zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 30/38] COLO: Separate the process of saving/loading ram and device state zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 31/38] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 32/38] netfilter: Add a public API to release all the buffered packets zhanghailiang
2015-11-03 12:39   ` Yang Hongyang
2015-11-03 13:19     ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 33/38] netfilter: Introduce an API to delete the timer of all buffer-filters zhanghailiang
2015-11-03 12:41   ` Yang Hongyang
2015-11-03 13:07     ` zhanghailiang
2015-11-04  2:51       ` Jason Wang
2015-11-04  3:08         ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 34/38] filter-buffer: Accept zero interval zhanghailiang
2015-11-03 12:43   ` Yang Hongyang
2015-11-04  2:52     ` Jason Wang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 35/38] netfilter: Introduce a API to automatically add filter-buffer for each netdev zhanghailiang
2015-11-03 12:57   ` Yang Hongyang
2015-11-03 13:16     ` zhanghailiang
2015-11-04  2:56   ` Jason Wang
2015-11-04  3:07     ` zhanghailiang
2015-11-05  7:43     ` zhanghailiang
2015-11-05  8:52       ` Wen Congyang
2015-11-05  9:21         ` Jason Wang
2015-11-05  9:33           ` Wen Congyang
2015-11-05  9:19       ` Jason Wang
2015-11-05 10:58         ` zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 36/38] netfilter: Introduce an API to delete all the automatically added netfilters zhanghailiang
2015-11-03 12:58   ` Yang Hongyang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 37/38] colo: Use the netfilter to buffer and release packets zhanghailiang
2015-11-03 11:56 ` [Qemu-devel] [PATCH COLO-Frame v10 38/38] COLO: Add block replication into colo process zhanghailiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.