[Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare
@ 2017-02-22  3:42 zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter zhanghailiang
                   ` (14 more replies)
  0 siblings, 15 replies; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Dong eddie, Jiang yunhong,
	Xu Quan, Jason Wang

Hi,
This series tries to integrate colo frame with block replication
and net compare. Block replcation and colo proxy (net compare) parts
have been merged in upstream for last version. We need to integrate
all of them to realize complete capability of COLO.

Besides, for colo frame, there are some optimizations, including
separating the process of saving ram and device state, using
an COLO_EXIT event to notify users that VM exits COLO, for these
parts, most of them have been reviewed long time ago in old version.

Please review, thanks.

Cc: Dong eddie <eddie.dong@intel.com>
Cc: Jiang yunhong <yunhong.jiang@intel.com>
Cc: Xu Quan <xuquan8@huawei.com>
Cc: Jason Wang <jasowang@redhat.com> 

zhanghailiang (15):
  net/colo: Add notifier/callback related helpers for filter
  colo-compare: implement the process of checkpoint
  colo-compare: use notifier to notify packets comparing result
  COLO: integrate colo compare with colo frame
  COLO: Handle shutdown command for VM in COLO state
  COLO: Add block replication into colo process
  COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
  ram/COLO: Record the dirty pages that SVM received
  COLO: Flush PVM's cached RAM into SVM's memory
  qmp event: Add COLO_EXIT event to notify users while exited from COLO
  savevm: split save/find loadvm_handlers entry into two helper
    functions
  savevm: split the process of different stages for loadvm/savevm
  COLO: Separate the process of saving/loading ram and device state
  COLO: Split qemu_savevm_state_begin out of checkpoint process
  COLO: flush host dirty ram from cache

 include/exec/ram_addr.h       |   1 +
 include/migration/colo.h      |   1 +
 include/migration/migration.h |   5 +
 include/sysemu/sysemu.h       |   9 ++
 migration/colo.c              | 232 +++++++++++++++++++++++++++++++++++++++---
 migration/migration.c         |   2 +-
 migration/ram.c               | 149 ++++++++++++++++++++++++++-
 migration/savevm.c            | 114 +++++++++++++++++----
 migration/trace-events        |   2 +
 net/colo-compare.c            | 104 ++++++++++++++++++-
 net/colo-compare.h            |  22 ++++
 net/colo.c                    |  92 +++++++++++++++++
 net/colo.h                    |  18 ++++
 qapi-schema.json              |  18 +++-
 qapi/event.json               |  21 ++++
 vl.c                          |  19 +++-
 16 files changed, 764 insertions(+), 45 deletions(-)
 create mode 100644 net/colo-compare.h

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-04-07 15:46   ` Dr. David Alan Gilbert
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint zhanghailiang
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Jason Wang

We will use this notifier to help COLO to notify filter object
to do something, like do checkpoint, or process failover event.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo.h | 18 ++++++++++++
 2 files changed, 110 insertions(+)

diff --git a/net/colo.c b/net/colo.c
index 8cc166b..1697150 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -15,6 +15,7 @@
 #include "qemu/osdep.h"
 #include "trace.h"
 #include "net/colo.h"
+#include "qapi/error.h"
 
 uint32_t connection_key_hash(const void *opaque)
 {
@@ -209,3 +210,94 @@ Connection *connection_get(GHashTable *connection_track_table,
 
     return conn;
 }
+
+static gboolean
+filter_notify_prepare(GSource *source, gint *timeout)
+{
+    *timeout = -1;
+
+    return FALSE;
+}
+
+static gboolean
+filter_notify_check(GSource *source)
+{
+    FilterNotifier *notify = (FilterNotifier *)source;
+
+    return notify->pfd.revents & (G_IO_IN | G_IO_HUP | G_IO_ERR);
+}
+
+static gboolean
+filter_notify_dispatch(GSource *source,
+                       GSourceFunc callback,
+                       gpointer user_data)
+{
+    FilterNotifier *notify = (FilterNotifier *)source;
+    int revents;
+    int ret;
+
+    revents = notify->pfd.revents & notify->pfd.events;
+    if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) {
+        ret = event_notifier_test_and_clear(&notify->event);
+        if (notify->cb) {
+            notify->cb(notify, ret);
+        }
+    }
+    return TRUE;
+}
+
+static void
+filter_notify_finalize(GSource *source)
+{
+    FilterNotifier *notify = (FilterNotifier *)source;
+
+    event_notifier_cleanup(&notify->event);
+}
+
+static GSourceFuncs notifier_source_funcs = {
+    filter_notify_prepare,
+    filter_notify_check,
+    filter_notify_dispatch,
+    filter_notify_finalize,
+};
+
+FilterNotifier *filter_noitifier_new(FilterNotifierCallback *cb,
+                    void *opaque, Error **errp)
+{
+    FilterNotifier *notify;
+    int ret;
+
+    notify = (FilterNotifier *)g_source_new(&notifier_source_funcs,
+                sizeof(FilterNotifier));
+    ret = event_notifier_init(&notify->event, false);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Failed to initialize event notifier");
+        goto fail;
+    }
+    notify->pfd.fd = event_notifier_get_fd(&notify->event);
+    notify->pfd.events = G_IO_IN | G_IO_HUP | G_IO_ERR;
+    notify->cb = cb;
+    notify->opaque = opaque;
+    g_source_add_poll(&notify->source, &notify->pfd);
+
+    return notify;
+
+fail:
+    g_source_destroy(&notify->source);
+    return NULL;
+}
+
+int filter_notifier_set(FilterNotifier *notify, uint64_t value)
+{
+    ssize_t ret;
+
+    do {
+        ret = write(notify->event.wfd, &value, sizeof(value));
+    } while (ret < 0 && errno == EINTR);
+
+    /* EAGAIN is fine, a read must be pending.  */
+    if (ret < 0 && errno != EAGAIN) {
+        return -errno;
+    }
+    return 0;
+}
diff --git a/net/colo.h b/net/colo.h
index cd9027f..00f03b5 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -19,6 +19,7 @@
 #include "qemu/jhash.h"
 #include "qemu/timer.h"
 #include "slirp/tcp.h"
+#include "qemu/event_notifier.h"
 
 #define HASHTABLE_MAX_SIZE 16384
 
@@ -89,4 +90,21 @@ void connection_hashtable_reset(GHashTable *connection_track_table);
 Packet *packet_new(const void *data, int size);
 void packet_destroy(void *opaque, void *user_data);
 
+typedef void FilterNotifierCallback(void *opaque, int value);
+typedef struct FilterNotifier {
+    GSource source;
+    EventNotifier event;
+    GPollFD pfd;
+    FilterNotifierCallback *cb;
+    void *opaque;
+} FilterNotifier;
+
+FilterNotifier *filter_noitifier_new(FilterNotifierCallback *cb,
+                    void *opaque, Error **errp);
+int filter_notifier_set(FilterNotifier *notify, uint64_t value);
+
+enum {
+    COLO_CHECKPOINT = 2,
+    COLO_FAILOVER,
+};
 #endif /* QEMU_COLO_PROXY_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22  9:31   ` Zhang Chen
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 03/15] colo-compare: use notifier to notify packets comparing result zhanghailiang
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Jason Wang

While do checkpoint, we need to flush all the unhandled packets,
By using the filter notifier mechanism, we can easily to notify
every compare object to do this process, which runs inside
of compare threads as a coroutine.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-compare.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo-compare.h | 20 +++++++++++++++
 2 files changed, 92 insertions(+)
 create mode 100644 net/colo-compare.h

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a6fc2ff..61a8ee4 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -29,17 +29,24 @@
 #include "qemu/sockets.h"
 #include "qapi-visit.h"
 #include "net/colo.h"
+#include "net/colo-compare.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
     OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
+static QTAILQ_HEAD(, CompareState) net_compares =
+       QTAILQ_HEAD_INITIALIZER(net_compares);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
 /* TODO: Should be configurable */
 #define REGULAR_PACKET_CHECK_MS 3000
 
+static QemuMutex event_mtx = { .lock = PTHREAD_MUTEX_INITIALIZER };
+static QemuCond event_complete_cond = { .cond = PTHREAD_COND_INITIALIZER };
+static int event_unhandled_count;
 /*
   + CompareState ++
   |               |
@@ -86,6 +93,10 @@ typedef struct CompareState {
 
     GMainContext *worker_context;
     GMainLoop *compare_loop;
+    /* Used for COLO to notify compare to do something */
+    FilterNotifier *notifier;
+
+    QTAILQ_ENTRY(CompareState) next;
 } CompareState;
 
 typedef struct CompareClass {
@@ -375,6 +386,11 @@ static void colo_compare_connection(void *opaque, void *user_data)
     while (!g_queue_is_empty(&conn->primary_list) &&
            !g_queue_is_empty(&conn->secondary_list)) {
         pkt = g_queue_pop_tail(&conn->primary_list);
+        if (!pkt) {
+            error_report("colo-compare pop pkt failed");
+            return;
+        }
+
         switch (conn->ip_proto) {
         case IPPROTO_TCP:
             result = g_queue_find_custom(&conn->secondary_list,
@@ -496,6 +512,52 @@ static gboolean check_old_packet_regular(void *opaque)
     return TRUE;
 }
 
+/* Public API, Used for COLO frame to notify compare event */
+void colo_notify_compares_event(void *opaque, int event, Error **errp)
+{
+    CompareState *s;
+    int ret;
+
+    qemu_mutex_lock(&event_mtx);
+    QTAILQ_FOREACH(s, &net_compares, next) {
+        ret = filter_notifier_set(s->notifier, event);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Failed to write value to eventfd");
+            goto fail;
+        }
+        event_unhandled_count++;
+    }
+    /* Wait all compare thread to finish handling this event */
+    while (event_unhandled_count) {
+        qemu_cond_wait(&event_complete_cond, &event_mtx);
+    }
+
+fail:
+    qemu_mutex_unlock(&event_mtx);
+}
+
+static void colo_flush_packets(void *opaque, void *user_data);
+
+static void colo_compare_handle_event(void *opaque, int event)
+{
+    FilterNotifier *notify = opaque;
+    CompareState *s = notify->opaque;
+
+    switch (event) {
+    case COLO_CHECKPOINT:
+        g_queue_foreach(&s->conn_list, colo_flush_packets, s);
+        break;
+    case COLO_FAILOVER:
+        break;
+    default:
+        break;
+    }
+    qemu_mutex_lock(&event_mtx);
+    event_unhandled_count--;
+    qemu_cond_broadcast(&event_complete_cond);
+    qemu_mutex_unlock(&event_mtx);
+}
+
 static void *colo_compare_thread(void *opaque)
 {
     CompareState *s = opaque;
@@ -516,8 +578,12 @@ static void *colo_compare_thread(void *opaque)
                           (GSourceFunc)check_old_packet_regular, s, NULL);
     g_source_attach(timeout_source, s->worker_context);
 
+    s->notifier = filter_noitifier_new(colo_compare_handle_event, s, NULL);
+    g_source_attach(&s->notifier->source, s->worker_context);
+
     g_main_loop_run(s->compare_loop);
 
+    g_source_unref(&s->notifier->source);
     g_source_unref(timeout_source);
     g_main_loop_unref(s->compare_loop);
     g_main_context_unref(s->worker_context);
@@ -660,6 +726,8 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
     net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
     net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
 
+    QTAILQ_INSERT_TAIL(&net_compares, s, next);
+
     g_queue_init(&s->conn_list);
 
     s->connection_track_table = g_hash_table_new_full(connection_key_hash,
@@ -726,6 +794,10 @@ static void colo_compare_finalize(Object *obj)
     g_main_loop_quit(s->compare_loop);
     qemu_thread_join(&s->thread);
 
+    if (!QTAILQ_EMPTY(&net_compares)) {
+        QTAILQ_REMOVE(&net_compares, s, next);
+    }
+
     /* Release all unhandled packets after compare thead exited */
     g_queue_foreach(&s->conn_list, colo_flush_packets, s);
 
diff --git a/net/colo-compare.h b/net/colo-compare.h
new file mode 100644
index 0000000..ed823ed
--- /dev/null
+++ b/net/colo-compare.h
@@ -0,0 +1,20 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_COMPARE_H
+#define QEMU_COLO_COMPARE_H
+
+void colo_notify_compares_event(void *opaque, int event, Error **errp);
+
+#endif /* QEMU_COLO_COMPARE_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 03/15] colo-compare: use notifier to notify packets comparing result
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 04/15] COLO: integrate colo compare with colo frame zhanghailiang
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Jason Wang

It's a good idea to use notifier to notify COLO frame of
inconsistent packets comparing.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 32 ++++++++++++++++++++++++++++----
 net/colo-compare.h |  2 ++
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 61a8ee4..0886d7e 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -30,6 +30,7 @@
 #include "qapi-visit.h"
 #include "net/colo.h"
 #include "net/colo-compare.h"
+#include "migration/migration.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
@@ -38,6 +39,9 @@
 static QTAILQ_HEAD(, CompareState) net_compares =
        QTAILQ_HEAD_INITIALIZER(net_compares);
 
+static NotifierList colo_compare_notifiers =
+    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
@@ -342,6 +346,22 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
     }
 }
 
+static void colo_compare_inconsistent_notify(void)
+{
+    notifier_list_notify(&colo_compare_notifiers,
+                migrate_get_current());
+}
+
+void colo_compare_register_notifier(Notifier *notify)
+{
+    notifier_list_add(&colo_compare_notifiers, notify);
+}
+
+void colo_compare_unregister_notifier(Notifier *notify)
+{
+    notifier_remove(notify);
+}
+
 static void colo_old_packet_check_one_conn(void *opaque,
                                            void *user_data)
 {
@@ -355,7 +375,7 @@ static void colo_old_packet_check_one_conn(void *opaque,
 
     if (result) {
         /* do checkpoint will flush old packet */
-        /* TODO: colo_notify_checkpoint();*/
+        colo_compare_inconsistent_notify();
     }
 }
 
@@ -373,7 +393,10 @@ static void colo_old_packet_check(void *opaque)
 
 /*
  * Called from the compare thread on the primary
- * for compare connection
+ * for compare connection.
+ * TODO: Reconstruct this function, we should hold the max handled sequence
+ * of the connect, Don't trigger a checkpoint request if we only get packets
+ * from one side (primary or secondary).
  */
 static void colo_compare_connection(void *opaque, void *user_data)
 {
@@ -422,11 +445,12 @@ static void colo_compare_connection(void *opaque, void *user_data)
             /*
              * If one packet arrive late, the secondary_list or
              * primary_list will be empty, so we can't compare it
-             * until next comparison.
+             * until next comparison. If the packets in the list are
+             * timeout, it will trigger a checkpoint request.
              */
             trace_colo_compare_main("packet different");
             g_queue_push_tail(&conn->primary_list, pkt);
-            /* TODO: colo_notify_checkpoint();*/
+            colo_compare_inconsistent_notify();
             break;
         }
     }
diff --git a/net/colo-compare.h b/net/colo-compare.h
index ed823ed..dc797ec 100644
--- a/net/colo-compare.h
+++ b/net/colo-compare.h
@@ -16,5 +16,7 @@
 #define QEMU_COLO_COMPARE_H
 
 void colo_notify_compares_event(void *opaque, int event, Error **errp);
+void colo_compare_register_notifier(Notifier *notify);
+void colo_compare_unregister_notifier(Notifier *notify);
 
 #endif /* QEMU_COLO_COMPARE_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 04/15] COLO: integrate colo compare with colo frame
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (2 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 03/15] colo-compare: use notifier to notify packets comparing result zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-04-07 15:59   ` Dr. David Alan Gilbert
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 05/15] COLO: Handle shutdown command for VM in COLO state zhanghailiang
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Jason Wang

For COLO FT, both the PVM and SVM run at the same time,
only sync the state while it needs.

So here, let SVM runs while not doing checkpoint,
Besides, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 migration/colo.c      | 25 +++++++++++++++++++++++++
 migration/migration.c |  2 +-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/migration/colo.c b/migration/colo.c
index 712308e..fb8d8fd 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -19,8 +19,11 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "migration/failover.h"
+#include "net/colo-compare.h"
+#include "net/colo.h"
 
 static bool vmstate_loading;
+static Notifier packets_compare_notifier;
 
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
@@ -263,6 +266,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     if (local_err) {
         goto out;
     }
+
     /* Reset channel-buffer directly */
     qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL);
     bioc->usage = 0;
@@ -283,6 +287,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
     /* Disable block migration */
     s->params.blk = 0;
     s->params.shared = 0;
@@ -341,6 +350,11 @@ out:
     return ret;
 }
 
+static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)
+{
+    colo_checkpoint_notify(data);
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     QIOChannelBuffer *bioc;
@@ -357,6 +371,9 @@ static void colo_process_checkpoint(MigrationState *s)
         goto out;
     }
 
+    packets_compare_notifier.notify = colo_compare_notify_checkpoint;
+    colo_compare_register_notifier(&packets_compare_notifier);
+
     /*
      * Wait for Secondary finish loading VM states and enter COLO
      * restore.
@@ -402,6 +419,7 @@ out:
         qemu_fclose(fb);
     }
 
+    colo_compare_unregister_notifier(&packets_compare_notifier);
     timer_del(s->colo_delay_timer);
 
     /* Hope this not to be too long to wait here */
@@ -518,6 +536,11 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        qemu_mutex_lock_iothread();
+        vm_stop_force_state(RUN_STATE_COLO);
+        trace_colo_vm_state_change("run", "stop");
+        qemu_mutex_unlock_iothread();
+
         /* FIXME: This is unnecessary for periodic checkpoint mode */
         colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
                      &local_err);
@@ -571,6 +594,8 @@ void *colo_process_incoming_thread(void *opaque)
         }
 
         vmstate_loading = false;
+        vm_start();
+        trace_colo_vm_state_change("stop", "run");
         qemu_mutex_unlock_iothread();
 
         if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
diff --git a/migration/migration.c b/migration/migration.c
index c6ae69d..2339be7 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -66,7 +66,7 @@
 /* The delay time (in ms) between two COLO checkpoints
  * Note: Please change this default value to 10000 when we support hybrid mode.
  */
-#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
+#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
 
 static NotifierList migration_state_notifiers =
     NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 05/15] COLO: Handle shutdown command for VM in COLO state
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (3 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 04/15] COLO: integrate colo compare with colo frame zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22 15:35   ` Eric Blake
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 06/15] COLO: Add block replication into colo process zhanghailiang
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Paolo Bonzini

If VM is in COLO FT state, we need to do some extra works before
starting normal shutdown process.

Secondary VM will ignore the shutdown command if users issue it directly
to Secondary VM. COLO will capture shutdown command and after
shutdown request from user.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
v19:
- fix title and comment
v15:
- Go on the shutdown process even some error happened
  while sent 'SHUTDOWN' message to SVM.
- Add Reviewed-by tag
v14:
- Remove 'colo_shutdown' variable, use colo_shutdown_request directly
v13:
- Move COLO shutdown related codes to colo.c file (Dave's suggestion)
---
 include/migration/colo.h |  1 +
 include/sysemu/sysemu.h  |  3 +++
 migration/colo.c         | 46 +++++++++++++++++++++++++++++++++++++++++++++-
 qapi-schema.json         |  4 +++-
 vl.c                     | 19 ++++++++++++++++---
 5 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 2bbff9e..aadd040 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -37,4 +37,5 @@ COLOMode get_colo_mode(void);
 void colo_do_failover(MigrationState *s);
 
 void colo_checkpoint_notify(void *opaque);
+bool colo_handle_shutdown(void);
 #endif
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 576c7ce..7ed665a 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -49,6 +49,8 @@ typedef enum WakeupReason {
     QEMU_WAKEUP_REASON_OTHER,
 } WakeupReason;
 
+extern int colo_shutdown_requested;
+
 void qemu_system_reset_request(void);
 void qemu_system_suspend_request(void);
 void qemu_register_suspend_notifier(Notifier *notifier);
@@ -56,6 +58,7 @@ void qemu_system_wakeup_request(WakeupReason reason);
 void qemu_system_wakeup_enable(WakeupReason reason, bool enabled);
 void qemu_register_wakeup_notifier(Notifier *notifier);
 void qemu_system_shutdown_request(void);
+void qemu_system_shutdown_request_core(void);
 void qemu_system_powerdown_request(void);
 void qemu_register_powerdown_notifier(Notifier *notifier);
 void qemu_system_debug_request(void);
diff --git a/migration/colo.c b/migration/colo.c
index fb8d8fd..4626435 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -336,6 +336,21 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    if (colo_shutdown_requested) {
+        colo_send_message(s->to_dst_file, COLO_MESSAGE_GUEST_SHUTDOWN,
+                          &local_err);
+        if (local_err) {
+            error_free(local_err);
+            /* Go on the shutdown process and throw the error message */
+            error_report("Failed to send shutdown message to SVM");
+        }
+        qemu_fflush(s->to_dst_file);
+        colo_shutdown_requested = 0;
+        qemu_system_shutdown_request_core();
+        /* Fix me: Just let the colo thread exit ? */
+        qemu_thread_exit(0);
+    }
+
     ret = 0;
 
     qemu_mutex_lock_iothread();
@@ -401,7 +416,9 @@ static void colo_process_checkpoint(MigrationState *s)
             goto out;
         }
 
-        qemu_sem_wait(&s->colo_checkpoint_sem);
+        if (!colo_shutdown_requested) {
+            qemu_sem_wait(&s->colo_checkpoint_sem);
+        }
 
         ret = colo_do_checkpoint_transaction(s, bioc, fb);
         if (ret < 0) {
@@ -477,6 +494,16 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
     case COLO_MESSAGE_CHECKPOINT_REQUEST:
         *checkpoint_request = 1;
         break;
+    case COLO_MESSAGE_GUEST_SHUTDOWN:
+        qemu_mutex_lock_iothread();
+        vm_stop_force_state(RUN_STATE_COLO);
+        qemu_system_shutdown_request_core();
+        qemu_mutex_unlock_iothread();
+        /*
+         * The main thread will be exit and terminate the whole
+         * process, do need some cleanup ?
+         */
+        qemu_thread_exit(0);
     default:
         *checkpoint_request = 0;
         error_setg(errp, "Got unknown COLO message: %d", msg);
@@ -634,3 +661,20 @@ out:
 
     return NULL;
 }
+
+bool colo_handle_shutdown(void)
+{
+    /*
+     * If VM is in COLO-FT mode, we need do some significant work before
+     * respond to the shutdown request. Besides, Secondary VM will ignore
+     * the shutdown request from users.
+     */
+    if (migration_incoming_in_colo_state()) {
+        return true;
+    }
+    if (migration_in_colo_state()) {
+        colo_shutdown_requested = 1;
+        return true;
+    }
+    return false;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index e9a6364..0521054 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1157,12 +1157,14 @@
 #
 # @vmstate-loaded: VM's state has been loaded by SVM.
 #
+# @guest-shutdown: shutdown require from PVM to SVM
+#
 # Since: 2.8
 ##
 { 'enum': 'COLOMessage',
   'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
             'vmstate-send', 'vmstate-size', 'vmstate-received',
-            'vmstate-loaded' ] }
+            'vmstate-loaded', 'guest-shutdown' ] }
 
 ##
 # @COLOMode:
diff --git a/vl.c b/vl.c
index b5d0a19..daad8df 100644
--- a/vl.c
+++ b/vl.c
@@ -1587,6 +1587,8 @@ static NotifierList wakeup_notifiers =
     NOTIFIER_LIST_INITIALIZER(wakeup_notifiers);
 static uint32_t wakeup_reason_mask = ~(1 << QEMU_WAKEUP_REASON_NONE);
 
+int colo_shutdown_requested;
+
 int qemu_shutdown_requested_get(void)
 {
     return shutdown_requested;
@@ -1713,7 +1715,10 @@ void qemu_system_guest_panicked(GuestPanicInformation *info)
 void qemu_system_reset_request(void)
 {
     if (no_reboot) {
-        shutdown_requested = 1;
+        qemu_system_shutdown_request();
+        if (!shutdown_requested) {/* colo handle it ? */
+            return;
+        }
     } else {
         reset_requested = 1;
     }
@@ -1786,14 +1791,22 @@ void qemu_system_killed(int signal, pid_t pid)
     qemu_notify_event();
 }
 
-void qemu_system_shutdown_request(void)
+void qemu_system_shutdown_request_core(void)
 {
-    trace_qemu_system_shutdown_request();
     replay_shutdown_request();
     shutdown_requested = 1;
     qemu_notify_event();
 }
 
+void qemu_system_shutdown_request(void)
+{
+    trace_qemu_system_shutdown_request();
+    if (colo_handle_shutdown()) {
+        return;
+    }
+    qemu_system_shutdown_request_core();
+}
+
 static void qemu_system_powerdown(void)
 {
     qapi_event_send_powerdown(&error_abort);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 06/15] COLO: Add block replication into colo process
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (4 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 05/15] COLO: Handle shutdown command for VM in COLO state zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 07/15] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Wen Congyang,
	Stefan Hajnoczi, Kevin Wolf, Max Reitz

Make sure master start block replication after slave's block
replication started.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 migration/colo.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 4626435..1e3e975 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -21,6 +21,9 @@
 #include "migration/failover.h"
 #include "net/colo-compare.h"
 #include "net/colo.h"
+#include "qapi-event.h"
+#include "block/block.h"
+#include "replication.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -55,6 +58,7 @@ static void secondary_vm_do_failover(void)
 {
     int old_state;
     MigrationIncomingState *mis = migration_incoming_get_current();
+    Error *local_err = NULL;
 
     /* Can not do failover during the process of VM's loading VMstate, Or
      * it will break the secondary VM.
@@ -72,6 +76,11 @@ static void secondary_vm_do_failover(void)
     migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
 
+    replication_stop_all(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
+
     if (!autostart) {
         error_report("\"-S\" qemu option will be ignored in secondary side");
         /* recover runstate to normal migration finish state */
@@ -109,6 +118,7 @@ static void primary_vm_do_failover(void)
 {
     MigrationState *s = migrate_get_current();
     int old_state;
+    Error *local_err = NULL;
 
     migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
@@ -132,6 +142,12 @@ static void primary_vm_do_failover(void)
                      FailoverStatus_lookup[old_state]);
         return;
     }
+
+    replication_stop_all(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
+
     /* Notify COLO thread that failover work is finished */
     qemu_sem_post(&s->colo_exit_sem);
 }
@@ -297,6 +313,15 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     s->params.shared = 0;
     qemu_savevm_state_header(fb);
     qemu_savevm_state_begin(fb, &s->params);
+
+    /* We call this API although this may do nothing on primary side. */
+    qemu_mutex_lock_iothread();
+    replication_do_checkpoint_all(&local_err);
+    qemu_mutex_unlock_iothread();
+    if (local_err) {
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     qemu_savevm_state_complete_precopy(fb, false);
     qemu_mutex_unlock_iothread();
@@ -403,6 +428,12 @@ static void colo_process_checkpoint(MigrationState *s)
     object_unref(OBJECT(bioc));
 
     qemu_mutex_lock_iothread();
+    replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
+    if (local_err) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
+
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
@@ -497,6 +528,7 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
     case COLO_MESSAGE_GUEST_SHUTDOWN:
         qemu_mutex_lock_iothread();
         vm_stop_force_state(RUN_STATE_COLO);
+        replication_stop_all(false, NULL);
         qemu_system_shutdown_request_core();
         qemu_mutex_unlock_iothread();
         /*
@@ -544,6 +576,18 @@ void *colo_process_incoming_thread(void *opaque)
     fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
     object_unref(OBJECT(bioc));
 
+    qemu_mutex_lock_iothread();
+    bdrv_invalidate_cache_all(&local_err);
+    if (local_err) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
+    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
+    qemu_mutex_unlock_iothread();
+    if (local_err) {
+        goto out;
+    }
+
     colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
                       &local_err);
     if (local_err) {
@@ -620,6 +664,18 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        replication_get_error_all(&local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+        /* discard colo disk buffer */
+        replication_do_checkpoint_all(&local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+
         vmstate_loading = false;
         vm_start();
         trace_colo_vm_state_change("stop", "run");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 07/15] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (5 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 06/15] COLO: Add block replication into colo process zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-04-07 17:06   ` Dr. David Alan Gilbert
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 08/15] ram/COLO: Record the dirty pages that SVM received zhanghailiang
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Juan Quintela

We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.

We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached ram
to SVM after we receive all PVM's state.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/exec/ram_addr.h       |  1 +
 include/migration/migration.h |  4 +++
 migration/colo.c              | 14 +++++++++
 migration/ram.c               | 73 ++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 3e79466..44e1190 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -27,6 +27,7 @@ struct RAMBlock {
     struct rcu_head rcu;
     struct MemoryRegion *mr;
     uint8_t *host;
+    uint8_t *colo_cache; /* For colo, VM's ram cache */
     ram_addr_t offset;
     ram_addr_t used_length;
     ram_addr_t max_length;
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 1735d66..93c6148 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -379,4 +379,8 @@ int ram_save_queue_pages(MigrationState *ms, const char *rbname,
 PostcopyState postcopy_state_get(void);
 /* Set the state and return the old state */
 PostcopyState postcopy_state_set(PostcopyState new_state);
+
+/* ram cache */
+int colo_init_ram_cache(void);
+void colo_release_ram_cache(void);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index 1e3e975..edb7f00 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -551,6 +551,7 @@ void *colo_process_incoming_thread(void *opaque)
     uint64_t total_size;
     uint64_t value;
     Error *local_err = NULL;
+    int ret;
 
     qemu_sem_init(&mis->colo_incoming_sem, 0);
 
@@ -572,6 +573,12 @@ void *colo_process_incoming_thread(void *opaque)
      */
     qemu_file_set_blocking(mis->from_src_file, true);
 
+    ret = colo_init_ram_cache();
+    if (ret < 0) {
+        error_report("Failed to initialize ram cache");
+        goto out;
+    }
+
     bioc = qio_channel_buffer_new(COLO_BUFFER_BASE_SIZE);
     fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
     object_unref(OBJECT(bioc));
@@ -705,11 +712,18 @@ out:
     if (fb) {
         qemu_fclose(fb);
     }
+    /*
+     * We can ensure BH is hold the global lock, and will join COLO
+     * incoming thread, so here it is not necessary to lock here again,
+     * Or there will be a deadlock error.
+     */
+    colo_release_ram_cache();
 
     /* Hope this not to be too long to loop here */
     qemu_sem_wait(&mis->colo_incoming_sem);
     qemu_sem_destroy(&mis->colo_incoming_sem);
     /* Must be called after failover BH is completed */
+
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
     }
diff --git a/migration/ram.c b/migration/ram.c
index f289fcd..b588990 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -219,6 +219,7 @@ static RAMBlock *last_sent_block;
 static ram_addr_t last_offset;
 static QemuMutex migration_bitmap_mutex;
 static uint64_t migration_dirty_pages;
+static bool ram_cache_enable;
 static uint32_t last_version;
 static bool ram_bulk_stage;
 
@@ -2227,6 +2228,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
     return block->host + offset;
 }
 
+static inline void *colo_cache_from_block_offset(RAMBlock *block,
+                                                 ram_addr_t offset)
+{
+    if (!offset_in_ramblock(block, offset)) {
+        return NULL;
+    }
+    if (!block->colo_cache) {
+        error_report("%s: colo_cache is NULL in block :%s",
+                     __func__, block->idstr);
+        return NULL;
+    }
+    return block->colo_cache + offset;
+}
+
 /*
  * If a page (or a whole RDMA chunk) has been
  * determined to be zero, then zap it.
@@ -2542,7 +2557,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                      RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
             RAMBlock *block = ram_block_from_stream(f, flags);
 
-            host = host_from_ram_block_offset(block, addr);
+            /* After going into COLO, we should load the Page into colo_cache */
+            if (ram_cache_enable) {
+                host = colo_cache_from_block_offset(block, addr);
+            } else {
+                host = host_from_ram_block_offset(block, addr);
+            }
             if (!host) {
                 error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
                 ret = -EINVAL;
@@ -2637,6 +2657,57 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     return ret;
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM, it will be called after first migration.
+ */
+int colo_init_ram_cache(void)
+{
+    RAMBlock *block;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        block->colo_cache = qemu_anon_ram_alloc(block->used_length, NULL);
+        if (!block->colo_cache) {
+            error_report("%s: Can't alloc memory for COLO cache of block %s,"
+                         "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
+                         block->used_length);
+            goto out_locked;
+        }
+        memcpy(block->colo_cache, block->host, block->used_length);
+    }
+    rcu_read_unlock();
+    ram_cache_enable = true;
+    return 0;
+
+out_locked:
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->colo_cache) {
+            qemu_anon_ram_free(block->colo_cache, block->used_length);
+            block->colo_cache = NULL;
+        }
+    }
+
+    rcu_read_unlock();
+    return -errno;
+}
+
+void colo_release_ram_cache(void)
+{
+    RAMBlock *block;
+
+    ram_cache_enable = false;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->colo_cache) {
+            qemu_anon_ram_free(block->colo_cache, block->used_length);
+            block->colo_cache = NULL;
+        }
+    }
+    rcu_read_unlock();
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 08/15] ram/COLO: Record the dirty pages that SVM received
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (6 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 07/15] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-23 18:44   ` Dr. David Alan Gilbert
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 09/15] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Juan Quintela

We record the address of the dirty pages that received,
it will help flushing pages that cached into SVM.
We record them by re-using migration dirty bitmap.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index b588990..ed3b606 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2231,6 +2231,9 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
 static inline void *colo_cache_from_block_offset(RAMBlock *block,
                                                  ram_addr_t offset)
 {
+    unsigned long *bitmap;
+    long k;
+
     if (!offset_in_ramblock(block, offset)) {
         return NULL;
     }
@@ -2239,6 +2242,17 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block,
                      __func__, block->idstr);
         return NULL;
     }
+
+    k = (memory_region_get_ram_addr(block->mr) + offset) >> TARGET_PAGE_BITS;
+    bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
+    /*
+    * During colo checkpoint, we need bitmap of these migrated pages.
+    * It help us to decide which pages in ram cache should be flushed
+    * into VM's RAM later.
+    */
+    if (!test_and_set_bit(k, bitmap)) {
+        migration_dirty_pages++;
+    }
     return block->colo_cache + offset;
 }
 
@@ -2664,6 +2678,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
 int colo_init_ram_cache(void)
 {
     RAMBlock *block;
+    int64_t ram_cache_pages = last_ram_offset() >> TARGET_PAGE_BITS;
 
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
@@ -2678,6 +2693,15 @@ int colo_init_ram_cache(void)
     }
     rcu_read_unlock();
     ram_cache_enable = true;
+    /*
+    * Record the dirty pages that sent by PVM, we use this dirty bitmap together
+    * with to decide which page in cache should be flushed into SVM's RAM. Here
+    * we use the same name 'migration_bitmap_rcu' as for migration.
+    */
+    migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
+    migration_bitmap_rcu->bmap = bitmap_new(ram_cache_pages);
+    migration_dirty_pages = 0;
+
     return 0;
 
 out_locked:
@@ -2695,9 +2719,15 @@ out_locked:
 void colo_release_ram_cache(void)
 {
     RAMBlock *block;
+    struct BitmapRcu *bitmap = migration_bitmap_rcu;
 
     ram_cache_enable = false;
 
+    atomic_rcu_set(&migration_bitmap_rcu, NULL);
+    if (bitmap) {
+        call_rcu(bitmap, migration_bitmap_free, rcu);
+    }
+
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
         if (block->colo_cache) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 09/15] COLO: Flush PVM's cached RAM into SVM's memory
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (7 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 08/15] ram/COLO: Record the dirty pages that SVM received zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 10/15] qmp event: Add COLO_EXIT event to notify users while exited from COLO zhanghailiang
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Juan Quintela

During the time of VM's running, PVM may dirty some pages, we will transfer
PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
time. So, the content of SVM's RAM cache will always be same with PVM's memory
after checkpoint.

Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
we do this in a more efficient way:
Only flush any page that dirtied by PVM since last checkpoint.
In this way, we can ensure SVM's memory same with PVM's.

Besides, we must ensure flush RAM cache before load device state.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/migration.h |  1 +
 migration/ram.c               | 41 +++++++++++++++++++++++++++++++++++++++++
 migration/trace-events        |  2 ++
 3 files changed, 44 insertions(+)

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 93c6148..ba5b97b 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -383,4 +383,5 @@ PostcopyState postcopy_state_set(PostcopyState new_state);
 /* ram cache */
 int colo_init_ram_cache(void);
 void colo_release_ram_cache(void);
+void colo_flush_ram_cache(void);
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index ed3b606..3f57fe0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2540,6 +2540,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
      * be atomic
      */
     bool postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING;
+    bool need_flush = false;
 
     seq_iter++;
 
@@ -2574,6 +2575,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
             /* After going into COLO, we should load the Page into colo_cache */
             if (ram_cache_enable) {
                 host = colo_cache_from_block_offset(block, addr);
+                need_flush = true;
             } else {
                 host = host_from_ram_block_offset(block, addr);
             }
@@ -2668,6 +2670,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     wait_for_decompress_done();
     rcu_read_unlock();
     trace_ram_load_complete(ret, seq_iter);
+
+    if (!ret  && ram_cache_enable && need_flush) {
+        colo_flush_ram_cache();
+    }
     return ret;
 }
 
@@ -2738,6 +2744,41 @@ void colo_release_ram_cache(void)
     rcu_read_unlock();
 }
 
+/*
+ * Flush content of RAM cache into SVM's memory.
+ * Only flush the pages that be dirtied by PVM or SVM or both.
+ */
+void colo_flush_ram_cache(void)
+{
+    RAMBlock *block = NULL;
+    void *dst_host;
+    void *src_host;
+    ram_addr_t offset = 0;
+
+    trace_colo_flush_ram_cache_begin(migration_dirty_pages);
+    rcu_read_lock();
+    block = QLIST_FIRST_RCU(&ram_list.blocks);
+
+    while (block) {
+        ram_addr_t ram_addr_abs;
+        offset = migration_bitmap_find_dirty(block, offset, &ram_addr_abs);
+        migration_bitmap_clear_dirty(ram_addr_abs);
+
+        if (offset >= block->used_length) {
+            offset = 0;
+            block = QLIST_NEXT_RCU(block, next);
+        } else {
+            dst_host = block->host + offset;
+            src_host = block->colo_cache + offset;
+            memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+        }
+    }
+
+    rcu_read_unlock();
+    trace_colo_flush_ram_cache_end();
+    assert(migration_dirty_pages == 0);
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
diff --git a/migration/trace-events b/migration/trace-events
index fa660e3..5d4cf80 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -71,6 +71,8 @@ migration_throttle(void) ""
 ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x"
 ram_postcopy_send_discard_bitmap(void) ""
 ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: start: %zx len: %zx"
+colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64
+colo_flush_ram_cache_end(void) ""
 
 # migration/migration.c
 await_return_path_close_on_source_close(void) ""
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 10/15] qmp event: Add COLO_EXIT event to notify users while exited from COLO
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (8 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 09/15] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 11/15] savevm: split save/find loadvm_handlers entry into two helper functions zhanghailiang
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Markus Armbruster, Michael Roth

If some errors happen during VM's COLO FT stage, it's important to
notify the users of this event. Together with 'x_colo_lost_heartbeat',
Users can intervene in COLO's failover work immediately.
If users don't want to get involved in COLO's failover verdict,
it is still necessary to notify users that we exited COLO mode.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 migration/colo.c | 19 +++++++++++++++++++
 qapi-schema.json | 14 ++++++++++++++
 qapi/event.json  | 21 +++++++++++++++++++++
 3 files changed, 54 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index edb7f00..65d0802 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -468,6 +468,18 @@ out:
     }
 
     colo_compare_unregister_notifier(&packets_compare_notifier);
+    /*
+     * There are only two reasons we can go here, some error happened.
+     * Or the user triggered failover.
+     */
+    if (failover_get_state() == FAILOVER_STATUS_NONE) {
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+                                  COLO_EXIT_REASON_ERROR, NULL);
+    } else {
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+                                  COLO_EXIT_REASON_REQUEST, NULL);
+    }
+
     timer_del(s->colo_delay_timer);
 
     /* Hope this not to be too long to wait here */
@@ -708,6 +720,13 @@ out:
     if (local_err) {
         error_report_err(local_err);
     }
+    if (failover_get_state() == FAILOVER_STATUS_NONE) {
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+                                  COLO_EXIT_REASON_ERROR, NULL);
+    } else {
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+                                  COLO_EXIT_REASON_REQUEST, NULL);
+    }
 
     if (fb) {
         qemu_fclose(fb);
diff --git a/qapi-schema.json b/qapi-schema.json
index 0521054..bb73e8f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1203,6 +1203,20 @@
   'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
 
 ##
+# @COLOExitReason:
+#
+# The reason for a COLO exit
+#
+# @request: COLO exit is due to an external request
+#
+# @error: COLO exit is due to an internal error
+#
+# Since: 2.9
+##
+{ 'enum': 'COLOExitReason',
+  'data': [ 'request', 'error' ] }
+
+##
 # @x-colo-lost-heartbeat:
 #
 # Tell qemu that heartbeat is lost, request it to do takeover procedures.
diff --git a/qapi/event.json b/qapi/event.json
index 970ff02..fe33628 100644
--- a/qapi/event.json
+++ b/qapi/event.json
@@ -441,6 +441,27 @@
   'data': { 'pass': 'int' } }
 
 ##
+# @COLO_EXIT:
+#
+# Emitted when VM finishes COLO mode due to some errors happening or
+# at the request of users.
+#
+# @mode: which COLO mode the VM was in when it exited.
+#
+# @reason: describes the reason for the COLO exit.
+#
+# Since: 2.9
+#
+# Example:
+#
+# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172},
+#      "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
+#
+##
+{ 'event': 'COLO_EXIT',
+  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
+
+##
 # @ACPI_DEVICE_OST:
 #
 # Emitted when guest executes ACPI _OST method.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 11/15] savevm: split save/find loadvm_handlers entry into two helper functions
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (9 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 10/15] qmp event: Add COLO_EXIT event to notify users while exited from COLO zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm zhanghailiang
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Juan Quintela

COLO's checkpoint process is based on migration process,
everytime we do checkpoint we will repeat the process of savevm and loadvm.

So we will call qemu_loadvm_section_start_full() repeatedly, It will
add all migration sections information into loadvm_handlers list everytime,
which will lead to memory leak.

To fix it, we split the process of saving and finding section entry into two
helper functions, we will check if section info was exist in loadvm_handlers
list before save it.

This modifications have no side effect for normal migration.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/savevm.c | 55 +++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 5ecd264..9c2d239 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1821,6 +1821,37 @@ void loadvm_free_handlers(MigrationIncomingState *mis)
     }
 }
 
+static LoadStateEntry *loadvm_add_section_entry(MigrationIncomingState *mis,
+                                                 SaveStateEntry *se,
+                                                 uint32_t section_id,
+                                                 uint32_t version_id)
+{
+    LoadStateEntry *le;
+
+    /* Add entry */
+    le = g_malloc0(sizeof(*le));
+
+    le->se = se;
+    le->section_id = section_id;
+    le->version_id = version_id;
+    QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry);
+    return le;
+}
+
+static LoadStateEntry *loadvm_find_section_entry(MigrationIncomingState *mis,
+                                                 uint32_t section_id)
+{
+    LoadStateEntry *le;
+
+    QLIST_FOREACH(le, &mis->loadvm_handlers, entry) {
+        if (le->section_id == section_id) {
+            break;
+        }
+    }
+
+    return le;
+}
+
 static int
 qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
 {
@@ -1863,15 +1894,12 @@ qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis)
         return -EINVAL;
     }
 
-    /* Add entry */
-    le = g_malloc0(sizeof(*le));
-
-    le->se = se;
-    le->section_id = section_id;
-    le->version_id = version_id;
-    QLIST_INSERT_HEAD(&mis->loadvm_handlers, le, entry);
-
-    ret = vmstate_load(f, le->se, le->version_id);
+     /* Check if we have saved this section info before, if not, save it */
+    le = loadvm_find_section_entry(mis, section_id);
+    if (!le) {
+        le = loadvm_add_section_entry(mis, se, section_id, version_id);
+    }
+    ret = vmstate_load(f, se, version_id);
     if (ret < 0) {
         error_report("error while loading state for instance 0x%x of"
                      " device '%s'", instance_id, idstr);
@@ -1894,12 +1922,9 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
     section_id = qemu_get_be32(f);
 
     trace_qemu_loadvm_state_section_partend(section_id);
-    QLIST_FOREACH(le, &mis->loadvm_handlers, entry) {
-        if (le->section_id == section_id) {
-            break;
-        }
-    }
-    if (le == NULL) {
+
+    le = loadvm_find_section_entry(mis, section_id);
+    if (!le) {
         error_report("Unknown savevm section %d", section_id);
         return -EINVAL;
     }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (10 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 11/15] savevm: split save/find loadvm_handlers entry into two helper functions zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-04-07 17:18   ` Dr. David Alan Gilbert
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 13/15] COLO: Separate the process of saving/loading ram and device state zhanghailiang
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Juan Quintela

There are several stages during loadvm/savevm process. In different stage,
migration incoming processes different types of sections.
We want to control these stages more accuracy, it will benefit COLO
performance, we don't have to save type of QEMU_VM_SECTION_START
sections everytime while do checkpoint, besides, we want to separate
the process of saving/loading memory and devices state.

So we add three new helper functions: qemu_loadvm_state_begin(),
qemu_load_device_state() and qemu_savevm_live_state() to achieve
different process during migration.

Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
public.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/sysemu/sysemu.h |  6 ++++++
 migration/savevm.c      | 55 ++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 7ed665a..95cae41 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
                                            uint64_t *start_list,
                                            uint64_t *length_list);
 
+void qemu_savevm_live_state(QEMUFile *f);
+int qemu_save_device_state(QEMUFile *f);
+
 int qemu_loadvm_state(QEMUFile *f);
+int qemu_loadvm_state_begin(QEMUFile *f);
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
+int qemu_load_device_state(QEMUFile *f);
 
 extern int autostart;
 
diff --git a/migration/savevm.c b/migration/savevm.c
index 9c2d239..dac478b 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -54,6 +54,7 @@
 #include "qemu/cutils.h"
 #include "io/channel-buffer.h"
 #include "io/channel-file.h"
+#include "migration/colo.h"
 
 #ifndef ETH_P_RARP
 #define ETH_P_RARP 0x8035
@@ -1279,13 +1280,21 @@ done:
     return ret;
 }
 
-static int qemu_save_device_state(QEMUFile *f)
+void qemu_savevm_live_state(QEMUFile *f)
 {
-    SaveStateEntry *se;
+    /* save QEMU_VM_SECTION_END section */
+    qemu_savevm_state_complete_precopy(f, true);
+    qemu_put_byte(f, QEMU_VM_EOF);
+}
 
-    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
-    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
+int qemu_save_device_state(QEMUFile *f)
+{
+    SaveStateEntry *se;
 
+    if (!migration_in_colo_state()) {
+        qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
+        qemu_put_be32(f, QEMU_VM_FILE_VERSION);
+    }
     cpu_synchronize_all_states();
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
@@ -1336,8 +1345,6 @@ enum LoadVMExitCodes {
     LOADVM_QUIT     =  1,
 };
 
-static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
-
 /* ------ incoming postcopy messages ------ */
 /* 'advise' arrives before any transfers just to tell us that a postcopy
  * *might* happen - it might be skipped if precopy transferred everything
@@ -1942,7 +1949,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
     return 0;
 }
 
-static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
 {
     uint8_t section_type;
     int ret = 0;
@@ -2080,6 +2087,40 @@ int qemu_loadvm_state(QEMUFile *f)
     return ret;
 }
 
+int qemu_loadvm_state_begin(QEMUFile *f)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    Error *local_err = NULL;
+    int ret;
+
+    if (qemu_savevm_state_blocked(&local_err)) {
+        error_report_err(local_err);
+        return -EINVAL;
+    }
+    /* Load QEMU_VM_SECTION_START section */
+    ret = qemu_loadvm_state_main(f, mis);
+    if (ret < 0) {
+        error_report("Failed to loadvm begin work: %d", ret);
+    }
+    return ret;
+}
+
+int qemu_load_device_state(QEMUFile *f)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    int ret;
+
+    /* Load QEMU_VM_SECTION_FULL section */
+    ret = qemu_loadvm_state_main(f, mis);
+    if (ret < 0) {
+        error_report("Failed to load device state: %d", ret);
+        return ret;
+    }
+
+    cpu_synchronize_all_post_init();
+    return 0;
+}
+
 int save_vmstate(Monitor *mon, const char *name)
 {
     BlockDriverState *bs, *bs1;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 13/15] COLO: Separate the process of saving/loading ram and device state
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (11 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 14/15] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 15/15] COLO: flush host dirty ram from cache zhanghailiang
  14 siblings, 0 replies; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Juan Quintela

We separate the process of saving/loading ram and device state when do
checkpoint. We add new helpers for save/load ram/device. With this change,
we can directly transfer RAM from primary side to secondary side without
using channel-buffer as assistant, which also reduce the size of extra memory
was used during checkpoint.

Besides, we move the colo_flush_ram_cache to the proper position after the
above change.

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c   | 48 ++++++++++++++++++++++++++++++++++++++----------
 migration/ram.c    |  5 -----
 migration/savevm.c |  4 ++++
 3 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 65d0802..b17e8e3 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -308,11 +308,20 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
     /* Disable block migration */
     s->params.blk = 0;
     s->params.shared = 0;
-    qemu_savevm_state_header(fb);
-    qemu_savevm_state_begin(fb, &s->params);
+    qemu_savevm_state_begin(s->to_dst_file, &s->params);
+    ret = qemu_file_get_error(s->to_dst_file);
+    if (ret < 0) {
+        error_report("Save VM state begin error");
+        goto out;
+    }
 
     /* We call this API although this may do nothing on primary side. */
     qemu_mutex_lock_iothread();
@@ -323,15 +332,21 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     }
 
     qemu_mutex_lock_iothread();
-    qemu_savevm_state_complete_precopy(fb, false);
+    /*
+     * Only save VM's live state, which not including device state.
+     * TODO: We may need a timeout mechanism to prevent COLO process
+     * to be blocked here.
+     */
+    qemu_savevm_live_state(s->to_dst_file);
+    /* Note: device state is saved into buffer */
+    ret = qemu_save_device_state(fb);
     qemu_mutex_unlock_iothread();
-
-    qemu_fflush(fb);
-
-    colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
-    if (local_err) {
+    if (ret < 0) {
+        error_report("Save device state error");
         goto out;
     }
+    qemu_fflush(fb);
+
     /*
      * We need the size of the VMstate data in Secondary side,
      * With which we can decide how much data should be read.
@@ -644,6 +659,17 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        ret = qemu_loadvm_state_begin(mis->from_src_file);
+        if (ret < 0) {
+            error_report("Load vm state begin error, ret=%d", ret);
+            goto out;
+        }
+        ret = qemu_loadvm_state_main(mis->from_src_file, mis);
+        if (ret < 0) {
+            error_report("Load VM's live state (ram) error");
+            goto out;
+        }
+
         value = colo_receive_message_value(mis->from_src_file,
                                  COLO_MESSAGE_VMSTATE_SIZE, &local_err);
         if (local_err) {
@@ -677,8 +703,10 @@ void *colo_process_incoming_thread(void *opaque)
         qemu_mutex_lock_iothread();
         qemu_system_reset(VMRESET_SILENT);
         vmstate_loading = true;
-        if (qemu_loadvm_state(fb) < 0) {
-            error_report("COLO: loadvm failed");
+        colo_flush_ram_cache();
+        ret = qemu_load_device_state(fb);
+        if (ret < 0) {
+            error_report("COLO: load device state failed");
             qemu_mutex_unlock_iothread();
             goto out;
         }
diff --git a/migration/ram.c b/migration/ram.c
index 3f57fe0..6227b94 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2540,7 +2540,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
      * be atomic
      */
     bool postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING;
-    bool need_flush = false;
 
     seq_iter++;
 
@@ -2575,7 +2574,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
             /* After going into COLO, we should load the Page into colo_cache */
             if (ram_cache_enable) {
                 host = colo_cache_from_block_offset(block, addr);
-                need_flush = true;
             } else {
                 host = host_from_ram_block_offset(block, addr);
             }
@@ -2671,9 +2669,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     rcu_read_unlock();
     trace_ram_load_complete(ret, seq_iter);
 
-    if (!ret  && ram_cache_enable && need_flush) {
-        colo_flush_ram_cache();
-    }
     return ret;
 }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index dac478b..67e4306 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1002,6 +1002,10 @@ void qemu_savevm_state_begin(QEMUFile *f,
             break;
         }
     }
+    if (migration_in_colo_state()) {
+        qemu_put_byte(f, QEMU_VM_EOF);
+        qemu_fflush(f);
+    }
 }
 
 /*
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 14/15] COLO: Split qemu_savevm_state_begin out of checkpoint process
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (12 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 13/15] COLO: Separate the process of saving/loading ram and device state zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 15/15] COLO: flush host dirty ram from cache zhanghailiang
  14 siblings, 0 replies; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Juan Quintela

It is unnecessary to call qemu_savevm_state_begin() in every checkpoint process.
It mainly sets up devices and does the first device state pass. These data will
not change during the later checkpoint process. So, we split it out of
colo_do_checkpoint_transaction(), in this way, we can reduce these data
transferring in the subsequent checkpoint.

Cc: Juan Quintela <quintela@redhat.com>
Sgned-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c | 52 ++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index b17e8e3..ab2d700 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -313,16 +313,6 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
-    /* Disable block migration */
-    s->params.blk = 0;
-    s->params.shared = 0;
-    qemu_savevm_state_begin(s->to_dst_file, &s->params);
-    ret = qemu_file_get_error(s->to_dst_file);
-    if (ret < 0) {
-        error_report("Save VM state begin error");
-        goto out;
-    }
-
     /* We call this API although this may do nothing on primary side. */
     qemu_mutex_lock_iothread();
     replication_do_checkpoint_all(&local_err);
@@ -410,6 +400,21 @@ static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)
     colo_checkpoint_notify(data);
 }
 
+static int colo_prepare_before_save(MigrationState *s)
+{
+    int ret;
+
+    /* Disable block migration */
+    s->params.blk = 0;
+    s->params.shared = 0;
+    qemu_savevm_state_begin(s->to_dst_file, &s->params);
+    ret = qemu_file_get_error(s->to_dst_file);
+    if (ret < 0) {
+        error_report("Save VM state begin error");
+    }
+    return ret;
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     QIOChannelBuffer *bioc;
@@ -429,6 +434,11 @@ static void colo_process_checkpoint(MigrationState *s)
     packets_compare_notifier.notify = colo_compare_notify_checkpoint;
     colo_compare_register_notifier(&packets_compare_notifier);
 
+    ret = colo_prepare_before_save(s);
+    if (ret < 0) {
+        goto out;
+    }
+
     /*
      * Wait for Secondary finish loading VM states and enter COLO
      * restore.
@@ -570,6 +580,17 @@ static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
     }
 }
 
+static int colo_prepare_before_load(QEMUFile *f)
+{
+    int ret;
+
+    ret = qemu_loadvm_state_begin(f);
+    if (ret < 0) {
+        error_report("Load VM state begin error, ret = %d", ret);
+    }
+    return ret;
+}
+
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
@@ -610,6 +631,11 @@ void *colo_process_incoming_thread(void *opaque)
     fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
     object_unref(OBJECT(bioc));
 
+    ret = colo_prepare_before_load(mis->from_src_file);
+    if (ret < 0) {
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     bdrv_invalidate_cache_all(&local_err);
     if (local_err) {
@@ -621,7 +647,6 @@ void *colo_process_incoming_thread(void *opaque)
     if (local_err) {
         goto out;
     }
-
     colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
                       &local_err);
     if (local_err) {
@@ -659,11 +684,6 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
-        ret = qemu_loadvm_state_begin(mis->from_src_file);
-        if (ret < 0) {
-            error_report("Load vm state begin error, ret=%d", ret);
-            goto out;
-        }
         ret = qemu_loadvm_state_main(mis->from_src_file, mis);
         if (ret < 0) {
             error_report("Load VM's live state (ram) error");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [Qemu-devel] [PATCH 15/15] COLO: flush host dirty ram from cache
  2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
                   ` (13 preceding siblings ...)
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 14/15] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
@ 2017-02-22  3:42 ` zhanghailiang
  2017-04-07 17:39   ` Dr. David Alan Gilbert
  14 siblings, 1 reply; 42+ messages in thread
From: zhanghailiang @ 2017-02-22  3:42 UTC (permalink / raw)
  To: qemu-devel, dgilbert, zhangchen.fnst
  Cc: lizhijian, xiecl.fnst, zhanghailiang, Juan Quintela

Don't need to flush all VM's ram from cache, only
flush the dirty pages since last checkpoint

Cc: Juan Quintela <quintela@redhat.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 migration/ram.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 6227b94..e9ba740 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2702,6 +2702,7 @@ int colo_init_ram_cache(void)
     migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
     migration_bitmap_rcu->bmap = bitmap_new(ram_cache_pages);
     migration_dirty_pages = 0;
+    memory_global_dirty_log_start();
 
     return 0;
 
@@ -2750,6 +2751,15 @@ void colo_flush_ram_cache(void)
     void *src_host;
     ram_addr_t offset = 0;
 
+    memory_global_dirty_log_sync();
+    qemu_mutex_lock(&migration_bitmap_mutex);
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        migration_bitmap_sync_range(block->offset, block->used_length);
+    }
+    rcu_read_unlock();
+    qemu_mutex_unlock(&migration_bitmap_mutex);
+
     trace_colo_flush_ram_cache_begin(migration_dirty_pages);
     rcu_read_lock();
     block = QLIST_FIRST_RCU(&ram_list.blocks);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint zhanghailiang
@ 2017-02-22  9:31   ` Zhang Chen
  2017-02-23  1:02     ` Hailiang Zhang
  2017-04-14  5:57     ` Jason Wang
  0 siblings, 2 replies; 42+ messages in thread
From: Zhang Chen @ 2017-02-22  9:31 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel, Jason Wang
  Cc: dgilbert, zhangchen.fnst, lizhijian, xiecl.fnst



On 02/22/2017 11:42 AM, zhanghailiang wrote:
> While do checkpoint, we need to flush all the unhandled packets,
> By using the filter notifier mechanism, we can easily to notify
> every compare object to do this process, which runs inside
> of compare threads as a coroutine.

Hi~ Jason and Hailiang.

I will send a patch set later about colo-compare notify mechanism for 
Xen like this patch.
I want to add a new chardev socket way in colo-comapre connect to Xen 
colo, for notify
checkpoint or failover, Because We have no choice to use this way 
communicate with Xen codes.
That's means we will have two notify mechanism.
What do you think about this?


Thanks
Zhang Chen

>
> Cc: Jason Wang <jasowang@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/colo-compare.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/colo-compare.h | 20 +++++++++++++++
>   2 files changed, 92 insertions(+)
>   create mode 100644 net/colo-compare.h
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 05/15] COLO: Handle shutdown command for VM in COLO state
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 05/15] COLO: Handle shutdown command for VM in COLO state zhanghailiang
@ 2017-02-22 15:35   ` Eric Blake
  2017-02-23  1:15     ` Hailiang Zhang
  0 siblings, 1 reply; 42+ messages in thread
From: Eric Blake @ 2017-02-22 15:35 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel, dgilbert, zhangchen.fnst
  Cc: Paolo Bonzini, xiecl.fnst, lizhijian

[-- Attachment #1: Type: text/plain, Size: 1302 bytes --]

On 02/21/2017 09:42 PM, zhanghailiang wrote:
> If VM is in COLO FT state, we need to do some extra works before
> starting normal shutdown process.
> 
> Secondary VM will ignore the shutdown command if users issue it directly
> to Secondary VM. COLO will capture shutdown command and after
> shutdown request from user.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> v19:
> - fix title and comment

Did you miss putting v19 in the subject line?


> +++ b/qapi-schema.json
> @@ -1157,12 +1157,14 @@
>  #
>  # @vmstate-loaded: VM's state has been loaded by SVM.
>  #
> +# @guest-shutdown: shutdown require from PVM to SVM

maybe s/require/requested/ ?

Missing '(since 2.9)'

> +#
>  # Since: 2.8
>  ##
>  { 'enum': 'COLOMessage',
>    'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
>              'vmstate-send', 'vmstate-size', 'vmstate-received',
> -            'vmstate-loaded' ] }
> +            'vmstate-loaded', 'guest-shutdown' ] }
>  

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-02-22  9:31   ` Zhang Chen
@ 2017-02-23  1:02     ` Hailiang Zhang
  2017-02-23  5:49       ` Zhang Chen
  2017-04-14  5:57     ` Jason Wang
  1 sibling, 1 reply; 42+ messages in thread
From: Hailiang Zhang @ 2017-02-23  1:02 UTC (permalink / raw)
  To: Zhang Chen, qemu-devel, Jason Wang
  Cc: xuquan8, dgilbert, lizhijian, xiecl.fnst

Hi,

On 2017/2/22 17:31, Zhang Chen wrote:
>
>
> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>> While do checkpoint, we need to flush all the unhandled packets,
>> By using the filter notifier mechanism, we can easily to notify
>> every compare object to do this process, which runs inside
>> of compare threads as a coroutine.
>
> Hi~ Jason and Hailiang.
>
> I will send a patch set later about colo-compare notify mechanism for
> Xen like this patch.
> I want to add a new chardev socket way in colo-comapre connect to Xen
> colo, for notify
> checkpoint or failover, Because We have no choice to use this way
> communicate with Xen codes.
> That's means we will have two notify mechanism.
> What do you think about this?
>

I don't think you need another mechanism, what you need to do is to
realize a qmp command which calls colo_notify_compares_event(),
It will not  return until the event (checkpoint or failover) be
handled by all compares. Will this satisfy your requirement ?

Thanks,
Hailiang

>
> Thanks
> Zhang Chen
>
>>
>> Cc: Jason Wang <jasowang@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>    net/colo-compare.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>    net/colo-compare.h | 20 +++++++++++++++
>>    2 files changed, 92 insertions(+)
>>    create mode 100644 net/colo-compare.h
>>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 05/15] COLO: Handle shutdown command for VM in COLO state
  2017-02-22 15:35   ` Eric Blake
@ 2017-02-23  1:15     ` Hailiang Zhang
  0 siblings, 0 replies; 42+ messages in thread
From: Hailiang Zhang @ 2017-02-23  1:15 UTC (permalink / raw)
  To: Eric Blake, qemu-devel, dgilbert, zhangchen.fnst
  Cc: xuquan8, Paolo Bonzini, xiecl.fnst, lizhijian

Hi Eric,

On 2017/2/22 23:35, Eric Blake wrote:
> On 02/21/2017 09:42 PM, zhanghailiang wrote:
>> If VM is in COLO FT state, we need to do some extra works before
>> starting normal shutdown process.
>>
>> Secondary VM will ignore the shutdown command if users issue it directly
>> to Secondary VM. COLO will capture shutdown command and after
>> shutdown request from user.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> ---
>> v19:
>> - fix title and comment
>
> Did you miss putting v19 in the subject line?
>

Er, some patches of this series are split from previous
v18 version, but most patches of that series has been merged
into upstream, so that's why it has a tag here.
I think it is better to remove this comment instead of using
v19 tag for this series.

>
>> +++ b/qapi-schema.json
>> @@ -1157,12 +1157,14 @@
>>   #
>>   # @vmstate-loaded: VM's state has been loaded by SVM.
>>   #
>> +# @guest-shutdown: shutdown require from PVM to SVM
>
> maybe s/require/requested/ ?
>

Yes.

> Missing '(since 2.9)'
>

Will fix this in next version, thanks.

>> +#
>>   # Since: 2.8
>>   ##
>>   { 'enum': 'COLOMessage',
>>     'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
>>               'vmstate-send', 'vmstate-size', 'vmstate-received',
>> -            'vmstate-loaded' ] }
>> +            'vmstate-loaded', 'guest-shutdown' ] }
>>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-02-23  1:02     ` Hailiang Zhang
@ 2017-02-23  5:49       ` Zhang Chen
  0 siblings, 0 replies; 42+ messages in thread
From: Zhang Chen @ 2017-02-23  5:49 UTC (permalink / raw)
  To: Hailiang Zhang, qemu-devel, Jason Wang
  Cc: zhangchen.fnst, xuquan8, dgilbert, lizhijian, xiecl.fnst



On 02/23/2017 09:02 AM, Hailiang Zhang wrote:
> Hi,
>
> On 2017/2/22 17:31, Zhang Chen wrote:
>>
>>
>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>> While do checkpoint, we need to flush all the unhandled packets,
>>> By using the filter notifier mechanism, we can easily to notify
>>> every compare object to do this process, which runs inside
>>> of compare threads as a coroutine.
>>
>> Hi~ Jason and Hailiang.
>>
>> I will send a patch set later about colo-compare notify mechanism for
>> Xen like this patch.
>> I want to add a new chardev socket way in colo-comapre connect to Xen
>> colo, for notify
>> checkpoint or failover, Because We have no choice to use this way
>> communicate with Xen codes.
>> That's means we will have two notify mechanism.
>> What do you think about this?
>>
>
> I don't think you need another mechanism, what you need to do is to
> realize a qmp command which calls colo_notify_compares_event(),
> It will not  return until the event (checkpoint or failover) be
> handled by all compares. Will this satisfy your requirement ?

No, colo-frame notify colo-comapre can calls colo_notify_compares_event(),
That's OK, but colo-comapre notify colo-frame in Xen have some problem,
Xen's colo-frame needs a API that blocking and have a timeout to read 
colo-comapre's
notify, this timeout is the time of periodic checkpoint. In this patch set,
colo-compare just call colo_compare_inconsistent_notify() to 
non-blocking notify.
We can not realize a qmp command that Xen always polling that to get 
status of notify,
Qemu also can not accept to call qmp command for polling.


Thanks
Zhang Chen

>
> Thanks,
> Hailiang
>
>>
>> Thanks
>> Zhang Chen
>>
>>>
>>> Cc: Jason Wang <jasowang@redhat.com>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>> ---
>>>    net/colo-compare.c | 72 
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>    net/colo-compare.h | 20 +++++++++++++++
>>>    2 files changed, 92 insertions(+)
>>>    create mode 100644 net/colo-compare.h
>>>
>>
>
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 08/15] ram/COLO: Record the dirty pages that SVM received
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 08/15] ram/COLO: Record the dirty pages that SVM received zhanghailiang
@ 2017-02-23 18:44   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 42+ messages in thread
From: Dr. David Alan Gilbert @ 2017-02-23 18:44 UTC (permalink / raw)
  To: zhanghailiang
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Juan Quintela

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We record the address of the dirty pages that received,
> it will help flushing pages that cached into SVM.
> We record them by re-using migration dirty bitmap.
> 
> Cc: Juan Quintela <quintela@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/ram.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index b588990..ed3b606 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2231,6 +2231,9 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
>  static inline void *colo_cache_from_block_offset(RAMBlock *block,
>                                                   ram_addr_t offset)
>  {
> +    unsigned long *bitmap;
> +    long k;

You could use a better name than 'k'.

Dave

>      if (!offset_in_ramblock(block, offset)) {
>          return NULL;
>      }
> @@ -2239,6 +2242,17 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block,
>                       __func__, block->idstr);
>          return NULL;
>      }
> +
> +    k = (memory_region_get_ram_addr(block->mr) + offset) >> TARGET_PAGE_BITS;
> +    bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap;
> +    /*
> +    * During colo checkpoint, we need bitmap of these migrated pages.
> +    * It help us to decide which pages in ram cache should be flushed
> +    * into VM's RAM later.
> +    */
> +    if (!test_and_set_bit(k, bitmap)) {
> +        migration_dirty_pages++;
> +    }
>      return block->colo_cache + offset;
>  }
>  
> @@ -2664,6 +2678,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>  int colo_init_ram_cache(void)
>  {
>      RAMBlock *block;
> +    int64_t ram_cache_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>  
>      rcu_read_lock();
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> @@ -2678,6 +2693,15 @@ int colo_init_ram_cache(void)
>      }
>      rcu_read_unlock();
>      ram_cache_enable = true;
> +    /*
> +    * Record the dirty pages that sent by PVM, we use this dirty bitmap together
> +    * with to decide which page in cache should be flushed into SVM's RAM. Here
> +    * we use the same name 'migration_bitmap_rcu' as for migration.
> +    */
> +    migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
> +    migration_bitmap_rcu->bmap = bitmap_new(ram_cache_pages);
> +    migration_dirty_pages = 0;
> +
>      return 0;
>  
>  out_locked:
> @@ -2695,9 +2719,15 @@ out_locked:
>  void colo_release_ram_cache(void)
>  {
>      RAMBlock *block;
> +    struct BitmapRcu *bitmap = migration_bitmap_rcu;
>  
>      ram_cache_enable = false;
>  
> +    atomic_rcu_set(&migration_bitmap_rcu, NULL);
> +    if (bitmap) {
> +        call_rcu(bitmap, migration_bitmap_free, rcu);
> +    }
> +
>      rcu_read_lock();
>      QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>          if (block->colo_cache) {
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter zhanghailiang
@ 2017-04-07 15:46   ` Dr. David Alan Gilbert
  2017-04-10  7:26     ` Hailiang Zhang
  0 siblings, 1 reply; 42+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-07 15:46 UTC (permalink / raw)
  To: zhanghailiang
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Jason Wang

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We will use this notifier to help COLO to notify filter object
> to do something, like do checkpoint, or process failover event.
> 
> Cc: Jason Wang <jasowang@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>  net/colo.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  net/colo.h | 18 ++++++++++++
>  2 files changed, 110 insertions(+)
> 

<..>

> +FilterNotifier *filter_noitifier_new(FilterNotifierCallback *cb,
                          ^^^^^^^^^ Typo - no*i*tifier

(I've not looked at this patch much, I'll leave networking stuff to Jason)

Dave

> +                    void *opaque, Error **errp)
> +{
> +    FilterNotifier *notify;
> +    int ret;
> +
> +    notify = (FilterNotifier *)g_source_new(&notifier_source_funcs,
> +                sizeof(FilterNotifier));
> +    ret = event_notifier_init(&notify->event, false);
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret, "Failed to initialize event notifier");
> +        goto fail;
> +    }
> +    notify->pfd.fd = event_notifier_get_fd(&notify->event);
> +    notify->pfd.events = G_IO_IN | G_IO_HUP | G_IO_ERR;
> +    notify->cb = cb;
> +    notify->opaque = opaque;
> +    g_source_add_poll(&notify->source, &notify->pfd);
> +
> +    return notify;
> +
> +fail:
> +    g_source_destroy(&notify->source);
> +    return NULL;
> +}
> +
> +int filter_notifier_set(FilterNotifier *notify, uint64_t value)
> +{
> +    ssize_t ret;
> +
> +    do {
> +        ret = write(notify->event.wfd, &value, sizeof(value));
> +    } while (ret < 0 && errno == EINTR);
> +
> +    /* EAGAIN is fine, a read must be pending.  */
> +    if (ret < 0 && errno != EAGAIN) {
> +        return -errno;
> +    }
> +    return 0;
> +}
> diff --git a/net/colo.h b/net/colo.h
> index cd9027f..00f03b5 100644
> --- a/net/colo.h
> +++ b/net/colo.h
> @@ -19,6 +19,7 @@
>  #include "qemu/jhash.h"
>  #include "qemu/timer.h"
>  #include "slirp/tcp.h"
> +#include "qemu/event_notifier.h"
>  
>  #define HASHTABLE_MAX_SIZE 16384
>  
> @@ -89,4 +90,21 @@ void connection_hashtable_reset(GHashTable *connection_track_table);
>  Packet *packet_new(const void *data, int size);
>  void packet_destroy(void *opaque, void *user_data);
>  
> +typedef void FilterNotifierCallback(void *opaque, int value);
> +typedef struct FilterNotifier {
> +    GSource source;
> +    EventNotifier event;
> +    GPollFD pfd;
> +    FilterNotifierCallback *cb;
> +    void *opaque;
> +} FilterNotifier;
> +
> +FilterNotifier *filter_noitifier_new(FilterNotifierCallback *cb,
> +                    void *opaque, Error **errp);
> +int filter_notifier_set(FilterNotifier *notify, uint64_t value);
> +
> +enum {
> +    COLO_CHECKPOINT = 2,
> +    COLO_FAILOVER,
> +};
>  #endif /* QEMU_COLO_PROXY_H */
> -- 
> 1.8.3.1
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 04/15] COLO: integrate colo compare with colo frame
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 04/15] COLO: integrate colo compare with colo frame zhanghailiang
@ 2017-04-07 15:59   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 42+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-07 15:59 UTC (permalink / raw)
  To: zhanghailiang
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Jason Wang

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> For COLO FT, both the PVM and SVM run at the same time,
> only sync the state while it needs.
> 
> So here, let SVM runs while not doing checkpoint,
> Besides, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100.
> 
> Cc: Jason Wang <jasowang@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/colo.c      | 25 +++++++++++++++++++++++++
>  migration/migration.c |  2 +-
>  2 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 712308e..fb8d8fd 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -19,8 +19,11 @@
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
>  #include "migration/failover.h"
> +#include "net/colo-compare.h"
> +#include "net/colo.h"
>  
>  static bool vmstate_loading;
> +static Notifier packets_compare_notifier;
>  
>  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>  
> @@ -263,6 +266,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
>      if (local_err) {
>          goto out;
>      }
> +
>      /* Reset channel-buffer directly */
>      qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL);
>      bioc->usage = 0;
> @@ -283,6 +287,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
>          goto out;
>      }
>  
> +    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
>      /* Disable block migration */
>      s->params.blk = 0;
>      s->params.shared = 0;
> @@ -341,6 +350,11 @@ out:
>      return ret;
>  }
>  
> +static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)
> +{
> +    colo_checkpoint_notify(data);
> +}
> +
>  static void colo_process_checkpoint(MigrationState *s)
>  {
>      QIOChannelBuffer *bioc;
> @@ -357,6 +371,9 @@ static void colo_process_checkpoint(MigrationState *s)
>          goto out;
>      }
>  
> +    packets_compare_notifier.notify = colo_compare_notify_checkpoint;
> +    colo_compare_register_notifier(&packets_compare_notifier);
> +
>      /*
>       * Wait for Secondary finish loading VM states and enter COLO
>       * restore.
> @@ -402,6 +419,7 @@ out:
>          qemu_fclose(fb);
>      }
>  
> +    colo_compare_unregister_notifier(&packets_compare_notifier);
>      timer_del(s->colo_delay_timer);
>  
>      /* Hope this not to be too long to wait here */
> @@ -518,6 +536,11 @@ void *colo_process_incoming_thread(void *opaque)
>              goto out;
>          }
>  
> +        qemu_mutex_lock_iothread();
> +        vm_stop_force_state(RUN_STATE_COLO);
> +        trace_colo_vm_state_change("run", "stop");
> +        qemu_mutex_unlock_iothread();
> +
>          /* FIXME: This is unnecessary for periodic checkpoint mode */
>          colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
>                       &local_err);
> @@ -571,6 +594,8 @@ void *colo_process_incoming_thread(void *opaque)
>          }
>  
>          vmstate_loading = false;
> +        vm_start();
> +        trace_colo_vm_state_change("stop", "run");
>          qemu_mutex_unlock_iothread();
>  
>          if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
> diff --git a/migration/migration.c b/migration/migration.c
> index c6ae69d..2339be7 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -66,7 +66,7 @@
>  /* The delay time (in ms) between two COLO checkpoints
>   * Note: Please change this default value to 10000 when we support hybrid mode.
>   */
> -#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
> +#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
>  
>  static NotifierList migration_state_notifiers =
>      NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 07/15] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 07/15] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
@ 2017-04-07 17:06   ` Dr. David Alan Gilbert
  2017-04-10  7:31     ` Hailiang Zhang
  0 siblings, 1 reply; 42+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-07 17:06 UTC (permalink / raw)
  To: zhanghailiang
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Juan Quintela

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> We should not load PVM's state directly into SVM, because there maybe some
> errors happen when SVM is receving data, which will break SVM.
> 
> We need to ensure receving all data before load the state into SVM. We use
> an extra memory to cache these data (PVM's ram). The ram cache in secondary side
> is initially the same as SVM/PVM's memory. And in the process of checkpoint,
> we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
> always the same as PVM's memory at every checkpoint, then we flush this cached ram
> to SVM after we receive all PVM's state.

You're probably going to find this interesting to merge with Juan's recent RAM block series.
Probably not too hard, but he's touching a lot of the same code and rearranging things.

Dave


> Cc: Juan Quintela <quintela@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  include/exec/ram_addr.h       |  1 +
>  include/migration/migration.h |  4 +++
>  migration/colo.c              | 14 +++++++++
>  migration/ram.c               | 73 ++++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 91 insertions(+), 1 deletion(-)
> 
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 3e79466..44e1190 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -27,6 +27,7 @@ struct RAMBlock {
>      struct rcu_head rcu;
>      struct MemoryRegion *mr;
>      uint8_t *host;
> +    uint8_t *colo_cache; /* For colo, VM's ram cache */
>      ram_addr_t offset;
>      ram_addr_t used_length;
>      ram_addr_t max_length;
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 1735d66..93c6148 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -379,4 +379,8 @@ int ram_save_queue_pages(MigrationState *ms, const char *rbname,
>  PostcopyState postcopy_state_get(void);
>  /* Set the state and return the old state */
>  PostcopyState postcopy_state_set(PostcopyState new_state);
> +
> +/* ram cache */
> +int colo_init_ram_cache(void);
> +void colo_release_ram_cache(void);
>  #endif
> diff --git a/migration/colo.c b/migration/colo.c
> index 1e3e975..edb7f00 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -551,6 +551,7 @@ void *colo_process_incoming_thread(void *opaque)
>      uint64_t total_size;
>      uint64_t value;
>      Error *local_err = NULL;
> +    int ret;
>  
>      qemu_sem_init(&mis->colo_incoming_sem, 0);
>  
> @@ -572,6 +573,12 @@ void *colo_process_incoming_thread(void *opaque)
>       */
>      qemu_file_set_blocking(mis->from_src_file, true);
>  
> +    ret = colo_init_ram_cache();
> +    if (ret < 0) {
> +        error_report("Failed to initialize ram cache");
> +        goto out;
> +    }
> +
>      bioc = qio_channel_buffer_new(COLO_BUFFER_BASE_SIZE);
>      fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
>      object_unref(OBJECT(bioc));
> @@ -705,11 +712,18 @@ out:
>      if (fb) {
>          qemu_fclose(fb);
>      }
> +    /*
> +     * We can ensure BH is hold the global lock, and will join COLO
> +     * incoming thread, so here it is not necessary to lock here again,
> +     * Or there will be a deadlock error.
> +     */
> +    colo_release_ram_cache();
>  
>      /* Hope this not to be too long to loop here */
>      qemu_sem_wait(&mis->colo_incoming_sem);
>      qemu_sem_destroy(&mis->colo_incoming_sem);
>      /* Must be called after failover BH is completed */
> +
>      if (mis->to_src_file) {
>          qemu_fclose(mis->to_src_file);
>      }
> diff --git a/migration/ram.c b/migration/ram.c
> index f289fcd..b588990 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -219,6 +219,7 @@ static RAMBlock *last_sent_block;
>  static ram_addr_t last_offset;
>  static QemuMutex migration_bitmap_mutex;
>  static uint64_t migration_dirty_pages;
> +static bool ram_cache_enable;
>  static uint32_t last_version;
>  static bool ram_bulk_stage;
>  
> @@ -2227,6 +2228,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
>      return block->host + offset;
>  }
>  
> +static inline void *colo_cache_from_block_offset(RAMBlock *block,
> +                                                 ram_addr_t offset)
> +{
> +    if (!offset_in_ramblock(block, offset)) {
> +        return NULL;
> +    }
> +    if (!block->colo_cache) {
> +        error_report("%s: colo_cache is NULL in block :%s",
> +                     __func__, block->idstr);
> +        return NULL;
> +    }
> +    return block->colo_cache + offset;
> +}
> +
>  /*
>   * If a page (or a whole RDMA chunk) has been
>   * determined to be zero, then zap it.
> @@ -2542,7 +2557,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                       RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
>              RAMBlock *block = ram_block_from_stream(f, flags);
>  
> -            host = host_from_ram_block_offset(block, addr);
> +            /* After going into COLO, we should load the Page into colo_cache */
> +            if (ram_cache_enable) {
> +                host = colo_cache_from_block_offset(block, addr);
> +            } else {
> +                host = host_from_ram_block_offset(block, addr);
> +            }
>              if (!host) {
>                  error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
>                  ret = -EINVAL;
> @@ -2637,6 +2657,57 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>      return ret;
>  }
>  
> +/*
> + * colo cache: this is for secondary VM, we cache the whole
> + * memory of the secondary VM, it will be called after first migration.
> + */
> +int colo_init_ram_cache(void)
> +{
> +    RAMBlock *block;
> +
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        block->colo_cache = qemu_anon_ram_alloc(block->used_length, NULL);
> +        if (!block->colo_cache) {
> +            error_report("%s: Can't alloc memory for COLO cache of block %s,"
> +                         "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
> +                         block->used_length);
> +            goto out_locked;
> +        }
> +        memcpy(block->colo_cache, block->host, block->used_length);
> +    }
> +    rcu_read_unlock();
> +    ram_cache_enable = true;
> +    return 0;
> +
> +out_locked:
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (block->colo_cache) {
> +            qemu_anon_ram_free(block->colo_cache, block->used_length);
> +            block->colo_cache = NULL;
> +        }
> +    }
> +
> +    rcu_read_unlock();
> +    return -errno;
> +}
> +
> +void colo_release_ram_cache(void)
> +{
> +    RAMBlock *block;
> +
> +    ram_cache_enable = false;
> +
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        if (block->colo_cache) {
> +            qemu_anon_ram_free(block->colo_cache, block->used_length);
> +            block->colo_cache = NULL;
> +        }
> +    }
> +    rcu_read_unlock();
> +}
> +
>  static SaveVMHandlers savevm_ram_handlers = {
>      .save_live_setup = ram_save_setup,
>      .save_live_iterate = ram_save_iterate,
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm zhanghailiang
@ 2017-04-07 17:18   ` Dr. David Alan Gilbert
  2017-04-10  8:26     ` Hailiang Zhang
  0 siblings, 1 reply; 42+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-07 17:18 UTC (permalink / raw)
  To: zhanghailiang
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Juan Quintela

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> There are several stages during loadvm/savevm process. In different stage,
> migration incoming processes different types of sections.
> We want to control these stages more accuracy, it will benefit COLO
> performance, we don't have to save type of QEMU_VM_SECTION_START
> sections everytime while do checkpoint, besides, we want to separate
> the process of saving/loading memory and devices state.
> 
> So we add three new helper functions: qemu_loadvm_state_begin(),
> qemu_load_device_state() and qemu_savevm_live_state() to achieve
> different process during migration.
> 
> Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
> public.
> 
> Cc: Juan Quintela <quintela@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  include/sysemu/sysemu.h |  6 ++++++
>  migration/savevm.c      | 55 ++++++++++++++++++++++++++++++++++++++++++-------
>  2 files changed, 54 insertions(+), 7 deletions(-)
> 
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 7ed665a..95cae41 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
>                                             uint64_t *start_list,
>                                             uint64_t *length_list);
>  
> +void qemu_savevm_live_state(QEMUFile *f);
> +int qemu_save_device_state(QEMUFile *f);
> +
>  int qemu_loadvm_state(QEMUFile *f);
> +int qemu_loadvm_state_begin(QEMUFile *f);
> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> +int qemu_load_device_state(QEMUFile *f);
>  
>  extern int autostart;
>  
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 9c2d239..dac478b 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -54,6 +54,7 @@
>  #include "qemu/cutils.h"
>  #include "io/channel-buffer.h"
>  #include "io/channel-file.h"
> +#include "migration/colo.h"
>  
>  #ifndef ETH_P_RARP
>  #define ETH_P_RARP 0x8035
> @@ -1279,13 +1280,21 @@ done:
>      return ret;
>  }
>  
> -static int qemu_save_device_state(QEMUFile *f)
> +void qemu_savevm_live_state(QEMUFile *f)
>  {
> -    SaveStateEntry *se;
> +    /* save QEMU_VM_SECTION_END section */
> +    qemu_savevm_state_complete_precopy(f, true);
> +    qemu_put_byte(f, QEMU_VM_EOF);
> +}
>  
> -    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
> -    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
> +int qemu_save_device_state(QEMUFile *f)
> +{
> +    SaveStateEntry *se;
>  
> +    if (!migration_in_colo_state()) {
> +        qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
> +        qemu_put_be32(f, QEMU_VM_FILE_VERSION);
> +    }

Note that got split out into qemu_savevm_state_header() at some point.

Dave

>      cpu_synchronize_all_states();
>  
>      QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> @@ -1336,8 +1345,6 @@ enum LoadVMExitCodes {
>      LOADVM_QUIT     =  1,
>  };
>  
> -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> -
>  /* ------ incoming postcopy messages ------ */
>  /* 'advise' arrives before any transfers just to tell us that a postcopy
>   * *might* happen - it might be skipped if precopy transferred everything
> @@ -1942,7 +1949,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
>      return 0;
>  }
>  
> -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
>  {
>      uint8_t section_type;
>      int ret = 0;
> @@ -2080,6 +2087,40 @@ int qemu_loadvm_state(QEMUFile *f)
>      return ret;
>  }
>  
> +int qemu_loadvm_state_begin(QEMUFile *f)
> +{
> +    MigrationIncomingState *mis = migration_incoming_get_current();
> +    Error *local_err = NULL;
> +    int ret;
> +
> +    if (qemu_savevm_state_blocked(&local_err)) {
> +        error_report_err(local_err);
> +        return -EINVAL;
> +    }
> +    /* Load QEMU_VM_SECTION_START section */
> +    ret = qemu_loadvm_state_main(f, mis);
> +    if (ret < 0) {
> +        error_report("Failed to loadvm begin work: %d", ret);
> +    }
> +    return ret;
> +}
> +
> +int qemu_load_device_state(QEMUFile *f)
> +{
> +    MigrationIncomingState *mis = migration_incoming_get_current();
> +    int ret;
> +
> +    /* Load QEMU_VM_SECTION_FULL section */
> +    ret = qemu_loadvm_state_main(f, mis);
> +    if (ret < 0) {
> +        error_report("Failed to load device state: %d", ret);
> +        return ret;
> +    }
> +
> +    cpu_synchronize_all_post_init();
> +    return 0;
> +}
> +
>  int save_vmstate(Monitor *mon, const char *name)
>  {
>      BlockDriverState *bs, *bs1;
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 15/15] COLO: flush host dirty ram from cache
  2017-02-22  3:42 ` [Qemu-devel] [PATCH 15/15] COLO: flush host dirty ram from cache zhanghailiang
@ 2017-04-07 17:39   ` Dr. David Alan Gilbert
  2017-04-10  7:13     ` Hailiang Zhang
  0 siblings, 1 reply; 42+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-07 17:39 UTC (permalink / raw)
  To: zhanghailiang
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Juan Quintela

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Don't need to flush all VM's ram from cache, only
> flush the dirty pages since last checkpoint
> 
> Cc: Juan Quintela <quintela@redhat.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>  migration/ram.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 6227b94..e9ba740 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2702,6 +2702,7 @@ int colo_init_ram_cache(void)
>      migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>      migration_bitmap_rcu->bmap = bitmap_new(ram_cache_pages);
>      migration_dirty_pages = 0;
> +    memory_global_dirty_log_start();

Shouldn't there be a stop somewhere?
(Probably if you failover to the secondary and colo stops?)

>      return 0;
>  
> @@ -2750,6 +2751,15 @@ void colo_flush_ram_cache(void)
>      void *src_host;
>      ram_addr_t offset = 0;
>  
> +    memory_global_dirty_log_sync();
> +    qemu_mutex_lock(&migration_bitmap_mutex);
> +    rcu_read_lock();
> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> +        migration_bitmap_sync_range(block->offset, block->used_length);
> +    }
> +    rcu_read_unlock();
> +    qemu_mutex_unlock(&migration_bitmap_mutex);

Again this might have some fun merging with Juan's recent changes - what's
really unusual about your set is that you're using this bitmap on the destination;
I suspect Juan's recent changes that trickier.
Check 'Creating RAMState for migration' and 'Split migration bitmaps by ramblock'.

Dave
>      trace_colo_flush_ram_cache_begin(migration_dirty_pages);
>      rcu_read_lock();
>      block = QLIST_FIRST_RCU(&ram_list.blocks);
> -- 
> 1.8.3.1
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 15/15] COLO: flush host dirty ram from cache
  2017-04-07 17:39   ` Dr. David Alan Gilbert
@ 2017-04-10  7:13     ` Hailiang Zhang
  0 siblings, 0 replies; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-10  7:13 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: xuquan8, qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst,
	Juan Quintela

On 2017/4/8 1:39, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> Don't need to flush all VM's ram from cache, only
>> flush the dirty pages since last checkpoint
>>
>> Cc: Juan Quintela <quintela@redhat.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>   migration/ram.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 6227b94..e9ba740 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -2702,6 +2702,7 @@ int colo_init_ram_cache(void)
>>       migration_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>>       migration_bitmap_rcu->bmap = bitmap_new(ram_cache_pages);
>>       migration_dirty_pages = 0;
>> +    memory_global_dirty_log_start();
> Shouldn't there be a stop somewhere?
> (Probably if you failover to the secondary and colo stops?)

Ha, good catch, i forgot to stop the dirty log in secondary side.

>>       return 0;
>>   
>> @@ -2750,6 +2751,15 @@ void colo_flush_ram_cache(void)
>>       void *src_host;
>>       ram_addr_t offset = 0;
>>   
>> +    memory_global_dirty_log_sync();
>> +    qemu_mutex_lock(&migration_bitmap_mutex);
>> +    rcu_read_lock();
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        migration_bitmap_sync_range(block->offset, block->used_length);
>> +    }
>> +    rcu_read_unlock();
>> +    qemu_mutex_unlock(&migration_bitmap_mutex);
> Again this might have some fun merging with Juan's recent changes - what's
> really unusual about your set is that you're using this bitmap on the destination;
> I suspect Juan's recent changes that trickier.
> Check 'Creating RAMState for migration' and 'Split migration bitmaps by ramblock'.

I have reviewed these two series, and i think it's not a big problem
for COLO here,  We can still re-use most of the codes.

Thanks,
Hailiang

> Dave
>>       trace_colo_flush_ram_cache_begin(migration_dirty_pages);
>>       rcu_read_lock();
>>       block = QLIST_FIRST_RCU(&ram_list.blocks);
>> -- 
>> 1.8.3.1
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter
  2017-04-07 15:46   ` Dr. David Alan Gilbert
@ 2017-04-10  7:26     ` Hailiang Zhang
  0 siblings, 0 replies; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-10  7:26 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: xuquan8, qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Jason Wang

On 2017/4/7 23:46, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> We will use this notifier to help COLO to notify filter object
>> to do something, like do checkpoint, or process failover event.
>>
>> Cc: Jason Wang <jasowang@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>   net/colo.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/colo.h | 18 ++++++++++++
>>   2 files changed, 110 insertions(+)
>>
> <..>
>
>> +FilterNotifier *filter_noitifier_new(FilterNotifierCallback *cb,
>                            ^^^^^^^^^ Typo - no*i*tifier

Good catch, will fix it in next version.

> (I've not looked at this patch much, I'll leave networking stuff to Jason)

OK, thanks.

> Dave
>
>> +                    void *opaque, Error **errp)
>> +{
>> +    FilterNotifier *notify;
>> +    int ret;
>> +
>> +    notify = (FilterNotifier *)g_source_new(&notifier_source_funcs,
>> +                sizeof(FilterNotifier));
>> +    ret = event_notifier_init(&notify->event, false);
>> +    if (ret < 0) {
>> +        error_setg_errno(errp, -ret, "Failed to initialize event notifier");
>> +        goto fail;
>> +    }
>> +    notify->pfd.fd = event_notifier_get_fd(&notify->event);
>> +    notify->pfd.events = G_IO_IN | G_IO_HUP | G_IO_ERR;
>> +    notify->cb = cb;
>> +    notify->opaque = opaque;
>> +    g_source_add_poll(&notify->source, &notify->pfd);
>> +
>> +    return notify;
>> +
>> +fail:
>> +    g_source_destroy(&notify->source);
>> +    return NULL;
>> +}
>> +
>> +int filter_notifier_set(FilterNotifier *notify, uint64_t value)
>> +{
>> +    ssize_t ret;
>> +
>> +    do {
>> +        ret = write(notify->event.wfd, &value, sizeof(value));
>> +    } while (ret < 0 && errno == EINTR);
>> +
>> +    /* EAGAIN is fine, a read must be pending.  */
>> +    if (ret < 0 && errno != EAGAIN) {
>> +        return -errno;
>> +    }
>> +    return 0;
>> +}
>> diff --git a/net/colo.h b/net/colo.h
>> index cd9027f..00f03b5 100644
>> --- a/net/colo.h
>> +++ b/net/colo.h
>> @@ -19,6 +19,7 @@
>>   #include "qemu/jhash.h"
>>   #include "qemu/timer.h"
>>   #include "slirp/tcp.h"
>> +#include "qemu/event_notifier.h"
>>   
>>   #define HASHTABLE_MAX_SIZE 16384
>>   
>> @@ -89,4 +90,21 @@ void connection_hashtable_reset(GHashTable *connection_track_table);
>>   Packet *packet_new(const void *data, int size);
>>   void packet_destroy(void *opaque, void *user_data);
>>   
>> +typedef void FilterNotifierCallback(void *opaque, int value);
>> +typedef struct FilterNotifier {
>> +    GSource source;
>> +    EventNotifier event;
>> +    GPollFD pfd;
>> +    FilterNotifierCallback *cb;
>> +    void *opaque;
>> +} FilterNotifier;
>> +
>> +FilterNotifier *filter_noitifier_new(FilterNotifierCallback *cb,
>> +                    void *opaque, Error **errp);
>> +int filter_notifier_set(FilterNotifier *notify, uint64_t value);
>> +
>> +enum {
>> +    COLO_CHECKPOINT = 2,
>> +    COLO_FAILOVER,
>> +};
>>   #endif /* QEMU_COLO_PROXY_H */
>> -- 
>> 1.8.3.1
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 07/15] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
  2017-04-07 17:06   ` Dr. David Alan Gilbert
@ 2017-04-10  7:31     ` Hailiang Zhang
  0 siblings, 0 replies; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-10  7:31 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: xuquan8, qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst,
	Juan Quintela

On 2017/4/8 1:06, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> We should not load PVM's state directly into SVM, because there maybe some
>> errors happen when SVM is receving data, which will break SVM.
>>
>> We need to ensure receving all data before load the state into SVM. We use
>> an extra memory to cache these data (PVM's ram). The ram cache in secondary side
>> is initially the same as SVM/PVM's memory. And in the process of checkpoint,
>> we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
>> always the same as PVM's memory at every checkpoint, then we flush this cached ram
>> to SVM after we receive all PVM's state.
> You're probably going to find this interesting to merge with Juan's recent RAM block series.
> Probably not too hard, but he's touching a lot of the same code and rearranging things.

Yes, I'll update this series on top of his series, better to send next version after his series been merged.

Thanks,
Hailiang

> Dave
>
>
>> Cc: Juan Quintela <quintela@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> ---
>>   include/exec/ram_addr.h       |  1 +
>>   include/migration/migration.h |  4 +++
>>   migration/colo.c              | 14 +++++++++
>>   migration/ram.c               | 73 ++++++++++++++++++++++++++++++++++++++++++-
>>   4 files changed, 91 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>> index 3e79466..44e1190 100644
>> --- a/include/exec/ram_addr.h
>> +++ b/include/exec/ram_addr.h
>> @@ -27,6 +27,7 @@ struct RAMBlock {
>>       struct rcu_head rcu;
>>       struct MemoryRegion *mr;
>>       uint8_t *host;
>> +    uint8_t *colo_cache; /* For colo, VM's ram cache */
>>       ram_addr_t offset;
>>       ram_addr_t used_length;
>>       ram_addr_t max_length;
>> diff --git a/include/migration/migration.h b/include/migration/migration.h
>> index 1735d66..93c6148 100644
>> --- a/include/migration/migration.h
>> +++ b/include/migration/migration.h
>> @@ -379,4 +379,8 @@ int ram_save_queue_pages(MigrationState *ms, const char *rbname,
>>   PostcopyState postcopy_state_get(void);
>>   /* Set the state and return the old state */
>>   PostcopyState postcopy_state_set(PostcopyState new_state);
>> +
>> +/* ram cache */
>> +int colo_init_ram_cache(void);
>> +void colo_release_ram_cache(void);
>>   #endif
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 1e3e975..edb7f00 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -551,6 +551,7 @@ void *colo_process_incoming_thread(void *opaque)
>>       uint64_t total_size;
>>       uint64_t value;
>>       Error *local_err = NULL;
>> +    int ret;
>>   
>>       qemu_sem_init(&mis->colo_incoming_sem, 0);
>>   
>> @@ -572,6 +573,12 @@ void *colo_process_incoming_thread(void *opaque)
>>        */
>>       qemu_file_set_blocking(mis->from_src_file, true);
>>   
>> +    ret = colo_init_ram_cache();
>> +    if (ret < 0) {
>> +        error_report("Failed to initialize ram cache");
>> +        goto out;
>> +    }
>> +
>>       bioc = qio_channel_buffer_new(COLO_BUFFER_BASE_SIZE);
>>       fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
>>       object_unref(OBJECT(bioc));
>> @@ -705,11 +712,18 @@ out:
>>       if (fb) {
>>           qemu_fclose(fb);
>>       }
>> +    /*
>> +     * We can ensure BH is hold the global lock, and will join COLO
>> +     * incoming thread, so here it is not necessary to lock here again,
>> +     * Or there will be a deadlock error.
>> +     */
>> +    colo_release_ram_cache();
>>   
>>       /* Hope this not to be too long to loop here */
>>       qemu_sem_wait(&mis->colo_incoming_sem);
>>       qemu_sem_destroy(&mis->colo_incoming_sem);
>>       /* Must be called after failover BH is completed */
>> +
>>       if (mis->to_src_file) {
>>           qemu_fclose(mis->to_src_file);
>>       }
>> diff --git a/migration/ram.c b/migration/ram.c
>> index f289fcd..b588990 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -219,6 +219,7 @@ static RAMBlock *last_sent_block;
>>   static ram_addr_t last_offset;
>>   static QemuMutex migration_bitmap_mutex;
>>   static uint64_t migration_dirty_pages;
>> +static bool ram_cache_enable;
>>   static uint32_t last_version;
>>   static bool ram_bulk_stage;
>>   
>> @@ -2227,6 +2228,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
>>       return block->host + offset;
>>   }
>>   
>> +static inline void *colo_cache_from_block_offset(RAMBlock *block,
>> +                                                 ram_addr_t offset)
>> +{
>> +    if (!offset_in_ramblock(block, offset)) {
>> +        return NULL;
>> +    }
>> +    if (!block->colo_cache) {
>> +        error_report("%s: colo_cache is NULL in block :%s",
>> +                     __func__, block->idstr);
>> +        return NULL;
>> +    }
>> +    return block->colo_cache + offset;
>> +}
>> +
>>   /*
>>    * If a page (or a whole RDMA chunk) has been
>>    * determined to be zero, then zap it.
>> @@ -2542,7 +2557,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>                        RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
>>               RAMBlock *block = ram_block_from_stream(f, flags);
>>   
>> -            host = host_from_ram_block_offset(block, addr);
>> +            /* After going into COLO, we should load the Page into colo_cache */
>> +            if (ram_cache_enable) {
>> +                host = colo_cache_from_block_offset(block, addr);
>> +            } else {
>> +                host = host_from_ram_block_offset(block, addr);
>> +            }
>>               if (!host) {
>>                   error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
>>                   ret = -EINVAL;
>> @@ -2637,6 +2657,57 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>       return ret;
>>   }
>>   
>> +/*
>> + * colo cache: this is for secondary VM, we cache the whole
>> + * memory of the secondary VM, it will be called after first migration.
>> + */
>> +int colo_init_ram_cache(void)
>> +{
>> +    RAMBlock *block;
>> +
>> +    rcu_read_lock();
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        block->colo_cache = qemu_anon_ram_alloc(block->used_length, NULL);
>> +        if (!block->colo_cache) {
>> +            error_report("%s: Can't alloc memory for COLO cache of block %s,"
>> +                         "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
>> +                         block->used_length);
>> +            goto out_locked;
>> +        }
>> +        memcpy(block->colo_cache, block->host, block->used_length);
>> +    }
>> +    rcu_read_unlock();
>> +    ram_cache_enable = true;
>> +    return 0;
>> +
>> +out_locked:
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        if (block->colo_cache) {
>> +            qemu_anon_ram_free(block->colo_cache, block->used_length);
>> +            block->colo_cache = NULL;
>> +        }
>> +    }
>> +
>> +    rcu_read_unlock();
>> +    return -errno;
>> +}
>> +
>> +void colo_release_ram_cache(void)
>> +{
>> +    RAMBlock *block;
>> +
>> +    ram_cache_enable = false;
>> +
>> +    rcu_read_lock();
>> +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>> +        if (block->colo_cache) {
>> +            qemu_anon_ram_free(block->colo_cache, block->used_length);
>> +            block->colo_cache = NULL;
>> +        }
>> +    }
>> +    rcu_read_unlock();
>> +}
>> +
>>   static SaveVMHandlers savevm_ram_handlers = {
>>       .save_live_setup = ram_save_setup,
>>       .save_live_iterate = ram_save_iterate,
>> -- 
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm
  2017-04-07 17:18   ` Dr. David Alan Gilbert
@ 2017-04-10  8:26     ` Hailiang Zhang
  2017-04-20  9:09       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-10  8:26 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Juan Quintela

On 2017/4/8 1:18, Dr. David Alan Gilbert wrote:
> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>> There are several stages during loadvm/savevm process. In different stage,
>> migration incoming processes different types of sections.
>> We want to control these stages more accuracy, it will benefit COLO
>> performance, we don't have to save type of QEMU_VM_SECTION_START
>> sections everytime while do checkpoint, besides, we want to separate
>> the process of saving/loading memory and devices state.
>>
>> So we add three new helper functions: qemu_loadvm_state_begin(),
>> qemu_load_device_state() and qemu_savevm_live_state() to achieve
>> different process during migration.
>>
>> Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
>> public.
>>
>> Cc: Juan Quintela <quintela@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> ---
>>   include/sysemu/sysemu.h |  6 ++++++
>>   migration/savevm.c      | 55 ++++++++++++++++++++++++++++++++++++++++++-------
>>   2 files changed, 54 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>> index 7ed665a..95cae41 100644
>> --- a/include/sysemu/sysemu.h
>> +++ b/include/sysemu/sysemu.h
>> @@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
>>                                              uint64_t *start_list,
>>                                              uint64_t *length_list);
>>   
>> +void qemu_savevm_live_state(QEMUFile *f);
>> +int qemu_save_device_state(QEMUFile *f);
>> +
>>   int qemu_loadvm_state(QEMUFile *f);
>> +int qemu_loadvm_state_begin(QEMUFile *f);
>> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
>> +int qemu_load_device_state(QEMUFile *f);
>>   
>>   extern int autostart;
>>   
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 9c2d239..dac478b 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -54,6 +54,7 @@
>>   #include "qemu/cutils.h"
>>   #include "io/channel-buffer.h"
>>   #include "io/channel-file.h"
>> +#include "migration/colo.h"
>>   
>>   #ifndef ETH_P_RARP
>>   #define ETH_P_RARP 0x8035
>> @@ -1279,13 +1280,21 @@ done:
>>       return ret;
>>   }
>>   
>> -static int qemu_save_device_state(QEMUFile *f)
>> +void qemu_savevm_live_state(QEMUFile *f)
>>   {
>> -    SaveStateEntry *se;
>> +    /* save QEMU_VM_SECTION_END section */
>> +    qemu_savevm_state_complete_precopy(f, true);
>> +    qemu_put_byte(f, QEMU_VM_EOF);
>> +}
>>   
>> -    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
>> -    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
>> +int qemu_save_device_state(QEMUFile *f)
>> +{
>> +    SaveStateEntry *se;
>>   
>> +    if (!migration_in_colo_state()) {
>> +        qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
>> +        qemu_put_be32(f, QEMU_VM_FILE_VERSION);
>> +    }
> Note that got split out into qemu_savevm_state_header() at some point.

Do you mean i should use the wrapper qemu_savevm_state_heade() here ?

> Dave
>
>>       cpu_synchronize_all_states();
>>   
>>       QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
>> @@ -1336,8 +1345,6 @@ enum LoadVMExitCodes {
>>       LOADVM_QUIT     =  1,
>>   };
>>   
>> -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
>> -
>>   /* ------ incoming postcopy messages ------ */
>>   /* 'advise' arrives before any transfers just to tell us that a postcopy
>>    * *might* happen - it might be skipped if precopy transferred everything
>> @@ -1942,7 +1949,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
>>       return 0;
>>   }
>>   
>> -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
>> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
>>   {
>>       uint8_t section_type;
>>       int ret = 0;
>> @@ -2080,6 +2087,40 @@ int qemu_loadvm_state(QEMUFile *f)
>>       return ret;
>>   }
>>   
>> +int qemu_loadvm_state_begin(QEMUFile *f)
>> +{
>> +    MigrationIncomingState *mis = migration_incoming_get_current();
>> +    Error *local_err = NULL;
>> +    int ret;
>> +
>> +    if (qemu_savevm_state_blocked(&local_err)) {
>> +        error_report_err(local_err);
>> +        return -EINVAL;
>> +    }
>> +    /* Load QEMU_VM_SECTION_START section */
>> +    ret = qemu_loadvm_state_main(f, mis);
>> +    if (ret < 0) {
>> +        error_report("Failed to loadvm begin work: %d", ret);
>> +    }
>> +    return ret;
>> +}
>> +
>> +int qemu_load_device_state(QEMUFile *f)
>> +{
>> +    MigrationIncomingState *mis = migration_incoming_get_current();
>> +    int ret;
>> +
>> +    /* Load QEMU_VM_SECTION_FULL section */
>> +    ret = qemu_loadvm_state_main(f, mis);
>> +    if (ret < 0) {
>> +        error_report("Failed to load device state: %d", ret);
>> +        return ret;
>> +    }
>> +
>> +    cpu_synchronize_all_post_init();
>> +    return 0;
>> +}
>> +
>>   int save_vmstate(Monitor *mon, const char *name)
>>   {
>>       BlockDriverState *bs, *bs1;
>> -- 
>> 1.8.3.1
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-02-22  9:31   ` Zhang Chen
  2017-02-23  1:02     ` Hailiang Zhang
@ 2017-04-14  5:57     ` Jason Wang
  2017-04-14  6:22       ` Hailiang Zhang
  1 sibling, 1 reply; 42+ messages in thread
From: Jason Wang @ 2017-04-14  5:57 UTC (permalink / raw)
  To: Zhang Chen, zhanghailiang, qemu-devel; +Cc: dgilbert, lizhijian, xiecl.fnst



On 2017年02月22日 17:31, Zhang Chen wrote:
>
>
> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>> While do checkpoint, we need to flush all the unhandled packets,
>> By using the filter notifier mechanism, we can easily to notify
>> every compare object to do this process, which runs inside
>> of compare threads as a coroutine.
>
> Hi~ Jason and Hailiang.
>
> I will send a patch set later about colo-compare notify mechanism for 
> Xen like this patch.
> I want to add a new chardev socket way in colo-comapre connect to Xen 
> colo, for notify
> checkpoint or failover, Because We have no choice to use this way 
> communicate with Xen codes.
> That's means we will have two notify mechanism.
> What do you think about this?
>
>
> Thanks
> Zhang Chen 

I was thinking the possibility of using similar way to for colo compare. 
E.g can we use socket? This can saves duplicated codes more or less.

Thanks

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-04-14  5:57     ` Jason Wang
@ 2017-04-14  6:22       ` Hailiang Zhang
  2017-04-14  6:38         ` Jason Wang
  0 siblings, 1 reply; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-14  6:22 UTC (permalink / raw)
  To: Jason Wang, Zhang Chen, qemu-devel
  Cc: xuquan8, dgilbert, lizhijian, xiecl.fnst

Hi Jason,

On 2017/4/14 13:57, Jason Wang wrote:
>
> On 2017年02月22日 17:31, Zhang Chen wrote:
>>
>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>> While do checkpoint, we need to flush all the unhandled packets,
>>> By using the filter notifier mechanism, we can easily to notify
>>> every compare object to do this process, which runs inside
>>> of compare threads as a coroutine.
>> Hi~ Jason and Hailiang.
>>
>> I will send a patch set later about colo-compare notify mechanism for
>> Xen like this patch.
>> I want to add a new chardev socket way in colo-comapre connect to Xen
>> colo, for notify
>> checkpoint or failover, Because We have no choice to use this way
>> communicate with Xen codes.
>> That's means we will have two notify mechanism.
>> What do you think about this?
>>
>>
>> Thanks
>> Zhang Chen
> I was thinking the possibility of using similar way to for colo compare.
> E.g can we use socket? This can saves duplicated codes more or less.

Since there are too many sockets used by filter and COLO, (Two unix sockets and two
  tcp sockets for each vNIC), I don't want to introduce more ;) , but i'm not sure if it is
possible to make it more flexible and optional, abstract these duplicated codes,
pass the opened fd (No matter eventfd or socket fd ) as parameter, for example.
Is this way acceptable ?

Thanks,
Hailiang

> Thanks
>
>
> .
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-04-14  6:22       ` Hailiang Zhang
@ 2017-04-14  6:38         ` Jason Wang
  2017-04-17 11:04           ` Hailiang Zhang
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2017-04-14  6:38 UTC (permalink / raw)
  To: Hailiang Zhang, Zhang Chen, qemu-devel
  Cc: xuquan8, dgilbert, lizhijian, xiecl.fnst



On 2017年04月14日 14:22, Hailiang Zhang wrote:
> Hi Jason,
>
> On 2017/4/14 13:57, Jason Wang wrote:
>>
>> On 2017年02月22日 17:31, Zhang Chen wrote:
>>>
>>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>>> While do checkpoint, we need to flush all the unhandled packets,
>>>> By using the filter notifier mechanism, we can easily to notify
>>>> every compare object to do this process, which runs inside
>>>> of compare threads as a coroutine.
>>> Hi~ Jason and Hailiang.
>>>
>>> I will send a patch set later about colo-compare notify mechanism for
>>> Xen like this patch.
>>> I want to add a new chardev socket way in colo-comapre connect to Xen
>>> colo, for notify
>>> checkpoint or failover, Because We have no choice to use this way
>>> communicate with Xen codes.
>>> That's means we will have two notify mechanism.
>>> What do you think about this?
>>>
>>>
>>> Thanks
>>> Zhang Chen
>> I was thinking the possibility of using similar way to for colo compare.
>> E.g can we use socket? This can saves duplicated codes more or less.
>
> Since there are too many sockets used by filter and COLO, (Two unix 
> sockets and two
>  tcp sockets for each vNIC), I don't want to introduce more ;) , but 
> i'm not sure if it is
> possible to make it more flexible and optional, abstract these 
> duplicated codes,
> pass the opened fd (No matter eventfd or socket fd ) as parameter, for 
> example.
> Is this way acceptable ?
>
> Thanks,
> Hailiang

Yes, that's kind of what I want. We don't want to use two message 
format. Passing a opened fd need management support, we still need a 
fallback if there's no management on top. For qemu/kvm, we can do all 
stuffs transparent to the cli by e.g socketpair() or others, but the key 
is to have a unified message format.

Thoughts?

Thanks

>
>> Thanks
>>
>>
>> .
>>
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-04-14  6:38         ` Jason Wang
@ 2017-04-17 11:04           ` Hailiang Zhang
  2017-04-18  1:32             ` Zhang Chen
  2017-04-18  3:55             ` Jason Wang
  0 siblings, 2 replies; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-17 11:04 UTC (permalink / raw)
  To: Jason Wang, Zhang Chen, qemu-devel
  Cc: xuquan8, dgilbert, lizhijian, xiecl.fnst

Hi Jason,

On 2017/4/14 14:38, Jason Wang wrote:
>
> On 2017年04月14日 14:22, Hailiang Zhang wrote:
>> Hi Jason,
>>
>> On 2017/4/14 13:57, Jason Wang wrote:
>>> On 2017年02月22日 17:31, Zhang Chen wrote:
>>>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>>>> While do checkpoint, we need to flush all the unhandled packets,
>>>>> By using the filter notifier mechanism, we can easily to notify
>>>>> every compare object to do this process, which runs inside
>>>>> of compare threads as a coroutine.
>>>> Hi~ Jason and Hailiang.
>>>>
>>>> I will send a patch set later about colo-compare notify mechanism for
>>>> Xen like this patch.
>>>> I want to add a new chardev socket way in colo-comapre connect to Xen
>>>> colo, for notify
>>>> checkpoint or failover, Because We have no choice to use this way
>>>> communicate with Xen codes.
>>>> That's means we will have two notify mechanism.
>>>> What do you think about this?
>>>>
>>>>
>>>> Thanks
>>>> Zhang Chen
>>> I was thinking the possibility of using similar way to for colo compare.
>>> E.g can we use socket? This can saves duplicated codes more or less.
>> Since there are too many sockets used by filter and COLO, (Two unix
>> sockets and two
>>   tcp sockets for each vNIC), I don't want to introduce more ;) , but
>> i'm not sure if it is
>> possible to make it more flexible and optional, abstract these
>> duplicated codes,
>> pass the opened fd (No matter eventfd or socket fd ) as parameter, for
>> example.
>> Is this way acceptable ?
>>
>> Thanks,
>> Hailiang
> Yes, that's kind of what I want. We don't want to use two message
> format. Passing a opened fd need management support, we still need a
> fallback if there's no management on top. For qemu/kvm, we can do all
> stuffs transparent to the cli by e.g socketpair() or others, but the key
> is to have a unified message format.

After a deeper investigation, i think we can re-use most codes, since there is no
existing way to notify xen (no ?), we still needs notify chardev socket (Be used to notify xen, it is optional.)
(http://patchwork.ozlabs.org/patch/733431/ "COLO-compare: Add Xen notify chardev socket handler frame")

Besides, there is an existing qmp comand 'xen-colo-do-checkpoint', we can re-use it to notify
colo-compare objects and other filter objects to do checkpoint, for the opposite direction, we use
the notify chardev socket (Only for xen).

So the codes will be like:
diff --git a/migration/colo.c b/migration/colo.c
index 91da936..813c281 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -224,7 +224,19 @@ ReplicationStatus *qmp_query_xen_replication_status(Error **errp)

  void qmp_xen_colo_do_checkpoint(Error **errp)
  {
+    Error *local_err = NULL;
+
      replication_do_checkpoint_all(errp);
+    /* Notify colo-compare and other filters to do checkpoint */
+    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    colo_notify_filters_event(COLO_CHECKPOINT, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+    }
  }

  static void colo_send_message(QEMUFile *f, COLOMessage msg,
diff --git a/net/colo-compare.c b/net/colo-compare.c
index 24e13f0..de975c5 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -391,6 +391,9 @@ static void colo_compare_inconsistent_notify(void)
  {
      notifier_list_notify(&colo_compare_notifiers,
                  migrate_get_current());
+    if (s->notify_dev) {
+       /* Do something, notify remote side through notify dev */
+    }
  }

  void colo_compare_register_notifier(Notifier *notify)

How about this scenario ？

> Thoughts?
>
> Thanks
>
>>> Thanks
>>>
>>>
>>> .
>>>
>>
>
> .
>

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-04-17 11:04           ` Hailiang Zhang
@ 2017-04-18  1:32             ` Zhang Chen
  2017-04-18  3:55             ` Jason Wang
  1 sibling, 0 replies; 42+ messages in thread
From: Zhang Chen @ 2017-04-18  1:32 UTC (permalink / raw)
  To: Hailiang Zhang, Jason Wang, qemu-devel
  Cc: zhangchen.fnst, xuquan8, dgilbert, lizhijian, xiecl.fnst



On 04/17/2017 07:04 PM, Hailiang Zhang wrote:
> Hi Jason,
>
> On 2017/4/14 14:38, Jason Wang wrote:
>>
>> On 2017年04月14日 14:22, Hailiang Zhang wrote:
>>> Hi Jason,
>>>
>>> On 2017/4/14 13:57, Jason Wang wrote:
>>>> On 2017年02月22日 17:31, Zhang Chen wrote:
>>>>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>>>>> While do checkpoint, we need to flush all the unhandled packets,
>>>>>> By using the filter notifier mechanism, we can easily to notify
>>>>>> every compare object to do this process, which runs inside
>>>>>> of compare threads as a coroutine.
>>>>> Hi~ Jason and Hailiang.
>>>>>
>>>>> I will send a patch set later about colo-compare notify mechanism for
>>>>> Xen like this patch.
>>>>> I want to add a new chardev socket way in colo-comapre connect to Xen
>>>>> colo, for notify
>>>>> checkpoint or failover, Because We have no choice to use this way
>>>>> communicate with Xen codes.
>>>>> That's means we will have two notify mechanism.
>>>>> What do you think about this?
>>>>>
>>>>>
>>>>> Thanks
>>>>> Zhang Chen
>>>> I was thinking the possibility of using similar way to for colo 
>>>> compare.
>>>> E.g can we use socket? This can saves duplicated codes more or less.
>>> Since there are too many sockets used by filter and COLO, (Two unix
>>> sockets and two
>>>   tcp sockets for each vNIC), I don't want to introduce more ;) , but
>>> i'm not sure if it is
>>> possible to make it more flexible and optional, abstract these
>>> duplicated codes,
>>> pass the opened fd (No matter eventfd or socket fd ) as parameter, for
>>> example.
>>> Is this way acceptable ?
>>>
>>> Thanks,
>>> Hailiang
>> Yes, that's kind of what I want. We don't want to use two message
>> format. Passing a opened fd need management support, we still need a
>> fallback if there's no management on top. For qemu/kvm, we can do all
>> stuffs transparent to the cli by e.g socketpair() or others, but the key
>> is to have a unified message format.
>
> After a deeper investigation, i think we can re-use most codes, since 
> there is no
> existing way to notify xen (no ?), we still needs notify chardev 
> socket (Be used to notify xen, it is optional.)
> (http://patchwork.ozlabs.org/patch/733431/ "COLO-compare: Add Xen 
> notify chardev socket handler frame")
>
> Besides, there is an existing qmp comand 'xen-colo-do-checkpoint', we 
> can re-use it to notify
> colo-compare objects and other filter objects to do checkpoint, for 
> the opposite direction, we use
> the notify chardev socket (Only for xen).
>
> So the codes will be like:
> diff --git a/migration/colo.c b/migration/colo.c
> index 91da936..813c281 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -224,7 +224,19 @@ ReplicationStatus 
> *qmp_query_xen_replication_status(Error **errp)
>
>  void qmp_xen_colo_do_checkpoint(Error **errp)
>  {
> +    Error *local_err = NULL;
> +
>      replication_do_checkpoint_all(errp);
> +    /* Notify colo-compare and other filters to do checkpoint */
> +    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +    colo_notify_filters_event(COLO_CHECKPOINT, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +    }
>  }
>
>  static void colo_send_message(QEMUFile *f, COLOMessage msg,
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 24e13f0..de975c5 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -391,6 +391,9 @@ static void colo_compare_inconsistent_notify(void)
>  {
>      notifier_list_notify(&colo_compare_notifiers,
>                  migrate_get_current());
> +    if (s->notify_dev) {
> +       /* Do something, notify remote side through notify dev */
> +    }
>  }
>
>  void colo_compare_register_notifier(Notifier *notify)
>
> How about this scenario ？

I agree this way, maybe rename qmp_xen_colo_do_checkpoint()
to qmp_remote_colo_do_checkpoint() is more generic.

Thanks
Zhang Chen

>
>> Thoughts?
>>
>> Thanks
>>
>>>> Thanks
>>>>
>>>>
>>>> .
>>>>
>>>
>>
>> .
>>
>
>
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-04-17 11:04           ` Hailiang Zhang
  2017-04-18  1:32             ` Zhang Chen
@ 2017-04-18  3:55             ` Jason Wang
  2017-04-18  6:58               ` Hailiang Zhang
  1 sibling, 1 reply; 42+ messages in thread
From: Jason Wang @ 2017-04-18  3:55 UTC (permalink / raw)
  To: Hailiang Zhang, Zhang Chen, qemu-devel
  Cc: xuquan8, xiecl.fnst, dgilbert, lizhijian



On 2017年04月17日 19:04, Hailiang Zhang wrote:
> Hi Jason,
>
> On 2017/4/14 14:38, Jason Wang wrote:
>>
>> On 2017年04月14日 14:22, Hailiang Zhang wrote:
>>> Hi Jason,
>>>
>>> On 2017/4/14 13:57, Jason Wang wrote:
>>>> On 2017年02月22日 17:31, Zhang Chen wrote:
>>>>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>>>>> While do checkpoint, we need to flush all the unhandled packets,
>>>>>> By using the filter notifier mechanism, we can easily to notify
>>>>>> every compare object to do this process, which runs inside
>>>>>> of compare threads as a coroutine.
>>>>> Hi~ Jason and Hailiang.
>>>>>
>>>>> I will send a patch set later about colo-compare notify mechanism for
>>>>> Xen like this patch.
>>>>> I want to add a new chardev socket way in colo-comapre connect to Xen
>>>>> colo, for notify
>>>>> checkpoint or failover, Because We have no choice to use this way
>>>>> communicate with Xen codes.
>>>>> That's means we will have two notify mechanism.
>>>>> What do you think about this?
>>>>>
>>>>>
>>>>> Thanks
>>>>> Zhang Chen
>>>> I was thinking the possibility of using similar way to for colo 
>>>> compare.
>>>> E.g can we use socket? This can saves duplicated codes more or less.
>>> Since there are too many sockets used by filter and COLO, (Two unix
>>> sockets and two
>>>   tcp sockets for each vNIC), I don't want to introduce more ;) , but
>>> i'm not sure if it is
>>> possible to make it more flexible and optional, abstract these
>>> duplicated codes,
>>> pass the opened fd (No matter eventfd or socket fd ) as parameter, for
>>> example.
>>> Is this way acceptable ?
>>>
>>> Thanks,
>>> Hailiang
>> Yes, that's kind of what I want. We don't want to use two message
>> format. Passing a opened fd need management support, we still need a
>> fallback if there's no management on top. For qemu/kvm, we can do all
>> stuffs transparent to the cli by e.g socketpair() or others, but the key
>> is to have a unified message format.
>
> After a deeper investigation, i think we can re-use most codes, since 
> there is no
> existing way to notify xen (no ?), we still needs notify chardev 
> socket (Be used to notify xen, it is optional.)
> (http://patchwork.ozlabs.org/patch/733431/ "COLO-compare: Add Xen 
> notify chardev socket handler frame")

Yes and actually you can use this for bi-directional communication. The 
only differences is the implementation of comparing.

>
> Besides, there is an existing qmp comand 'xen-colo-do-checkpoint', 

I don't see this in master?

> we can re-use it to notify
> colo-compare objects and other filter objects to do checkpoint, for 
> the opposite direction, we use
> the notify chardev socket (Only for xen).

Just want to make sure I understand the design, who will trigger this 
command? Management?

Can we just use the socket?

>
> So the codes will be like:
> diff --git a/migration/colo.c b/migration/colo.c
> index 91da936..813c281 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -224,7 +224,19 @@ ReplicationStatus 
> *qmp_query_xen_replication_status(Error **errp)
>
>  void qmp_xen_colo_do_checkpoint(Error **errp)
>  {
> +    Error *local_err = NULL;
> +
>      replication_do_checkpoint_all(errp);
> +    /* Notify colo-compare and other filters to do checkpoint */
> +    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +    colo_notify_filters_event(COLO_CHECKPOINT, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +    }
>  }
>
>  static void colo_send_message(QEMUFile *f, COLOMessage msg,
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 24e13f0..de975c5 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -391,6 +391,9 @@ static void colo_compare_inconsistent_notify(void)
>  {
>      notifier_list_notify(&colo_compare_notifiers,
>                  migrate_get_current());
> +    if (s->notify_dev) {
> +       /* Do something, notify remote side through notify dev */
> +    }
>  }
>
>  void colo_compare_register_notifier(Notifier *notify)
>
> How about this scenario ？

See my reply above, and we need unify the message format too. Raw string 
is ok but we'd better have something like TLV or others.

Thanks

>
>> Thoughts?
>>
>> Thanks
>>
>>>> Thanks
>>>>
>>>>
>>>> .
>>>>
>>>
>>
>> .
>>
>
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-04-18  3:55             ` Jason Wang
@ 2017-04-18  6:58               ` Hailiang Zhang
  2017-04-20  5:15                 ` Jason Wang
  0 siblings, 1 reply; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-18  6:58 UTC (permalink / raw)
  To: Jason Wang, Zhang Chen, qemu-devel
  Cc: xuquan8, xiecl.fnst, dgilbert, lizhijian

On 2017/4/18 11:55, Jason Wang wrote:
>
> On 2017年04月17日 19:04, Hailiang Zhang wrote:
>> Hi Jason,
>>
>> On 2017/4/14 14:38, Jason Wang wrote:
>>> On 2017年04月14日 14:22, Hailiang Zhang wrote:
>>>> Hi Jason,
>>>>
>>>> On 2017/4/14 13:57, Jason Wang wrote:
>>>>> On 2017年02月22日 17:31, Zhang Chen wrote:
>>>>>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>>>>>> While do checkpoint, we need to flush all the unhandled packets,
>>>>>>> By using the filter notifier mechanism, we can easily to notify
>>>>>>> every compare object to do this process, which runs inside
>>>>>>> of compare threads as a coroutine.
>>>>>> Hi~ Jason and Hailiang.
>>>>>>
>>>>>> I will send a patch set later about colo-compare notify mechanism for
>>>>>> Xen like this patch.
>>>>>> I want to add a new chardev socket way in colo-comapre connect to Xen
>>>>>> colo, for notify
>>>>>> checkpoint or failover, Because We have no choice to use this way
>>>>>> communicate with Xen codes.
>>>>>> That's means we will have two notify mechanism.
>>>>>> What do you think about this?
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Zhang Chen
>>>>> I was thinking the possibility of using similar way to for colo
>>>>> compare.
>>>>> E.g can we use socket? This can saves duplicated codes more or less.
>>>> Since there are too many sockets used by filter and COLO, (Two unix
>>>> sockets and two
>>>>    tcp sockets for each vNIC), I don't want to introduce more ;) , but
>>>> i'm not sure if it is
>>>> possible to make it more flexible and optional, abstract these
>>>> duplicated codes,
>>>> pass the opened fd (No matter eventfd or socket fd ) as parameter, for
>>>> example.
>>>> Is this way acceptable ?
>>>>
>>>> Thanks,
>>>> Hailiang
>>> Yes, that's kind of what I want. We don't want to use two message
>>> format. Passing a opened fd need management support, we still need a
>>> fallback if there's no management on top. For qemu/kvm, we can do all
>>> stuffs transparent to the cli by e.g socketpair() or others, but the key
>>> is to have a unified message format.
>> After a deeper investigation, i think we can re-use most codes, since
>> there is no
>> existing way to notify xen (no ?), we still needs notify chardev
>> socket (Be used to notify xen, it is optional.)
>> (http://patchwork.ozlabs.org/patch/733431/ "COLO-compare: Add Xen
>> notify chardev socket handler frame")
> Yes and actually you can use this for bi-directional communication. The
> only differences is the implementation of comparing.
>
>> Besides, there is an existing qmp comand 'xen-colo-do-checkpoint',
> I don't see this in master?

Er, it has been merged already, please see migration/colo.c, void qmp_xen_colo_do_checkpoint(Error **errp);

>> we can re-use it to notify
>> colo-compare objects and other filter objects to do checkpoint, for
>> the opposite direction, we use
>> the notify chardev socket (Only for xen).
> Just want to make sure I understand the design, who will trigger this
> command? Management?

The command will be issued by XEN (xc_save ?), the original existing xen-colo-do-checkpoint
command now only be used to notify block replication to do checkpoint, we can re-use it for filters too.

> Can we just use the socket?

I don't quite understand ...
Just as the codes showed bellow, in this scenario,
XEN notifies colo-compare and fiters do checkpoint by using qmp command,
and colo-compare notifies XEN about net inconsistency event by using the new socket.

>> So the codes will be like:
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 91da936..813c281 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -224,7 +224,19 @@ ReplicationStatus
>> *qmp_query_xen_replication_status(Error **errp)
>>
>>   void qmp_xen_colo_do_checkpoint(Error **errp)
>>   {
>> +    Error *local_err = NULL;
>> +
>>       replication_do_checkpoint_all(errp);
>> +    /* Notify colo-compare and other filters to do checkpoint */
>> +    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +        return;
>> +    }
>> +    colo_notify_filters_event(COLO_CHECKPOINT, &local_err);
>> +    if (local_err) {
>> +        error_propagate(errp, local_err);
>> +    }
>>   }
>>
>>   static void colo_send_message(QEMUFile *f, COLOMessage msg,
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 24e13f0..de975c5 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -391,6 +391,9 @@ static void colo_compare_inconsistent_notify(void)
>>   {
>>       notifier_list_notify(&colo_compare_notifiers,
>>                   migrate_get_current());

KVM will use this notifier/callback way, and in this way, we can avoid the redundant socket.

>> +    if (s->notify_dev) {
>> +       /* Do something, notify remote side through notify dev */
>> +    }
>>   }

If we have a notify socket configured, we will send the message about net inconsistent event.

>>
>>   void colo_compare_register_notifier(Notifier *notify)
>>
>> How about this scenario ？
> See my reply above, and we need unify the message format too. Raw string
> is ok but we'd better have something like TLV or others.

Agreed, we need it to be more standard.

Thanks,
Hailiang

> Thanks
>
>>> Thoughts?
>>>
>>> Thanks
>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> .
>>>>>
>>> .
>>>
>>
>>
>
> .
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-04-18  6:58               ` Hailiang Zhang
@ 2017-04-20  5:15                 ` Jason Wang
  2017-04-21  8:10                   ` Hailiang Zhang
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Wang @ 2017-04-20  5:15 UTC (permalink / raw)
  To: Hailiang Zhang, Zhang Chen, qemu-devel
  Cc: xuquan8, xiecl.fnst, dgilbert, lizhijian



On 2017年04月18日 14:58, Hailiang Zhang wrote:
> On 2017/4/18 11:55, Jason Wang wrote:
>>
>> On 2017年04月17日 19:04, Hailiang Zhang wrote:
>>> Hi Jason,
>>>
>>> On 2017/4/14 14:38, Jason Wang wrote:
>>>> On 2017年04月14日 14:22, Hailiang Zhang wrote:
>>>>> Hi Jason,
>>>>>
>>>>> On 2017/4/14 13:57, Jason Wang wrote:
>>>>>> On 2017年02月22日 17:31, Zhang Chen wrote:
>>>>>>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>>>>>>> While do checkpoint, we need to flush all the unhandled packets,
>>>>>>>> By using the filter notifier mechanism, we can easily to notify
>>>>>>>> every compare object to do this process, which runs inside
>>>>>>>> of compare threads as a coroutine.
>>>>>>> Hi~ Jason and Hailiang.
>>>>>>>
>>>>>>> I will send a patch set later about colo-compare notify 
>>>>>>> mechanism for
>>>>>>> Xen like this patch.
>>>>>>> I want to add a new chardev socket way in colo-comapre connect 
>>>>>>> to Xen
>>>>>>> colo, for notify
>>>>>>> checkpoint or failover, Because We have no choice to use this way
>>>>>>> communicate with Xen codes.
>>>>>>> That's means we will have two notify mechanism.
>>>>>>> What do you think about this?
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>> Zhang Chen
>>>>>> I was thinking the possibility of using similar way to for colo
>>>>>> compare.
>>>>>> E.g can we use socket? This can saves duplicated codes more or less.
>>>>> Since there are too many sockets used by filter and COLO, (Two unix
>>>>> sockets and two
>>>>>    tcp sockets for each vNIC), I don't want to introduce more ;) , 
>>>>> but
>>>>> i'm not sure if it is
>>>>> possible to make it more flexible and optional, abstract these
>>>>> duplicated codes,
>>>>> pass the opened fd (No matter eventfd or socket fd ) as parameter, 
>>>>> for
>>>>> example.
>>>>> Is this way acceptable ?
>>>>>
>>>>> Thanks,
>>>>> Hailiang
>>>> Yes, that's kind of what I want. We don't want to use two message
>>>> format. Passing a opened fd need management support, we still need a
>>>> fallback if there's no management on top. For qemu/kvm, we can do all
>>>> stuffs transparent to the cli by e.g socketpair() or others, but 
>>>> the key
>>>> is to have a unified message format.
>>> After a deeper investigation, i think we can re-use most codes, since
>>> there is no
>>> existing way to notify xen (no ?), we still needs notify chardev
>>> socket (Be used to notify xen, it is optional.)
>>> (http://patchwork.ozlabs.org/patch/733431/ "COLO-compare: Add Xen
>>> notify chardev socket handler frame")
>> Yes and actually you can use this for bi-directional communication. The
>> only differences is the implementation of comparing.
>>
>>> Besides, there is an existing qmp comand 'xen-colo-do-checkpoint',
>> I don't see this in master?
>
> Er, it has been merged already, please see migration/colo.c, void 
> qmp_xen_colo_do_checkpoint(Error **errp);

Aha, I see. Thanks.

>
>>> we can re-use it to notify
>>> colo-compare objects and other filter objects to do checkpoint, for
>>> the opposite direction, we use
>>> the notify chardev socket (Only for xen).
>> Just want to make sure I understand the design, who will trigger this
>> command? Management?
>
> The command will be issued by XEN (xc_save ?), the original existing 
> xen-colo-do-checkpoint
> command now only be used to notify block replication to do checkpoint, 
> we can re-use it for filters too.

So it was called by management. For KVM case, we probably don't need 
this since the comparing thread are under control of qemu.

>
>> Can we just use the socket?
>
> I don't quite understand ...
> Just as the codes showed bellow, in this scenario,
> XEN notifies colo-compare and fiters do checkpoint by using qmp command,

Yes, that's just what I mean. Technically Xen can use socket to do this too.

Thanks

> and colo-compare notifies XEN about net inconsistency event by using 
> the new socket.
>
>>> So the codes will be like:
>>> diff --git a/migration/colo.c b/migration/colo.c
>>> index 91da936..813c281 100644
>>> --- a/migration/colo.c
>>> +++ b/migration/colo.c
>>> @@ -224,7 +224,19 @@ ReplicationStatus
>>> *qmp_query_xen_replication_status(Error **errp)
>>>
>>>   void qmp_xen_colo_do_checkpoint(Error **errp)
>>>   {
>>> +    Error *local_err = NULL;
>>> +
>>>       replication_do_checkpoint_all(errp);
>>> +    /* Notify colo-compare and other filters to do checkpoint */
>>> +    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
>>> +    if (local_err) {
>>> +        error_propagate(errp, local_err);
>>> +        return;
>>> +    }
>>> +    colo_notify_filters_event(COLO_CHECKPOINT, &local_err);
>>> +    if (local_err) {
>>> +        error_propagate(errp, local_err);
>>> +    }
>>>   }
>>>
>>>   static void colo_send_message(QEMUFile *f, COLOMessage msg,
>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>> index 24e13f0..de975c5 100644
>>> --- a/net/colo-compare.c
>>> +++ b/net/colo-compare.c
>>> @@ -391,6 +391,9 @@ static void colo_compare_inconsistent_notify(void)
>>>   {
>>>       notifier_list_notify(&colo_compare_notifiers,
>>>                   migrate_get_current());
>
> KVM will use this notifier/callback way, and in this way, we can avoid 
> the redundant socket.
>
>>> +    if (s->notify_dev) {
>>> +       /* Do something, notify remote side through notify dev */
>>> +    }
>>>   }
>
> If we have a notify socket configured, we will send the message about 
> net inconsistent event.
>
>>>
>>>   void colo_compare_register_notifier(Notifier *notify)
>>>
>>> How about this scenario ？
>> See my reply above, and we need unify the message format too. Raw string
>> is ok but we'd better have something like TLV or others.
>
> Agreed, we need it to be more standard.
>
> Thanks,
> Hailiang
>
>> Thanks
>>
>>>> Thoughts?
>>>>
>>>> Thanks
>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>> .
>>>>>>
>>>> .
>>>>
>>>
>>>
>>
>> .
>>
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm
  2017-04-10  8:26     ` Hailiang Zhang
@ 2017-04-20  9:09       ` Dr. David Alan Gilbert
  2017-04-21  6:50         ` Hailiang Zhang
  0 siblings, 1 reply; 42+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-20  9:09 UTC (permalink / raw)
  To: Hailiang Zhang
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Juan Quintela

* Hailiang Zhang (zhang.zhanghailiang@huawei.com) wrote:
> On 2017/4/8 1:18, Dr. David Alan Gilbert wrote:
> > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> > > There are several stages during loadvm/savevm process. In different stage,
> > > migration incoming processes different types of sections.
> > > We want to control these stages more accuracy, it will benefit COLO
> > > performance, we don't have to save type of QEMU_VM_SECTION_START
> > > sections everytime while do checkpoint, besides, we want to separate
> > > the process of saving/loading memory and devices state.
> > > 
> > > So we add three new helper functions: qemu_loadvm_state_begin(),
> > > qemu_load_device_state() and qemu_savevm_live_state() to achieve
> > > different process during migration.
> > > 
> > > Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
> > > public.
> > > 
> > > Cc: Juan Quintela <quintela@redhat.com>
> > > Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> > > Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > ---
> > >   include/sysemu/sysemu.h |  6 ++++++
> > >   migration/savevm.c      | 55 ++++++++++++++++++++++++++++++++++++++++++-------
> > >   2 files changed, 54 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> > > index 7ed665a..95cae41 100644
> > > --- a/include/sysemu/sysemu.h
> > > +++ b/include/sysemu/sysemu.h
> > > @@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
> > >                                              uint64_t *start_list,
> > >                                              uint64_t *length_list);
> > > +void qemu_savevm_live_state(QEMUFile *f);
> > > +int qemu_save_device_state(QEMUFile *f);
> > > +
> > >   int qemu_loadvm_state(QEMUFile *f);
> > > +int qemu_loadvm_state_begin(QEMUFile *f);
> > > +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> > > +int qemu_load_device_state(QEMUFile *f);
> > >   extern int autostart;
> > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > index 9c2d239..dac478b 100644
> > > --- a/migration/savevm.c
> > > +++ b/migration/savevm.c
> > > @@ -54,6 +54,7 @@
> > >   #include "qemu/cutils.h"
> > >   #include "io/channel-buffer.h"
> > >   #include "io/channel-file.h"
> > > +#include "migration/colo.h"
> > >   #ifndef ETH_P_RARP
> > >   #define ETH_P_RARP 0x8035
> > > @@ -1279,13 +1280,21 @@ done:
> > >       return ret;
> > >   }
> > > -static int qemu_save_device_state(QEMUFile *f)
> > > +void qemu_savevm_live_state(QEMUFile *f)
> > >   {
> > > -    SaveStateEntry *se;
> > > +    /* save QEMU_VM_SECTION_END section */
> > > +    qemu_savevm_state_complete_precopy(f, true);
> > > +    qemu_put_byte(f, QEMU_VM_EOF);
> > > +}
> > > -    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
> > > -    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
> > > +int qemu_save_device_state(QEMUFile *f)
> > > +{
> > > +    SaveStateEntry *se;
> > > +    if (!migration_in_colo_state()) {
> > > +        qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
> > > +        qemu_put_be32(f, QEMU_VM_FILE_VERSION);
> > > +    }
> > Note that got split out into qemu_savevm_state_header() at some point.
> 
> Do you mean i should use the wrapper qemu_savevm_state_heade() here ?

Yes, I think so; best to keep the code that writes the file headers in one place.

Dave

> > Dave
> > 
> > >       cpu_synchronize_all_states();
> > >       QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> > > @@ -1336,8 +1345,6 @@ enum LoadVMExitCodes {
> > >       LOADVM_QUIT     =  1,
> > >   };
> > > -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> > > -
> > >   /* ------ incoming postcopy messages ------ */
> > >   /* 'advise' arrives before any transfers just to tell us that a postcopy
> > >    * *might* happen - it might be skipped if precopy transferred everything
> > > @@ -1942,7 +1949,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
> > >       return 0;
> > >   }
> > > -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> > > +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> > >   {
> > >       uint8_t section_type;
> > >       int ret = 0;
> > > @@ -2080,6 +2087,40 @@ int qemu_loadvm_state(QEMUFile *f)
> > >       return ret;
> > >   }
> > > +int qemu_loadvm_state_begin(QEMUFile *f)
> > > +{
> > > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > > +    Error *local_err = NULL;
> > > +    int ret;
> > > +
> > > +    if (qemu_savevm_state_blocked(&local_err)) {
> > > +        error_report_err(local_err);
> > > +        return -EINVAL;
> > > +    }
> > > +    /* Load QEMU_VM_SECTION_START section */
> > > +    ret = qemu_loadvm_state_main(f, mis);
> > > +    if (ret < 0) {
> > > +        error_report("Failed to loadvm begin work: %d", ret);
> > > +    }
> > > +    return ret;
> > > +}
> > > +
> > > +int qemu_load_device_state(QEMUFile *f)
> > > +{
> > > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > > +    int ret;
> > > +
> > > +    /* Load QEMU_VM_SECTION_FULL section */
> > > +    ret = qemu_loadvm_state_main(f, mis);
> > > +    if (ret < 0) {
> > > +        error_report("Failed to load device state: %d", ret);
> > > +        return ret;
> > > +    }
> > > +
> > > +    cpu_synchronize_all_post_init();
> > > +    return 0;
> > > +}
> > > +
> > >   int save_vmstate(Monitor *mon, const char *name)
> > >   {
> > >       BlockDriverState *bs, *bs1;
> > > -- 
> > > 1.8.3.1
> > > 
> > > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> > .
> > 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm
  2017-04-20  9:09       ` Dr. David Alan Gilbert
@ 2017-04-21  6:50         ` Hailiang Zhang
  0 siblings, 0 replies; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-21  6:50 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, zhangchen.fnst, lizhijian, xiecl.fnst, Juan Quintela

On 2017/4/20 17:09, Dr. David Alan Gilbert wrote:
> * Hailiang Zhang (zhang.zhanghailiang@huawei.com) wrote:
>> On 2017/4/8 1:18, Dr. David Alan Gilbert wrote:
>>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
>>>> There are several stages during loadvm/savevm process. In different stage,
>>>> migration incoming processes different types of sections.
>>>> We want to control these stages more accuracy, it will benefit COLO
>>>> performance, we don't have to save type of QEMU_VM_SECTION_START
>>>> sections everytime while do checkpoint, besides, we want to separate
>>>> the process of saving/loading memory and devices state.
>>>>
>>>> So we add three new helper functions: qemu_loadvm_state_begin(),
>>>> qemu_load_device_state() and qemu_savevm_live_state() to achieve
>>>> different process during migration.
>>>>
>>>> Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
>>>> public.
>>>>
>>>> Cc: Juan Quintela <quintela@redhat.com>
>>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>> ---
>>>>    include/sysemu/sysemu.h |  6 ++++++
>>>>    migration/savevm.c      | 55 ++++++++++++++++++++++++++++++++++++++++++-------
>>>>    2 files changed, 54 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
>>>> index 7ed665a..95cae41 100644
>>>> --- a/include/sysemu/sysemu.h
>>>> +++ b/include/sysemu/sysemu.h
>>>> @@ -132,7 +132,13 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
>>>>                                               uint64_t *start_list,
>>>>                                               uint64_t *length_list);
>>>> +void qemu_savevm_live_state(QEMUFile *f);
>>>> +int qemu_save_device_state(QEMUFile *f);
>>>> +
>>>>    int qemu_loadvm_state(QEMUFile *f);
>>>> +int qemu_loadvm_state_begin(QEMUFile *f);
>>>> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
>>>> +int qemu_load_device_state(QEMUFile *f);
>>>>    extern int autostart;
>>>> diff --git a/migration/savevm.c b/migration/savevm.c
>>>> index 9c2d239..dac478b 100644
>>>> --- a/migration/savevm.c
>>>> +++ b/migration/savevm.c
>>>> @@ -54,6 +54,7 @@
>>>>    #include "qemu/cutils.h"
>>>>    #include "io/channel-buffer.h"
>>>>    #include "io/channel-file.h"
>>>> +#include "migration/colo.h"
>>>>    #ifndef ETH_P_RARP
>>>>    #define ETH_P_RARP 0x8035
>>>> @@ -1279,13 +1280,21 @@ done:
>>>>        return ret;
>>>>    }
>>>> -static int qemu_save_device_state(QEMUFile *f)
>>>> +void qemu_savevm_live_state(QEMUFile *f)
>>>>    {
>>>> -    SaveStateEntry *se;
>>>> +    /* save QEMU_VM_SECTION_END section */
>>>> +    qemu_savevm_state_complete_precopy(f, true);
>>>> +    qemu_put_byte(f, QEMU_VM_EOF);
>>>> +}
>>>> -    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
>>>> -    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
>>>> +int qemu_save_device_state(QEMUFile *f)
>>>> +{
>>>> +    SaveStateEntry *se;
>>>> +    if (!migration_in_colo_state()) {
>>>> +        qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
>>>> +        qemu_put_be32(f, QEMU_VM_FILE_VERSION);
>>>> +    }
>>> Note that got split out into qemu_savevm_state_header() at some point.
>> Do you mean i should use the wrapper qemu_savevm_state_heade() here ?
> Yes, I think so; best to keep the code that writes the file headers in one place.

OK, Will fix in next version, thanks.

> Dave
>
>>> Dave
>>>
>>>>        cpu_synchronize_all_states();
>>>>        QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
>>>> @@ -1336,8 +1345,6 @@ enum LoadVMExitCodes {
>>>>        LOADVM_QUIT     =  1,
>>>>    };
>>>> -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
>>>> -
>>>>    /* ------ incoming postcopy messages ------ */
>>>>    /* 'advise' arrives before any transfers just to tell us that a postcopy
>>>>     * *might* happen - it might be skipped if precopy transferred everything
>>>> @@ -1942,7 +1949,7 @@ qemu_loadvm_section_part_end(QEMUFile *f, MigrationIncomingState *mis)
>>>>        return 0;
>>>>    }
>>>> -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
>>>> +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
>>>>    {
>>>>        uint8_t section_type;
>>>>        int ret = 0;
>>>> @@ -2080,6 +2087,40 @@ int qemu_loadvm_state(QEMUFile *f)
>>>>        return ret;
>>>>    }
>>>> +int qemu_loadvm_state_begin(QEMUFile *f)
>>>> +{
>>>> +    MigrationIncomingState *mis = migration_incoming_get_current();
>>>> +    Error *local_err = NULL;
>>>> +    int ret;
>>>> +
>>>> +    if (qemu_savevm_state_blocked(&local_err)) {
>>>> +        error_report_err(local_err);
>>>> +        return -EINVAL;
>>>> +    }
>>>> +    /* Load QEMU_VM_SECTION_START section */
>>>> +    ret = qemu_loadvm_state_main(f, mis);
>>>> +    if (ret < 0) {
>>>> +        error_report("Failed to loadvm begin work: %d", ret);
>>>> +    }
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +int qemu_load_device_state(QEMUFile *f)
>>>> +{
>>>> +    MigrationIncomingState *mis = migration_incoming_get_current();
>>>> +    int ret;
>>>> +
>>>> +    /* Load QEMU_VM_SECTION_FULL section */
>>>> +    ret = qemu_loadvm_state_main(f, mis);
>>>> +    if (ret < 0) {
>>>> +        error_report("Failed to load device state: %d", ret);
>>>> +        return ret;
>>>> +    }
>>>> +
>>>> +    cpu_synchronize_all_post_init();
>>>> +    return 0;
>>>> +}
>>>> +
>>>>    int save_vmstate(Monitor *mon, const char *name)
>>>>    {
>>>>        BlockDriverState *bs, *bs1;
>>>> -- 
>>>> 1.8.3.1
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>> .
>>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> .
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint
  2017-04-20  5:15                 ` Jason Wang
@ 2017-04-21  8:10                   ` Hailiang Zhang
  0 siblings, 0 replies; 42+ messages in thread
From: Hailiang Zhang @ 2017-04-21  8:10 UTC (permalink / raw)
  To: Jason Wang, Zhang Chen, qemu-devel
  Cc: xuquan8, xiecl.fnst, dgilbert, lizhijian

On 2017/4/20 13:15, Jason Wang wrote:
>
> On 2017年04月18日 14:58, Hailiang Zhang wrote:
>> On 2017/4/18 11:55, Jason Wang wrote:
>>> On 2017年04月17日 19:04, Hailiang Zhang wrote:
>>>> Hi Jason,
>>>>
>>>> On 2017/4/14 14:38, Jason Wang wrote:
>>>>> On 2017年04月14日 14:22, Hailiang Zhang wrote:
>>>>>> Hi Jason,
>>>>>>
>>>>>> On 2017/4/14 13:57, Jason Wang wrote:
>>>>>>> On 2017年02月22日 17:31, Zhang Chen wrote:
>>>>>>>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>>>>>>>> While do checkpoint, we need to flush all the unhandled packets,
>>>>>>>>> By using the filter notifier mechanism, we can easily to notify
>>>>>>>>> every compare object to do this process, which runs inside
>>>>>>>>> of compare threads as a coroutine.
>>>>>>>> Hi~ Jason and Hailiang.
>>>>>>>>
>>>>>>>> I will send a patch set later about colo-compare notify
>>>>>>>> mechanism for
>>>>>>>> Xen like this patch.
>>>>>>>> I want to add a new chardev socket way in colo-comapre connect
>>>>>>>> to Xen
>>>>>>>> colo, for notify
>>>>>>>> checkpoint or failover, Because We have no choice to use this way
>>>>>>>> communicate with Xen codes.
>>>>>>>> That's means we will have two notify mechanism.
>>>>>>>> What do you think about this?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Zhang Chen
>>>>>>> I was thinking the possibility of using similar way to for colo
>>>>>>> compare.
>>>>>>> E.g can we use socket? This can saves duplicated codes more or less.
>>>>>> Since there are too many sockets used by filter and COLO, (Two unix
>>>>>> sockets and two
>>>>>>     tcp sockets for each vNIC), I don't want to introduce more ;) ,
>>>>>> but
>>>>>> i'm not sure if it is
>>>>>> possible to make it more flexible and optional, abstract these
>>>>>> duplicated codes,
>>>>>> pass the opened fd (No matter eventfd or socket fd ) as parameter,
>>>>>> for
>>>>>> example.
>>>>>> Is this way acceptable ?
>>>>>>
>>>>>> Thanks,
>>>>>> Hailiang
>>>>> Yes, that's kind of what I want. We don't want to use two message
>>>>> format. Passing a opened fd need management support, we still need a
>>>>> fallback if there's no management on top. For qemu/kvm, we can do all
>>>>> stuffs transparent to the cli by e.g socketpair() or others, but
>>>>> the key
>>>>> is to have a unified message format.
>>>> After a deeper investigation, i think we can re-use most codes, since
>>>> there is no
>>>> existing way to notify xen (no ?), we still needs notify chardev
>>>> socket (Be used to notify xen, it is optional.)
>>>> (http://patchwork.ozlabs.org/patch/733431/ "COLO-compare: Add Xen
>>>> notify chardev socket handler frame")
>>> Yes and actually you can use this for bi-directional communication. The
>>> only differences is the implementation of comparing.
>>>
>>>> Besides, there is an existing qmp comand 'xen-colo-do-checkpoint',
>>> I don't see this in master?
>> Er, it has been merged already, please see migration/colo.c, void
>> qmp_xen_colo_do_checkpoint(Error **errp);
> Aha, I see. Thanks.

;)

>>>> we can re-use it to notify
>>>> colo-compare objects and other filter objects to do checkpoint, for
>>>> the opposite direction, we use
>>>> the notify chardev socket (Only for xen).
>>> Just want to make sure I understand the design, who will trigger this
>>> command? Management?
>> The command will be issued by XEN (xc_save ?), the original existing
>> xen-colo-do-checkpoint
>> command now only be used to notify block replication to do checkpoint,
>> we can re-use it for filters too.
> So it was called by management. For KVM case, we probably don't need
> this since the comparing thread are under control of qemu.

Yes, you are right.

>>> Can we just use the socket?
>> I don't quite understand ...
>> Just as the codes showed bellow, in this scenario,
>> XEN notifies colo-compare and fiters do checkpoint by using qmp command,
> Yes, that's just what I mean. Technically Xen can use socket to do this too.

Yes, great, since we have come to an agreement on the scenario, I'll update this series.

Thanks,
Hailiang.

> Thanks
>
>> and colo-compare notifies XEN about net inconsistency event by using
>> the new socket.
>>
>>>> So the codes will be like:
>>>> diff --git a/migration/colo.c b/migration/colo.c
>>>> index 91da936..813c281 100644
>>>> --- a/migration/colo.c
>>>> +++ b/migration/colo.c
>>>> @@ -224,7 +224,19 @@ ReplicationStatus
>>>> *qmp_query_xen_replication_status(Error **errp)
>>>>
>>>>    void qmp_xen_colo_do_checkpoint(Error **errp)
>>>>    {
>>>> +    Error *local_err = NULL;
>>>> +
>>>>        replication_do_checkpoint_all(errp);
>>>> +    /* Notify colo-compare and other filters to do checkpoint */
>>>> +    colo_notify_compares_event(NULL, COLO_CHECKPOINT, &local_err);
>>>> +    if (local_err) {
>>>> +        error_propagate(errp, local_err);
>>>> +        return;
>>>> +    }
>>>> +    colo_notify_filters_event(COLO_CHECKPOINT, &local_err);
>>>> +    if (local_err) {
>>>> +        error_propagate(errp, local_err);
>>>> +    }
>>>>    }
>>>>
>>>>    static void colo_send_message(QEMUFile *f, COLOMessage msg,
>>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>>> index 24e13f0..de975c5 100644
>>>> --- a/net/colo-compare.c
>>>> +++ b/net/colo-compare.c
>>>> @@ -391,6 +391,9 @@ static void colo_compare_inconsistent_notify(void)
>>>>    {
>>>>        notifier_list_notify(&colo_compare_notifiers,
>>>>                    migrate_get_current());
>> KVM will use this notifier/callback way, and in this way, we can avoid
>> the redundant socket.
>>
>>>> +    if (s->notify_dev) {
>>>> +       /* Do something, notify remote side through notify dev */
>>>> +    }
>>>>    }
>> If we have a notify socket configured, we will send the message about
>> net inconsistent event.
>>
>>>>    void colo_compare_register_notifier(Notifier *notify)
>>>>
>>>> How about this scenario ？
>>> See my reply above, and we need unify the message format too. Raw string
>>> is ok but we'd better have something like TLV or others.
>> Agreed, we need it to be more standard.
>>
>> Thanks,
>> Hailiang
>>
>>> Thanks
>>>
>>>>> Thoughts?
>>>>>
>>>>> Thanks
>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>> .
>>>>>
>>>>
>>> .
>>>
>>
>
> .
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2017-04-21  8:11 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-22  3:42 [Qemu-devel] [PATCH 00/15] COLO: integrate colo frame with block replication and net compare zhanghailiang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 01/15] net/colo: Add notifier/callback related helpers for filter zhanghailiang
2017-04-07 15:46   ` Dr. David Alan Gilbert
2017-04-10  7:26     ` Hailiang Zhang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 02/15] colo-compare: implement the process of checkpoint zhanghailiang
2017-02-22  9:31   ` Zhang Chen
2017-02-23  1:02     ` Hailiang Zhang
2017-02-23  5:49       ` Zhang Chen
2017-04-14  5:57     ` Jason Wang
2017-04-14  6:22       ` Hailiang Zhang
2017-04-14  6:38         ` Jason Wang
2017-04-17 11:04           ` Hailiang Zhang
2017-04-18  1:32             ` Zhang Chen
2017-04-18  3:55             ` Jason Wang
2017-04-18  6:58               ` Hailiang Zhang
2017-04-20  5:15                 ` Jason Wang
2017-04-21  8:10                   ` Hailiang Zhang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 03/15] colo-compare: use notifier to notify packets comparing result zhanghailiang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 04/15] COLO: integrate colo compare with colo frame zhanghailiang
2017-04-07 15:59   ` Dr. David Alan Gilbert
2017-02-22  3:42 ` [Qemu-devel] [PATCH 05/15] COLO: Handle shutdown command for VM in COLO state zhanghailiang
2017-02-22 15:35   ` Eric Blake
2017-02-23  1:15     ` Hailiang Zhang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 06/15] COLO: Add block replication into colo process zhanghailiang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 07/15] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily zhanghailiang
2017-04-07 17:06   ` Dr. David Alan Gilbert
2017-04-10  7:31     ` Hailiang Zhang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 08/15] ram/COLO: Record the dirty pages that SVM received zhanghailiang
2017-02-23 18:44   ` Dr. David Alan Gilbert
2017-02-22  3:42 ` [Qemu-devel] [PATCH 09/15] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 10/15] qmp event: Add COLO_EXIT event to notify users while exited from COLO zhanghailiang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 11/15] savevm: split save/find loadvm_handlers entry into two helper functions zhanghailiang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 12/15] savevm: split the process of different stages for loadvm/savevm zhanghailiang
2017-04-07 17:18   ` Dr. David Alan Gilbert
2017-04-10  8:26     ` Hailiang Zhang
2017-04-20  9:09       ` Dr. David Alan Gilbert
2017-04-21  6:50         ` Hailiang Zhang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 13/15] COLO: Separate the process of saving/loading ram and device state zhanghailiang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 14/15] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
2017-02-22  3:42 ` [Qemu-devel] [PATCH 15/15] COLO: flush host dirty ram from cache zhanghailiang
2017-04-07 17:39   ` Dr. David Alan Gilbert
2017-04-10  7:13     ` Hailiang Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.