All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy
@ 2018-06-03  5:05 Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 01/17] filter-rewriter: fix memory leak for connection in connection_track_table Zhang Chen
                   ` (16 more replies)
  0 siblings, 17 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

Hi~ All~

COLO Frame, block replication and COLO proxy(colo-compare,filter-mirror,
filter-redirector,filter-rewriter) have been exist in qemu
for long time, it's time to integrate these three parts to make COLO really works.

In this series, we have some optimizations for COLO frame, including separating the
process of saving ram and device state, using an COLO_EXIT event to notify users that
VM exits COLO, for these parts, most of them have been reviewed long time ago in old version,
but since this series have just rebased on upstream which had merged a new series of migration,
parts of pathes in this series deserve review again.

We use notifier/callback method for COLO compare to notify COLO frame about
net packets inconsistent event, and add a handle_event method for NetFilterClass to
help COLO frame to notify filters and colo-compare about checkpoint/failover event,
it is flexible.

For the neweset version, please refer to:
https://github.com/zhangckid/qemu/tree/qemu-colo-18jun1

Please review, thanks.

V8:
 - Rebased on upstream codes.
 - Addressed Markus's comments in patch 10/17.
 - Addressed Markus's comments in patch 11/17.
 - Removed some comments in patch 4/17.
 - Moved the "migration_bitmap_clear_dirty()" to suitable position in
   patch 9/17.
 - Rewrote the patch 07/17 to address Davie's comments.
 - Moved the "qemu_savevm_live_state" out of the
   qemu_mutex_lock_iothread.
 - Fixed the bug that in some status COLO vm crash with segmentation fault.

V7:
 - Addressed Markus's comments in 11/17.
 - Rebased on upstream.

V6:
 - Addressed Eric Blake's comments, use the enum to feedback in patch 11/17.
 - Fixed QAPI command separator problem in patch 11/17.


Zhang Chen (13):
  filter-rewriter: fix memory leak for connection in
    connection_track_table
  colo-compare: implement the process of checkpoint
  colo-compare: use notifier to notify packets comparing result
  COLO: integrate colo compare with colo frame
  COLO: Add block replication into colo process
  COLO: Remove colo_state migration struct
  COLO: Load dirty pages into SVM's RAM cache firstly
  ram/COLO: Record the dirty pages that SVM received
  COLO: Flush memory data from ram cache
  qapi: Add new command to query colo status
  savevm: split the process of different stages for loadvm/savevm
  filter: Add handle_event method for NetFilterClass
  filter-rewriter: handle checkpoint and failover event

zhanghailiang (4):
  qmp event: Add COLO_EXIT event to notify users while exited COLO
  COLO: flush host dirty ram from cache
  COLO: notify net filters about checkpoint/failover event
  COLO: quick failover process by kick COLO thread

 include/exec/ram_addr.h  |   1 +
 include/migration/colo.h |  11 +-
 include/net/filter.h     |   5 +
 migration/Makefile.objs  |   2 +-
 migration/colo-comm.c    |  76 -------------
 migration/colo.c         | 239 +++++++++++++++++++++++++++++++++++++--
 migration/migration.c    |  44 ++++++-
 migration/ram.c          | 163 +++++++++++++++++++++++++-
 migration/ram.h          |   4 +
 migration/savevm.c       |  53 +++++++--
 migration/savevm.h       |   5 +
 migration/trace-events   |   3 +
 net/colo-compare.c       | 108 ++++++++++++++++--
 net/colo-compare.h       |  24 ++++
 net/colo.h               |   4 +
 net/filter-rewriter.c    | 109 ++++++++++++++++--
 net/filter.c             |  17 +++
 net/net.c                |  28 +++++
 qapi/migration.json      |  72 ++++++++++++
 vl.c                     |   2 -
 20 files changed, 851 insertions(+), 119 deletions(-)
 delete mode 100644 migration/colo-comm.c
 create mode 100644 net/colo-compare.h

-- 
2.17.GIT

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 01/17] filter-rewriter: fix memory leak for connection in connection_track_table
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-04  5:51   ` Jason Wang
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 02/17] colo-compare: implement the process of checkpoint Zhang Chen
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

After a net connection is closed, we didn't clear its releated resources
in connection_track_table, which will lead to memory leak.

Let't track the state of net connection, if it is closed, its related
resources will be cleared up.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
---
 net/colo.h            |  4 +++
 net/filter-rewriter.c | 69 ++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 66 insertions(+), 7 deletions(-)

diff --git a/net/colo.h b/net/colo.h
index da6c36dcf7..cd118510c5 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -18,6 +18,7 @@
 #include "slirp/slirp.h"
 #include "qemu/jhash.h"
 #include "qemu/timer.h"
+#include "slirp/tcp.h"
 
 #define HASHTABLE_MAX_SIZE 16384
 
@@ -86,6 +87,9 @@ typedef struct Connection {
      * run once in independent tcp connection
      */
     int syn_flag;
+
+    int tcp_state; /* TCP FSM state */
+    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
 } Connection;
 
 uint32_t connection_key_hash(const void *opaque);
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index 62dad2d773..0909a9a8af 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -59,9 +59,9 @@ static int is_tcp_packet(Packet *pkt)
 }
 
 /* handle tcp packet from primary guest */
-static int handle_primary_tcp_pkt(NetFilterState *nf,
+static int handle_primary_tcp_pkt(RewriterState *rf,
                                   Connection *conn,
-                                  Packet *pkt)
+                                  Packet *pkt, ConnectionKey *key)
 {
     struct tcphdr *tcp_pkt;
 
@@ -99,15 +99,44 @@ static int handle_primary_tcp_pkt(NetFilterState *nf,
             net_checksum_calculate((uint8_t *)pkt->data + pkt->vnet_hdr_len,
                                    pkt->size - pkt->vnet_hdr_len);
         }
+        /*
+         * Case 1:
+         * The *server* side of this connect is VM, *client* tries to close
+         * the connection.
+         *
+         * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1'
+         * packet from server side. From this point, we can ensure that there
+         * will be no packets in the connection, except that, some errors
+         * happen between the path of 'filter object' and vNIC, if this rare
+         * case really happen, we can still create a new connection,
+         * So it is safe to remove the connection from connection_track_table.
+         *
+         */
+        if ((conn->tcp_state == TCPS_LAST_ACK) &&
+            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
+            g_hash_table_remove(rf->connection_track_table, key);
+        }
+    }
+    /*
+     * Case 2:
+     * The *server* side of this connect is VM, *server* tries to close
+     * the connection.
+     *
+     * We got 'fin=1, ack=1' packet from client side, we need to
+     * record the seq of 'fin=1, ack=1' packet.
+     */
+    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
+        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
+        conn->tcp_state = TCPS_LAST_ACK;
     }
 
     return 0;
 }
 
 /* handle tcp packet from secondary guest */
-static int handle_secondary_tcp_pkt(NetFilterState *nf,
+static int handle_secondary_tcp_pkt(RewriterState *rf,
                                     Connection *conn,
-                                    Packet *pkt)
+                                    Packet *pkt, ConnectionKey *key)
 {
     struct tcphdr *tcp_pkt;
 
@@ -139,8 +168,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf,
             net_checksum_calculate((uint8_t *)pkt->data + pkt->vnet_hdr_len,
                                    pkt->size - pkt->vnet_hdr_len);
         }
+        /*
+         * Case 2:
+         * The *server* side of this connect is VM, *server* tries to close
+         * the connection.
+         *
+         * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1'
+         * packet from client side. Like Case 1, there should be no packets
+         * in the connection from now know, But the difference here is
+         * if the packet is lost, We will get the resent 'fin=1,ack=1' packet.
+         * TODO: Fix above case.
+         */
+        if ((conn->tcp_state == TCPS_LAST_ACK) &&
+            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
+            g_hash_table_remove(rf->connection_track_table, key);
+        }
+    }
+    /*
+     * Case 1:
+     * The *server* side of this connect is VM, *client* tries to close
+     * the connection.
+     *
+     * We got 'fin=1, ack=1' packet from server side, we need to
+     * record the seq of 'fin=1, ack=1' packet.
+     */
+    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
+        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
+        conn->tcp_state = TCPS_LAST_ACK;
     }
-
     return 0;
 }
 
@@ -190,7 +245,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
 
         if (sender == nf->netdev) {
             /* NET_FILTER_DIRECTION_TX */
-            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
+            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
                 qemu_net_queue_send(s->incoming_queue, sender, 0,
                 (const uint8_t *)pkt->data, pkt->size, NULL);
                 packet_destroy(pkt, NULL);
@@ -203,7 +258,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
             }
         } else {
             /* NET_FILTER_DIRECTION_RX */
-            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
+            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
                 qemu_net_queue_send(s->incoming_queue, sender, 0,
                 (const uint8_t *)pkt->data, pkt->size, NULL);
                 packet_destroy(pkt, NULL);
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 02/17] colo-compare: implement the process of checkpoint
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 01/17] filter-rewriter: fix memory leak for connection in connection_track_table Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-04  6:31   ` Jason Wang
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 03/17] colo-compare: use notifier to notify packets comparing result Zhang Chen
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

While do checkpoint, we need to flush all the unhandled packets,
By using the filter notifier mechanism, we can easily to notify
every compare object to do this process, which runs inside
of compare threads as a coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
---
 include/migration/colo.h |  6 ++++
 net/colo-compare.c       | 76 ++++++++++++++++++++++++++++++++++++++++
 net/colo-compare.h       | 22 ++++++++++++
 3 files changed, 104 insertions(+)
 create mode 100644 net/colo-compare.h

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 2fe48ad353..fefb2fcf4c 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -16,6 +16,12 @@
 #include "qemu-common.h"
 #include "qapi/qapi-types-migration.h"
 
+enum colo_event {
+    COLO_EVENT_NONE,
+    COLO_EVENT_CHECKPOINT,
+    COLO_EVENT_FAILOVER,
+};
+
 void colo_info_init(void);
 
 void migrate_start_colo_process(MigrationState *s);
diff --git a/net/colo-compare.c b/net/colo-compare.c
index 23b2d2c4cc..7ff3ae8904 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -27,11 +27,16 @@
 #include "qemu/sockets.h"
 #include "net/colo.h"
 #include "sysemu/iothread.h"
+#include "net/colo-compare.h"
+#include "migration/colo.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
     OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
+static QTAILQ_HEAD(, CompareState) net_compares =
+       QTAILQ_HEAD_INITIALIZER(net_compares);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
@@ -41,6 +46,10 @@
 /* TODO: Should be configurable */
 #define REGULAR_PACKET_CHECK_MS 3000
 
+static QemuMutex event_mtx;
+static QemuCond event_complete_cond;
+static int event_unhandled_count;
+
 /*
  *  + CompareState ++
  *  |               |
@@ -87,6 +96,11 @@ typedef struct CompareState {
     IOThread *iothread;
     GMainContext *worker_context;
     QEMUTimer *packet_check_timer;
+
+    QEMUBH *event_bh;
+    enum colo_event event;
+
+    QTAILQ_ENTRY(CompareState) next;
 } CompareState;
 
 typedef struct CompareClass {
@@ -736,6 +750,25 @@ static void check_old_packet_regular(void *opaque)
                 REGULAR_PACKET_CHECK_MS);
 }
 
+/* Public API, Used for COLO frame to notify compare event */
+void colo_notify_compares_event(void *opaque, int event, Error **errp)
+{
+    CompareState *s;
+
+    qemu_mutex_lock(&event_mtx);
+    QTAILQ_FOREACH(s, &net_compares, next) {
+        s->event = event;
+        qemu_bh_schedule(s->event_bh);
+        event_unhandled_count++;
+    }
+    /* Wait all compare threads to finish handling this event */
+    while (event_unhandled_count > 0) {
+        qemu_cond_wait(&event_complete_cond, &event_mtx);
+    }
+
+    qemu_mutex_unlock(&event_mtx);
+}
+
 static void colo_compare_timer_init(CompareState *s)
 {
     AioContext *ctx = iothread_get_aio_context(s->iothread);
@@ -756,6 +789,28 @@ static void colo_compare_timer_del(CompareState *s)
     }
  }
 
+static void colo_flush_packets(void *opaque, void *user_data);
+
+static void colo_compare_handle_event(void *opaque)
+{
+    CompareState *s = opaque;
+
+    switch (s->event) {
+    case COLO_EVENT_CHECKPOINT:
+        g_queue_foreach(&s->conn_list, colo_flush_packets, s);
+        break;
+    case COLO_EVENT_FAILOVER:
+        break;
+    default:
+        break;
+    }
+    qemu_mutex_lock(&event_mtx);
+    assert(event_unhandled_count > 0);
+    event_unhandled_count--;
+    qemu_cond_broadcast(&event_complete_cond);
+    qemu_mutex_unlock(&event_mtx);
+}
+
 static void colo_compare_iothread(CompareState *s)
 {
     object_ref(OBJECT(s->iothread));
@@ -769,6 +824,7 @@ static void colo_compare_iothread(CompareState *s)
                              s, s->worker_context, true);
 
     colo_compare_timer_init(s);
+    s->event_bh = qemu_bh_new(colo_compare_handle_event, s);
 }
 
 static char *compare_get_pri_indev(Object *obj, Error **errp)
@@ -926,8 +982,13 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
     net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize, s->vnet_hdr);
     net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize, s->vnet_hdr);
 
+    QTAILQ_INSERT_TAIL(&net_compares, s, next);
+
     g_queue_init(&s->conn_list);
 
+    qemu_mutex_init(&event_mtx);
+    qemu_cond_init(&event_complete_cond);
+
     s->connection_track_table = g_hash_table_new_full(connection_key_hash,
                                                       connection_key_equal,
                                                       g_free,
@@ -990,6 +1051,7 @@ static void colo_compare_init(Object *obj)
 static void colo_compare_finalize(Object *obj)
 {
     CompareState *s = COLO_COMPARE(obj);
+    CompareState *tmp = NULL;
 
     qemu_chr_fe_deinit(&s->chr_pri_in, false);
     qemu_chr_fe_deinit(&s->chr_sec_in, false);
@@ -997,6 +1059,16 @@ static void colo_compare_finalize(Object *obj)
     if (s->iothread) {
         colo_compare_timer_del(s);
     }
+
+    qemu_bh_delete(s->event_bh);
+
+    QTAILQ_FOREACH(tmp, &net_compares, next) {
+        if (!strcmp(tmp->outdev, s->outdev)) {
+            QTAILQ_REMOVE(&net_compares, s, next);
+            break;
+        }
+    }
+
     /* Release all unhandled packets after compare thead exited */
     g_queue_foreach(&s->conn_list, colo_flush_packets, s);
 
@@ -1009,6 +1081,10 @@ static void colo_compare_finalize(Object *obj)
     if (s->iothread) {
         object_unref(OBJECT(s->iothread));
     }
+
+    qemu_mutex_destroy(&event_mtx);
+    qemu_cond_destroy(&event_complete_cond);
+
     g_free(s->pri_indev);
     g_free(s->sec_indev);
     g_free(s->outdev);
diff --git a/net/colo-compare.h b/net/colo-compare.h
new file mode 100644
index 0000000000..1b1ce76aea
--- /dev/null
+++ b/net/colo-compare.h
@@ -0,0 +1,22 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2017 FUJITSU LIMITED
+ * Copyright (c) 2017 Intel Corporation
+ *
+ * Authors:
+ *    zhanghailiang <zhang.zhanghailiang@huawei.com>
+ *    Zhang Chen <zhangckid@gmail.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_COMPARE_H
+#define QEMU_COLO_COMPARE_H
+
+void colo_notify_compares_event(void *opaque, int event, Error **errp);
+
+#endif /* QEMU_COLO_COMPARE_H */
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 03/17] colo-compare: use notifier to notify packets comparing result
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 01/17] filter-rewriter: fix memory leak for connection in connection_track_table Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 02/17] colo-compare: implement the process of checkpoint Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-04  6:36   ` Jason Wang
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 04/17] COLO: integrate colo compare with colo frame Zhang Chen
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

It's a good idea to use notifier to notify COLO frame of
inconsistent packets comparing.

Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 32 +++++++++++++++++++++++++-------
 net/colo-compare.h |  2 ++
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 7ff3ae8904..05061cd1c4 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -29,6 +29,7 @@
 #include "sysemu/iothread.h"
 #include "net/colo-compare.h"
 #include "migration/colo.h"
+#include "migration/migration.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
@@ -37,6 +38,9 @@
 static QTAILQ_HEAD(, CompareState) net_compares =
        QTAILQ_HEAD_INITIALIZER(net_compares);
 
+static NotifierList colo_compare_notifiers =
+    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
@@ -561,8 +565,24 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
     }
 }
 
+static void colo_compare_inconsistent_notify(void)
+{
+    notifier_list_notify(&colo_compare_notifiers,
+                migrate_get_current());
+}
+
+void colo_compare_register_notifier(Notifier *notify)
+{
+    notifier_list_add(&colo_compare_notifiers, notify);
+}
+
+void colo_compare_unregister_notifier(Notifier *notify)
+{
+    notifier_remove(notify);
+}
+
 static int colo_old_packet_check_one_conn(Connection *conn,
-                                          void *user_data)
+                                           void *user_data)
 {
     GList *result = NULL;
     int64_t check_time = REGULAR_PACKET_CHECK_MS;
@@ -573,10 +593,7 @@ static int colo_old_packet_check_one_conn(Connection *conn,
 
     if (result) {
         /* Do checkpoint will flush old packet */
-        /*
-         * TODO: Notify colo frame to do checkpoint.
-         * colo_compare_inconsistent_notify();
-         */
+        colo_compare_inconsistent_notify();
         return 0;
     }
 
@@ -620,11 +637,12 @@ static void colo_compare_packet(CompareState *s, Connection *conn,
             /*
              * If one packet arrive late, the secondary_list or
              * primary_list will be empty, so we can't compare it
-             * until next comparison.
+             * until next comparison. If the packets in the list are
+             * timeout, it will trigger a checkpoint request.
              */
             trace_colo_compare_main("packet different");
             g_queue_push_head(&conn->primary_list, pkt);
-            /* TODO: colo_notify_checkpoint();*/
+            colo_compare_inconsistent_notify();
             break;
         }
     }
diff --git a/net/colo-compare.h b/net/colo-compare.h
index 1b1ce76aea..22ddd512e2 100644
--- a/net/colo-compare.h
+++ b/net/colo-compare.h
@@ -18,5 +18,7 @@
 #define QEMU_COLO_COMPARE_H
 
 void colo_notify_compares_event(void *opaque, int event, Error **errp);
+void colo_compare_register_notifier(Notifier *notify);
+void colo_compare_unregister_notifier(Notifier *notify);
 
 #endif /* QEMU_COLO_COMPARE_H */
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 04/17] COLO: integrate colo compare with colo frame
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (2 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 03/17] colo-compare: use notifier to notify packets comparing result Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 05/17] COLO: Add block replication into colo process Zhang Chen
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

For COLO FT, both the PVM and SVM run at the same time,
only sync the state while it needs.

So here, let SVM runs while not doing checkpoint, change
DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100.

Besides, we forgot to release colo_checkpoint_semd and
colo_delay_timer, fix them here.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c      | 42 ++++++++++++++++++++++++++++++++++++++++--
 migration/migration.c |  6 ++----
 2 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 4381067ed4..081df1835f 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -25,8 +25,11 @@
 #include "qemu/error-report.h"
 #include "migration/failover.h"
 #include "replication.h"
+#include "net/colo-compare.h"
+#include "net/colo.h"
 
 static bool vmstate_loading;
+static Notifier packets_compare_notifier;
 
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
@@ -343,6 +346,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
         goto out;
     }
 
+    colo_notify_compares_event(NULL, COLO_EVENT_CHECKPOINT, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
     /* Disable block migration */
     migrate_set_block_enabled(false, &local_err);
     qemu_savevm_state_header(fb);
@@ -400,6 +408,11 @@ out:
     return ret;
 }
 
+static void colo_compare_notify_checkpoint(Notifier *notifier, void *data)
+{
+    colo_checkpoint_notify(data);
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
     QIOChannelBuffer *bioc;
@@ -416,6 +429,9 @@ static void colo_process_checkpoint(MigrationState *s)
         goto out;
     }
 
+    packets_compare_notifier.notify = colo_compare_notify_checkpoint;
+    colo_compare_register_notifier(&packets_compare_notifier);
+
     /*
      * Wait for Secondary finish loading VM states and enter COLO
      * restore.
@@ -461,11 +477,21 @@ out:
         qemu_fclose(fb);
     }
 
-    timer_del(s->colo_delay_timer);
-
     /* Hope this not to be too long to wait here */
     qemu_sem_wait(&s->colo_exit_sem);
     qemu_sem_destroy(&s->colo_exit_sem);
+
+    /*
+     * It is safe to unregister notifier after failover finished.
+     * Besides, colo_delay_timer and colo_checkpoint_sem can't be
+     * released befor unregister notifier, or there will be use-after-free
+     * error.
+     */
+    colo_compare_unregister_notifier(&packets_compare_notifier);
+    timer_del(s->colo_delay_timer);
+    timer_free(s->colo_delay_timer);
+    qemu_sem_destroy(&s->colo_checkpoint_sem);
+
     /*
      * Must be called after failover BH is completed,
      * Or the failover BH may shutdown the wrong fd that
@@ -558,6 +584,11 @@ void *colo_process_incoming_thread(void *opaque)
     fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc));
     object_unref(OBJECT(bioc));
 
+    qemu_mutex_lock_iothread();
+    vm_start();
+    trace_colo_vm_state_change("stop", "run");
+    qemu_mutex_unlock_iothread();
+
     colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
                       &local_err);
     if (local_err) {
@@ -577,6 +608,11 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        qemu_mutex_lock_iothread();
+        vm_stop_force_state(RUN_STATE_COLO);
+        trace_colo_vm_state_change("run", "stop");
+        qemu_mutex_unlock_iothread();
+
         /* FIXME: This is unnecessary for periodic checkpoint mode */
         colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
                      &local_err);
@@ -630,6 +666,8 @@ void *colo_process_incoming_thread(void *opaque)
         }
 
         vmstate_loading = false;
+        vm_start();
+        trace_colo_vm_state_change("stop", "run");
         qemu_mutex_unlock_iothread();
 
         if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
diff --git a/migration/migration.c b/migration/migration.c
index 05aec2c905..59aab11f4a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -75,10 +75,8 @@
 /* Migration XBZRLE default cache size */
 #define DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE (64 * 1024 * 1024)
 
-/* The delay time (in ms) between two COLO checkpoints
- * Note: Please change this default value to 10000 when we support hybrid mode.
- */
-#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
+/* The delay time (in ms) between two COLO checkpoints */
+#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
 #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
 #define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 16
 
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 05/17] COLO: Add block replication into colo process
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (3 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 04/17] COLO: integrate colo compare with colo frame Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 06/17] COLO: Remove colo_state migration struct Zhang Chen
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

Make sure master start block replication after slave's block
replication started.

Besides, we need to activate VM's blocks before goes into
COLO state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
---
 migration/colo.c      | 43 +++++++++++++++++++++++++++++++++++++++++++
 migration/migration.c |  9 +++++++++
 2 files changed, 52 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 081df1835f..e06640c3d6 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -27,6 +27,7 @@
 #include "replication.h"
 #include "net/colo-compare.h"
 #include "net/colo.h"
+#include "block/block.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -56,6 +57,7 @@ static void secondary_vm_do_failover(void)
 {
     int old_state;
     MigrationIncomingState *mis = migration_incoming_get_current();
+    Error *local_err = NULL;
 
     /* Can not do failover during the process of VM's loading VMstate, Or
      * it will break the secondary VM.
@@ -73,6 +75,11 @@ static void secondary_vm_do_failover(void)
     migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
 
+    replication_stop_all(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
+
     if (!autostart) {
         error_report("\"-S\" qemu option will be ignored in secondary side");
         /* recover runstate to normal migration finish state */
@@ -110,6 +117,7 @@ static void primary_vm_do_failover(void)
 {
     MigrationState *s = migrate_get_current();
     int old_state;
+    Error *local_err = NULL;
 
     migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
@@ -133,6 +141,13 @@ static void primary_vm_do_failover(void)
                      FailoverStatus_str(old_state));
         return;
     }
+
+    replication_stop_all(true, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+        local_err = NULL;
+    }
+
     /* Notify COLO thread that failover work is finished */
     qemu_sem_post(&s->colo_exit_sem);
 }
@@ -356,6 +371,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     qemu_savevm_state_header(fb);
     qemu_savevm_state_setup(fb);
     qemu_mutex_lock_iothread();
+    replication_do_checkpoint_all(&local_err);
+    if (local_err) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
     qemu_savevm_state_complete_precopy(fb, false, false);
     qemu_mutex_unlock_iothread();
 
@@ -446,6 +466,12 @@ static void colo_process_checkpoint(MigrationState *s)
     object_unref(OBJECT(bioc));
 
     qemu_mutex_lock_iothread();
+    replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
+    if (local_err) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
+
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
@@ -585,6 +611,11 @@ void *colo_process_incoming_thread(void *opaque)
     object_unref(OBJECT(bioc));
 
     qemu_mutex_lock_iothread();
+    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
+    if (local_err) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
     vm_start();
     trace_colo_vm_state_change("stop", "run");
     qemu_mutex_unlock_iothread();
@@ -665,6 +696,18 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        replication_get_error_all(&local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+        /* discard colo disk buffer */
+        replication_do_checkpoint_all(&local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+
         vmstate_loading = false;
         vm_start();
         trace_colo_vm_state_change("stop", "run");
diff --git a/migration/migration.c b/migration/migration.c
index 59aab11f4a..0dfdeecf0f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -360,6 +360,7 @@ static void process_incoming_migration_co(void *opaque)
     MigrationIncomingState *mis = migration_incoming_get_current();
     PostcopyState ps;
     int ret;
+    Error *local_err = NULL;
 
     assert(mis->from_src_file);
     mis->largest_page_size = qemu_ram_pagesize_largest();
@@ -391,6 +392,14 @@ static void process_incoming_migration_co(void *opaque)
 
     /* we get COLO info, and know if we are in COLO mode */
     if (!ret && migration_incoming_enable_colo()) {
+        /* Make sure all file formats flush their mutable metadata */
+        bdrv_invalidate_cache_all(&local_err);
+        if (local_err) {
+            migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                    MIGRATION_STATUS_FAILED);
+            error_report_err(local_err);
+            exit(EXIT_FAILURE);
+        }
         mis->migration_incoming_co = qemu_coroutine_self();
         qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
              colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 06/17] COLO: Remove colo_state migration struct
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (4 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 05/17] COLO: Add block replication into colo process Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 07/17] COLO: Load dirty pages into SVM's RAM cache firstly Zhang Chen
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

We need to know if migration is going into COLO state for
incoming side before start normal migration.

Instead by using the VMStateDescription to send colo_state
from source side to destination side, we use MIG_CMD_ENABLE_COLO
to indicate whether COLO is enabled or not.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/colo.h |  5 +--
 migration/Makefile.objs  |  2 +-
 migration/colo-comm.c    | 76 ----------------------------------------
 migration/colo.c         | 13 ++++++-
 migration/migration.c    | 23 +++++++++++-
 migration/savevm.c       | 18 ++++++++++
 migration/savevm.h       |  1 +
 migration/trace-events   |  1 +
 vl.c                     |  2 --
 9 files changed, 58 insertions(+), 83 deletions(-)
 delete mode 100644 migration/colo-comm.c

diff --git a/include/migration/colo.h b/include/migration/colo.h
index fefb2fcf4c..99ce17aca7 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -28,8 +28,9 @@ void migrate_start_colo_process(MigrationState *s);
 bool migration_in_colo_state(void);
 
 /* loadvm */
-bool migration_incoming_enable_colo(void);
-void migration_incoming_exit_colo(void);
+void migration_incoming_enable_colo(void);
+void migration_incoming_disable_colo(void);
+bool migration_incoming_colo_enabled(void);
 void *colo_process_incoming_thread(void *opaque);
 bool migration_incoming_in_colo_state(void);
 
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index c83ec47ba8..a4f3bafd86 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -1,6 +1,6 @@
 common-obj-y += migration.o socket.o fd.o exec.o
 common-obj-y += tls.o channel.o savevm.o
-common-obj-y += colo-comm.o colo.o colo-failover.o
+common-obj-y += colo.o colo-failover.o
 common-obj-y += vmstate.o vmstate-types.o page_cache.o
 common-obj-y += qemu-file.o global_state.o
 common-obj-y += qemu-file-channel.o
diff --git a/migration/colo-comm.c b/migration/colo-comm.c
deleted file mode 100644
index df26e4dfe7..0000000000
--- a/migration/colo-comm.c
+++ /dev/null
@@ -1,76 +0,0 @@
-/*
- * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
- * (a.k.a. Fault Tolerance or Continuous Replication)
- *
- * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
- * Copyright (c) 2016 FUJITSU LIMITED
- * Copyright (c) 2016 Intel Corporation
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or
- * later. See the COPYING file in the top-level directory.
- *
- */
-
-#include "qemu/osdep.h"
-#include "migration.h"
-#include "migration/colo.h"
-#include "migration/vmstate.h"
-#include "trace.h"
-
-typedef struct {
-     bool colo_requested;
-} COLOInfo;
-
-static COLOInfo colo_info;
-
-COLOMode get_colo_mode(void)
-{
-    if (migration_in_colo_state()) {
-        return COLO_MODE_PRIMARY;
-    } else if (migration_incoming_in_colo_state()) {
-        return COLO_MODE_SECONDARY;
-    } else {
-        return COLO_MODE_UNKNOWN;
-    }
-}
-
-static int colo_info_pre_save(void *opaque)
-{
-    COLOInfo *s = opaque;
-
-    s->colo_requested = migrate_colo_enabled();
-
-    return 0;
-}
-
-static bool colo_info_need(void *opaque)
-{
-   return migrate_colo_enabled();
-}
-
-static const VMStateDescription colo_state = {
-    .name = "COLOState",
-    .version_id = 1,
-    .minimum_version_id = 1,
-    .pre_save = colo_info_pre_save,
-    .needed = colo_info_need,
-    .fields = (VMStateField[]) {
-        VMSTATE_BOOL(colo_requested, COLOInfo),
-        VMSTATE_END_OF_LIST()
-    },
-};
-
-void colo_info_init(void)
-{
-    vmstate_register(NULL, 0, &colo_state, &colo_info);
-}
-
-bool migration_incoming_enable_colo(void)
-{
-    return colo_info.colo_requested;
-}
-
-void migration_incoming_exit_colo(void)
-{
-    colo_info.colo_requested = false;
-}
diff --git a/migration/colo.c b/migration/colo.c
index e06640c3d6..c083d3696f 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -152,6 +152,17 @@ static void primary_vm_do_failover(void)
     qemu_sem_post(&s->colo_exit_sem);
 }
 
+COLOMode get_colo_mode(void)
+{
+    if (migration_in_colo_state()) {
+        return COLO_MODE_PRIMARY;
+    } else if (migration_incoming_in_colo_state()) {
+        return COLO_MODE_SECONDARY;
+    } else {
+        return COLO_MODE_UNKNOWN;
+    }
+}
+
 void colo_do_failover(MigrationState *s)
 {
     /* Make sure VM stopped while failover happened. */
@@ -745,7 +756,7 @@ out:
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
     }
-    migration_incoming_exit_colo();
+    migration_incoming_disable_colo();
 
     return NULL;
 }
diff --git a/migration/migration.c b/migration/migration.c
index 0dfdeecf0f..48e183a54e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -280,6 +280,22 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
     return migrate_send_rp_message(mis, msg_type, msglen, bufc);
 }
 
+static bool migration_colo_enabled;
+bool migration_incoming_colo_enabled(void)
+{
+    return migration_colo_enabled;
+}
+
+void migration_incoming_disable_colo(void)
+{
+    migration_colo_enabled = false;
+}
+
+void migration_incoming_enable_colo(void)
+{
+    migration_colo_enabled = true;
+}
+
 void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
     const char *p;
@@ -391,7 +407,7 @@ static void process_incoming_migration_co(void *opaque)
     }
 
     /* we get COLO info, and know if we are in COLO mode */
-    if (!ret && migration_incoming_enable_colo()) {
+    if (!ret && migration_incoming_colo_enabled()) {
         /* Make sure all file formats flush their mutable metadata */
         bdrv_invalidate_cache_all(&local_err);
         if (local_err) {
@@ -2847,6 +2863,11 @@ static void *migration_thread(void *opaque)
         qemu_savevm_send_postcopy_advise(s->to_dst_file);
     }
 
+    if (migrate_colo_enabled()) {
+        /* Notify migration destination that we enable COLO */
+        qemu_savevm_send_colo_enable(s->to_dst_file);
+    }
+
     qemu_savevm_state_setup(s->to_dst_file);
 
     s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
diff --git a/migration/savevm.c b/migration/savevm.c
index 4251125831..308f753013 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -55,6 +55,8 @@
 #include "io/channel-buffer.h"
 #include "io/channel-file.h"
 #include "sysemu/replay.h"
+#include "migration/colo.h"
+
 
 #ifndef ETH_P_RARP
 #define ETH_P_RARP 0x8035
@@ -82,6 +84,7 @@ enum qemu_vm_cmd {
                                       precopy but are dirty. */
     MIG_CMD_POSTCOPY_RESUME,       /* resume postcopy on dest */
     MIG_CMD_PACKAGED,          /* Send a wrapped stream within this stream */
+    MIG_CMD_ENABLE_COLO,       /* Enable COLO */
     MIG_CMD_RECV_BITMAP,       /* Request for recved bitmap on dst */
     MIG_CMD_MAX
 };
@@ -840,6 +843,12 @@ static void qemu_savevm_command_send(QEMUFile *f,
     qemu_fflush(f);
 }
 
+void qemu_savevm_send_colo_enable(QEMUFile *f)
+{
+    trace_savevm_send_colo_enable();
+    qemu_savevm_command_send(f, MIG_CMD_ENABLE_COLO, 0, NULL);
+}
+
 void qemu_savevm_send_ping(QEMUFile *f, uint32_t value)
 {
     uint32_t buf;
@@ -1917,6 +1926,12 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
     return 0;
 }
 
+static int loadvm_process_enable_colo(MigrationIncomingState *mis)
+{
+    migration_incoming_enable_colo();
+    return 0;
+}
+
 /*
  * Process an incoming 'QEMU_VM_COMMAND'
  * 0           just a normal return
@@ -1996,6 +2011,9 @@ static int loadvm_process_command(QEMUFile *f)
 
     case MIG_CMD_RECV_BITMAP:
         return loadvm_handle_recv_bitmap(mis, len);
+
+    case MIG_CMD_ENABLE_COLO:
+        return loadvm_process_enable_colo(mis);
     }
 
     return 0;
diff --git a/migration/savevm.h b/migration/savevm.h
index a5e65b8ae3..8373c2f6bd 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -55,6 +55,7 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
                                            uint16_t len,
                                            uint64_t *start_list,
                                            uint64_t *length_list);
+void qemu_savevm_send_colo_enable(QEMUFile *f);
 
 int qemu_loadvm_state(QEMUFile *f);
 void qemu_loadvm_state_cleanup(void);
diff --git a/migration/trace-events b/migration/trace-events
index 3c798ddd11..20accb5b80 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -37,6 +37,7 @@ savevm_send_ping(uint32_t val) "0x%x"
 savevm_send_postcopy_listen(void) ""
 savevm_send_postcopy_run(void) ""
 savevm_send_postcopy_resume(void) ""
+savevm_send_colo_enable(void) ""
 savevm_send_recv_bitmap(char *name) "%s"
 savevm_state_setup(void) ""
 savevm_state_resume_prepare(void) ""
diff --git a/vl.c b/vl.c
index 70f090c823..e00ca5b0c2 100644
--- a/vl.c
+++ b/vl.c
@@ -4346,8 +4346,6 @@ int main(int argc, char **argv, char **envp)
 #endif
     }
 
-    colo_info_init();
-
     if (net_init_clients(&err) < 0) {
         error_report_err(err);
         exit(1);
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 07/17] COLO: Load dirty pages into SVM's RAM cache firstly
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (5 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 06/17] COLO: Remove colo_state migration struct Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 08/17] ram/COLO: Record the dirty pages that SVM received Zhang Chen
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.

We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached ram
to SVM after we receive all PVM's state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
---
 include/exec/ram_addr.h |  1 +
 migration/migration.c   |  6 +++
 migration/ram.c         | 83 ++++++++++++++++++++++++++++++++++++++++-
 migration/ram.h         |  4 ++
 migration/savevm.c      |  2 +-
 5 files changed, 93 insertions(+), 3 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index cf2446a176..51ec153a57 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -27,6 +27,7 @@ struct RAMBlock {
     struct rcu_head rcu;
     struct MemoryRegion *mr;
     uint8_t *host;
+    uint8_t *colo_cache; /* For colo, VM's ram cache */
     ram_addr_t offset;
     ram_addr_t used_length;
     ram_addr_t max_length;
diff --git a/migration/migration.c b/migration/migration.c
index 48e183a54e..0d3e2e6d66 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -416,6 +416,10 @@ static void process_incoming_migration_co(void *opaque)
             error_report_err(local_err);
             exit(EXIT_FAILURE);
         }
+        if (colo_init_ram_cache() < 0) {
+            error_report("Init ram cache failed");
+            exit(EXIT_FAILURE);
+        }
         mis->migration_incoming_co = qemu_coroutine_self();
         qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
              colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
@@ -424,6 +428,8 @@ static void process_incoming_migration_co(void *opaque)
 
         /* Wait checkpoint incoming thread exit before free resource */
         qemu_thread_join(&mis->colo_incoming_thread);
+        /* We hold the global iothread lock, so it is safe here */
+        colo_release_ram_cache();
     }
 
     if (ret < 0) {
diff --git a/migration/ram.c b/migration/ram.c
index c53e8369a3..2bcd70659f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2820,6 +2820,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
     return block->host + offset;
 }
 
+static inline void *colo_cache_from_block_offset(RAMBlock *block,
+                                                 ram_addr_t offset)
+{
+    if (!offset_in_ramblock(block, offset)) {
+        return NULL;
+    }
+    if (!block->colo_cache) {
+        error_report("%s: colo_cache is NULL in block :%s",
+                     __func__, block->idstr);
+        return NULL;
+    }
+    return block->colo_cache + offset;
+}
+
 /**
  * ram_handle_compressed: handle the zero page case
  *
@@ -3024,6 +3038,58 @@ static void decompress_data_with_multi_threads(QEMUFile *f,
     qemu_mutex_unlock(&decomp_done_lock);
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM, it is need to hold the global lock
+ * to call this helper.
+ */
+int colo_init_ram_cache(void)
+{
+    RAMBlock *block;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        block->colo_cache = qemu_anon_ram_alloc(block->used_length,
+                                                NULL,
+                                                false);
+        if (!block->colo_cache) {
+            error_report("%s: Can't alloc memory for COLO cache of block %s,"
+                         "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
+                         block->used_length);
+            goto out_locked;
+        }
+        memcpy(block->colo_cache, block->host, block->used_length);
+    }
+    rcu_read_unlock();
+    return 0;
+
+out_locked:
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->colo_cache) {
+            qemu_anon_ram_free(block->colo_cache, block->used_length);
+            block->colo_cache = NULL;
+        }
+    }
+
+    rcu_read_unlock();
+    return -errno;
+}
+
+/* It is need to hold the global lock to call this helper */
+void colo_release_ram_cache(void)
+{
+    RAMBlock *block;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->colo_cache) {
+            qemu_anon_ram_free(block->colo_cache, block->used_length);
+            block->colo_cache = NULL;
+        }
+    }
+    rcu_read_unlock();
+}
+
 /**
  * ram_load_setup: Setup RAM for migration incoming side
  *
@@ -3040,6 +3106,7 @@ static int ram_load_setup(QEMUFile *f, void *opaque)
 
     xbzrle_load_setup();
     ramblock_recv_map_init();
+
     return 0;
 }
 
@@ -3053,6 +3120,7 @@ static int ram_load_cleanup(void *opaque)
         g_free(rb->receivedmap);
         rb->receivedmap = NULL;
     }
+
     return 0;
 }
 
@@ -3286,13 +3354,24 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                      RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
             RAMBlock *block = ram_block_from_stream(f, flags);
 
-            host = host_from_ram_block_offset(block, addr);
+            /*
+             * After going into COLO, we should load the Page into colo_cache.
+             */
+            if (migration_incoming_in_colo_state()) {
+                host = colo_cache_from_block_offset(block, addr);
+            } else {
+                host = host_from_ram_block_offset(block, addr);
+            }
             if (!host) {
                 error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
                 ret = -EINVAL;
                 break;
             }
-            ramblock_recv_bitmap_set(block, host);
+
+            if (!migration_incoming_in_colo_state()) {
+                ramblock_recv_bitmap_set(block, host);
+            }
+
             trace_ram_load_loop(block->idstr, (uint64_t)addr, flags, host);
         }
 
diff --git a/migration/ram.h b/migration/ram.h
index d386f4d641..d5e81d4d48 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -70,4 +70,8 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file,
                                   const char *block_name);
 int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb);
 
+/* ram cache */
+int colo_init_ram_cache(void);
+void colo_release_ram_cache(void);
+
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index 308f753013..4a789eb4c9 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1929,7 +1929,7 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
 static int loadvm_process_enable_colo(MigrationIncomingState *mis)
 {
     migration_incoming_enable_colo();
-    return 0;
+    return colo_init_ram_cache();
 }
 
 /*
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 08/17] ram/COLO: Record the dirty pages that SVM received
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (6 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 07/17] COLO: Load dirty pages into SVM's RAM cache firstly Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 09/17] COLO: Flush memory data from ram cache Zhang Chen
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

We record the address of the dirty pages that received,
it will help flushing pages that cached into SVM.

Here, it is a trick, we record dirty pages by re-using migration
dirty bitmap. In the later patch, we will start the dirty log
for SVM, just like migration, in this way, we can record both
the dirty pages caused by PVM and SVM, we only flush those dirty
pages from RAM cache while do checkpoint.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 2bcd70659f..dd86eeba87 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2831,6 +2831,15 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block,
                      __func__, block->idstr);
         return NULL;
     }
+
+    /*
+    * During colo checkpoint, we need bitmap of these migrated pages.
+    * It help us to decide which pages in ram cache should be flushed
+    * into VM's RAM later.
+    */
+    if (!test_and_set_bit(offset >> TARGET_PAGE_BITS, block->bmap)) {
+        ram_state->migration_dirty_pages++;
+    }
     return block->colo_cache + offset;
 }
 
@@ -3061,6 +3070,24 @@ int colo_init_ram_cache(void)
         memcpy(block->colo_cache, block->host, block->used_length);
     }
     rcu_read_unlock();
+    /*
+    * Record the dirty pages that sent by PVM, we use this dirty bitmap together
+    * with to decide which page in cache should be flushed into SVM's RAM. Here
+    * we use the same name 'ram_bitmap' as for migration.
+    */
+    if (ram_bytes_total()) {
+        RAMBlock *block;
+
+        QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+            unsigned long pages = block->max_length >> TARGET_PAGE_BITS;
+
+            block->bmap = bitmap_new(pages);
+            bitmap_set(block->bmap, 0, pages);
+         }
+    }
+    ram_state = g_new0(RAMState, 1);
+    ram_state->migration_dirty_pages = 0;
+
     return 0;
 
 out_locked:
@@ -3080,6 +3107,10 @@ void colo_release_ram_cache(void)
 {
     RAMBlock *block;
 
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        g_free(block->bmap);
+        block->bmap = NULL;
+    }
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
         if (block->colo_cache) {
@@ -3088,6 +3119,8 @@ void colo_release_ram_cache(void)
         }
     }
     rcu_read_unlock();
+    g_free(ram_state);
+    ram_state = NULL;
 }
 
 /**
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 09/17] COLO: Flush memory data from ram cache
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (7 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 08/17] ram/COLO: Record the dirty pages that SVM received Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO Zhang Chen
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

During the time of VM's running, PVM may dirty some pages, we will transfer
PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
time. So, the content of SVM's RAM cache will always be same with PVM's memory
after checkpoint.

Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
we do this in a more efficient way:
Only flush any page that dirtied by PVM since last checkpoint.
In this way, we can ensure SVM's memory same with PVM's.

Besides, we must ensure flush RAM cache before load device state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c        | 37 +++++++++++++++++++++++++++++++++++++
 migration/trace-events |  2 ++
 2 files changed, 39 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index dd86eeba87..927436bd12 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3332,6 +3332,39 @@ static bool postcopy_is_running(void)
     return ps >= POSTCOPY_INCOMING_LISTENING && ps < POSTCOPY_INCOMING_END;
 }
 
+/*
+ * Flush content of RAM cache into SVM's memory.
+ * Only flush the pages that be dirtied by PVM or SVM or both.
+ */
+static void colo_flush_ram_cache(void)
+{
+    RAMBlock *block = NULL;
+    void *dst_host;
+    void *src_host;
+    unsigned long offset = 0;
+
+    trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages);
+    rcu_read_lock();
+    block = QLIST_FIRST_RCU(&ram_list.blocks);
+
+    while (block) {
+        offset = migration_bitmap_find_dirty(ram_state, block, offset);
+
+        if (offset << TARGET_PAGE_BITS >= block->used_length) {
+            offset = 0;
+            block = QLIST_NEXT_RCU(block, next);
+        } else {
+            migration_bitmap_clear_dirty(ram_state, block, offset);
+            dst_host = block->host + (offset << TARGET_PAGE_BITS);
+            src_host = block->colo_cache + (offset << TARGET_PAGE_BITS);
+            memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+        }
+    }
+
+    rcu_read_unlock();
+    trace_colo_flush_ram_cache_end();
+}
+
 static int ram_load(QEMUFile *f, void *opaque, int version_id)
 {
     int flags = 0, ret = 0, invalid_flags = 0;
@@ -3504,6 +3537,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     ret |= wait_for_decompress_done();
     rcu_read_unlock();
     trace_ram_load_complete(ret, seq_iter);
+
+    if (!ret  && migration_incoming_in_colo_state()) {
+        colo_flush_ram_cache();
+    }
     return ret;
 }
 
diff --git a/migration/trace-events b/migration/trace-events
index 20accb5b80..9cc80075ca 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -90,6 +90,8 @@ ram_dirty_bitmap_sync_start(void) ""
 ram_dirty_bitmap_sync_wait(void) ""
 ram_dirty_bitmap_sync_complete(void) ""
 ram_state_resume_prepare(uint64_t v) "%" PRId64
+colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64
+colo_flush_ram_cache_end(void) ""
 
 # migration/migration.c
 await_return_path_close_on_source_close(void) ""
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (8 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 09/17] COLO: Flush memory data from ram cache Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-04 22:23   ` Eric Blake
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status Zhang Chen
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

If some errors happen during VM's COLO FT stage, it's important to
notify the users of this event. Together with 'x-colo-lost-heartbeat',
Users can intervene in COLO's failover work immediately.
If users don't want to get involved in COLO's failover verdict,
it is still necessary to notify users that we exited COLO mode.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 migration/colo.c    | 31 +++++++++++++++++++++++++++++++
 qapi/migration.json | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index c083d3696f..bedb677788 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -28,6 +28,7 @@
 #include "net/colo-compare.h"
 #include "net/colo.h"
 #include "block/block.h"
+#include "qapi/qapi-events-migration.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -514,6 +515,23 @@ out:
         qemu_fclose(fb);
     }
 
+    /*
+     * There are only two reasons we can go here, some error happened.
+     * Or the user triggered failover.
+     */
+    switch (failover_get_state()) {
+    case FAILOVER_STATUS_NONE:
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+                                  COLO_EXIT_REASON_ERROR, NULL);
+        break;
+    case FAILOVER_STATUS_REQUIRE:
+        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
+                                  COLO_EXIT_REASON_REQUEST, NULL);
+        break;
+    default:
+        abort();
+    }
+
     /* Hope this not to be too long to wait here */
     qemu_sem_wait(&s->colo_exit_sem);
     qemu_sem_destroy(&s->colo_exit_sem);
@@ -745,6 +763,19 @@ out:
         error_report_err(local_err);
     }
 
+    switch (failover_get_state()) {
+    case FAILOVER_STATUS_NONE:
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+                                  COLO_EXIT_REASON_ERROR, NULL);
+        break;
+    case FAILOVER_STATUS_REQUIRE:
+        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
+                                  COLO_EXIT_REASON_REQUEST, NULL);
+        break;
+    default:
+        abort();
+    }
+
     if (fb) {
         qemu_fclose(fb);
     }
diff --git a/qapi/migration.json b/qapi/migration.json
index dc9cc85545..93136ce5a0 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -880,6 +880,44 @@
 { 'enum': 'FailoverStatus',
   'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
 
+##
+# @COLO_EXIT:
+#
+# Emitted when VM finishes COLO mode due to some errors happening or
+# at the request of users.
+#
+# @mode: report COLO mode when COLO exited.
+#
+# @reason: describes the reason for the COLO exit.
+#
+# Since: 2.13
+#
+# Example:
+#
+# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172},
+#      "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
+#
+##
+{ 'event': 'COLO_EXIT',
+  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
+
+##
+# @COLOExitReason:
+#
+# The reason for a COLO exit
+#
+# @none: no failover has ever happened, This can't occur in the COLO_EXIT event,
+# only in the result of query-colo-status.
+#
+# @request: COLO exit is due to an external request
+#
+# @error: COLO exit is due to an internal error
+#
+# Since: 2.13
+##
+{ 'enum': 'COLOExitReason',
+  'data': [ 'none', 'request', 'error' ] }
+
 ##
 # @x-colo-lost-heartbeat:
 #
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (9 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-04 22:23   ` Eric Blake
  2018-06-07 12:59   ` Markus Armbruster
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 12/17] savevm: split the process of different stages for loadvm/savevm Zhang Chen
                   ` (5 subsequent siblings)
  16 siblings, 2 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

Libvirt or other high level software can use this command query colo status.
You can test this command like that:
{'execute':'query-colo-status'}

Signed-off-by: Zhang Chen <zhangckid@gmail.com>
---
 migration/colo.c    | 39 +++++++++++++++++++++++++++++++++++++++
 qapi/migration.json | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 73 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index bedb677788..8c6b8e9a4e 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -29,6 +29,7 @@
 #include "net/colo.h"
 #include "block/block.h"
 #include "qapi/qapi-events-migration.h"
+#include "qapi/qmp/qerror.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -237,6 +238,44 @@ void qmp_xen_colo_do_checkpoint(Error **errp)
 #endif
 }
 
+COLOStatus *qmp_query_colo_status(Error **errp)
+{
+    int state;
+    COLOStatus *s = g_new0(COLOStatus, 1);
+
+    s->mode = get_colo_mode();
+
+    switch (s->mode) {
+    case COLO_MODE_UNKNOWN:
+        error_setg(errp, "COLO is disabled");
+        state = MIGRATION_STATUS_NONE;
+        break;
+    case COLO_MODE_PRIMARY:
+        state = migrate_get_current()->state;
+        break;
+    case COLO_MODE_SECONDARY:
+        state = migration_incoming_get_current()->state;
+        break;
+    default:
+        abort();
+    }
+
+    s->colo_running = state == MIGRATION_STATUS_COLO;
+
+    switch (failover_get_state()) {
+    case FAILOVER_STATUS_NONE:
+        s->reason = COLO_EXIT_REASON_NONE;
+        break;
+    case FAILOVER_STATUS_REQUIRE:
+        s->reason = COLO_EXIT_REASON_REQUEST;
+        break;
+    default:
+        s->reason = COLO_EXIT_REASON_ERROR;
+    }
+
+    return s;
+}
+
 static void colo_send_message(QEMUFile *f, COLOMessage msg,
                               Error **errp)
 {
diff --git a/qapi/migration.json b/qapi/migration.json
index 93136ce5a0..356a370949 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1231,6 +1231,40 @@
 ##
 { 'command': 'xen-colo-do-checkpoint' }
 
+##
+# @COLOStatus:
+#
+# The result format for 'query-colo-status'.
+#
+# @mode: COLO running mode. If COLO is running, this field will return
+#        'primary' or 'secodary'.
+#
+# @colo-running: true if COLO is running.
+#
+# @reason: describes the reason for the COLO exit.
+#
+# Since: 2.13
+##
+{ 'struct': 'COLOStatus',
+  'data': { 'mode': 'COLOMode', 'colo-running': 'bool', 'reason': 'COLOExitReason' } }
+
+##
+# @query-colo-status:
+#
+# Query COLO status while the vm is running.
+#
+# Returns: A @COLOStatus object showing the status.
+#
+# Example:
+#
+# -> { "execute": "query-colo-status" }
+# <- { "return": { "mode": "primary", "colo-running": true, "reason": "request" } }
+#
+# Since: 2.13
+##
+{ 'command': 'query-colo-status',
+  'returns': 'COLOStatus' }
+
 ##
 # @migrate-recover:
 #
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 12/17] savevm: split the process of different stages for loadvm/savevm
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (10 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 13/17] COLO: flush host dirty ram from cache Zhang Chen
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

There are several stages during loadvm/savevm process. In different stage,
migration incoming processes different types of sections.
We want to control these stages more accuracy, it will benefit COLO
performance, we don't have to save type of QEMU_VM_SECTION_START
sections everytime while do checkpoint, besides, we want to separate
the process of saving/loading memory and devices state.

So we add three new helper functions: qemu_load_device_state() and
qemu_savevm_live_state() to achieve different process during migration.

Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
public, and simplify the codes of qemu_save_device_state() by calling the
wrapper qemu_savevm_state_header().

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c   | 40 ++++++++++++++++++++++++++++++++--------
 migration/savevm.c | 35 ++++++++++++++++++++++++++++-------
 migration/savevm.h |  4 ++++
 3 files changed, 64 insertions(+), 15 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 8c6b8e9a4e..442471e088 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -30,6 +30,7 @@
 #include "block/block.h"
 #include "qapi/qapi-events-migration.h"
 #include "qapi/qmp/qerror.h"
+#include "sysemu/cpus.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -419,23 +420,34 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
 
     /* Disable block migration */
     migrate_set_block_enabled(false, &local_err);
-    qemu_savevm_state_header(fb);
-    qemu_savevm_state_setup(fb);
     qemu_mutex_lock_iothread();
     replication_do_checkpoint_all(&local_err);
     if (local_err) {
         qemu_mutex_unlock_iothread();
         goto out;
     }
-    qemu_savevm_state_complete_precopy(fb, false, false);
-    qemu_mutex_unlock_iothread();
-
-    qemu_fflush(fb);
 
     colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
     if (local_err) {
+        qemu_mutex_unlock_iothread();
+        goto out;
+    }
+    /* Note: device state is saved into buffer */
+    ret = qemu_save_device_state(fb);
+
+    qemu_mutex_unlock_iothread();
+    if (ret < 0) {
         goto out;
     }
+    /*
+     * Only save VM's live state, which not including device state.
+     * TODO: We may need a timeout mechanism to prevent COLO process
+     * to be blocked here.
+     */
+    qemu_savevm_live_state(s->to_dst_file);
+
+    qemu_fflush(fb);
+
     /*
      * We need the size of the VMstate data in Secondary side,
      * With which we can decide how much data should be read.
@@ -653,6 +665,7 @@ void *colo_process_incoming_thread(void *opaque)
     uint64_t total_size;
     uint64_t value;
     Error *local_err = NULL;
+    int ret;
 
     qemu_sem_init(&mis->colo_incoming_sem, 0);
 
@@ -725,6 +738,16 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        qemu_mutex_lock_iothread();
+        cpu_synchronize_all_pre_loadvm();
+        ret = qemu_loadvm_state_main(mis->from_src_file, mis);
+        qemu_mutex_unlock_iothread();
+
+        if (ret < 0) {
+            error_report("Load VM's live state (ram) error");
+            goto out;
+        }
+
         value = colo_receive_message_value(mis->from_src_file,
                                  COLO_MESSAGE_VMSTATE_SIZE, &local_err);
         if (local_err) {
@@ -758,8 +781,9 @@ void *colo_process_incoming_thread(void *opaque)
         qemu_mutex_lock_iothread();
         qemu_system_reset(SHUTDOWN_CAUSE_NONE);
         vmstate_loading = true;
-        if (qemu_loadvm_state(fb) < 0) {
-            error_report("COLO: loadvm failed");
+        ret = qemu_load_device_state(fb);
+        if (ret < 0) {
+            error_report("COLO: load device state failed");
             qemu_mutex_unlock_iothread();
             goto out;
         }
diff --git a/migration/savevm.c b/migration/savevm.c
index 4a789eb4c9..24be7c75e5 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1378,13 +1378,20 @@ done:
     return ret;
 }
 
-static int qemu_save_device_state(QEMUFile *f)
+void qemu_savevm_live_state(QEMUFile *f)
 {
-    SaveStateEntry *se;
+    /* save QEMU_VM_SECTION_END section */
+    qemu_savevm_state_complete_precopy(f, true, false);
+    qemu_put_byte(f, QEMU_VM_EOF);
+}
 
-    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
-    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
+int qemu_save_device_state(QEMUFile *f)
+{
+    SaveStateEntry *se;
 
+    if (!migration_in_colo_state()) {
+        qemu_savevm_state_header(f);
+    }
     cpu_synchronize_all_states();
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
@@ -1440,8 +1447,6 @@ enum LoadVMExitCodes {
     LOADVM_QUIT     =  1,
 };
 
-static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
-
 /* ------ incoming postcopy messages ------ */
 /* 'advise' arrives before any transfers just to tell us that a postcopy
  * *might* happen - it might be skipped if precopy transferred everything
@@ -2241,7 +2246,7 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis)
     return true;
 }
 
-static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
 {
     uint8_t section_type;
     int ret = 0;
@@ -2414,6 +2419,22 @@ int qemu_loadvm_state(QEMUFile *f)
     return ret;
 }
 
+int qemu_load_device_state(QEMUFile *f)
+{
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    int ret;
+
+    /* Load QEMU_VM_SECTION_FULL section */
+    ret = qemu_loadvm_state_main(f, mis);
+    if (ret < 0) {
+        error_report("Failed to load device state: %d", ret);
+        return ret;
+    }
+
+    cpu_synchronize_all_post_init();
+    return 0;
+}
+
 int save_snapshot(const char *name, Error **errp)
 {
     BlockDriverState *bs, *bs1;
diff --git a/migration/savevm.h b/migration/savevm.h
index 8373c2f6bd..51a4b9caa8 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -56,8 +56,12 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
                                            uint64_t *start_list,
                                            uint64_t *length_list);
 void qemu_savevm_send_colo_enable(QEMUFile *f);
+void qemu_savevm_live_state(QEMUFile *f);
+int qemu_save_device_state(QEMUFile *f);
 
 int qemu_loadvm_state(QEMUFile *f);
 void qemu_loadvm_state_cleanup(void);
+int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
+int qemu_load_device_state(QEMUFile *f);
 
 #endif
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 13/17] COLO: flush host dirty ram from cache
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (11 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 12/17] savevm: split the process of different stages for loadvm/savevm Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass Zhang Chen
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Don't need to flush all VM's ram from cache, only
flush the dirty pages since last checkpoint

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 927436bd12..e34a015785 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3087,6 +3087,7 @@ int colo_init_ram_cache(void)
     }
     ram_state = g_new0(RAMState, 1);
     ram_state->migration_dirty_pages = 0;
+    memory_global_dirty_log_start();
 
     return 0;
 
@@ -3107,10 +3108,12 @@ void colo_release_ram_cache(void)
 {
     RAMBlock *block;
 
+    memory_global_dirty_log_stop();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
         g_free(block->bmap);
         block->bmap = NULL;
     }
+
     rcu_read_lock();
     QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
         if (block->colo_cache) {
@@ -3343,6 +3346,13 @@ static void colo_flush_ram_cache(void)
     void *src_host;
     unsigned long offset = 0;
 
+    memory_global_dirty_log_sync();
+    rcu_read_lock();
+    RAMBLOCK_FOREACH(block) {
+        migration_bitmap_sync_range(ram_state, block, 0, block->used_length);
+    }
+    rcu_read_unlock();
+
     trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages);
     rcu_read_lock();
     block = QLIST_FIRST_RCU(&ram_list.blocks);
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (12 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 13/17] COLO: flush host dirty ram from cache Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-04  6:57   ` Jason Wang
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 15/17] filter-rewriter: handle checkpoint and failover event Zhang Chen
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

Filter needs to process the event of checkpoint/failover or
other event passed by COLO frame.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 include/net/filter.h |  5 +++++
 net/filter.c         | 17 +++++++++++++++++
 net/net.c            | 28 ++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 435acd6f82..49da666ac0 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -38,6 +38,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
 
 typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp);
 
+typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error **errp);
+
 typedef struct NetFilterClass {
     ObjectClass parent_class;
 
@@ -45,6 +47,7 @@ typedef struct NetFilterClass {
     FilterSetup *setup;
     FilterCleanup *cleanup;
     FilterStatusChanged *status_changed;
+    FilterHandleEvent *handle_event;
     /* mandatory */
     FilterReceiveIOV *receive_iov;
 } NetFilterClass;
@@ -77,4 +80,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
                                     int iovcnt,
                                     void *opaque);
 
+void colo_notify_filters_event(int event, Error **errp);
+
 #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter.c b/net/filter.c
index 2fd7d7d663..0f17eba143 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -17,6 +17,8 @@
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
 #include "qemu/iov.h"
+#include "net/colo.h"
+#include "migration/colo.h"
 
 static inline bool qemu_can_skip_netfilter(NetFilterState *nf)
 {
@@ -245,11 +247,26 @@ static void netfilter_finalize(Object *obj)
     g_free(nf->netdev_id);
 }
 
+static void dummy_handle_event(NetFilterState *nf, int event, Error **errp)
+{
+    switch (event) {
+    case COLO_EVENT_CHECKPOINT:
+        break;
+    case COLO_EVENT_FAILOVER:
+        object_property_set_str(OBJECT(nf), "off", "status", errp);
+        break;
+    default:
+        break;
+    }
+}
+
 static void netfilter_class_init(ObjectClass *oc, void *data)
 {
     UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+    NetFilterClass *nfc = NETFILTER_CLASS(oc);
 
     ucc->complete = netfilter_complete;
+    nfc->handle_event = dummy_handle_event;
 }
 
 static const TypeInfo netfilter_info = {
diff --git a/net/net.c b/net/net.c
index efb9eaf779..378cd2f9ec 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1329,6 +1329,34 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
     }
 }
 
+void colo_notify_filters_event(int event, Error **errp)
+{
+    NetClientState *nc, *peer;
+    NetClientDriver type;
+    NetFilterState *nf;
+    NetFilterClass *nfc = NULL;
+    Error *local_err = NULL;
+
+    QTAILQ_FOREACH(nc, &net_clients, next) {
+        peer = nc->peer;
+        type = nc->info->type;
+        if (!peer || type != NET_CLIENT_DRIVER_TAP) {
+            continue;
+        }
+        QTAILQ_FOREACH(nf, &nc->filters, next) {
+            nfc =  NETFILTER_GET_CLASS(OBJECT(nf));
+            if (!nfc->handle_event) {
+                continue;
+            }
+            nfc->handle_event(nf, event, &local_err);
+            if (local_err) {
+                error_propagate(errp, local_err);
+                return;
+            }
+        }
+    }
+}
+
 void qmp_set_link(const char *name, bool up, Error **errp)
 {
     NetClientState *ncs[MAX_QUEUE_NUM];
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 15/17] filter-rewriter: handle checkpoint and failover event
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (13 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-04  7:42   ` Jason Wang
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 16/17] COLO: notify net filters about checkpoint/failover event Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 17/17] COLO: quick failover process by kick COLO thread Zhang Chen
  16 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

After one round of checkpoint, the states between PVM and SVM
become consistent, so it is unnecessary to adjust the sequence
of net packets for old connections, besides, while failover
happens, filter-rewriter needs to check if it still needs to
adjust sequence of net packets.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
---
 migration/colo.c      | 13 +++++++++++++
 net/filter-rewriter.c | 40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 442471e088..0bff21d9e5 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -31,6 +31,7 @@
 #include "qapi/qapi-events-migration.h"
 #include "qapi/qmp/qerror.h"
 #include "sysemu/cpus.h"
+#include "net/filter.h"
 
 static bool vmstate_loading;
 static Notifier packets_compare_notifier;
@@ -82,6 +83,11 @@ static void secondary_vm_do_failover(void)
     if (local_err) {
         error_report_err(local_err);
     }
+    /* Notify all filters of all NIC to do checkpoint */
+    colo_notify_filters_event(COLO_EVENT_FAILOVER, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
 
     if (!autostart) {
         error_report("\"-S\" qemu option will be ignored in secondary side");
@@ -800,6 +806,13 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        /* Notify all filters of all NIC to do checkpoint */
+        colo_notify_filters_event(COLO_EVENT_CHECKPOINT, &local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+
         vmstate_loading = false;
         vm_start();
         trace_colo_vm_state_change("stop", "run");
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index 0909a9a8af..f3c306cc89 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -20,6 +20,8 @@
 #include "qemu/main-loop.h"
 #include "qemu/iov.h"
 #include "net/checksum.h"
+#include "net/colo.h"
+#include "migration/colo.h"
 
 #define FILTER_COLO_REWRITER(obj) \
     OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
@@ -277,6 +279,43 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
     return 0;
 }
 
+static void reset_seq_offset(gpointer key, gpointer value, gpointer user_data)
+{
+    Connection *conn = (Connection *)value;
+
+    conn->offset = 0;
+}
+
+static gboolean offset_is_nonzero(gpointer key,
+                                  gpointer value,
+                                  gpointer user_data)
+{
+    Connection *conn = (Connection *)value;
+
+    return conn->offset ? true : false;
+}
+
+static void colo_rewriter_handle_event(NetFilterState *nf, int event,
+                                       Error **errp)
+{
+    RewriterState *rs = FILTER_COLO_REWRITER(nf);
+
+    switch (event) {
+    case COLO_EVENT_CHECKPOINT:
+        g_hash_table_foreach(rs->connection_track_table,
+                            reset_seq_offset, NULL);
+        break;
+    case COLO_EVENT_FAILOVER:
+        if (!g_hash_table_find(rs->connection_track_table,
+                              offset_is_nonzero, NULL)) {
+            object_property_set_str(OBJECT(nf), "off", "status", errp);
+        }
+        break;
+    default:
+        break;
+    }
+}
+
 static void colo_rewriter_cleanup(NetFilterState *nf)
 {
     RewriterState *s = FILTER_COLO_REWRITER(nf);
@@ -332,6 +371,7 @@ static void colo_rewriter_class_init(ObjectClass *oc, void *data)
     nfc->setup = colo_rewriter_setup;
     nfc->cleanup = colo_rewriter_cleanup;
     nfc->receive_iov = colo_rewriter_receive_iov;
+    nfc->handle_event = colo_rewriter_handle_event;
 }
 
 static const TypeInfo colo_rewriter_info = {
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 16/17] COLO: notify net filters about checkpoint/failover event
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (14 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 15/17] filter-rewriter: handle checkpoint and failover event Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 17/17] COLO: quick failover process by kick COLO thread Zhang Chen
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

Notify all net filters about the checkpoint and failover event.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index 0bff21d9e5..e3824d139a 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -88,6 +88,11 @@ static void secondary_vm_do_failover(void)
     if (local_err) {
         error_report_err(local_err);
     }
+    /* Notify all filters of all NIC to do checkpoint */
+    colo_notify_filters_event(COLO_EVENT_FAILOVER, &local_err);
+    if (local_err) {
+        error_report_err(local_err);
+    }
 
     if (!autostart) {
         error_report("\"-S\" qemu option will be ignored in secondary side");
@@ -813,6 +818,13 @@ void *colo_process_incoming_thread(void *opaque)
             goto out;
         }
 
+        /* Notify all filters of all NIC to do checkpoint */
+        colo_notify_filters_event(COLO_EVENT_CHECKPOINT, &local_err);
+        if (local_err) {
+            qemu_mutex_unlock_iothread();
+            goto out;
+        }
+
         vmstate_loading = false;
         vm_start();
         trace_colo_vm_state_change("stop", "run");
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH V8 17/17] COLO: quick failover process by kick COLO thread
  2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
                   ` (15 preceding siblings ...)
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 16/17] COLO: notify net filters about checkpoint/failover event Zhang Chen
@ 2018-06-03  5:05 ` Zhang Chen
  16 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-03  5:05 UTC (permalink / raw)
  To: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake,
	Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian

From: zhanghailiang <zhang.zhanghailiang@huawei.com>

COLO thread may sleep at qemu_sem_wait(&s->colo_checkpoint_sem),
while failover works begin, It's better to wakeup it to quick
the process.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index e3824d139a..1d01f5ba08 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -135,6 +135,11 @@ static void primary_vm_do_failover(void)
 
     migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
+    /*
+     * kick COLO thread which might wait at
+     * qemu_sem_wait(&s->colo_checkpoint_sem).
+     */
+    colo_checkpoint_notify(migrate_get_current());
 
     /*
      * Wake up COLO thread which may blocked in recv() or send(),
@@ -561,6 +566,9 @@ static void colo_process_checkpoint(MigrationState *s)
 
         qemu_sem_wait(&s->colo_checkpoint_sem);
 
+        if (s->state != MIGRATION_STATUS_COLO) {
+            goto out;
+        }
         ret = colo_do_checkpoint_transaction(s, bioc, fb);
         if (ret < 0) {
             goto out;
-- 
2.17.GIT

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 01/17] filter-rewriter: fix memory leak for connection in connection_track_table
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 01/17] filter-rewriter: fix memory leak for connection in connection_track_table Zhang Chen
@ 2018-06-04  5:51   ` Jason Wang
  2018-06-10 14:08     ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2018-06-04  5:51 UTC (permalink / raw)
  To: Zhang Chen, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster
  Cc: zhanghailiang, Li Zhijian



On 2018年06月03日 13:05, Zhang Chen wrote:
> After a net connection is closed, we didn't clear its releated resources
> in connection_track_table, which will lead to memory leak.
>
> Let't track the state of net connection, if it is closed, its related
> resources will be cleared up.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> ---
>   net/colo.h            |  4 +++
>   net/filter-rewriter.c | 69 ++++++++++++++++++++++++++++++++++++++-----
>   2 files changed, 66 insertions(+), 7 deletions(-)
>
> diff --git a/net/colo.h b/net/colo.h
> index da6c36dcf7..cd118510c5 100644
> --- a/net/colo.h
> +++ b/net/colo.h
> @@ -18,6 +18,7 @@
>   #include "slirp/slirp.h"
>   #include "qemu/jhash.h"
>   #include "qemu/timer.h"
> +#include "slirp/tcp.h"
>   
>   #define HASHTABLE_MAX_SIZE 16384
>   
> @@ -86,6 +87,9 @@ typedef struct Connection {
>        * run once in independent tcp connection
>        */
>       int syn_flag;
> +
> +    int tcp_state; /* TCP FSM state */
> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */

So the question is, the state machine is not complete. I suspect there 
will be corner cases that will be left because of lacking sufficient 
states. LAST_ACK happens only for passive close. How about active close?

So I think we need either maintain a full state machine or not instead 
of a partial one. We don't want endless bugs.

Thanks

>   } Connection;
>   
>   uint32_t connection_key_hash(const void *opaque);
> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
> index 62dad2d773..0909a9a8af 100644
> --- a/net/filter-rewriter.c
> +++ b/net/filter-rewriter.c
> @@ -59,9 +59,9 @@ static int is_tcp_packet(Packet *pkt)
>   }
>   
>   /* handle tcp packet from primary guest */
> -static int handle_primary_tcp_pkt(NetFilterState *nf,
> +static int handle_primary_tcp_pkt(RewriterState *rf,
>                                     Connection *conn,
> -                                  Packet *pkt)
> +                                  Packet *pkt, ConnectionKey *key)
>   {
>       struct tcphdr *tcp_pkt;
>   
> @@ -99,15 +99,44 @@ static int handle_primary_tcp_pkt(NetFilterState *nf,
>               net_checksum_calculate((uint8_t *)pkt->data + pkt->vnet_hdr_len,
>                                      pkt->size - pkt->vnet_hdr_len);
>           }
> +        /*
> +         * Case 1:
> +         * The *server* side of this connect is VM, *client* tries to close
> +         * the connection.
> +         *
> +         * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1'
> +         * packet from server side. From this point, we can ensure that there
> +         * will be no packets in the connection, except that, some errors
> +         * happen between the path of 'filter object' and vNIC, if this rare
> +         * case really happen, we can still create a new connection,
> +         * So it is safe to remove the connection from connection_track_table.
> +         *
> +         */
> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
> +            g_hash_table_remove(rf->connection_track_table, key);
> +        }
> +    }
> +    /*
> +     * Case 2:
> +     * The *server* side of this connect is VM, *server* tries to close
> +     * the connection.
> +     *
> +     * We got 'fin=1, ack=1' packet from client side, we need to
> +     * record the seq of 'fin=1, ack=1' packet.
> +     */
> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
> +        conn->tcp_state = TCPS_LAST_ACK;
>       }
>   
>       return 0;
>   }
>   
>   /* handle tcp packet from secondary guest */
> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>                                       Connection *conn,
> -                                    Packet *pkt)
> +                                    Packet *pkt, ConnectionKey *key)
>   {
>       struct tcphdr *tcp_pkt;
>   
> @@ -139,8 +168,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf,
>               net_checksum_calculate((uint8_t *)pkt->data + pkt->vnet_hdr_len,
>                                      pkt->size - pkt->vnet_hdr_len);
>           }
> +        /*
> +         * Case 2:
> +         * The *server* side of this connect is VM, *server* tries to close
> +         * the connection.
> +         *
> +         * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1'
> +         * packet from client side. Like Case 1, there should be no packets
> +         * in the connection from now know, But the difference here is
> +         * if the packet is lost, We will get the resent 'fin=1,ack=1' packet.
> +         * TODO: Fix above case.
> +         */
> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
> +            g_hash_table_remove(rf->connection_track_table, key);
> +        }
> +    }
> +    /*
> +     * Case 1:
> +     * The *server* side of this connect is VM, *client* tries to close
> +     * the connection.
> +     *
> +     * We got 'fin=1, ack=1' packet from server side, we need to
> +     * record the seq of 'fin=1, ack=1' packet.
> +     */
> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
> +        conn->tcp_state = TCPS_LAST_ACK;
>       }
> -
>       return 0;
>   }
>   
> @@ -190,7 +245,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>   
>           if (sender == nf->netdev) {
>               /* NET_FILTER_DIRECTION_TX */
> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>                   qemu_net_queue_send(s->incoming_queue, sender, 0,
>                   (const uint8_t *)pkt->data, pkt->size, NULL);
>                   packet_destroy(pkt, NULL);
> @@ -203,7 +258,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>               }
>           } else {
>               /* NET_FILTER_DIRECTION_RX */
> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>                   qemu_net_queue_send(s->incoming_queue, sender, 0,
>                   (const uint8_t *)pkt->data, pkt->size, NULL);
>                   packet_destroy(pkt, NULL);

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 02/17] colo-compare: implement the process of checkpoint
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 02/17] colo-compare: implement the process of checkpoint Zhang Chen
@ 2018-06-04  6:31   ` Jason Wang
  2018-06-10 14:08     ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2018-06-04  6:31 UTC (permalink / raw)
  To: Zhang Chen, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster
  Cc: zhanghailiang, Li Zhijian



On 2018年06月03日 13:05, Zhang Chen wrote:
> While do checkpoint, we need to flush all the unhandled packets,
> By using the filter notifier mechanism, we can easily to notify
> every compare object to do this process, which runs inside
> of compare threads as a coroutine.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> ---
>   include/migration/colo.h |  6 ++++
>   net/colo-compare.c       | 76 ++++++++++++++++++++++++++++++++++++++++
>   net/colo-compare.h       | 22 ++++++++++++
>   3 files changed, 104 insertions(+)
>   create mode 100644 net/colo-compare.h
>
> diff --git a/include/migration/colo.h b/include/migration/colo.h
> index 2fe48ad353..fefb2fcf4c 100644
> --- a/include/migration/colo.h
> +++ b/include/migration/colo.h
> @@ -16,6 +16,12 @@
>   #include "qemu-common.h"
>   #include "qapi/qapi-types-migration.h"
>   
> +enum colo_event {
> +    COLO_EVENT_NONE,
> +    COLO_EVENT_CHECKPOINT,
> +    COLO_EVENT_FAILOVER,
> +};
> +
>   void colo_info_init(void);
>   
>   void migrate_start_colo_process(MigrationState *s);
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 23b2d2c4cc..7ff3ae8904 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -27,11 +27,16 @@
>   #include "qemu/sockets.h"
>   #include "net/colo.h"
>   #include "sysemu/iothread.h"
> +#include "net/colo-compare.h"
> +#include "migration/colo.h"
>   
>   #define TYPE_COLO_COMPARE "colo-compare"
>   #define COLO_COMPARE(obj) \
>       OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>   
> +static QTAILQ_HEAD(, CompareState) net_compares =
> +       QTAILQ_HEAD_INITIALIZER(net_compares);
> +
>   #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>   #define MAX_QUEUE_SIZE 1024
>   
> @@ -41,6 +46,10 @@
>   /* TODO: Should be configurable */
>   #define REGULAR_PACKET_CHECK_MS 3000
>   
> +static QemuMutex event_mtx;
> +static QemuCond event_complete_cond;
> +static int event_unhandled_count;
> +
>   /*
>    *  + CompareState ++
>    *  |               |
> @@ -87,6 +96,11 @@ typedef struct CompareState {
>       IOThread *iothread;
>       GMainContext *worker_context;
>       QEMUTimer *packet_check_timer;
> +
> +    QEMUBH *event_bh;
> +    enum colo_event event;
> +
> +    QTAILQ_ENTRY(CompareState) next;
>   } CompareState;
>   
>   typedef struct CompareClass {
> @@ -736,6 +750,25 @@ static void check_old_packet_regular(void *opaque)
>                   REGULAR_PACKET_CHECK_MS);
>   }
>   
> +/* Public API, Used for COLO frame to notify compare event */
> +void colo_notify_compares_event(void *opaque, int event, Error **errp)
> +{
> +    CompareState *s;
> +
> +    qemu_mutex_lock(&event_mtx);
> +    QTAILQ_FOREACH(s, &net_compares, next) {
> +        s->event = event;
> +        qemu_bh_schedule(s->event_bh);
> +        event_unhandled_count++;
> +    }
> +    /* Wait all compare threads to finish handling this event */
> +    while (event_unhandled_count > 0) {
> +        qemu_cond_wait(&event_complete_cond, &event_mtx);
> +    }
> +
> +    qemu_mutex_unlock(&event_mtx);
> +}
> +
>   static void colo_compare_timer_init(CompareState *s)
>   {
>       AioContext *ctx = iothread_get_aio_context(s->iothread);
> @@ -756,6 +789,28 @@ static void colo_compare_timer_del(CompareState *s)
>       }
>    }
>   
> +static void colo_flush_packets(void *opaque, void *user_data);
> +
> +static void colo_compare_handle_event(void *opaque)
> +{
> +    CompareState *s = opaque;
> +
> +    switch (s->event) {
> +    case COLO_EVENT_CHECKPOINT:
> +        g_queue_foreach(&s->conn_list, colo_flush_packets, s);
> +        break;
> +    case COLO_EVENT_FAILOVER:
> +        break;
> +    default:
> +        break;
> +    }
> +    qemu_mutex_lock(&event_mtx);

Isn't this a deadlock? Since colo_notify_compares_event() won't release 
event_mtx until event_unhandled_count reaches zero.

> +    assert(event_unhandled_count > 0);
> +    event_unhandled_count--;
> +    qemu_cond_broadcast(&event_complete_cond);
> +    qemu_mutex_unlock(&event_mtx);
> +}
> +
>   static void colo_compare_iothread(CompareState *s)
>   {
>       object_ref(OBJECT(s->iothread));
> @@ -769,6 +824,7 @@ static void colo_compare_iothread(CompareState *s)
>                                s, s->worker_context, true);
>   
>       colo_compare_timer_init(s);
> +    s->event_bh = qemu_bh_new(colo_compare_handle_event, s);
>   }
>   
>   static char *compare_get_pri_indev(Object *obj, Error **errp)
> @@ -926,8 +982,13 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
>       net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize, s->vnet_hdr);
>       net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize, s->vnet_hdr);
>   
> +    QTAILQ_INSERT_TAIL(&net_compares, s, next);
> +
>       g_queue_init(&s->conn_list);
>   
> +    qemu_mutex_init(&event_mtx);
> +    qemu_cond_init(&event_complete_cond);
> +
>       s->connection_track_table = g_hash_table_new_full(connection_key_hash,
>                                                         connection_key_equal,
>                                                         g_free,
> @@ -990,6 +1051,7 @@ static void colo_compare_init(Object *obj)
>   static void colo_compare_finalize(Object *obj)
>   {
>       CompareState *s = COLO_COMPARE(obj);
> +    CompareState *tmp = NULL;
>   
>       qemu_chr_fe_deinit(&s->chr_pri_in, false);
>       qemu_chr_fe_deinit(&s->chr_sec_in, false);
> @@ -997,6 +1059,16 @@ static void colo_compare_finalize(Object *obj)
>       if (s->iothread) {
>           colo_compare_timer_del(s);
>       }
> +
> +    qemu_bh_delete(s->event_bh);
> +
> +    QTAILQ_FOREACH(tmp, &net_compares, next) {
> +        if (!strcmp(tmp->outdev, s->outdev)) {

This looks not elegant, can we compare by address or just use QLIST?

Thanks

> +            QTAILQ_REMOVE(&net_compares, s, next);
> +            break;
> +        }
> +    }
> +
>       /* Release all unhandled packets after compare thead exited */
>       g_queue_foreach(&s->conn_list, colo_flush_packets, s);
>   
> @@ -1009,6 +1081,10 @@ static void colo_compare_finalize(Object *obj)
>       if (s->iothread) {
>           object_unref(OBJECT(s->iothread));
>       }
> +
> +    qemu_mutex_destroy(&event_mtx);
> +    qemu_cond_destroy(&event_complete_cond);
> +
>       g_free(s->pri_indev);
>       g_free(s->sec_indev);
>       g_free(s->outdev);
> diff --git a/net/colo-compare.h b/net/colo-compare.h
> new file mode 100644
> index 0000000000..1b1ce76aea
> --- /dev/null
> +++ b/net/colo-compare.h
> @@ -0,0 +1,22 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2017 FUJITSU LIMITED
> + * Copyright (c) 2017 Intel Corporation
> + *
> + * Authors:
> + *    zhanghailiang <zhang.zhanghailiang@huawei.com>
> + *    Zhang Chen <zhangckid@gmail.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef QEMU_COLO_COMPARE_H
> +#define QEMU_COLO_COMPARE_H
> +
> +void colo_notify_compares_event(void *opaque, int event, Error **errp);
> +
> +#endif /* QEMU_COLO_COMPARE_H */

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 03/17] colo-compare: use notifier to notify packets comparing result
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 03/17] colo-compare: use notifier to notify packets comparing result Zhang Chen
@ 2018-06-04  6:36   ` Jason Wang
  2018-06-10 14:09     ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2018-06-04  6:36 UTC (permalink / raw)
  To: Zhang Chen, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster
  Cc: zhanghailiang, Li Zhijian



On 2018年06月03日 13:05, Zhang Chen wrote:
> It's a good idea to use notifier to notify COLO frame of
> inconsistent packets comparing.
>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>   net/colo-compare.c | 32 +++++++++++++++++++++++++-------
>   net/colo-compare.h |  2 ++
>   2 files changed, 27 insertions(+), 7 deletions(-)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 7ff3ae8904..05061cd1c4 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -29,6 +29,7 @@
>   #include "sysemu/iothread.h"
>   #include "net/colo-compare.h"
>   #include "migration/colo.h"
> +#include "migration/migration.h"
>   
>   #define TYPE_COLO_COMPARE "colo-compare"
>   #define COLO_COMPARE(obj) \
> @@ -37,6 +38,9 @@
>   static QTAILQ_HEAD(, CompareState) net_compares =
>          QTAILQ_HEAD_INITIALIZER(net_compares);
>   
> +static NotifierList colo_compare_notifiers =
> +    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
> +
>   #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>   #define MAX_QUEUE_SIZE 1024
>   
> @@ -561,8 +565,24 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
>       }
>   }
>   
> +static void colo_compare_inconsistent_notify(void)
> +{

Not good at English but inconsistency sounds better here.

Thanks

> +    notifier_list_notify(&colo_compare_notifiers,
> +                migrate_get_current());
> +}
> +
> +void colo_compare_register_notifier(Notifier *notify)
> +{
> +    notifier_list_add(&colo_compare_notifiers, notify);
> +}
> +
> +void colo_compare_unregister_notifier(Notifier *notify)
> +{
> +    notifier_remove(notify);
> +}
> +
>   static int colo_old_packet_check_one_conn(Connection *conn,
> -                                          void *user_data)
> +                                           void *user_data)
>   {
>       GList *result = NULL;
>       int64_t check_time = REGULAR_PACKET_CHECK_MS;
> @@ -573,10 +593,7 @@ static int colo_old_packet_check_one_conn(Connection *conn,
>   
>       if (result) {
>           /* Do checkpoint will flush old packet */
> -        /*
> -         * TODO: Notify colo frame to do checkpoint.
> -         * colo_compare_inconsistent_notify();
> -         */
> +        colo_compare_inconsistent_notify();
>           return 0;
>       }
>   
> @@ -620,11 +637,12 @@ static void colo_compare_packet(CompareState *s, Connection *conn,
>               /*
>                * If one packet arrive late, the secondary_list or
>                * primary_list will be empty, so we can't compare it
> -             * until next comparison.
> +             * until next comparison. If the packets in the list are
> +             * timeout, it will trigger a checkpoint request.
>                */
>               trace_colo_compare_main("packet different");
>               g_queue_push_head(&conn->primary_list, pkt);
> -            /* TODO: colo_notify_checkpoint();*/
> +            colo_compare_inconsistent_notify();
>               break;
>           }
>       }
> diff --git a/net/colo-compare.h b/net/colo-compare.h
> index 1b1ce76aea..22ddd512e2 100644
> --- a/net/colo-compare.h
> +++ b/net/colo-compare.h
> @@ -18,5 +18,7 @@
>   #define QEMU_COLO_COMPARE_H
>   
>   void colo_notify_compares_event(void *opaque, int event, Error **errp);
> +void colo_compare_register_notifier(Notifier *notify);
> +void colo_compare_unregister_notifier(Notifier *notify);
>   
>   #endif /* QEMU_COLO_COMPARE_H */

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass Zhang Chen
@ 2018-06-04  6:57   ` Jason Wang
  2018-06-10 14:09     ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2018-06-04  6:57 UTC (permalink / raw)
  To: Zhang Chen, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster
  Cc: zhanghailiang, Li Zhijian



On 2018年06月03日 13:05, Zhang Chen wrote:
> Filter needs to process the event of checkpoint/failover or
> other event passed by COLO frame.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>   include/net/filter.h |  5 +++++
>   net/filter.c         | 17 +++++++++++++++++
>   net/net.c            | 28 ++++++++++++++++++++++++++++
>   3 files changed, 50 insertions(+)
>
> diff --git a/include/net/filter.h b/include/net/filter.h
> index 435acd6f82..49da666ac0 100644
> --- a/include/net/filter.h
> +++ b/include/net/filter.h
> @@ -38,6 +38,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
>   
>   typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp);
>   
> +typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error **errp);
> +
>   typedef struct NetFilterClass {
>       ObjectClass parent_class;
>   
> @@ -45,6 +47,7 @@ typedef struct NetFilterClass {
>       FilterSetup *setup;
>       FilterCleanup *cleanup;
>       FilterStatusChanged *status_changed;
> +    FilterHandleEvent *handle_event;
>       /* mandatory */
>       FilterReceiveIOV *receive_iov;
>   } NetFilterClass;
> @@ -77,4 +80,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
>                                       int iovcnt,
>                                       void *opaque);
>   
> +void colo_notify_filters_event(int event, Error **errp);
> +
>   #endif /* QEMU_NET_FILTER_H */
> diff --git a/net/filter.c b/net/filter.c
> index 2fd7d7d663..0f17eba143 100644
> --- a/net/filter.c
> +++ b/net/filter.c
> @@ -17,6 +17,8 @@
>   #include "net/vhost_net.h"
>   #include "qom/object_interfaces.h"
>   #include "qemu/iov.h"
> +#include "net/colo.h"
> +#include "migration/colo.h"
>   
>   static inline bool qemu_can_skip_netfilter(NetFilterState *nf)
>   {
> @@ -245,11 +247,26 @@ static void netfilter_finalize(Object *obj)
>       g_free(nf->netdev_id);
>   }
>   
> +static void dummy_handle_event(NetFilterState *nf, int event, Error **errp)
> +{
> +    switch (event) {
> +    case COLO_EVENT_CHECKPOINT:
> +        break;
> +    case COLO_EVENT_FAILOVER:
> +        object_property_set_str(OBJECT(nf), "off", "status", errp);
> +        break;
> +    default:
> +        break;
> +    }
> +}
> +
>   static void netfilter_class_init(ObjectClass *oc, void *data)
>   {
>       UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
> +    NetFilterClass *nfc = NETFILTER_CLASS(oc);
>   
>       ucc->complete = netfilter_complete;
> +    nfc->handle_event = dummy_handle_event;
>   }
>   
>   static const TypeInfo netfilter_info = {
> diff --git a/net/net.c b/net/net.c
> index efb9eaf779..378cd2f9ec 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -1329,6 +1329,34 @@ void hmp_info_network(Monitor *mon, const QDict *qdict)
>       }
>   }
>   
> +void colo_notify_filters_event(int event, Error **errp)
> +{
> +    NetClientState *nc, *peer;
> +    NetClientDriver type;
> +    NetFilterState *nf;
> +    NetFilterClass *nfc = NULL;
> +    Error *local_err = NULL;
> +
> +    QTAILQ_FOREACH(nc, &net_clients, next) {
> +        peer = nc->peer;
> +        type = nc->info->type;
> +        if (!peer || type != NET_CLIENT_DRIVER_TAP) {
> +            continue;

I think when COLO is enabled, qemu should reject all other types of 
backends.

> +        }
> +        QTAILQ_FOREACH(nf, &nc->filters, next) {
> +            nfc =  NETFILTER_GET_CLASS(OBJECT(nf));
> +            if (!nfc->handle_event) {
> +                continue;
> +            }

Is this still necessary?

Thanks

> +            nfc->handle_event(nf, event, &local_err);
> +            if (local_err) {
> +                error_propagate(errp, local_err);
> +                return;
> +            }
> +        }
> +    }
> +}
> +
>   void qmp_set_link(const char *name, bool up, Error **errp)
>   {
>       NetClientState *ncs[MAX_QUEUE_NUM];

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 15/17] filter-rewriter: handle checkpoint and failover event
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 15/17] filter-rewriter: handle checkpoint and failover event Zhang Chen
@ 2018-06-04  7:42   ` Jason Wang
  2018-06-10 17:20     ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2018-06-04  7:42 UTC (permalink / raw)
  To: Zhang Chen, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster
  Cc: zhanghailiang, Li Zhijian



On 2018年06月03日 13:05, Zhang Chen wrote:
> After one round of checkpoint, the states between PVM and SVM
> become consistent, so it is unnecessary to adjust the sequence
> of net packets for old connections, besides, while failover
> happens, filter-rewriter needs to check if it still needs to
> adjust sequence of net packets.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> ---
>   migration/colo.c      | 13 +++++++++++++
>   net/filter-rewriter.c | 40 ++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 53 insertions(+)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index 442471e088..0bff21d9e5 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -31,6 +31,7 @@
>   #include "qapi/qapi-events-migration.h"
>   #include "qapi/qmp/qerror.h"
>   #include "sysemu/cpus.h"
> +#include "net/filter.h"
>   
>   static bool vmstate_loading;
>   static Notifier packets_compare_notifier;
> @@ -82,6 +83,11 @@ static void secondary_vm_do_failover(void)
>       if (local_err) {
>           error_report_err(local_err);
>       }
> +    /* Notify all filters of all NIC to do checkpoint */
> +    colo_notify_filters_event(COLO_EVENT_FAILOVER, &local_err);
> +    if (local_err) {
> +        error_report_err(local_err);
> +    }
>   
>       if (!autostart) {
>           error_report("\"-S\" qemu option will be ignored in secondary side");
> @@ -800,6 +806,13 @@ void *colo_process_incoming_thread(void *opaque)
>               goto out;
>           }
>   
> +        /* Notify all filters of all NIC to do checkpoint */
> +        colo_notify_filters_event(COLO_EVENT_CHECKPOINT, &local_err);
> +        if (local_err) {
> +            qemu_mutex_unlock_iothread();
> +            goto out;
> +        }
> +

I think the above should belong to another patch.

>           vmstate_loading = false;
>           vm_start();
>           trace_colo_vm_state_change("stop", "run");
> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
> index 0909a9a8af..f3c306cc89 100644
> --- a/net/filter-rewriter.c
> +++ b/net/filter-rewriter.c
> @@ -20,6 +20,8 @@
>   #include "qemu/main-loop.h"
>   #include "qemu/iov.h"
>   #include "net/checksum.h"
> +#include "net/colo.h"
> +#include "migration/colo.h"
>   
>   #define FILTER_COLO_REWRITER(obj) \
>       OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
> @@ -277,6 +279,43 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>       return 0;
>   }
>   
> +static void reset_seq_offset(gpointer key, gpointer value, gpointer user_data)
> +{
> +    Connection *conn = (Connection *)value;
> +
> +    conn->offset = 0;
> +}
> +
> +static gboolean offset_is_nonzero(gpointer key,
> +                                  gpointer value,
> +                                  gpointer user_data)
> +{
> +    Connection *conn = (Connection *)value;
> +
> +    return conn->offset ? true : false;
> +}
> +
> +static void colo_rewriter_handle_event(NetFilterState *nf, int event,
> +                                       Error **errp)
> +{
> +    RewriterState *rs = FILTER_COLO_REWRITER(nf);
> +
> +    switch (event) {
> +    case COLO_EVENT_CHECKPOINT:
> +        g_hash_table_foreach(rs->connection_track_table,
> +                            reset_seq_offset, NULL);
> +        break;
> +    case COLO_EVENT_FAILOVER:
> +        if (!g_hash_table_find(rs->connection_track_table,
> +                              offset_is_nonzero, NULL)) {
> +            object_property_set_str(OBJECT(nf), "off", "status", errp);
> +        }

I may miss something but I think rewriter should not be turned off even 
after failover?

Thanks

> +        break;
> +    default:
> +        break;
> +    }
> +}
> +
>   static void colo_rewriter_cleanup(NetFilterState *nf)
>   {
>       RewriterState *s = FILTER_COLO_REWRITER(nf);
> @@ -332,6 +371,7 @@ static void colo_rewriter_class_init(ObjectClass *oc, void *data)
>       nfc->setup = colo_rewriter_setup;
>       nfc->cleanup = colo_rewriter_cleanup;
>       nfc->receive_iov = colo_rewriter_receive_iov;
> +    nfc->handle_event = colo_rewriter_handle_event;
>   }
>   
>   static const TypeInfo colo_rewriter_info = {

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO Zhang Chen
@ 2018-06-04 22:23   ` Eric Blake
  2018-06-07 12:54     ` Markus Armbruster
  0 siblings, 1 reply; 46+ messages in thread
From: Eric Blake @ 2018-06-04 22:23 UTC (permalink / raw)
  To: Zhang Chen, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Markus Armbruster
  Cc: zhanghailiang, Li Zhijian

On 06/03/2018 12:05 AM, Zhang Chen wrote:
> From: zhanghailiang <zhang.zhanghailiang@huawei.com>
> 
> If some errors happen during VM's COLO FT stage, it's important to
> notify the users of this event. Together with 'x-colo-lost-heartbeat',
> Users can intervene in COLO's failover work immediately.
> If users don't want to get involved in COLO's failover verdict,
> it is still necessary to notify users that we exited COLO mode.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>   migration/colo.c    | 31 +++++++++++++++++++++++++++++++
>   qapi/migration.json | 38 ++++++++++++++++++++++++++++++++++++++
>   2 files changed, 69 insertions(+)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index c083d3696f..bedb677788 100644
> --- a/migration/colo.c

>   
> +    /*
> +     * There are only two reasons we can go here, some error happened.
> +     * Or the user triggered failover.

s/go here/get here/
s/happened. Or/happened, or/

> +++ b/qapi/migration.json
> @@ -880,6 +880,44 @@
>   { 'enum': 'FailoverStatus',
>     'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
>   
> +##
> +# @COLO_EXIT:
> +#
> +# Emitted when VM finishes COLO mode due to some errors happening or
> +# at the request of users.
> +#
> +# @mode: report COLO mode when COLO exited.
> +#
> +# @reason: describes the reason for the COLO exit.
> +#
> +# Since: 2.13

s/2.13/3.0/

> +#
> +# Example:
> +#
> +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172},
> +#      "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
> +#
> +##
> +{ 'event': 'COLO_EXIT',
> +  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
> +
> +##
> +# @COLOExitReason:
> +#
> +# The reason for a COLO exit
> +#
> +# @none: no failover has ever happened, This can't occur in the COLO_EXIT event,

s/happened,/happened./

> +# only in the result of query-colo-status.
> +#
> +# @request: COLO exit is due to an external request
> +#
> +# @error: COLO exit is due to an internal error
> +#
> +# Since: 2.13

s/2.13/3.0/

> +##
> +{ 'enum': 'COLOExitReason',
> +  'data': [ 'none', 'request', 'error' ] }
> +
>   ##
>   # @x-colo-lost-heartbeat:
>   #
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status Zhang Chen
@ 2018-06-04 22:23   ` Eric Blake
  2018-06-10 17:42     ` Zhang Chen
  2018-06-07 12:59   ` Markus Armbruster
  1 sibling, 1 reply; 46+ messages in thread
From: Eric Blake @ 2018-06-04 22:23 UTC (permalink / raw)
  To: Zhang Chen, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Markus Armbruster
  Cc: zhanghailiang, Li Zhijian

On 06/03/2018 12:05 AM, Zhang Chen wrote:
> Libvirt or other high level software can use this command query colo status.
> You can test this command like that:
> {'execute':'query-colo-status'}
> 
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> ---

> +++ b/qapi/migration.json
> @@ -1231,6 +1231,40 @@
>   ##
>   { 'command': 'xen-colo-do-checkpoint' }
>   
> +##
> +# @COLOStatus:
> +#
> +# The result format for 'query-colo-status'.
> +#
> +# @mode: COLO running mode. If COLO is running, this field will return
> +#        'primary' or 'secodary'.

s/secodary/secondary/

> +#
> +# @colo-running: true if COLO is running.
> +#
> +# @reason: describes the reason for the COLO exit.
> +#
> +# Since: 2.13

3.0

> +##
> +{ 'struct': 'COLOStatus',
> +  'data': { 'mode': 'COLOMode', 'colo-running': 'bool', 'reason': 'COLOExitReason' } }
> +
> +##
> +# @query-colo-status:
> +#
> +# Query COLO status while the vm is running.
> +#
> +# Returns: A @COLOStatus object showing the status.
> +#
> +# Example:
> +#
> +# -> { "execute": "query-colo-status" }
> +# <- { "return": { "mode": "primary", "colo-running": true, "reason": "request" } }
> +#
> +# Since: 2.13

3.0

> +##
> +{ 'command': 'query-colo-status',
> +  'returns': 'COLOStatus' }
> +
>   ##
>   # @migrate-recover:
>   #
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO
  2018-06-04 22:23   ` Eric Blake
@ 2018-06-07 12:54     ` Markus Armbruster
  2018-06-10 17:24       ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Markus Armbruster @ 2018-06-07 12:54 UTC (permalink / raw)
  To: Eric Blake
  Cc: Zhang Chen, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, zhanghailiang, Li Zhijian

Eric Blake <eblake@redhat.com> writes:

> On 06/03/2018 12:05 AM, Zhang Chen wrote:
>> From: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>
>> If some errors happen during VM's COLO FT stage, it's important to
>> notify the users of this event. Together with 'x-colo-lost-heartbeat',
>> Users can intervene in COLO's failover work immediately.
>> If users don't want to get involved in COLO's failover verdict,
>> it is still necessary to notify users that we exited COLO mode.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> ---
>>   migration/colo.c    | 31 +++++++++++++++++++++++++++++++
>>   qapi/migration.json | 38 ++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 69 insertions(+)
>>
>> diff --git a/migration/colo.c b/migration/colo.c
>> index c083d3696f..bedb677788 100644
>> --- a/migration/colo.c
>
>>   +    /*
>> +     * There are only two reasons we can go here, some error happened.
>> +     * Or the user triggered failover.
>
> s/go here/get here/
> s/happened. Or/happened, or/
>
>> +++ b/qapi/migration.json
>> @@ -880,6 +880,44 @@
>>   { 'enum': 'FailoverStatus',
>>     'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
>> +##
>> +# @COLO_EXIT:
>> +#
>> +# Emitted when VM finishes COLO mode due to some errors happening or
>> +# at the request of users.
>> +#
>> +# @mode: report COLO mode when COLO exited.
>> +#
>> +# @reason: describes the reason for the COLO exit.
>> +#
>> +# Since: 2.13
>
> s/2.13/3.0/
>
>> +#
>> +# Example:
>> +#
>> +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172},
>> +#      "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } }
>> +#
>> +##
>> +{ 'event': 'COLO_EXIT',
>> +  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
>> +
>> +##
>> +# @COLOExitReason:
>> +#
>> +# The reason for a COLO exit
>> +#
>> +# @none: no failover has ever happened, This can't occur in the COLO_EXIT event,
>
> s/happened,/happened./

While there, wrap the line around column 70, please.

>> +# only in the result of query-colo-status.
>> +#
>> +# @request: COLO exit is due to an external request
>> +#
>> +# @error: COLO exit is due to an internal error
>> +#
>> +# Since: 2.13
>
> s/2.13/3.0/
>
>> +##
>> +{ 'enum': 'COLOExitReason',
>> +  'data': [ 'none', 'request', 'error' ] }
>> +
>>   ##
>>   # @x-colo-lost-heartbeat:
>>   #
>>

With these touch-ups:
Reviewed-by: Markus Armbruster <armbru@redhat.com>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status Zhang Chen
  2018-06-04 22:23   ` Eric Blake
@ 2018-06-07 12:59   ` Markus Armbruster
  2018-06-10 17:39     ` Zhang Chen
  1 sibling, 1 reply; 46+ messages in thread
From: Markus Armbruster @ 2018-06-07 12:59 UTC (permalink / raw)
  To: Zhang Chen
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake, zhanghailiang,
	Li Zhijian

Zhang Chen <zhangckid@gmail.com> writes:

> Libvirt or other high level software can use this command query colo status.
> You can test this command like that:
> {'execute':'query-colo-status'}
>
> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> ---
>  migration/colo.c    | 39 +++++++++++++++++++++++++++++++++++++++
>  qapi/migration.json | 34 ++++++++++++++++++++++++++++++++++
>  2 files changed, 73 insertions(+)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index bedb677788..8c6b8e9a4e 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -29,6 +29,7 @@
>  #include "net/colo.h"
>  #include "block/block.h"
>  #include "qapi/qapi-events-migration.h"
> +#include "qapi/qmp/qerror.h"
>  
>  static bool vmstate_loading;
>  static Notifier packets_compare_notifier;
> @@ -237,6 +238,44 @@ void qmp_xen_colo_do_checkpoint(Error **errp)
>  #endif
>  }
>  
> +COLOStatus *qmp_query_colo_status(Error **errp)
> +{
> +    int state;
> +    COLOStatus *s = g_new0(COLOStatus, 1);
> +
> +    s->mode = get_colo_mode();
> +
> +    switch (s->mode) {
> +    case COLO_MODE_UNKNOWN:
> +        error_setg(errp, "COLO is disabled");
> +        state = MIGRATION_STATUS_NONE;
> +        break;
> +    case COLO_MODE_PRIMARY:
> +        state = migrate_get_current()->state;
> +        break;
> +    case COLO_MODE_SECONDARY:
> +        state = migration_incoming_get_current()->state;
> +        break;
> +    default:
> +        abort();
> +    }
> +
> +    s->colo_running = state == MIGRATION_STATUS_COLO;
> +
> +    switch (failover_get_state()) {
> +    case FAILOVER_STATUS_NONE:
> +        s->reason = COLO_EXIT_REASON_NONE;
> +        break;
> +    case FAILOVER_STATUS_REQUIRE:
> +        s->reason = COLO_EXIT_REASON_REQUEST;
> +        break;
> +    default:
> +        s->reason = COLO_EXIT_REASON_ERROR;
> +    }
> +
> +    return s;
> +}
> +
>  static void colo_send_message(QEMUFile *f, COLOMessage msg,
>                                Error **errp)
>  {
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 93136ce5a0..356a370949 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1231,6 +1231,40 @@
>  ##
>  { 'command': 'xen-colo-do-checkpoint' }
>  
> +##
> +# @COLOStatus:
> +#
> +# The result format for 'query-colo-status'.
> +#
> +# @mode: COLO running mode. If COLO is running, this field will return
> +#        'primary' or 'secodary'.
> +#
> +# @colo-running: true if COLO is running.
> +#
> +# @reason: describes the reason for the COLO exit.

What's the value of @reason before a "COLO exit"?

> +#
> +# Since: 2.13
> +##
> +{ 'struct': 'COLOStatus',
> +  'data': { 'mode': 'COLOMode', 'colo-running': 'bool', 'reason': 'COLOExitReason' } }
> +
> +##
> +# @query-colo-status:
> +#
> +# Query COLO status while the vm is running.
> +#
> +# Returns: A @COLOStatus object showing the status.
> +#
> +# Example:
> +#
> +# -> { "execute": "query-colo-status" }
> +# <- { "return": { "mode": "primary", "colo-running": true, "reason": "request" } }
> +#
> +# Since: 2.13
> +##
> +{ 'command': 'query-colo-status',
> +  'returns': 'COLOStatus' }
> +
>  ##
>  # @migrate-recover:
>  #

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 01/17] filter-rewriter: fix memory leak for connection in connection_track_table
  2018-06-04  5:51   ` Jason Wang
@ 2018-06-10 14:08     ` Zhang Chen
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 14:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster,
	zhanghailiang, Li Zhijian

On Mon, Jun 4, 2018 at 1:51 PM, Jason Wang <jasowang@redhat.com> wrote:

>
>
> On 2018年06月03日 13:05, Zhang Chen wrote:
>
>> After a net connection is closed, we didn't clear its releated resources
>> in connection_track_table, which will lead to memory leak.
>>
>> Let't track the state of net connection, if it is closed, its related
>> resources will be cleared up.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
>> ---
>>   net/colo.h            |  4 +++
>>   net/filter-rewriter.c | 69 ++++++++++++++++++++++++++++++++++++++-----
>>   2 files changed, 66 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/colo.h b/net/colo.h
>> index da6c36dcf7..cd118510c5 100644
>> --- a/net/colo.h
>> +++ b/net/colo.h
>> @@ -18,6 +18,7 @@
>>   #include "slirp/slirp.h"
>>   #include "qemu/jhash.h"
>>   #include "qemu/timer.h"
>> +#include "slirp/tcp.h"
>>     #define HASHTABLE_MAX_SIZE 16384
>>   @@ -86,6 +87,9 @@ typedef struct Connection {
>>        * run once in independent tcp connection
>>        */
>>       int syn_flag;
>> +
>> +    int tcp_state; /* TCP FSM state */
>> +    tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>>
>
> So the question is, the state machine is not complete. I suspect there
> will be corner cases that will be left because of lacking sufficient
> states. LAST_ACK happens only for passive close. How about active close?
>
> So I think we need either maintain a full state machine or not instead of
> a partial one. We don't want endless bugs.
>

OK, I got it.
I will add a full state machine here in next version.

Thanks
Zhang Chen


>
> Thanks
>
>
>   } Connection;
>>     uint32_t connection_key_hash(const void *opaque);
>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>> index 62dad2d773..0909a9a8af 100644
>> --- a/net/filter-rewriter.c
>> +++ b/net/filter-rewriter.c
>> @@ -59,9 +59,9 @@ static int is_tcp_packet(Packet *pkt)
>>   }
>>     /* handle tcp packet from primary guest */
>> -static int handle_primary_tcp_pkt(NetFilterState *nf,
>> +static int handle_primary_tcp_pkt(RewriterState *rf,
>>                                     Connection *conn,
>> -                                  Packet *pkt)
>> +                                  Packet *pkt, ConnectionKey *key)
>>   {
>>       struct tcphdr *tcp_pkt;
>>   @@ -99,15 +99,44 @@ static int handle_primary_tcp_pkt(NetFilterState
>> *nf,
>>               net_checksum_calculate((uint8_t *)pkt->data +
>> pkt->vnet_hdr_len,
>>                                      pkt->size - pkt->vnet_hdr_len);
>>           }
>> +        /*
>> +         * Case 1:
>> +         * The *server* side of this connect is VM, *client* tries to
>> close
>> +         * the connection.
>> +         *
>> +         * We got 'ack=1' packets from client side, it acks 'fin=1,
>> ack=1'
>> +         * packet from server side. From this point, we can ensure that
>> there
>> +         * will be no packets in the connection, except that, some errors
>> +         * happen between the path of 'filter object' and vNIC, if this
>> rare
>> +         * case really happen, we can still create a new connection,
>> +         * So it is safe to remove the connection from
>> connection_track_table.
>> +         *
>> +         */
>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>> +            g_hash_table_remove(rf->connection_track_table, key);
>> +        }
>> +    }
>> +    /*
>> +     * Case 2:
>> +     * The *server* side of this connect is VM, *server* tries to close
>> +     * the connection.
>> +     *
>> +     * We got 'fin=1, ack=1' packet from client side, we need to
>> +     * record the seq of 'fin=1, ack=1' packet.
>> +     */
>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>> +        conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
>> +        conn->tcp_state = TCPS_LAST_ACK;
>>       }
>>         return 0;
>>   }
>>     /* handle tcp packet from secondary guest */
>> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
>> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>>                                       Connection *conn,
>> -                                    Packet *pkt)
>> +                                    Packet *pkt, ConnectionKey *key)
>>   {
>>       struct tcphdr *tcp_pkt;
>>   @@ -139,8 +168,34 @@ static int handle_secondary_tcp_pkt(NetFilterState
>> *nf,
>>               net_checksum_calculate((uint8_t *)pkt->data +
>> pkt->vnet_hdr_len,
>>                                      pkt->size - pkt->vnet_hdr_len);
>>           }
>> +        /*
>> +         * Case 2:
>> +         * The *server* side of this connect is VM, *server* tries to
>> close
>> +         * the connection.
>> +         *
>> +         * We got 'ack=1' packets from server side, it acks 'fin=1,
>> ack=1'
>> +         * packet from client side. Like Case 1, there should be no
>> packets
>> +         * in the connection from now know, But the difference here is
>> +         * if the packet is lost, We will get the resent 'fin=1,ack=1'
>> packet.
>> +         * TODO: Fix above case.
>> +         */
>> +        if ((conn->tcp_state == TCPS_LAST_ACK) &&
>> +            (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>> +            g_hash_table_remove(rf->connection_track_table, key);
>> +        }
>> +    }
>> +    /*
>> +     * Case 1:
>> +     * The *server* side of this connect is VM, *client* tries to close
>> +     * the connection.
>> +     *
>> +     * We got 'fin=1, ack=1' packet from server side, we need to
>> +     * record the seq of 'fin=1, ack=1' packet.
>> +     */
>> +    if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>> +        conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
>> +        conn->tcp_state = TCPS_LAST_ACK;
>>       }
>> -
>>       return 0;
>>   }
>>   @@ -190,7 +245,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState
>> *nf,
>>             if (sender == nf->netdev) {
>>               /* NET_FILTER_DIRECTION_TX */
>> -            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
>> +            if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>>                   qemu_net_queue_send(s->incoming_queue, sender, 0,
>>                   (const uint8_t *)pkt->data, pkt->size, NULL);
>>                   packet_destroy(pkt, NULL);
>> @@ -203,7 +258,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState
>> *nf,
>>               }
>>           } else {
>>               /* NET_FILTER_DIRECTION_RX */
>> -            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
>> +            if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>>                   qemu_net_queue_send(s->incoming_queue, sender, 0,
>>                   (const uint8_t *)pkt->data, pkt->size, NULL);
>>                   packet_destroy(pkt, NULL);
>>
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 02/17] colo-compare: implement the process of checkpoint
  2018-06-04  6:31   ` Jason Wang
@ 2018-06-10 14:08     ` Zhang Chen
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 14:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster,
	zhanghailiang, Li Zhijian

On Mon, Jun 4, 2018 at 2:31 PM, Jason Wang <jasowang@redhat.com> wrote:

>
>
> On 2018年06月03日 13:05, Zhang Chen wrote:
>
>> While do checkpoint, we need to flush all the unhandled packets,
>> By using the filter notifier mechanism, we can easily to notify
>> every compare object to do this process, which runs inside
>> of compare threads as a coroutine.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
>> ---
>>   include/migration/colo.h |  6 ++++
>>   net/colo-compare.c       | 76 ++++++++++++++++++++++++++++++++++++++++
>>   net/colo-compare.h       | 22 ++++++++++++
>>   3 files changed, 104 insertions(+)
>>   create mode 100644 net/colo-compare.h
>>
>> diff --git a/include/migration/colo.h b/include/migration/colo.h
>> index 2fe48ad353..fefb2fcf4c 100644
>> --- a/include/migration/colo.h
>> +++ b/include/migration/colo.h
>> @@ -16,6 +16,12 @@
>>   #include "qemu-common.h"
>>   #include "qapi/qapi-types-migration.h"
>>   +enum colo_event {
>> +    COLO_EVENT_NONE,
>> +    COLO_EVENT_CHECKPOINT,
>> +    COLO_EVENT_FAILOVER,
>> +};
>> +
>>   void colo_info_init(void);
>>     void migrate_start_colo_process(MigrationState *s);
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 23b2d2c4cc..7ff3ae8904 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -27,11 +27,16 @@
>>   #include "qemu/sockets.h"
>>   #include "net/colo.h"
>>   #include "sysemu/iothread.h"
>> +#include "net/colo-compare.h"
>> +#include "migration/colo.h"
>>     #define TYPE_COLO_COMPARE "colo-compare"
>>   #define COLO_COMPARE(obj) \
>>       OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>>   +static QTAILQ_HEAD(, CompareState) net_compares =
>> +       QTAILQ_HEAD_INITIALIZER(net_compares);
>> +
>>   #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>>   #define MAX_QUEUE_SIZE 1024
>>   @@ -41,6 +46,10 @@
>>   /* TODO: Should be configurable */
>>   #define REGULAR_PACKET_CHECK_MS 3000
>>   +static QemuMutex event_mtx;
>> +static QemuCond event_complete_cond;
>> +static int event_unhandled_count;
>> +
>>   /*
>>    *  + CompareState ++
>>    *  |               |
>> @@ -87,6 +96,11 @@ typedef struct CompareState {
>>       IOThread *iothread;
>>       GMainContext *worker_context;
>>       QEMUTimer *packet_check_timer;
>> +
>> +    QEMUBH *event_bh;
>> +    enum colo_event event;
>> +
>> +    QTAILQ_ENTRY(CompareState) next;
>>   } CompareState;
>>     typedef struct CompareClass {
>> @@ -736,6 +750,25 @@ static void check_old_packet_regular(void *opaque)
>>                   REGULAR_PACKET_CHECK_MS);
>>   }
>>   +/* Public API, Used for COLO frame to notify compare event */
>> +void colo_notify_compares_event(void *opaque, int event, Error **errp)
>> +{
>> +    CompareState *s;
>> +
>> +    qemu_mutex_lock(&event_mtx);
>> +    QTAILQ_FOREACH(s, &net_compares, next) {
>> +        s->event = event;
>> +        qemu_bh_schedule(s->event_bh);
>> +        event_unhandled_count++;
>> +    }
>> +    /* Wait all compare threads to finish handling this event */
>> +    while (event_unhandled_count > 0) {
>> +        qemu_cond_wait(&event_complete_cond, &event_mtx);
>> +    }
>> +
>> +    qemu_mutex_unlock(&event_mtx);
>> +}
>> +
>>   static void colo_compare_timer_init(CompareState *s)
>>   {
>>       AioContext *ctx = iothread_get_aio_context(s->iothread);
>> @@ -756,6 +789,28 @@ static void colo_compare_timer_del(CompareState *s)
>>       }
>>    }
>>   +static void colo_flush_packets(void *opaque, void *user_data);
>> +
>> +static void colo_compare_handle_event(void *opaque)
>> +{
>> +    CompareState *s = opaque;
>> +
>> +    switch (s->event) {
>> +    case COLO_EVENT_CHECKPOINT:
>> +        g_queue_foreach(&s->conn_list, colo_flush_packets, s);
>> +        break;
>> +    case COLO_EVENT_FAILOVER:
>> +        break;
>> +    default:
>> +        break;
>> +    }
>> +    qemu_mutex_lock(&event_mtx);
>>
>
> Isn't this a deadlock? Since colo_notify_compares_event() won't release
> event_mtx until event_unhandled_count reaches zero.
>
>
Good catch!
I will fix it in next version.


>
> +    assert(event_unhandled_count > 0);
>> +    event_unhandled_count--;
>> +    qemu_cond_broadcast(&event_complete_cond);
>> +    qemu_mutex_unlock(&event_mtx);
>> +}
>> +
>>   static void colo_compare_iothread(CompareState *s)
>>   {
>>       object_ref(OBJECT(s->iothread));
>> @@ -769,6 +824,7 @@ static void colo_compare_iothread(CompareState *s)
>>                                s, s->worker_context, true);
>>         colo_compare_timer_init(s);
>> +    s->event_bh = qemu_bh_new(colo_compare_handle_event, s);
>>   }
>>     static char *compare_get_pri_indev(Object *obj, Error **errp)
>> @@ -926,8 +982,13 @@ static void colo_compare_complete(UserCreatable
>> *uc, Error **errp)
>>       net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize,
>> s->vnet_hdr);
>>       net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize,
>> s->vnet_hdr);
>>   +    QTAILQ_INSERT_TAIL(&net_compares, s, next);
>> +
>>       g_queue_init(&s->conn_list);
>>   +    qemu_mutex_init(&event_mtx);
>> +    qemu_cond_init(&event_complete_cond);
>> +
>>       s->connection_track_table = g_hash_table_new_full(connecti
>> on_key_hash,
>>
>> connection_key_equal,
>>                                                         g_free,
>> @@ -990,6 +1051,7 @@ static void colo_compare_init(Object *obj)
>>   static void colo_compare_finalize(Object *obj)
>>   {
>>       CompareState *s = COLO_COMPARE(obj);
>> +    CompareState *tmp = NULL;
>>         qemu_chr_fe_deinit(&s->chr_pri_in, false);
>>       qemu_chr_fe_deinit(&s->chr_sec_in, false);
>> @@ -997,6 +1059,16 @@ static void colo_compare_finalize(Object *obj)
>>       if (s->iothread) {
>>           colo_compare_timer_del(s);
>>       }
>> +
>> +    qemu_bh_delete(s->event_bh);
>> +
>> +    QTAILQ_FOREACH(tmp, &net_compares, next) {
>> +        if (!strcmp(tmp->outdev, s->outdev)) {
>>
>
> This looks not elegant, can we compare by address or just use QLIST?
>
>
OK, I will compare by address in next version.

Thanks
Zhang Chen


> Thanks
>
>
> +            QTAILQ_REMOVE(&net_compares, s, next);
>> +            break;
>> +        }
>> +    }
>> +
>>       /* Release all unhandled packets after compare thead exited */
>>       g_queue_foreach(&s->conn_list, colo_flush_packets, s);
>>   @@ -1009,6 +1081,10 @@ static void colo_compare_finalize(Object *obj)
>>       if (s->iothread) {
>>           object_unref(OBJECT(s->iothread));
>>       }
>> +
>> +    qemu_mutex_destroy(&event_mtx);
>> +    qemu_cond_destroy(&event_complete_cond);
>> +
>>       g_free(s->pri_indev);
>>       g_free(s->sec_indev);
>>       g_free(s->outdev);
>> diff --git a/net/colo-compare.h b/net/colo-compare.h
>> new file mode 100644
>> index 0000000000..1b1ce76aea
>> --- /dev/null
>> +++ b/net/colo-compare.h
>> @@ -0,0 +1,22 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service
>> (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2017 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2017 FUJITSU LIMITED
>> + * Copyright (c) 2017 Intel Corporation
>> + *
>> + * Authors:
>> + *    zhanghailiang <zhang.zhanghailiang@huawei.com>
>> + *    Zhang Chen <zhangckid@gmail.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +#ifndef QEMU_COLO_COMPARE_H
>> +#define QEMU_COLO_COMPARE_H
>> +
>> +void colo_notify_compares_event(void *opaque, int event, Error **errp);
>> +
>> +#endif /* QEMU_COLO_COMPARE_H */
>>
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 03/17] colo-compare: use notifier to notify packets comparing result
  2018-06-04  6:36   ` Jason Wang
@ 2018-06-10 14:09     ` Zhang Chen
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 14:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster,
	zhanghailiang, Li Zhijian

On Mon, Jun 4, 2018 at 2:36 PM, Jason Wang <jasowang@redhat.com> wrote:

>
>
> On 2018年06月03日 13:05, Zhang Chen wrote:
>
>> It's a good idea to use notifier to notify COLO frame of
>> inconsistent packets comparing.
>>
>> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>   net/colo-compare.c | 32 +++++++++++++++++++++++++-------
>>   net/colo-compare.h |  2 ++
>>   2 files changed, 27 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 7ff3ae8904..05061cd1c4 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -29,6 +29,7 @@
>>   #include "sysemu/iothread.h"
>>   #include "net/colo-compare.h"
>>   #include "migration/colo.h"
>> +#include "migration/migration.h"
>>     #define TYPE_COLO_COMPARE "colo-compare"
>>   #define COLO_COMPARE(obj) \
>> @@ -37,6 +38,9 @@
>>   static QTAILQ_HEAD(, CompareState) net_compares =
>>          QTAILQ_HEAD_INITIALIZER(net_compares);
>>   +static NotifierList colo_compare_notifiers =
>> +    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
>> +
>>   #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>>   #define MAX_QUEUE_SIZE 1024
>>   @@ -561,8 +565,24 @@ static int colo_old_packet_check_one(Packet *pkt,
>> int64_t *check_time)
>>       }
>>   }
>>   +static void colo_compare_inconsistent_notify(void)
>> +{
>>
>
> Not good at English but inconsistency sounds better here.
>
>
Sorry, I'm not good at English too...
I will change to "inconsistency" in next version.

Thanks
Zhang Chen


> Thanks
>
>
> +    notifier_list_notify(&colo_compare_notifiers,
>> +                migrate_get_current());
>> +}
>> +
>> +void colo_compare_register_notifier(Notifier *notify)
>> +{
>> +    notifier_list_add(&colo_compare_notifiers, notify);
>> +}
>> +
>> +void colo_compare_unregister_notifier(Notifier *notify)
>> +{
>> +    notifier_remove(notify);
>> +}
>> +
>>   static int colo_old_packet_check_one_conn(Connection *conn,
>> -                                          void *user_data)
>> +                                           void *user_data)
>>   {
>>       GList *result = NULL;
>>       int64_t check_time = REGULAR_PACKET_CHECK_MS;
>> @@ -573,10 +593,7 @@ static int colo_old_packet_check_one_conn(Connection
>> *conn,
>>         if (result) {
>>           /* Do checkpoint will flush old packet */
>> -        /*
>> -         * TODO: Notify colo frame to do checkpoint.
>> -         * colo_compare_inconsistent_notify();
>> -         */
>> +        colo_compare_inconsistent_notify();
>>           return 0;
>>       }
>>   @@ -620,11 +637,12 @@ static void colo_compare_packet(CompareState *s,
>> Connection *conn,
>>               /*
>>                * If one packet arrive late, the secondary_list or
>>                * primary_list will be empty, so we can't compare it
>> -             * until next comparison.
>> +             * until next comparison. If the packets in the list are
>> +             * timeout, it will trigger a checkpoint request.
>>                */
>>               trace_colo_compare_main("packet different");
>>               g_queue_push_head(&conn->primary_list, pkt);
>> -            /* TODO: colo_notify_checkpoint();*/
>> +            colo_compare_inconsistent_notify();
>>               break;
>>           }
>>       }
>> diff --git a/net/colo-compare.h b/net/colo-compare.h
>> index 1b1ce76aea..22ddd512e2 100644
>> --- a/net/colo-compare.h
>> +++ b/net/colo-compare.h
>> @@ -18,5 +18,7 @@
>>   #define QEMU_COLO_COMPARE_H
>>     void colo_notify_compares_event(void *opaque, int event, Error
>> **errp);
>> +void colo_compare_register_notifier(Notifier *notify);
>> +void colo_compare_unregister_notifier(Notifier *notify);
>>     #endif /* QEMU_COLO_COMPARE_H */
>>
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass
  2018-06-04  6:57   ` Jason Wang
@ 2018-06-10 14:09     ` Zhang Chen
  2018-06-11  1:56       ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 14:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster,
	zhanghailiang, Li Zhijian

On Mon, Jun 4, 2018 at 2:57 PM, Jason Wang <jasowang@redhat.com> wrote:

>
>
> On 2018年06月03日 13:05, Zhang Chen wrote:
>
>> Filter needs to process the event of checkpoint/failover or
>> other event passed by COLO frame.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>   include/net/filter.h |  5 +++++
>>   net/filter.c         | 17 +++++++++++++++++
>>   net/net.c            | 28 ++++++++++++++++++++++++++++
>>   3 files changed, 50 insertions(+)
>>
>> diff --git a/include/net/filter.h b/include/net/filter.h
>> index 435acd6f82..49da666ac0 100644
>> --- a/include/net/filter.h
>> +++ b/include/net/filter.h
>> @@ -38,6 +38,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
>>     typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp);
>>   +typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error
>> **errp);
>> +
>>   typedef struct NetFilterClass {
>>       ObjectClass parent_class;
>>   @@ -45,6 +47,7 @@ typedef struct NetFilterClass {
>>       FilterSetup *setup;
>>       FilterCleanup *cleanup;
>>       FilterStatusChanged *status_changed;
>> +    FilterHandleEvent *handle_event;
>>       /* mandatory */
>>       FilterReceiveIOV *receive_iov;
>>   } NetFilterClass;
>> @@ -77,4 +80,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState
>> *sender,
>>                                       int iovcnt,
>>                                       void *opaque);
>>   +void colo_notify_filters_event(int event, Error **errp);
>> +
>>   #endif /* QEMU_NET_FILTER_H */
>> diff --git a/net/filter.c b/net/filter.c
>> index 2fd7d7d663..0f17eba143 100644
>> --- a/net/filter.c
>> +++ b/net/filter.c
>> @@ -17,6 +17,8 @@
>>   #include "net/vhost_net.h"
>>   #include "qom/object_interfaces.h"
>>   #include "qemu/iov.h"
>> +#include "net/colo.h"
>> +#include "migration/colo.h"
>>     static inline bool qemu_can_skip_netfilter(NetFilterState *nf)
>>   {
>> @@ -245,11 +247,26 @@ static void netfilter_finalize(Object *obj)
>>       g_free(nf->netdev_id);
>>   }
>>   +static void dummy_handle_event(NetFilterState *nf, int event, Error
>> **errp)
>> +{
>> +    switch (event) {
>> +    case COLO_EVENT_CHECKPOINT:
>> +        break;
>> +    case COLO_EVENT_FAILOVER:
>> +        object_property_set_str(OBJECT(nf), "off", "status", errp);
>> +        break;
>> +    default:
>> +        break;
>> +    }
>> +}
>> +
>>   static void netfilter_class_init(ObjectClass *oc, void *data)
>>   {
>>       UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
>> +    NetFilterClass *nfc = NETFILTER_CLASS(oc);
>>         ucc->complete = netfilter_complete;
>> +    nfc->handle_event = dummy_handle_event;
>>   }
>>     static const TypeInfo netfilter_info = {
>> diff --git a/net/net.c b/net/net.c
>> index efb9eaf779..378cd2f9ec 100644
>> --- a/net/net.c
>> +++ b/net/net.c
>> @@ -1329,6 +1329,34 @@ void hmp_info_network(Monitor *mon, const QDict
>> *qdict)
>>       }
>>   }
>>   +void colo_notify_filters_event(int event, Error **errp)
>> +{
>> +    NetClientState *nc, *peer;
>> +    NetClientDriver type;
>> +    NetFilterState *nf;
>> +    NetFilterClass *nfc = NULL;
>> +    Error *local_err = NULL;
>> +
>> +    QTAILQ_FOREACH(nc, &net_clients, next) {
>> +        peer = nc->peer;
>> +        type = nc->info->type;
>> +        if (!peer || type != NET_CLIENT_DRIVER_TAP) {
>> +            continue;
>>
>
> I think when COLO is enabled, qemu should reject all other types of
> backends.


Yes, you are right.
May be we should add a assert here? Because we can not get backend info when
colo-compare start up.
Have any idea?


>
>
> +        }
>> +        QTAILQ_FOREACH(nf, &nc->filters, next) {
>> +            nfc =  NETFILTER_GET_CLASS(OBJECT(nf));
>> +            if (!nfc->handle_event) {
>> +                continue;
>> +            }
>>
>
> Is this still necessary?
>
>
Yes, may be user will use other filter(like filter-buffer) with COLO at
same time.

Thanks
Zhang Chen



> Thanks
>
>
> +            nfc->handle_event(nf, event, &local_err);
>> +            if (local_err) {
>> +                error_propagate(errp, local_err);
>> +                return;
>> +            }
>> +        }
>> +    }
>> +}
>> +
>>   void qmp_set_link(const char *name, bool up, Error **errp)
>>   {
>>       NetClientState *ncs[MAX_QUEUE_NUM];
>>
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 15/17] filter-rewriter: handle checkpoint and failover event
  2018-06-04  7:42   ` Jason Wang
@ 2018-06-10 17:20     ` Zhang Chen
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 17:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Eric Blake, Markus Armbruster,
	zhanghailiang, Li Zhijian

On Mon, Jun 4, 2018 at 3:42 PM, Jason Wang <jasowang@redhat.com> wrote:

>
>
> On 2018年06月03日 13:05, Zhang Chen wrote:
>
>> After one round of checkpoint, the states between PVM and SVM
>> become consistent, so it is unnecessary to adjust the sequence
>> of net packets for old connections, besides, while failover
>> happens, filter-rewriter needs to check if it still needs to
>> adjust sequence of net packets.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
>> ---
>>   migration/colo.c      | 13 +++++++++++++
>>   net/filter-rewriter.c | 40 ++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 53 insertions(+)
>>
>> diff --git a/migration/colo.c b/migration/colo.c
>> index 442471e088..0bff21d9e5 100644
>> --- a/migration/colo.c
>> +++ b/migration/colo.c
>> @@ -31,6 +31,7 @@
>>   #include "qapi/qapi-events-migration.h"
>>   #include "qapi/qmp/qerror.h"
>>   #include "sysemu/cpus.h"
>> +#include "net/filter.h"
>>     static bool vmstate_loading;
>>   static Notifier packets_compare_notifier;
>> @@ -82,6 +83,11 @@ static void secondary_vm_do_failover(void)
>>       if (local_err) {
>>           error_report_err(local_err);
>>       }
>> +    /* Notify all filters of all NIC to do checkpoint */
>> +    colo_notify_filters_event(COLO_EVENT_FAILOVER, &local_err);
>> +    if (local_err) {
>> +        error_report_err(local_err);
>> +    }
>>         if (!autostart) {
>>           error_report("\"-S\" qemu option will be ignored in secondary
>> side");
>> @@ -800,6 +806,13 @@ void *colo_process_incoming_thread(void *opaque)
>>               goto out;
>>           }
>>   +        /* Notify all filters of all NIC to do checkpoint */
>> +        colo_notify_filters_event(COLO_EVENT_CHECKPOINT, &local_err);
>> +        if (local_err) {
>> +            qemu_mutex_unlock_iothread();
>> +            goto out;
>> +        }
>> +
>>
>
> I think the above should belong to another patch.
>
>
>
Yes, I will relocate above code.



>           vmstate_loading = false;
>>           vm_start();
>>           trace_colo_vm_state_change("stop", "run");
>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>> index 0909a9a8af..f3c306cc89 100644
>> --- a/net/filter-rewriter.c
>> +++ b/net/filter-rewriter.c
>> @@ -20,6 +20,8 @@
>>   #include "qemu/main-loop.h"
>>   #include "qemu/iov.h"
>>   #include "net/checksum.h"
>> +#include "net/colo.h"
>> +#include "migration/colo.h"
>>     #define FILTER_COLO_REWRITER(obj) \
>>       OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
>> @@ -277,6 +279,43 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState
>> *nf,
>>       return 0;
>>   }
>>   +static void reset_seq_offset(gpointer key, gpointer value, gpointer
>> user_data)
>> +{
>> +    Connection *conn = (Connection *)value;
>> +
>> +    conn->offset = 0;
>> +}
>> +
>> +static gboolean offset_is_nonzero(gpointer key,
>> +                                  gpointer value,
>> +                                  gpointer user_data)
>> +{
>> +    Connection *conn = (Connection *)value;
>> +
>> +    return conn->offset ? true : false;
>> +}
>> +
>> +static void colo_rewriter_handle_event(NetFilterState *nf, int event,
>> +                                       Error **errp)
>> +{
>> +    RewriterState *rs = FILTER_COLO_REWRITER(nf);
>> +
>> +    switch (event) {
>> +    case COLO_EVENT_CHECKPOINT:
>> +        g_hash_table_foreach(rs->connection_track_table,
>> +                            reset_seq_offset, NULL);
>> +        break;
>> +    case COLO_EVENT_FAILOVER:
>> +        if (!g_hash_table_find(rs->connection_track_table,
>> +                              offset_is_nonzero, NULL)) {
>> +            object_property_set_str(OBJECT(nf), "off", "status", errp);
>> +        }
>>
>
> I may miss something but I think rewriter should not be turned off even
> after failover?
>
>
Yes, after failover, rewriter should maintain the connection that created
before failover and turned off for other connections.
I will fix it in next version.
Very thanks your comments Jason~~~


Thanks
Zhang Chen




> Thanks
>
>
> +        break;
>> +    default:
>> +        break;
>> +    }
>> +}
>> +
>>   static void colo_rewriter_cleanup(NetFilterState *nf)
>>   {
>>       RewriterState *s = FILTER_COLO_REWRITER(nf);
>> @@ -332,6 +371,7 @@ static void colo_rewriter_class_init(ObjectClass
>> *oc, void *data)
>>       nfc->setup = colo_rewriter_setup;
>>       nfc->cleanup = colo_rewriter_cleanup;
>>       nfc->receive_iov = colo_rewriter_receive_iov;
>> +    nfc->handle_event = colo_rewriter_handle_event;
>>   }
>>     static const TypeInfo colo_rewriter_info = {
>>
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO
  2018-06-07 12:54     ` Markus Armbruster
@ 2018-06-10 17:24       ` Zhang Chen
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 17:24 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Eric Blake, qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, zhanghailiang, Li Zhijian

On Thu, Jun 7, 2018 at 8:54 PM, Markus Armbruster <armbru@redhat.com> wrote:

> Eric Blake <eblake@redhat.com> writes:
>
> > On 06/03/2018 12:05 AM, Zhang Chen wrote:
> >> From: zhanghailiang <zhang.zhanghailiang@huawei.com>
> >>
> >> If some errors happen during VM's COLO FT stage, it's important to
> >> notify the users of this event. Together with 'x-colo-lost-heartbeat',
> >> Users can intervene in COLO's failover work immediately.
> >> If users don't want to get involved in COLO's failover verdict,
> >> it is still necessary to notify users that we exited COLO mode.
> >>
> >> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> >> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> >> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> >> Reviewed-by: Eric Blake <eblake@redhat.com>
> >> ---
> >>   migration/colo.c    | 31 +++++++++++++++++++++++++++++++
> >>   qapi/migration.json | 38 ++++++++++++++++++++++++++++++++++++++
> >>   2 files changed, 69 insertions(+)
> >>
> >> diff --git a/migration/colo.c b/migration/colo.c
> >> index c083d3696f..bedb677788 100644
> >> --- a/migration/colo.c
> >
> >>   +    /*
> >> +     * There are only two reasons we can go here, some error happened.
> >> +     * Or the user triggered failover.
> >
> > s/go here/get here/
> > s/happened. Or/happened, or/
> >
> >> +++ b/qapi/migration.json
> >> @@ -880,6 +880,44 @@
> >>   { 'enum': 'FailoverStatus',
> >>     'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] }
> >> +##
> >> +# @COLO_EXIT:
> >> +#
> >> +# Emitted when VM finishes COLO mode due to some errors happening or
> >> +# at the request of users.
> >> +#
> >> +# @mode: report COLO mode when COLO exited.
> >> +#
> >> +# @reason: describes the reason for the COLO exit.
> >> +#
> >> +# Since: 2.13
> >
> > s/2.13/3.0/
> >
> >> +#
> >> +# Example:
> >> +#
> >> +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172},
> >> +#      "event": "COLO_EXIT", "data": {"mode": "primary", "reason":
> "request" } }
> >> +#
> >> +##
> >> +{ 'event': 'COLO_EXIT',
> >> +  'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } }
> >> +
> >> +##
> >> +# @COLOExitReason:
> >> +#
> >> +# The reason for a COLO exit
> >> +#
> >> +# @none: no failover has ever happened, This can't occur in the
> COLO_EXIT event,
> >
> > s/happened,/happened./
>
> While there, wrap the line around column 70, please.
>
> >> +# only in the result of query-colo-status.
> >> +#
> >> +# @request: COLO exit is due to an external request
> >> +#
> >> +# @error: COLO exit is due to an internal error
> >> +#
> >> +# Since: 2.13
> >
> > s/2.13/3.0/
> >
> >> +##
> >> +{ 'enum': 'COLOExitReason',
> >> +  'data': [ 'none', 'request', 'error' ] }
> >> +
> >>   ##
> >>   # @x-colo-lost-heartbeat:
> >>   #
> >>
>
> With these touch-ups:
> Reviewed-by: Markus Armbruster <armbru@redhat.com>
>


I will fixed the typo and addressed comments in next version.
Thanks Markus and Eric.

Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-07 12:59   ` Markus Armbruster
@ 2018-06-10 17:39     ` Zhang Chen
  2018-06-11  6:48       ` Markus Armbruster
  0 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 17:39 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Eric Blake, zhanghailiang,
	Li Zhijian

On Thu, Jun 7, 2018 at 8:59 PM, Markus Armbruster <armbru@redhat.com> wrote:

> Zhang Chen <zhangckid@gmail.com> writes:
>
> > Libvirt or other high level software can use this command query colo
> status.
> > You can test this command like that:
> > {'execute':'query-colo-status'}
> >
> > Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> > ---
> >  migration/colo.c    | 39 +++++++++++++++++++++++++++++++++++++++
> >  qapi/migration.json | 34 ++++++++++++++++++++++++++++++++++
> >  2 files changed, 73 insertions(+)
> >
> > diff --git a/migration/colo.c b/migration/colo.c
> > index bedb677788..8c6b8e9a4e 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -29,6 +29,7 @@
> >  #include "net/colo.h"
> >  #include "block/block.h"
> >  #include "qapi/qapi-events-migration.h"
> > +#include "qapi/qmp/qerror.h"
> >
> >  static bool vmstate_loading;
> >  static Notifier packets_compare_notifier;
> > @@ -237,6 +238,44 @@ void qmp_xen_colo_do_checkpoint(Error **errp)
> >  #endif
> >  }
> >
> > +COLOStatus *qmp_query_colo_status(Error **errp)
> > +{
> > +    int state;
> > +    COLOStatus *s = g_new0(COLOStatus, 1);
> > +
> > +    s->mode = get_colo_mode();
> > +
> > +    switch (s->mode) {
> > +    case COLO_MODE_UNKNOWN:
> > +        error_setg(errp, "COLO is disabled");
> > +        state = MIGRATION_STATUS_NONE;
> > +        break;
> > +    case COLO_MODE_PRIMARY:
> > +        state = migrate_get_current()->state;
> > +        break;
> > +    case COLO_MODE_SECONDARY:
> > +        state = migration_incoming_get_current()->state;
> > +        break;
> > +    default:
> > +        abort();
> > +    }
> > +
> > +    s->colo_running = state == MIGRATION_STATUS_COLO;
> > +
> > +    switch (failover_get_state()) {
> > +    case FAILOVER_STATUS_NONE:
> > +        s->reason = COLO_EXIT_REASON_NONE;
> > +        break;
> > +    case FAILOVER_STATUS_REQUIRE:
> > +        s->reason = COLO_EXIT_REASON_REQUEST;
> > +        break;
> > +    default:
> > +        s->reason = COLO_EXIT_REASON_ERROR;
> > +    }
> > +
> > +    return s;
> > +}
> > +
> >  static void colo_send_message(QEMUFile *f, COLOMessage msg,
> >                                Error **errp)
> >  {
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 93136ce5a0..356a370949 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -1231,6 +1231,40 @@
> >  ##
> >  { 'command': 'xen-colo-do-checkpoint' }
> >
> > +##
> > +# @COLOStatus:
> > +#
> > +# The result format for 'query-colo-status'.
> > +#
> > +# @mode: COLO running mode. If COLO is running, this field will return
> > +#        'primary' or 'secodary'.
> > +#
> > +# @colo-running: true if COLO is running.
> > +#
> > +# @reason: describes the reason for the COLO exit.
>
> What's the value of @reason before a "COLO exit"?
>

Before a "COLO exit", we just return 'none' in this field.

Thanks
Zhang Chen


>
> > +#
> > +# Since: 2.13
> > +##
> > +{ 'struct': 'COLOStatus',
> > +  'data': { 'mode': 'COLOMode', 'colo-running': 'bool', 'reason':
> 'COLOExitReason' } }
> > +
> > +##
> > +# @query-colo-status:
> > +#
> > +# Query COLO status while the vm is running.
> > +#
> > +# Returns: A @COLOStatus object showing the status.
> > +#
> > +# Example:
> > +#
> > +# -> { "execute": "query-colo-status" }
> > +# <- { "return": { "mode": "primary", "colo-running": true, "reason":
> "request" } }
> > +#
> > +# Since: 2.13
> > +##
> > +{ 'command': 'query-colo-status',
> > +  'returns': 'COLOStatus' }
> > +
> >  ##
> >  # @migrate-recover:
> >  #
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-04 22:23   ` Eric Blake
@ 2018-06-10 17:42     ` Zhang Chen
  2018-06-10 17:53       ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 17:42 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Markus Armbruster,
	zhanghailiang, Li Zhijian

On Tue, Jun 5, 2018 at 6:23 AM, Eric Blake <eblake@redhat.com> wrote:

> On 06/03/2018 12:05 AM, Zhang Chen wrote:
>
>> Libvirt or other high level software can use this command query colo
>> status.
>> You can test this command like that:
>> {'execute':'query-colo-status'}
>>
>> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
>> ---
>>
>
> +++ b/qapi/migration.json
>> @@ -1231,6 +1231,40 @@
>>   ##
>>   { 'command': 'xen-colo-do-checkpoint' }
>>   +##
>> +# @COLOStatus:
>> +#
>> +# The result format for 'query-colo-status'.
>> +#
>> +# @mode: COLO running mode. If COLO is running, this field will return
>> +#        'primary' or 'secodary'.
>>
>
> s/secodary/secondary/
>
> +#
>> +# @colo-running: true if COLO is running.
>> +#
>> +# @reason: describes the reason for the COLO exit.
>> +#
>> +# Since: 2.13
>>
>
> 3.0
>
> +##
>> +{ 'struct': 'COLOStatus',
>> +  'data': { 'mode': 'COLOMode', 'colo-running': 'bool', 'reason':
>> 'COLOExitReason' } }
>> +
>> +##
>> +# @query-colo-status:
>> +#
>> +# Query COLO status while the vm is running.
>> +#
>> +# Returns: A @COLOStatus object showing the status.
>> +#
>> +# Example:
>> +#
>> +# -> { "execute": "query-colo-status" }
>> +# <- { "return": { "mode": "primary", "colo-running": true, "reason":
>> "request" } }
>> +#
>> +# Since: 2.13
>>
>
> 3.0


Oh, I can't see the new Qemu plan...

Thank you for the reminder.
Zhang Chen



>
>
> +##
>> +{ 'command': 'query-colo-status',
>> +  'returns': 'COLOStatus' }
>> +
>>   ##
>>   # @migrate-recover:
>>   #
>>
>>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-10 17:42     ` Zhang Chen
@ 2018-06-10 17:53       ` Zhang Chen
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-10 17:53 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, Paolo Bonzini, Juan Quintela,
	Dr . David Alan Gilbert, Jason Wang, Markus Armbruster,
	zhanghailiang, Li Zhijian

On Mon, Jun 11, 2018 at 1:42 AM, Zhang Chen <zhangckid@gmail.com> wrote:

>
>
> On Tue, Jun 5, 2018 at 6:23 AM, Eric Blake <eblake@redhat.com> wrote:
>
>> On 06/03/2018 12:05 AM, Zhang Chen wrote:
>>
>>> Libvirt or other high level software can use this command query colo
>>> status.
>>> You can test this command like that:
>>> {'execute':'query-colo-status'}
>>>
>>> Signed-off-by: Zhang Chen <zhangckid@gmail.com>
>>> ---
>>>
>>
>> +++ b/qapi/migration.json
>>> @@ -1231,6 +1231,40 @@
>>>   ##
>>>   { 'command': 'xen-colo-do-checkpoint' }
>>>   +##
>>> +# @COLOStatus:
>>> +#
>>> +# The result format for 'query-colo-status'.
>>> +#
>>> +# @mode: COLO running mode. If COLO is running, this field will return
>>> +#        'primary' or 'secodary'.
>>>
>>
>> s/secodary/secondary/
>>
>> +#
>>> +# @colo-running: true if COLO is running.
>>> +#
>>> +# @reason: describes the reason for the COLO exit.
>>> +#
>>> +# Since: 2.13
>>>
>>
>> 3.0
>>
>> +##
>>> +{ 'struct': 'COLOStatus',
>>> +  'data': { 'mode': 'COLOMode', 'colo-running': 'bool', 'reason':
>>> 'COLOExitReason' } }
>>> +
>>> +##
>>> +# @query-colo-status:
>>> +#
>>> +# Query COLO status while the vm is running.
>>> +#
>>> +# Returns: A @COLOStatus object showing the status.
>>> +#
>>> +# Example:
>>> +#
>>> +# -> { "execute": "query-colo-status" }
>>> +# <- { "return": { "mode": "primary", "colo-running": true, "reason":
>>> "request" } }
>>> +#
>>> +# Since: 2.13
>>>
>>
>> 3.0
>
>
> Oh, I can't see the new Qemu plan...
>

Typo: Sorry, I just forgot to see the new plan....


>
> Thank you for the reminder.
> Zhang Chen
>
>
>
>>
>>
>> +##
>>> +{ 'command': 'query-colo-status',
>>> +  'returns': 'COLOStatus' }
>>> +
>>>   ##
>>>   # @migrate-recover:
>>>   #
>>>
>>>
>> --
>> Eric Blake, Principal Software Engineer
>> Red Hat, Inc.           +1-919-301-3266
>> Virtualization:  qemu.org | libvirt.org
>>
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass
  2018-06-10 14:09     ` Zhang Chen
@ 2018-06-11  1:56       ` Jason Wang
  2018-06-11  6:46         ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2018-06-11  1:56 UTC (permalink / raw)
  To: Zhang Chen
  Cc: zhanghailiang, Li Zhijian, Juan Quintela, Markus Armbruster,
	qemu-devel, Paolo Bonzini, Dr . David Alan Gilbert



On 2018年06月10日 22:09, Zhang Chen wrote:
>> I think when COLO is enabled, qemu should reject all other types of
>> backends.
> Yes, you are right.
> May be we should add a assert here? Because we can not get backend info when
> colo-compare start up.
> Have any idea?
>
>

Is there a global variable or function to detect the COLO existence? If 
yes, maybe we could use that.

Thanks

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass
  2018-06-11  1:56       ` Jason Wang
@ 2018-06-11  6:46         ` Zhang Chen
  2018-06-11  7:02           ` Jason Wang
  0 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-11  6:46 UTC (permalink / raw)
  To: Jason Wang
  Cc: zhanghailiang, Li Zhijian, Juan Quintela, Markus Armbruster,
	qemu-devel, Paolo Bonzini, Dr . David Alan Gilbert

On Mon, Jun 11, 2018 at 9:56 AM, Jason Wang <jasowang@redhat.com> wrote:

>
>
> On 2018年06月10日 22:09, Zhang Chen wrote:
>
>> I think when COLO is enabled, qemu should reject all other types of
>>> backends.
>>>
>> Yes, you are right.
>> May be we should add a assert here? Because we can not get backend info
>> when
>> colo-compare start up.
>> Have any idea?
>>
>>
>>
> Is there a global variable or function to detect the COLO existence? If
> yes, maybe we could use that.
>

May be we can reuse the "qmp_query_colo_status" in 11/17 patch.
I will add this check in next version.


Thanks
Zhang Chen


>
> Thanks
>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-10 17:39     ` Zhang Chen
@ 2018-06-11  6:48       ` Markus Armbruster
  2018-06-11 15:34         ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Markus Armbruster @ 2018-06-11  6:48 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Markus Armbruster, zhanghailiang, Li Zhijian, Juan Quintela,
	Jason Wang, qemu-devel, Dr . David Alan Gilbert, Paolo Bonzini

Zhang Chen <zhangckid@gmail.com> writes:

> On Thu, Jun 7, 2018 at 8:59 PM, Markus Armbruster <armbru@redhat.com> wrote:
>
>> Zhang Chen <zhangckid@gmail.com> writes:
>>
>> > Libvirt or other high level software can use this command query colo
>> status.
>> > You can test this command like that:
>> > {'execute':'query-colo-status'}
>> >
>> > Signed-off-by: Zhang Chen <zhangckid@gmail.com>
>> > ---
>> >  migration/colo.c    | 39 +++++++++++++++++++++++++++++++++++++++
>> >  qapi/migration.json | 34 ++++++++++++++++++++++++++++++++++
>> >  2 files changed, 73 insertions(+)
>> >
>> > diff --git a/migration/colo.c b/migration/colo.c
>> > index bedb677788..8c6b8e9a4e 100644
>> > --- a/migration/colo.c
>> > +++ b/migration/colo.c
>> > @@ -29,6 +29,7 @@
>> >  #include "net/colo.h"
>> >  #include "block/block.h"
>> >  #include "qapi/qapi-events-migration.h"
>> > +#include "qapi/qmp/qerror.h"
>> >
>> >  static bool vmstate_loading;
>> >  static Notifier packets_compare_notifier;
>> > @@ -237,6 +238,44 @@ void qmp_xen_colo_do_checkpoint(Error **errp)
>> >  #endif
>> >  }
>> >
>> > +COLOStatus *qmp_query_colo_status(Error **errp)
>> > +{
>> > +    int state;
>> > +    COLOStatus *s = g_new0(COLOStatus, 1);
>> > +
>> > +    s->mode = get_colo_mode();
>> > +
>> > +    switch (s->mode) {
>> > +    case COLO_MODE_UNKNOWN:
>> > +        error_setg(errp, "COLO is disabled");
>> > +        state = MIGRATION_STATUS_NONE;
>> > +        break;
>> > +    case COLO_MODE_PRIMARY:
>> > +        state = migrate_get_current()->state;
>> > +        break;
>> > +    case COLO_MODE_SECONDARY:
>> > +        state = migration_incoming_get_current()->state;
>> > +        break;
>> > +    default:
>> > +        abort();
>> > +    }
>> > +
>> > +    s->colo_running = state == MIGRATION_STATUS_COLO;
>> > +
>> > +    switch (failover_get_state()) {
>> > +    case FAILOVER_STATUS_NONE:
>> > +        s->reason = COLO_EXIT_REASON_NONE;
>> > +        break;
>> > +    case FAILOVER_STATUS_REQUIRE:
>> > +        s->reason = COLO_EXIT_REASON_REQUEST;
>> > +        break;
>> > +    default:
>> > +        s->reason = COLO_EXIT_REASON_ERROR;
>> > +    }
>> > +
>> > +    return s;
>> > +}
>> > +
>> >  static void colo_send_message(QEMUFile *f, COLOMessage msg,
>> >                                Error **errp)
>> >  {
>> > diff --git a/qapi/migration.json b/qapi/migration.json
>> > index 93136ce5a0..356a370949 100644
>> > --- a/qapi/migration.json
>> > +++ b/qapi/migration.json
>> > @@ -1231,6 +1231,40 @@
>> >  ##
>> >  { 'command': 'xen-colo-do-checkpoint' }
>> >
>> > +##
>> > +# @COLOStatus:
>> > +#
>> > +# The result format for 'query-colo-status'.
>> > +#
>> > +# @mode: COLO running mode. If COLO is running, this field will return
>> > +#        'primary' or 'secodary'.
>> > +#
>> > +# @colo-running: true if COLO is running.
>> > +#
>> > +# @reason: describes the reason for the COLO exit.
>>
>> What's the value of @reason before a "COLO exit"?
>>
>
> Before a "COLO exit", we just return 'none' in this field.

Please add that to the documentation.

Please excuse my ignorance on COLO...  I'm still not sure I fully
understand how the three members are related, or even how the COLO state
machine works and how its related to / embedded in RunState.  I searched
docs/ for a state diagram, but couldn't find one.

According to runstate_transitions_def[], the part of the RunState state
machine that's directly connected to state "colo" looks like this:

    inmigrate  -+
                |
    paused  ----+
                |
    migrate  ---+->  colo  <------>  running
                |
    suspended  -+
                |
    watchdog  --+

For each of the seven state transitions: how is the state transition
triggered (e.g. by QMP command, spontaneously when a certain condition
is detected, ...), and what events (if any) are emitted then?

How is @colo-running related to the run state?

Which run states are considered to be "before a COLO exit"?  If "before
a COLO exit" doesn't map to run states, the state machine is too coarse
to fully describe COLO, and I'd like to see a suitably refined one.

If @colo-running is true, then @mode is either "primary" or "secondary".
What are the possible values when @colo-running is false?

[...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass
  2018-06-11  6:46         ` Zhang Chen
@ 2018-06-11  7:02           ` Jason Wang
  2018-06-11 15:36             ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Jason Wang @ 2018-06-11  7:02 UTC (permalink / raw)
  To: Zhang Chen
  Cc: zhanghailiang, Li Zhijian, Juan Quintela, qemu-devel,
	Markus Armbruster, Paolo Bonzini, Dr . David Alan Gilbert



On 2018年06月11日 14:46, Zhang Chen wrote:
> On Mon, Jun 11, 2018 at 9:56 AM, Jason Wang <jasowang@redhat.com> wrote:
>
>>
>> On 2018年06月10日 22:09, Zhang Chen wrote:
>>
>>> I think when COLO is enabled, qemu should reject all other types of
>>>> backends.
>>>>
>>> Yes, you are right.
>>> May be we should add a assert here? Because we can not get backend info
>>> when
>>> colo-compare start up.
>>> Have any idea?
>>>
>>>
>>>
>> Is there a global variable or function to detect the COLO existence? If
>> yes, maybe we could use that.
>>
> May be we can reuse the "qmp_query_colo_status" in 11/17 patch.
> I will add this check in next version.

I think it's better to have an internal helper. Calling qmp function 
from net is odd.

Thanks

>
> Thanks
> Zhang Chen
>
>
>> Thanks
>>
>>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-11  6:48       ` Markus Armbruster
@ 2018-06-11 15:34         ` Zhang Chen
  2018-06-13 16:50           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 46+ messages in thread
From: Zhang Chen @ 2018-06-11 15:34 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: zhanghailiang, Li Zhijian, Juan Quintela, Jason Wang, qemu-devel,
	Dr . David Alan Gilbert, Paolo Bonzini

On Mon, Jun 11, 2018 at 2:48 PM, Markus Armbruster <armbru@redhat.com>
wrote:

> Zhang Chen <zhangckid@gmail.com> writes:
>
> > On Thu, Jun 7, 2018 at 8:59 PM, Markus Armbruster <armbru@redhat.com>
> wrote:
> >
> >> Zhang Chen <zhangckid@gmail.com> writes:
> >>
> >> > Libvirt or other high level software can use this command query colo
> >> status.
> >> > You can test this command like that:
> >> > {'execute':'query-colo-status'}
> >> >
> >> > Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> >> > ---
> >> >  migration/colo.c    | 39 +++++++++++++++++++++++++++++++++++++++
> >> >  qapi/migration.json | 34 ++++++++++++++++++++++++++++++++++
> >> >  2 files changed, 73 insertions(+)
> >> >
> >> > diff --git a/migration/colo.c b/migration/colo.c
> >> > index bedb677788..8c6b8e9a4e 100644
> >> > --- a/migration/colo.c
> >> > +++ b/migration/colo.c
> >> > @@ -29,6 +29,7 @@
> >> >  #include "net/colo.h"
> >> >  #include "block/block.h"
> >> >  #include "qapi/qapi-events-migration.h"
> >> > +#include "qapi/qmp/qerror.h"
> >> >
> >> >  static bool vmstate_loading;
> >> >  static Notifier packets_compare_notifier;
> >> > @@ -237,6 +238,44 @@ void qmp_xen_colo_do_checkpoint(Error **errp)
> >> >  #endif
> >> >  }
> >> >
> >> > +COLOStatus *qmp_query_colo_status(Error **errp)
> >> > +{
> >> > +    int state;
> >> > +    COLOStatus *s = g_new0(COLOStatus, 1);
> >> > +
> >> > +    s->mode = get_colo_mode();
> >> > +
> >> > +    switch (s->mode) {
> >> > +    case COLO_MODE_UNKNOWN:
> >> > +        error_setg(errp, "COLO is disabled");
> >> > +        state = MIGRATION_STATUS_NONE;
> >> > +        break;
> >> > +    case COLO_MODE_PRIMARY:
> >> > +        state = migrate_get_current()->state;
> >> > +        break;
> >> > +    case COLO_MODE_SECONDARY:
> >> > +        state = migration_incoming_get_current()->state;
> >> > +        break;
> >> > +    default:
> >> > +        abort();
> >> > +    }
> >> > +
> >> > +    s->colo_running = state == MIGRATION_STATUS_COLO;
> >> > +
> >> > +    switch (failover_get_state()) {
> >> > +    case FAILOVER_STATUS_NONE:
> >> > +        s->reason = COLO_EXIT_REASON_NONE;
> >> > +        break;
> >> > +    case FAILOVER_STATUS_REQUIRE:
> >> > +        s->reason = COLO_EXIT_REASON_REQUEST;
> >> > +        break;
> >> > +    default:
> >> > +        s->reason = COLO_EXIT_REASON_ERROR;
> >> > +    }
> >> > +
> >> > +    return s;
> >> > +}
> >> > +
> >> >  static void colo_send_message(QEMUFile *f, COLOMessage msg,
> >> >                                Error **errp)
> >> >  {
> >> > diff --git a/qapi/migration.json b/qapi/migration.json
> >> > index 93136ce5a0..356a370949 100644
> >> > --- a/qapi/migration.json
> >> > +++ b/qapi/migration.json
> >> > @@ -1231,6 +1231,40 @@
> >> >  ##
> >> >  { 'command': 'xen-colo-do-checkpoint' }
> >> >
> >> > +##
> >> > +# @COLOStatus:
> >> > +#
> >> > +# The result format for 'query-colo-status'.
> >> > +#
> >> > +# @mode: COLO running mode. If COLO is running, this field will
> return
> >> > +#        'primary' or 'secodary'.
> >> > +#
> >> > +# @colo-running: true if COLO is running.
> >> > +#
> >> > +# @reason: describes the reason for the COLO exit.
> >>
> >> What's the value of @reason before a "COLO exit"?
> >>
> >
> > Before a "COLO exit", we just return 'none' in this field.
>
> Please add that to the documentation.
>

OK.


>
> Please excuse my ignorance on COLO...  I'm still not sure I fully
> understand how the three members are related, or even how the COLO state
> machine works and how its related to / embedded in RunState.  I searched
> docs/ for a state diagram, but couldn't find one.
>
> According to runstate_transitions_def[], the part of the RunState state
> machine that's directly connected to state "colo" looks like this:
>
>     inmigrate  -+
>                 |
>     paused  ----+
>                 |
>     migrate  ---+->  colo  <------>  running
>                 |
>     suspended  -+
>                 |
>     watchdog  --+
>
> For each of the seven state transitions: how is the state transition
> triggered (e.g. by QMP command, spontaneously when a certain condition
> is detected, ...), and what events (if any) are emitted then?
>
>
When you start COLO, the VM always running in "MIGRATION_STATUS_COLO" still
occur failover.
And in the flow diagram, you can think COLO always running in migrate state.
Because into COLO mode, we will control VM state in COLO code itself, for
example:
When we start COLO, it will do the first migration as normal live
migration, after that we will enter
the COLO process, at that time COLO think the primary VM state is same with
secondary VM(the first checkpoint),
so we will use vm_start() start the primary VM(unlike to normal migration)
and secondary VM.
In this time, primary VM and secondary VM will parallel running, and if
COLO found two VM state are
not same, it will trigger checkpoint(like another migration). Finally, if
occurred some fault that will trigger
failover, after that primary VM maybe return to normal running
mode(secondary dead).
So, if we just see the primary VM state, may be it has out of the RunState
state
machine or it still in migrate state.




> How is @colo-running related to the run state?
>

Not related, as I say above.


>
> Which run states are considered to be "before a COLO exit"?  If "before
> a COLO exit" doesn't map to run states, the state machine is too coarse
> to fully describe COLO, and I'd like to see a suitably refined one.
>
>
COLO just is a special case. It's worthy to refined one?
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Any comments?



> If @colo-running is true, then @mode is either "primary" or "secondary".
> What are the possible values when @colo-running is false?
>

The @mode will in "unknown" state.


Thanks
Zhang Chen



>
> [...]
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass
  2018-06-11  7:02           ` Jason Wang
@ 2018-06-11 15:36             ` Zhang Chen
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-11 15:36 UTC (permalink / raw)
  To: Jason Wang
  Cc: zhanghailiang, Li Zhijian, Juan Quintela, qemu-devel,
	Markus Armbruster, Paolo Bonzini, Dr . David Alan Gilbert

On Mon, Jun 11, 2018 at 3:02 PM, Jason Wang <jasowang@redhat.com> wrote:

>
>
> On 2018年06月11日 14:46, Zhang Chen wrote:
>
>> On Mon, Jun 11, 2018 at 9:56 AM, Jason Wang <jasowang@redhat.com> wrote:
>>
>>
>>> On 2018年06月10日 22:09, Zhang Chen wrote:
>>>
>>> I think when COLO is enabled, qemu should reject all other types of
>>>>
>>>>> backends.
>>>>>
>>>>> Yes, you are right.
>>>> May be we should add a assert here? Because we can not get backend info
>>>> when
>>>> colo-compare start up.
>>>> Have any idea?
>>>>
>>>>
>>>>
>>>> Is there a global variable or function to detect the COLO existence? If
>>> yes, maybe we could use that.
>>>
>>> May be we can reuse the "qmp_query_colo_status" in 11/17 patch.
>> I will add this check in next version.
>>
>
> I think it's better to have an internal helper. Calling qmp function from
> net is odd.
>
>
OK, I got it.

Thanks
Zhang Chen



> Thanks
>
>
>> Thanks
>> Zhang Chen
>>
>>
>> Thanks
>>>
>>>
>>>
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-11 15:34         ` Zhang Chen
@ 2018-06-13 16:50           ` Dr. David Alan Gilbert
  2018-06-14  8:42             ` Markus Armbruster
  0 siblings, 1 reply; 46+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-13 16:50 UTC (permalink / raw)
  To: Zhang Chen
  Cc: Markus Armbruster, zhanghailiang, Li Zhijian, Juan Quintela,
	Jason Wang, qemu-devel, Paolo Bonzini

* Zhang Chen (zhangckid@gmail.com) wrote:
> On Mon, Jun 11, 2018 at 2:48 PM, Markus Armbruster <armbru@redhat.com>
> wrote:
> 
> > Zhang Chen <zhangckid@gmail.com> writes:
> >
> > > On Thu, Jun 7, 2018 at 8:59 PM, Markus Armbruster <armbru@redhat.com>
> > wrote:
> > >
> > >> Zhang Chen <zhangckid@gmail.com> writes:
> > >>
> > >> > Libvirt or other high level software can use this command query colo
> > >> status.
> > >> > You can test this command like that:
> > >> > {'execute':'query-colo-status'}
> > >> >
> > >> > Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> > >> > ---
> > >> >  migration/colo.c    | 39 +++++++++++++++++++++++++++++++++++++++
> > >> >  qapi/migration.json | 34 ++++++++++++++++++++++++++++++++++
> > >> >  2 files changed, 73 insertions(+)
> > >> >
> > >> > diff --git a/migration/colo.c b/migration/colo.c
> > >> > index bedb677788..8c6b8e9a4e 100644
> > >> > --- a/migration/colo.c
> > >> > +++ b/migration/colo.c
> > >> > @@ -29,6 +29,7 @@
> > >> >  #include "net/colo.h"
> > >> >  #include "block/block.h"
> > >> >  #include "qapi/qapi-events-migration.h"
> > >> > +#include "qapi/qmp/qerror.h"
> > >> >
> > >> >  static bool vmstate_loading;
> > >> >  static Notifier packets_compare_notifier;
> > >> > @@ -237,6 +238,44 @@ void qmp_xen_colo_do_checkpoint(Error **errp)
> > >> >  #endif
> > >> >  }
> > >> >
> > >> > +COLOStatus *qmp_query_colo_status(Error **errp)
> > >> > +{
> > >> > +    int state;
> > >> > +    COLOStatus *s = g_new0(COLOStatus, 1);
> > >> > +
> > >> > +    s->mode = get_colo_mode();
> > >> > +
> > >> > +    switch (s->mode) {
> > >> > +    case COLO_MODE_UNKNOWN:
> > >> > +        error_setg(errp, "COLO is disabled");
> > >> > +        state = MIGRATION_STATUS_NONE;
> > >> > +        break;
> > >> > +    case COLO_MODE_PRIMARY:
> > >> > +        state = migrate_get_current()->state;
> > >> > +        break;
> > >> > +    case COLO_MODE_SECONDARY:
> > >> > +        state = migration_incoming_get_current()->state;
> > >> > +        break;
> > >> > +    default:
> > >> > +        abort();
> > >> > +    }
> > >> > +
> > >> > +    s->colo_running = state == MIGRATION_STATUS_COLO;
> > >> > +
> > >> > +    switch (failover_get_state()) {
> > >> > +    case FAILOVER_STATUS_NONE:
> > >> > +        s->reason = COLO_EXIT_REASON_NONE;
> > >> > +        break;
> > >> > +    case FAILOVER_STATUS_REQUIRE:
> > >> > +        s->reason = COLO_EXIT_REASON_REQUEST;
> > >> > +        break;
> > >> > +    default:
> > >> > +        s->reason = COLO_EXIT_REASON_ERROR;
> > >> > +    }
> > >> > +
> > >> > +    return s;
> > >> > +}
> > >> > +
> > >> >  static void colo_send_message(QEMUFile *f, COLOMessage msg,
> > >> >                                Error **errp)
> > >> >  {
> > >> > diff --git a/qapi/migration.json b/qapi/migration.json
> > >> > index 93136ce5a0..356a370949 100644
> > >> > --- a/qapi/migration.json
> > >> > +++ b/qapi/migration.json
> > >> > @@ -1231,6 +1231,40 @@
> > >> >  ##
> > >> >  { 'command': 'xen-colo-do-checkpoint' }
> > >> >
> > >> > +##
> > >> > +# @COLOStatus:
> > >> > +#
> > >> > +# The result format for 'query-colo-status'.
> > >> > +#
> > >> > +# @mode: COLO running mode. If COLO is running, this field will
> > return
> > >> > +#        'primary' or 'secodary'.
> > >> > +#
> > >> > +# @colo-running: true if COLO is running.
> > >> > +#
> > >> > +# @reason: describes the reason for the COLO exit.
> > >>
> > >> What's the value of @reason before a "COLO exit"?
> > >>
> > >
> > > Before a "COLO exit", we just return 'none' in this field.
> >
> > Please add that to the documentation.
> >
> 
> OK.
> 
> 
> >
> > Please excuse my ignorance on COLO...  I'm still not sure I fully
> > understand how the three members are related, or even how the COLO state
> > machine works and how its related to / embedded in RunState.  I searched
> > docs/ for a state diagram, but couldn't find one.
> >
> > According to runstate_transitions_def[], the part of the RunState state
> > machine that's directly connected to state "colo" looks like this:
> >
> >     inmigrate  -+
> >                 |
> >     paused  ----+
> >                 |
> >     migrate  ---+->  colo  <------>  running
> >                 |
> >     suspended  -+
> >                 |
> >     watchdog  --+
> >
> > For each of the seven state transitions: how is the state transition
> > triggered (e.g. by QMP command, spontaneously when a certain condition
> > is detected, ...), and what events (if any) are emitted then?
> >
> >
> When you start COLO, the VM always running in "MIGRATION_STATUS_COLO" still
> occur failover.
> And in the flow diagram, you can think COLO always running in migrate state.
> Because into COLO mode, we will control VM state in COLO code itself, for
> example:
> When we start COLO, it will do the first migration as normal live
> migration, after that we will enter
> the COLO process, at that time COLO think the primary VM state is same with
> secondary VM(the first checkpoint),
> so we will use vm_start() start the primary VM(unlike to normal migration)
> and secondary VM.
> In this time, primary VM and secondary VM will parallel running, and if
> COLO found two VM state are
> not same, it will trigger checkpoint(like another migration). Finally, if
> occurred some fault that will trigger
> failover, after that primary VM maybe return to normal running
> mode(secondary dead).
> So, if we just see the primary VM state, may be it has out of the RunState
> state
> machine or it still in migrate state.
> 
> 
> 
> 
> > How is @colo-running related to the run state?
> >
> 
> Not related, as I say above.

Right; this is a different type of 'running' - it might be better to say
'active' rather than running.

  COLO has a pair of VMs in sync with a constant stream of migrations
between them.
The 'mode' is whether it's the source (primary) or destination (secondary) VM.
(Also sometimes written PVM/SVM)

If COLO fails for some reason (e.g. the
secondary host fails) then I think this is saying the 'colo-running'
would be false.

Some monitoring tool would be watching this to make sure you
really do have a redundent pair of VMs, and if one of them failed
you'd want to know and alert.

Dave

> > Which run states are considered to be "before a COLO exit"?  If "before
> > a COLO exit" doesn't map to run states, the state machine is too coarse
> > to fully describe COLO, and I'd like to see a suitably refined one.
> >
> >
> COLO just is a special case. It's worthy to refined one?
> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Any comments?
> 
> 
> 
> > If @colo-running is true, then @mode is either "primary" or "secondary".
> > What are the possible values when @colo-running is false?
> >
> 
> The @mode will in "unknown" state.
> 
> 
> Thanks
> Zhang Chen
> 
> 
> 
> >
> > [...]
> >
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-13 16:50           ` Dr. David Alan Gilbert
@ 2018-06-14  8:42             ` Markus Armbruster
  2018-06-14  9:25               ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 46+ messages in thread
From: Markus Armbruster @ 2018-06-14  8:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Zhang Chen, zhanghailiang, Li Zhijian, Juan Quintela, Jason Wang,
	qemu-devel, Markus Armbruster, Paolo Bonzini

"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:

> * Zhang Chen (zhangckid@gmail.com) wrote:
>> On Mon, Jun 11, 2018 at 2:48 PM, Markus Armbruster <armbru@redhat.com>
>> wrote:
>> 
>> > Zhang Chen <zhangckid@gmail.com> writes:
>> >
>> > > On Thu, Jun 7, 2018 at 8:59 PM, Markus Armbruster <armbru@redhat.com> wrote:
>> > >
>> > >> Zhang Chen <zhangckid@gmail.com> writes:
>> > >>
>> > >> > Libvirt or other high level software can use this command query colo status.
>> > >> > You can test this command like that:
>> > >> > {'execute':'query-colo-status'}
>> > >> >
>> > >> > Signed-off-by: Zhang Chen <zhangckid@gmail.com>
[...]
>> > >> > diff --git a/qapi/migration.json b/qapi/migration.json
>> > >> > index 93136ce5a0..356a370949 100644
>> > >> > --- a/qapi/migration.json
>> > >> > +++ b/qapi/migration.json
>> > >> > @@ -1231,6 +1231,40 @@
>> > >> >  ##
>> > >> >  { 'command': 'xen-colo-do-checkpoint' }
>> > >> >
>> > >> > +##
>> > >> > +# @COLOStatus:
>> > >> > +#
>> > >> > +# The result format for 'query-colo-status'.
>> > >> > +#
>> > >> > +# @mode: COLO running mode. If COLO is running, this field will return
>> > >> > +#        'primary' or 'secodary'.
>> > >> > +#
>> > >> > +# @colo-running: true if COLO is running.
>> > >> > +#
>> > >> > +# @reason: describes the reason for the COLO exit.
>> > >>
>> > >> What's the value of @reason before a "COLO exit"?
>> > >>
>> > >
>> > > Before a "COLO exit", we just return 'none' in this field.
>> >
>> > Please add that to the documentation.
>> >
>> 
>> OK.
>> 
>> 
>> >
>> > Please excuse my ignorance on COLO...  I'm still not sure I fully
>> > understand how the three members are related, or even how the COLO state
>> > machine works and how its related to / embedded in RunState.  I searched
>> > docs/ for a state diagram, but couldn't find one.
>> >
>> > According to runstate_transitions_def[], the part of the RunState state
>> > machine that's directly connected to state "colo" looks like this:
>> >
>> >     inmigrate  -+
>> >                 |
>> >     paused  ----+
>> >                 |
>> >     migrate  ---+->  colo  <------>  running
>> >                 |
>> >     suspended  -+
>> >                 |
>> >     watchdog  --+
>> >
>> > For each of the seven state transitions: how is the state transition
>> > triggered (e.g. by QMP command, spontaneously when a certain condition
>> > is detected, ...), and what events (if any) are emitted then?
>> >
>> >
>> When you start COLO, the VM always running in "MIGRATION_STATUS_COLO" still
>> occur failover.
>> And in the flow diagram, you can think COLO always running in migrate state.
>> Because into COLO mode, we will control VM state in COLO code itself, for
>> example:
>> When we start COLO, it will do the first migration as normal live
>> migration, after that we will enter
>> the COLO process, at that time COLO think the primary VM state is same with
>> secondary VM(the first checkpoint),
>> so we will use vm_start() start the primary VM(unlike to normal migration)
>> and secondary VM.
>> In this time, primary VM and secondary VM will parallel running, and if
>> COLO found two VM state are
>> not same, it will trigger checkpoint(like another migration). Finally, if
>> occurred some fault that will trigger
>> failover, after that primary VM maybe return to normal running
>> mode(secondary dead).
>> So, if we just see the primary VM state, may be it has out of the RunState
>> state
>> machine or it still in migrate state.
>> 
>> 
>> 
>> 
>> > How is @colo-running related to the run state?
>> >
>> 
>> Not related, as I say above.
>
> Right; this is a different type of 'running' - it might be better to say
> 'active' rather than running.

Rename?

>   COLO has a pair of VMs in sync with a constant stream of migrations
> between them.
> The 'mode' is whether it's the source (primary) or destination (secondary) VM.
> (Also sometimes written PVM/SVM)
>
> If COLO fails for some reason (e.g. the
> secondary host fails) then I think this is saying the 'colo-running'
> would be false.
>
> Some monitoring tool would be watching this to make sure you
> really do have a redundent pair of VMs, and if one of them failed
> you'd want to know and alert.

Let me try to explain what I learned in my own words, so you can correct
my misunderstandings.

A VM doing COLO is either the primary or the secondary of a pair.  A
monitoring process watches them.

At some time, it enters MigrationStatus 'colo'.  Peeking at the code, it
looks like it enters it from state 'active', and never leaves it.  This
happens after we successfully created the secondary by migrating the
primary.

Aside: migrate_set_state() appears to do nothing when @old_state doesn't
match @state, yet callers appear to assume it works.  Feels brittle.  Am
I confused?

The monitoring process orchestrates fault tolerance:

* It initially creates the secondary by migrating the primary.  This is
  called the first checkpoint.

* If the primary goes down, the monitor sends x-colo-lost-heartbeat to
  the secondary.  The secondary becomes the primary, and we create a new
  secondary by live-migrating the primary.

* If the secondary goes down or out of sync, we abandon it and send
  x-colo-lost-heartbeat to the primary.  We can then create a new
  secondary by live-migrating the primary.  This is called another
  checkpoint.

x-colo-lost-heartbeat's doc comment:

# Tell qemu that heartbeat is lost, request it to do takeover procedures.
# If this command is sent to the PVM, the Primary side will exit COLO mode.

What does "exiting COLO mode" mean, and how is it reflected in
ColoStatus member mode?  Do we reenter COLO mode eventually?  How?

# If sent to the Secondary, the Secondary side will run failover work,
# then takes over server operation to become the service VM.

Undefined term "service VM".  Do you mean primary VM?

Cases:

(1) This VM isn't doing COLO.  ColoStatus:

    { "mode": "unknown",
      "running": false,
      "reason": "none" }

(2) This VM is a COLO primary

(2a) and it hasn't received x-colo-lost-heartbeat since it last became
     primary.  ColoStatus:

    { "mode": "primary",
      "running": true,          # I guess
      "reason": "none" }

(2b) and it has received x-colo-lost-heartbeat since it last became
     primary

    { "mode": "primary",
      "running": true,          # I guess
      "reason": "request" }

(2c) and it has run into some error condition I don't understand (but
     probably should)

    { "mode": "primary",
      "running": true,          # I guess
      "reason": "error" }

(3) This VM is a COLO secondary

(3a-c) like (2a-c)

If that's correct (and I doubt it), then @running is entirely redundant:
it's false if and only if @mode is "unknown".

Speaking of mode "unknown": that's a bad name.  "none" would be better.
Or maybe query-colo-status should fail in case (1), to get rid of it at
the interface entirely.

We really, really, really need a state diagram complete with QMP
commands and events.  COLO-FT.txt covers architecture and provides an
example, but it's entirely inadequate at explaining how the QMP commands
and events fit in, and their doc comments don't really help.  I feel
this is the reason why we're at v8 and I'm still groping in the dark,
unable to pass judgement on the proposed QAPI schema changes.

[...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-14  8:42             ` Markus Armbruster
@ 2018-06-14  9:25               ` Dr. David Alan Gilbert
  2018-06-19  4:00                 ` Zhang Chen
  0 siblings, 1 reply; 46+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-14  9:25 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Zhang Chen, zhanghailiang, Li Zhijian, Juan Quintela, Jason Wang,
	qemu-devel, Paolo Bonzini

* Markus Armbruster (armbru@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Zhang Chen (zhangckid@gmail.com) wrote:
> >> On Mon, Jun 11, 2018 at 2:48 PM, Markus Armbruster <armbru@redhat.com>
> >> wrote:
> >> 
> >> > Zhang Chen <zhangckid@gmail.com> writes:
> >> >
> >> > > On Thu, Jun 7, 2018 at 8:59 PM, Markus Armbruster <armbru@redhat.com> wrote:
> >> > >
> >> > >> Zhang Chen <zhangckid@gmail.com> writes:
> >> > >>
> >> > >> > Libvirt or other high level software can use this command query colo status.
> >> > >> > You can test this command like that:
> >> > >> > {'execute':'query-colo-status'}
> >> > >> >
> >> > >> > Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> [...]
> >> > >> > diff --git a/qapi/migration.json b/qapi/migration.json
> >> > >> > index 93136ce5a0..356a370949 100644
> >> > >> > --- a/qapi/migration.json
> >> > >> > +++ b/qapi/migration.json
> >> > >> > @@ -1231,6 +1231,40 @@
> >> > >> >  ##
> >> > >> >  { 'command': 'xen-colo-do-checkpoint' }
> >> > >> >
> >> > >> > +##
> >> > >> > +# @COLOStatus:
> >> > >> > +#
> >> > >> > +# The result format for 'query-colo-status'.
> >> > >> > +#
> >> > >> > +# @mode: COLO running mode. If COLO is running, this field will return
> >> > >> > +#        'primary' or 'secodary'.
> >> > >> > +#
> >> > >> > +# @colo-running: true if COLO is running.
> >> > >> > +#
> >> > >> > +# @reason: describes the reason for the COLO exit.
> >> > >>
> >> > >> What's the value of @reason before a "COLO exit"?
> >> > >>
> >> > >
> >> > > Before a "COLO exit", we just return 'none' in this field.
> >> >
> >> > Please add that to the documentation.
> >> >
> >> 
> >> OK.
> >> 
> >> 
> >> >
> >> > Please excuse my ignorance on COLO...  I'm still not sure I fully
> >> > understand how the three members are related, or even how the COLO state
> >> > machine works and how its related to / embedded in RunState.  I searched
> >> > docs/ for a state diagram, but couldn't find one.
> >> >
> >> > According to runstate_transitions_def[], the part of the RunState state
> >> > machine that's directly connected to state "colo" looks like this:
> >> >
> >> >     inmigrate  -+
> >> >                 |
> >> >     paused  ----+
> >> >                 |
> >> >     migrate  ---+->  colo  <------>  running
> >> >                 |
> >> >     suspended  -+
> >> >                 |
> >> >     watchdog  --+
> >> >
> >> > For each of the seven state transitions: how is the state transition
> >> > triggered (e.g. by QMP command, spontaneously when a certain condition
> >> > is detected, ...), and what events (if any) are emitted then?
> >> >
> >> >
> >> When you start COLO, the VM always running in "MIGRATION_STATUS_COLO" still
> >> occur failover.
> >> And in the flow diagram, you can think COLO always running in migrate state.
> >> Because into COLO mode, we will control VM state in COLO code itself, for
> >> example:
> >> When we start COLO, it will do the first migration as normal live
> >> migration, after that we will enter
> >> the COLO process, at that time COLO think the primary VM state is same with
> >> secondary VM(the first checkpoint),
> >> so we will use vm_start() start the primary VM(unlike to normal migration)
> >> and secondary VM.
> >> In this time, primary VM and secondary VM will parallel running, and if
> >> COLO found two VM state are
> >> not same, it will trigger checkpoint(like another migration). Finally, if
> >> occurred some fault that will trigger
> >> failover, after that primary VM maybe return to normal running
> >> mode(secondary dead).
> >> So, if we just see the primary VM state, may be it has out of the RunState
> >> state
> >> machine or it still in migrate state.
> >> 
> >> 
> >> 
> >> 
> >> > How is @colo-running related to the run state?
> >> >
> >> 
> >> Not related, as I say above.
> >
> > Right; this is a different type of 'running' - it might be better to say
> > 'active' rather than running.
> 
> Rename?
> 
> >   COLO has a pair of VMs in sync with a constant stream of migrations
> > between them.
> > The 'mode' is whether it's the source (primary) or destination (secondary) VM.
> > (Also sometimes written PVM/SVM)
> >
> > If COLO fails for some reason (e.g. the
> > secondary host fails) then I think this is saying the 'colo-running'
> > would be false.
> >
> > Some monitoring tool would be watching this to make sure you
> > really do have a redundent pair of VMs, and if one of them failed
> > you'd want to know and alert.
> 
> Let me try to explain what I learned in my own words, so you can correct
> my misunderstandings.
> 
> A VM doing COLO is either the primary or the secondary of a pair.  A
> monitoring process watches them.

Right

> At some time, it enters MigrationStatus 'colo'.  Peeking at the code, it
> looks like it enters it from state 'active', and never leaves it.  This
> happens after we successfully created the secondary by migrating the
> primary.

Yes, I think that's right.

> Aside: migrate_set_state() appears to do nothing when @old_state doesn't
> match @state, yet callers appear to assume it works.  Feels brittle.  Am
> I confused?

It's an atomic-compare-exchange used to set the state; most of the time you only
care about the fact it's atomic and you know the state you expect to be
coming from; normally the cases where this isn't right are failure
paths, but those are explicitly checked by checking error states.
There are some places where we explicitly check the exchanged value but
they're pretty rare, and are normally special cases (e.g. when forcing a
cancel).

> The monitoring process orchestrates fault tolerance:
> 
> * It initially creates the secondary by migrating the primary.  This is
>   called the first checkpoint.

Right.

(And the step you haven't mentioned; that we keep sending checkpoints)

> * If the primary goes down, the monitor sends x-colo-lost-heartbeat to
>   the secondary.  The secondary becomes the primary, and we create a new
>   secondary by live-migrating the primary.

I don't think there's mechanisms yet for resyncing to bring a failed
pair back into a new pair - so you survive one failure at the moment.
(I might be wrong, that was the case previously)

> * If the secondary goes down or out of sync, we abandon it and send
>   x-colo-lost-heartbeat to the primary.  We can then create a new
>   secondary by live-migrating the primary.  This is called another
>   checkpoint.

Yes

> x-colo-lost-heartbeat's doc comment:
> 
> # Tell qemu that heartbeat is lost, request it to do takeover procedures.
> # If this command is sent to the PVM, the Primary side will exit COLO mode.
> 
> What does "exiting COLO mode" mean

The VM is running unprotected - there's no migration/checkpointing.  At
that point it's pretty much just a normal VM.

> and how is it reflected in
> ColoStatus member mode?  Do we reenter COLO mode eventually?  How?

I'm not sure of the status in that case (I'll leave that to Zhang Chen)
but at that point it's just a normal VM; so I think we go through the
startup-like path of having to do that first migration again.

> # If sent to the Secondary, the Secondary side will run failover work,
> # then takes over server operation to become the service VM.
> 
> Undefined term "service VM".  Do you mean primary VM?

I think that means the VM that's actually running the workload; at that
point there is no primary/secondary any more because COLO isn't
synchronising.

> Cases:
> 
> (1) This VM isn't doing COLO.  ColoStatus:
> 
>     { "mode": "unknown",
>       "running": false,
>       "reason": "none" }
> 
> (2) This VM is a COLO primary
> 
> (2a) and it hasn't received x-colo-lost-heartbeat since it last became
>      primary.  ColoStatus:
> 
>     { "mode": "primary",
>       "running": true,          # I guess
>       "reason": "none" }
> 
> (2b) and it has received x-colo-lost-heartbeat since it last became
>      primary
> 
>     { "mode": "primary",
>       "running": true,          # I guess
>       "reason": "request" }
> 
> (2c) and it has run into some error condition I don't understand (but
>      probably should)
> 
>     { "mode": "primary",
>       "running": true,          # I guess
>       "reason": "error" }
> 
> (3) This VM is a COLO secondary
> 
> (3a-c) like (2a-c)
> 
> If that's correct (and I doubt it), then @running is entirely redundant:
> it's false if and only if @mode is "unknown".

That's probably true; both fields do derive from the migration state;
I think the mode is primary if you're outgoing migration state is COLO,
it's secondary if you're incoming state is COLO, and unknown if neither
state is COLO.  And 'running' is the OR of those.  
Note that there's one other piece of state, the 'colo' migration
capability (that is displayed in the normal capabilities stuff).

So for example, if you're in the process of starting COLO up,
your colo capability is set, your migration mode is still normal
migration setup/active/complete - so these would still show
'unknown/false/none' which probably could be better.

> Speaking of mode "unknown": that's a bad name.  "none" would be better.
> Or maybe query-colo-status should fail in case (1), to get rid of it at
> the interface entirely.
> 
> We really, really, really need a state diagram complete with QMP
> commands and events.  COLO-FT.txt covers architecture and provides an
> example, but it's entirely inadequate at explaining how the QMP commands
> and events fit in, and their doc comments don't really help.  I feel
> this is the reason why we're at v8 and I'm still groping in the dark,
> unable to pass judgement on the proposed QAPI schema changes.

COLO is a big series that touches lots of bits of QEMU (and has bounced
through the hands of a few people); most of the iterations haven't been
that much about the interface.

Dave

> [...]
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status
  2018-06-14  9:25               ` Dr. David Alan Gilbert
@ 2018-06-19  4:00                 ` Zhang Chen
  0 siblings, 0 replies; 46+ messages in thread
From: Zhang Chen @ 2018-06-19  4:00 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Markus Armbruster, zhanghailiang, Li Zhijian, Juan Quintela,
	Jason Wang, qemu-devel, Paolo Bonzini

On Thu, Jun 14, 2018 at 5:25 PM, Dr. David Alan Gilbert <dgilbert@redhat.com
> wrote:

> * Markus Armbruster (armbru@redhat.com) wrote:
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> >
> > > * Zhang Chen (zhangckid@gmail.com) wrote:
> > >> On Mon, Jun 11, 2018 at 2:48 PM, Markus Armbruster <armbru@redhat.com
> >
> > >> wrote:
> > >>
> > >> > Zhang Chen <zhangckid@gmail.com> writes:
> > >> >
> > >> > > On Thu, Jun 7, 2018 at 8:59 PM, Markus Armbruster <
> armbru@redhat.com> wrote:
> > >> > >
> > >> > >> Zhang Chen <zhangckid@gmail.com> writes:
> > >> > >>
> > >> > >> > Libvirt or other high level software can use this command
> query colo status.
> > >> > >> > You can test this command like that:
> > >> > >> > {'execute':'query-colo-status'}
> > >> > >> >
> > >> > >> > Signed-off-by: Zhang Chen <zhangckid@gmail.com>
> > [...]
> > >> > >> > diff --git a/qapi/migration.json b/qapi/migration.json
> > >> > >> > index 93136ce5a0..356a370949 100644
> > >> > >> > --- a/qapi/migration.json
> > >> > >> > +++ b/qapi/migration.json
> > >> > >> > @@ -1231,6 +1231,40 @@
> > >> > >> >  ##
> > >> > >> >  { 'command': 'xen-colo-do-checkpoint' }
> > >> > >> >
> > >> > >> > +##
> > >> > >> > +# @COLOStatus:
> > >> > >> > +#
> > >> > >> > +# The result format for 'query-colo-status'.
> > >> > >> > +#
> > >> > >> > +# @mode: COLO running mode. If COLO is running, this field
> will return
> > >> > >> > +#        'primary' or 'secodary'.
> > >> > >> > +#
> > >> > >> > +# @colo-running: true if COLO is running.
> > >> > >> > +#
> > >> > >> > +# @reason: describes the reason for the COLO exit.
> > >> > >>
> > >> > >> What's the value of @reason before a "COLO exit"?
> > >> > >>
> > >> > >
> > >> > > Before a "COLO exit", we just return 'none' in this field.
> > >> >
> > >> > Please add that to the documentation.
> > >> >
> > >>
> > >> OK.
> > >>
> > >>
> > >> >
> > >> > Please excuse my ignorance on COLO...  I'm still not sure I fully
> > >> > understand how the three members are related, or even how the COLO
> state
> > >> > machine works and how its related to / embedded in RunState.  I
> searched
> > >> > docs/ for a state diagram, but couldn't find one.
> > >> >
> > >> > According to runstate_transitions_def[], the part of the RunState
> state
> > >> > machine that's directly connected to state "colo" looks like this:
> > >> >
> > >> >     inmigrate  -+
> > >> >                 |
> > >> >     paused  ----+
> > >> >                 |
> > >> >     migrate  ---+->  colo  <------>  running
> > >> >                 |
> > >> >     suspended  -+
> > >> >                 |
> > >> >     watchdog  --+
> > >> >
> > >> > For each of the seven state transitions: how is the state transition
> > >> > triggered (e.g. by QMP command, spontaneously when a certain
> condition
> > >> > is detected, ...), and what events (if any) are emitted then?
> > >> >
> > >> >
> > >> When you start COLO, the VM always running in "MIGRATION_STATUS_COLO"
> still
> > >> occur failover.
> > >> And in the flow diagram, you can think COLO always running in migrate
> state.
> > >> Because into COLO mode, we will control VM state in COLO code itself,
> for
> > >> example:
> > >> When we start COLO, it will do the first migration as normal live
> > >> migration, after that we will enter
> > >> the COLO process, at that time COLO think the primary VM state is
> same with
> > >> secondary VM(the first checkpoint),
> > >> so we will use vm_start() start the primary VM(unlike to normal
> migration)
> > >> and secondary VM.
> > >> In this time, primary VM and secondary VM will parallel running, and
> if
> > >> COLO found two VM state are
> > >> not same, it will trigger checkpoint(like another migration).
> Finally, if
> > >> occurred some fault that will trigger
> > >> failover, after that primary VM maybe return to normal running
> > >> mode(secondary dead).
> > >> So, if we just see the primary VM state, may be it has out of the
> RunState
> > >> state
> > >> machine or it still in migrate state.
> > >>
> > >>
> > >>
> > >>
> > >> > How is @colo-running related to the run state?
> > >> >
> > >>
> > >> Not related, as I say above.
> > >
> > > Right; this is a different type of 'running' - it might be better to
> say
> > > 'active' rather than running.
> >
> > Rename?
>

OK, I will rename it in next version.



> >
> > >   COLO has a pair of VMs in sync with a constant stream of migrations
> > > between them.
> > > The 'mode' is whether it's the source (primary) or destination
> (secondary) VM.
> > > (Also sometimes written PVM/SVM)
> > >
> > > If COLO fails for some reason (e.g. the
> > > secondary host fails) then I think this is saying the 'colo-running'
> > > would be false.
> > >
> > > Some monitoring tool would be watching this to make sure you
> > > really do have a redundent pair of VMs, and if one of them failed
> > > you'd want to know and alert.
> >
> > Let me try to explain what I learned in my own words, so you can correct
> > my misunderstandings.
> >
> > A VM doing COLO is either the primary or the secondary of a pair.  A
> > monitoring process watches them.
>
> Right
>
> > At some time, it enters MigrationStatus 'colo'.  Peeking at the code, it
> > looks like it enters it from state 'active', and never leaves it.  This
> > happens after we successfully created the secondary by migrating the
> > primary.
>
> Yes, I think that's right.
>
> > Aside: migrate_set_state() appears to do nothing when @old_state doesn't
> > match @state, yet callers appear to assume it works.  Feels brittle.  Am
> > I confused?
>
> It's an atomic-compare-exchange used to set the state; most of the time
> you only
> care about the fact it's atomic and you know the state you expect to be
> coming from; normally the cases where this isn't right are failure
> paths, but those are explicitly checked by checking error states.
> There are some places where we explicitly check the exchanged value but
> they're pretty rare, and are normally special cases (e.g. when forcing a
> cancel).
>
> > The monitoring process orchestrates fault tolerance:
> >
> > * It initially creates the secondary by migrating the primary.  This is
> >   called the first checkpoint.
>
> Right.
>
> (And the step you haven't mentioned; that we keep sending checkpoints)
>
> > * If the primary goes down, the monitor sends x-colo-lost-heartbeat to
> >   the secondary.  The secondary becomes the primary, and we create a new
> >   secondary by live-migrating the primary.
>
> I don't think there's mechanisms yet for resyncing to bring a failed
> pair back into a new pair - so you survive one failure at the moment.
> (I might be wrong, that was the case previously)
>


Dave is right. In qemu side, we just provide some qmp command to upper
layer software,
Like openstack or libvirt, user can implement some policy on high level
then call the qmp
command to use COLO.



>
> > * If the secondary goes down or out of sync, we abandon it and send
> >   x-colo-lost-heartbeat to the primary.  We can then create a new
> >   secondary by live-migrating the primary.  This is called another
> >   checkpoint.
>
> Yes
>
> > x-colo-lost-heartbeat's doc comment:
> >
> > # Tell qemu that heartbeat is lost, request it to do takeover procedures.
> > # If this command is sent to the PVM, the Primary side will exit COLO
> mode.
> >
> > What does "exiting COLO mode" mean
>
> The VM is running unprotected - there's no migration/checkpointing.  At
> that point it's pretty much just a normal VM.
>
> > and how is it reflected in
> > ColoStatus member mode?  Do we reenter COLO mode eventually?  How?
>
> I'm not sure of the status in that case (I'll leave that to Zhang Chen)
> but at that point it's just a normal VM; so I think we go through the
> startup-like path of having to do that first migration again.
>
> > # If sent to the Secondary, the Secondary side will run failover work,
> > # then takes over server operation to become the service VM.
> >
> > Undefined term "service VM".  Do you mean primary VM?
>
> I think that means the VM that's actually running the workload; at that
> point there is no primary/secondary any more because COLO isn't
> synchronising.
>
> > Cases:
> >
> > (1) This VM isn't doing COLO.  ColoStatus:
> >
> >     { "mode": "unknown",
> >       "running": false,
> >       "reason": "none" }
> >
> > (2) This VM is a COLO primary
> >
> > (2a) and it hasn't received x-colo-lost-heartbeat since it last became
> >      primary.  ColoStatus:
> >
> >     { "mode": "primary",
> >       "running": true,          # I guess
> >       "reason": "none" }
> >
> > (2b) and it has received x-colo-lost-heartbeat since it last became
> >      primary
> >
> >     { "mode": "primary",
> >       "running": true,          # I guess
> >       "reason": "request" }
> >
> > (2c) and it has run into some error condition I don't understand (but
> >      probably should)
> >
> >     { "mode": "primary",
> >       "running": true,          # I guess
> >       "reason": "error" }
> >
> > (3) This VM is a COLO secondary
> >
> > (3a-c) like (2a-c)
> >
> > If that's correct (and I doubt it), then @running is entirely redundant:
> > it's false if and only if @mode is "unknown".
>
> That's probably true; both fields do derive from the migration state;
> I think the mode is primary if you're outgoing migration state is COLO,
> it's secondary if you're incoming state is COLO, and unknown if neither
> state is COLO.  And 'running' is the OR of those.
> Note that there's one other piece of state, the 'colo' migration
> capability (that is displayed in the normal capabilities stuff).
>
> So for example, if you're in the process of starting COLO up,
> your colo capability is set, your migration mode is still normal
> migration setup/active/complete - so these would still show
> 'unknown/false/none' which probably could be better.
>
> > Speaking of mode "unknown": that's a bad name.  "none" would be better.
> > Or maybe query-colo-status should fail in case (1), to get rid of it at
> > the interface entirely.
>

Currently in case (1):

{'execute':'query-colo-status'}{"return": {}}

{"error": {"class": "GenericError", "desc": "COLO is disabled"}}



> >
> > We really, really, really need a state diagram complete with QMP
> > commands and events.  COLO-FT.txt covers architecture and provides an
> > example, but it's entirely inadequate at explaining how the QMP commands
> > and events fit in, and their doc comments don't really help.  I feel
> > this is the reason why we're at v8 and I'm still groping in the dark,
> > unable to pass judgement on the proposed QAPI schema changes.
>
>
OK, I got it, I will try to add new patch about the state diagram complete
with QMP commands and events in the COLO-FT.txt.



> COLO is a big series that touches lots of bits of QEMU (and has bounced
> through the hands of a few people); most of the iterations haven't been
> that much about the interface.
>


Yes, I will add the state diagram as Markus said.

Thanks
Zhang Chen


>
> Dave
>
> > [...]
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2018-06-19  4:00 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-03  5:05 [Qemu-devel] [PATCH V8 00/17] COLO: integrate colo frame with block replication and COLO proxy Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 01/17] filter-rewriter: fix memory leak for connection in connection_track_table Zhang Chen
2018-06-04  5:51   ` Jason Wang
2018-06-10 14:08     ` Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 02/17] colo-compare: implement the process of checkpoint Zhang Chen
2018-06-04  6:31   ` Jason Wang
2018-06-10 14:08     ` Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 03/17] colo-compare: use notifier to notify packets comparing result Zhang Chen
2018-06-04  6:36   ` Jason Wang
2018-06-10 14:09     ` Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 04/17] COLO: integrate colo compare with colo frame Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 05/17] COLO: Add block replication into colo process Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 06/17] COLO: Remove colo_state migration struct Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 07/17] COLO: Load dirty pages into SVM's RAM cache firstly Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 08/17] ram/COLO: Record the dirty pages that SVM received Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 09/17] COLO: Flush memory data from ram cache Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 10/17] qmp event: Add COLO_EXIT event to notify users while exited COLO Zhang Chen
2018-06-04 22:23   ` Eric Blake
2018-06-07 12:54     ` Markus Armbruster
2018-06-10 17:24       ` Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 11/17] qapi: Add new command to query colo status Zhang Chen
2018-06-04 22:23   ` Eric Blake
2018-06-10 17:42     ` Zhang Chen
2018-06-10 17:53       ` Zhang Chen
2018-06-07 12:59   ` Markus Armbruster
2018-06-10 17:39     ` Zhang Chen
2018-06-11  6:48       ` Markus Armbruster
2018-06-11 15:34         ` Zhang Chen
2018-06-13 16:50           ` Dr. David Alan Gilbert
2018-06-14  8:42             ` Markus Armbruster
2018-06-14  9:25               ` Dr. David Alan Gilbert
2018-06-19  4:00                 ` Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 12/17] savevm: split the process of different stages for loadvm/savevm Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 13/17] COLO: flush host dirty ram from cache Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 14/17] filter: Add handle_event method for NetFilterClass Zhang Chen
2018-06-04  6:57   ` Jason Wang
2018-06-10 14:09     ` Zhang Chen
2018-06-11  1:56       ` Jason Wang
2018-06-11  6:46         ` Zhang Chen
2018-06-11  7:02           ` Jason Wang
2018-06-11 15:36             ` Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 15/17] filter-rewriter: handle checkpoint and failover event Zhang Chen
2018-06-04  7:42   ` Jason Wang
2018-06-10 17:20     ` Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 16/17] COLO: notify net filters about checkpoint/failover event Zhang Chen
2018-06-03  5:05 ` [Qemu-devel] [PATCH V8 17/17] COLO: quick failover process by kick COLO thread Zhang Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.