All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] colo-compare: Preparing work for combining with COLO frame
@ 2017-01-24 14:05 zhanghailiang
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage zhanghailiang
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: zhanghailiang @ 2017-01-24 14:05 UTC (permalink / raw)
  To: qemu-devel, jasowang
  Cc: zhangchen.fnst, lizhijian, zhanghailiang, Juan Quintela,
	Amit Shah, Dr . David Alan Gilbert, eddie.dong, Xu Quan

Hi,

This series is some preparing works for integrating colo-compare into COLO
frame.

It is based on patch '[PATCH] colo-compare: sort TCP packet queue by sequence number'
http://patchwork.ozlabs.org/patch/718955/ 

Please review and any commits are welcomed.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Dr. David Alan Gilbert (git) <dgilbert@redhat.com>
Cc: eddie.dong@intel.com
Cc: Xu Quan <xuquan8@huawei.com>


Zhang Chen (1):
  colo-compare: add API to flush all queued packets while do checkpoint

zhanghailiang (2):
  colo-compare: reconstruct the mutex lock usage
  colo-compare: use notifier to notify inconsistent packets comparing

 net/colo-compare.c | 105 +++++++++++++++++++++++++++++++++++++++++++++++------
 net/colo-compare.h |  22 +++++++++++
 net/colo.c         |   2 +
 net/colo.h         |   2 +
 4 files changed, 119 insertions(+), 12 deletions(-)
 create mode 100644 net/colo-compare.h

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-01-24 14:05 [Qemu-devel] [PATCH 0/3] colo-compare: Preparing work for combining with COLO frame zhanghailiang
@ 2017-01-24 14:05 ` zhanghailiang
  2017-02-03  3:47   ` Jason Wang
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 2/3] colo-compare: add API to flush all queued packets while do checkpoint zhanghailiang
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing zhanghailiang
  2 siblings, 1 reply; 22+ messages in thread
From: zhanghailiang @ 2017-01-24 14:05 UTC (permalink / raw)
  To: qemu-devel, jasowang; +Cc: zhangchen.fnst, lizhijian, zhanghailiang

The original 'timer_check_lock' mutex lock of struct CompareState
is used to protect the 'conn_list' queue and its child queues which
are 'primary_list' and 'secondary_list', which is a little abused
and confusing

To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock'
which is used to protect 'conn_list' queue, use another 'conn_lock'
to protect 'primary_list' and 'secondary_list'.

Besides, fix some missing places which need these mutex lock.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 33 +++++++++++++++++++++++----------
 net/colo.c         |  2 ++
 net/colo.h         |  2 ++
 3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 5a4f335..9bea62a 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -79,13 +79,15 @@ typedef struct CompareState {
      * element type: Connection
      */
     GQueue conn_list;
+    QemuMutex conn_list_lock;
     /* hashtable to save connection */
     GHashTable *connection_track_table;
+
     /* compare thread, a thread for each NIC */
     QemuThread thread;
+
     /* Timer used on the primary to find packets that are never matched */
     QEMUTimer *timer;
-    QemuMutex timer_check_lock;
 } CompareState;
 
 typedef struct CompareClass {
@@ -133,6 +135,7 @@ static int packet_enqueue(CompareState *s, int mode)
     }
     fill_connection_key(pkt, &key);
 
+    qemu_mutex_lock(&s->conn_list_lock);
     conn = connection_get(s->connection_track_table,
                           &key,
                           &s->conn_list);
@@ -141,16 +144,19 @@ static int packet_enqueue(CompareState *s, int mode)
         g_queue_push_tail(&s->conn_list, conn);
         conn->processing = true;
     }
+    qemu_mutex_unlock(&s->conn_list_lock);
 
     if (mode == PRIMARY_IN) {
         if (g_queue_get_length(&conn->primary_list) <=
                                MAX_QUEUE_SIZE) {
+            qemu_mutex_lock(&conn->conn_lock);
             g_queue_push_tail(&conn->primary_list, pkt);
             if (conn->ip_proto == IPPROTO_TCP) {
                 g_queue_sort(&conn->primary_list,
                              (GCompareDataFunc)seq_sorter,
                              NULL);
             }
+            qemu_mutex_unlock(&conn->conn_lock);
         } else {
             error_report("colo compare primary queue size too big,"
                          "drop packet");
@@ -158,12 +164,14 @@ static int packet_enqueue(CompareState *s, int mode)
     } else {
         if (g_queue_get_length(&conn->secondary_list) <=
                                MAX_QUEUE_SIZE) {
+            qemu_mutex_lock(&conn->conn_lock);
             g_queue_push_tail(&conn->secondary_list, pkt);
             if (conn->ip_proto == IPPROTO_TCP) {
                 g_queue_sort(&conn->secondary_list,
                              (GCompareDataFunc)seq_sorter,
                              NULL);
             }
+            qemu_mutex_unlock(&conn->conn_lock);
         } else {
             error_report("colo compare secondary queue size too big,"
                          "drop packet");
@@ -338,10 +346,11 @@ static void colo_old_packet_check_one_conn(void *opaque,
     GList *result = NULL;
     int64_t check_time = REGULAR_PACKET_CHECK_MS;
 
+    qemu_mutex_lock(&conn->conn_lock);
     result = g_queue_find_custom(&conn->primary_list,
                                  &check_time,
                                  (GCompareFunc)colo_old_packet_check_one);
-
+    qemu_mutex_unlock(&conn->conn_lock);
     if (result) {
         /* do checkpoint will flush old packet */
         /* TODO: colo_notify_checkpoint();*/
@@ -357,7 +366,9 @@ static void colo_old_packet_check(void *opaque)
 {
     CompareState *s = opaque;
 
+    qemu_mutex_lock(&s->conn_list_lock);
     g_queue_foreach(&s->conn_list, colo_old_packet_check_one_conn, NULL);
+    qemu_mutex_unlock(&s->conn_list_lock);
 }
 
 /*
@@ -372,11 +383,10 @@ static void colo_compare_connection(void *opaque, void *user_data)
     GList *result = NULL;
     int ret;
 
+    qemu_mutex_lock(&conn->conn_lock);
     while (!g_queue_is_empty(&conn->primary_list) &&
            !g_queue_is_empty(&conn->secondary_list)) {
-        qemu_mutex_lock(&s->timer_check_lock);
         pkt = g_queue_pop_tail(&conn->primary_list);
-        qemu_mutex_unlock(&s->timer_check_lock);
         switch (conn->ip_proto) {
         case IPPROTO_TCP:
             result = g_queue_find_custom(&conn->secondary_list,
@@ -411,13 +421,12 @@ static void colo_compare_connection(void *opaque, void *user_data)
              * until next comparison.
              */
             trace_colo_compare_main("packet different");
-            qemu_mutex_lock(&s->timer_check_lock);
             g_queue_push_tail(&conn->primary_list, pkt);
-            qemu_mutex_unlock(&s->timer_check_lock);
             /* TODO: colo_notify_checkpoint();*/
             break;
         }
     }
+    qemu_mutex_unlock(&conn->conn_lock);
 }
 
 static int compare_chr_send(CharBackend *out,
@@ -561,8 +570,10 @@ static void compare_pri_rs_finalize(SocketReadState *pri_rs)
         trace_colo_compare_main("primary: unsupported packet in");
         compare_chr_send(&s->chr_out, pri_rs->buf, pri_rs->packet_len);
     } else {
+        qemu_mutex_lock(&s->conn_list_lock);
         /* compare connection */
         g_queue_foreach(&s->conn_list, colo_compare_connection, s);
+        qemu_mutex_unlock(&s->conn_list_lock);
     }
 }
 
@@ -573,8 +584,10 @@ static void compare_sec_rs_finalize(SocketReadState *sec_rs)
     if (packet_enqueue(s, SECONDARY_IN)) {
         trace_colo_compare_main("secondary: unsupported packet in");
     } else {
+        qemu_mutex_lock(&s->conn_list_lock);
         /* compare connection */
         g_queue_foreach(&s->conn_list, colo_compare_connection, s);
+        qemu_mutex_unlock(&s->conn_list_lock);
     }
 }
 
@@ -618,9 +631,7 @@ static void check_old_packet_regular(void *opaque)
      * TODO: Make timer handler run in compare thread
      * like qemu_chr_add_handlers_full.
      */
-    qemu_mutex_lock(&s->timer_check_lock);
     colo_old_packet_check(s);
-    qemu_mutex_unlock(&s->timer_check_lock);
 }
 
 /*
@@ -665,7 +676,7 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
     net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
 
     g_queue_init(&s->conn_list);
-    qemu_mutex_init(&s->timer_check_lock);
+    qemu_mutex_init(&s->conn_list_lock);
 
     s->connection_track_table = g_hash_table_new_full(connection_key_hash,
                                                       connection_key_equal,
@@ -718,8 +729,10 @@ static void colo_compare_finalize(Object *obj)
     g_queue_free(&s->conn_list);
 
     if (qemu_thread_is_self(&s->thread)) {
+        qemu_mutex_lock(&s->conn_list_lock);
         /* compare connection */
         g_queue_foreach(&s->conn_list, colo_compare_connection, s);
+        qemu_mutex_unlock(&s->conn_list_lock);
         qemu_thread_join(&s->thread);
     }
 
@@ -727,7 +740,7 @@ static void colo_compare_finalize(Object *obj)
         timer_del(s->timer);
     }
 
-    qemu_mutex_destroy(&s->timer_check_lock);
+    qemu_mutex_destroy(&s->conn_list_lock);
 
     g_free(s->pri_indev);
     g_free(s->sec_indev);
diff --git a/net/colo.c b/net/colo.c
index 6a6eacd..267f29c 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -138,6 +138,7 @@ Connection *connection_new(ConnectionKey *key)
     conn->syn_flag = 0;
     g_queue_init(&conn->primary_list);
     g_queue_init(&conn->secondary_list);
+    qemu_mutex_init(&conn->conn_lock);
 
     return conn;
 }
@@ -151,6 +152,7 @@ void connection_destroy(void *opaque)
     g_queue_foreach(&conn->secondary_list, packet_destroy, NULL);
     g_queue_free(&conn->secondary_list);
     g_slice_free(Connection, conn);
+    qemu_mutex_destroy(&conn->conn_lock);
 }
 
 Packet *packet_new(const void *data, int size)
diff --git a/net/colo.h b/net/colo.h
index 7c524f3..2d5f9be 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -61,6 +61,8 @@ typedef struct Connection {
     GQueue secondary_list;
     /* flag to enqueue unprocessed_connections */
     bool processing;
+    /* Protect the access of primary_list or secondary list */
+    QemuMutex conn_lock;
     uint8_t ip_proto;
     /* offset = secondary_seq - primary_seq */
     tcp_seq  offset;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 2/3] colo-compare: add API to flush all queued packets while do checkpoint
  2017-01-24 14:05 [Qemu-devel] [PATCH 0/3] colo-compare: Preparing work for combining with COLO frame zhanghailiang
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage zhanghailiang
@ 2017-01-24 14:05 ` zhanghailiang
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing zhanghailiang
  2 siblings, 0 replies; 22+ messages in thread
From: zhanghailiang @ 2017-01-24 14:05 UTC (permalink / raw)
  To: qemu-devel, jasowang; +Cc: zhangchen.fnst, lizhijian, zhanghailiang

From: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>

While COLO does checkpoint, we need to flush all queued packets of
COLO proxy, use a new API to do these things, and export it which
will be used by COLO frame.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo-compare.h | 20 ++++++++++++++++++++
 2 files changed, 68 insertions(+)
 create mode 100644 net/colo-compare.h

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 9bea62a..2ad577b 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -29,11 +29,15 @@
 #include "qemu/sockets.h"
 #include "qapi-visit.h"
 #include "net/colo.h"
+#include "net/colo-compare.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
     OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
+static QTAILQ_HEAD(, CompareState) net_compares =
+       QTAILQ_HEAD_INITIALIZER(net_compares);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
@@ -88,6 +92,8 @@ typedef struct CompareState {
 
     /* Timer used on the primary to find packets that are never matched */
     QEMUTimer *timer;
+
+    QTAILQ_ENTRY(CompareState) next;
 } CompareState;
 
 typedef struct CompareClass {
@@ -181,6 +187,39 @@ static int packet_enqueue(CompareState *s, int mode)
     return 0;
 }
 
+static void colo_flush_connection(void *opaque, void *user_data)
+{
+    CompareState *s = user_data;
+    Connection *conn = opaque;
+    Packet *pkt = NULL;
+
+    qemu_mutex_lock(&conn->conn_lock);
+    while (!g_queue_is_empty(&conn->primary_list)) {
+        pkt = g_queue_pop_head(&conn->primary_list);
+        compare_chr_send(&s->chr_out, pkt->data, pkt->size);
+        packet_destroy(pkt, NULL);
+    }
+    while (!g_queue_is_empty(&conn->secondary_list)) {
+        pkt = g_queue_pop_head(&conn->secondary_list);
+        packet_destroy(pkt, NULL);
+    }
+    qemu_mutex_unlock(&conn->conn_lock);
+}
+
+/* Fix Me: better to use notifier way */
+void colo_compare_do_checkpoint(void)
+{
+    CompareState *s;
+
+    trace_colo_compare_main("colo_compare_do_checkpoint");
+
+    QTAILQ_FOREACH(s, &net_compares, next) {
+        qemu_mutex_lock(&s->conn_list_lock);
+        g_queue_foreach(&s->conn_list, colo_flush_connection, s);
+        qemu_mutex_unlock(&s->conn_list_lock);
+    }
+}
+
 /*
  * The IP packets sent by primary and secondary
  * will be compared in here
@@ -387,6 +426,11 @@ static void colo_compare_connection(void *opaque, void *user_data)
     while (!g_queue_is_empty(&conn->primary_list) &&
            !g_queue_is_empty(&conn->secondary_list)) {
         pkt = g_queue_pop_tail(&conn->primary_list);
+        if (!pkt) {
+            error_report("colo-compare pop pkt failed");
+            return;
+        }
+
         switch (conn->ip_proto) {
         case IPPROTO_TCP:
             result = g_queue_find_custom(&conn->secondary_list,
@@ -726,6 +770,10 @@ static void colo_compare_finalize(Object *obj)
     qemu_chr_fe_deinit(&s->chr_sec_in);
     qemu_chr_fe_deinit(&s->chr_out);
 
+    if (!QTAILQ_EMPTY(&net_compares)) {
+        QTAILQ_REMOVE(&net_compares, s, next);
+    }
+
     g_queue_free(&s->conn_list);
 
     if (qemu_thread_is_self(&s->thread)) {
diff --git a/net/colo-compare.h b/net/colo-compare.h
new file mode 100644
index 0000000..44f9014
--- /dev/null
+++ b/net/colo-compare.h
@@ -0,0 +1,20 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_COMPARE_H
+#define QEMU_COLO_COMPARE_H
+
+void colo_compare_do_checkpoint(void);
+
+#endif /* QEMU_COLO_COMPARE_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing
  2017-01-24 14:05 [Qemu-devel] [PATCH 0/3] colo-compare: Preparing work for combining with COLO frame zhanghailiang
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage zhanghailiang
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 2/3] colo-compare: add API to flush all queued packets while do checkpoint zhanghailiang
@ 2017-01-24 14:05 ` zhanghailiang
  2017-02-03  4:50   ` Jason Wang
  2 siblings, 1 reply; 22+ messages in thread
From: zhanghailiang @ 2017-01-24 14:05 UTC (permalink / raw)
  To: qemu-devel, jasowang; +Cc: zhangchen.fnst, lizhijian, zhanghailiang

It's a good idea to use notifier to notify COLO frame of
inconsistent packets comparing.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
---
 net/colo-compare.c | 24 ++++++++++++++++++++++--
 net/colo-compare.h |  2 ++
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 2ad577b..39c394d 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -30,6 +30,7 @@
 #include "qapi-visit.h"
 #include "net/colo.h"
 #include "net/colo-compare.h"
+#include "migration/migration.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
@@ -38,6 +39,9 @@
 static QTAILQ_HEAD(, CompareState) net_compares =
        QTAILQ_HEAD_INITIALIZER(net_compares);
 
+static NotifierList colo_compare_notifiers =
+    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
@@ -378,6 +382,22 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
     }
 }
 
+static void colo_compare_inconsistent_notify(void)
+{
+    notifier_list_notify(&colo_compare_notifiers,
+                migrate_get_current());
+}
+
+void colo_compare_register_notifier(Notifier *notify)
+{
+    notifier_list_add(&colo_compare_notifiers, notify);
+}
+
+void colo_compare_unregister_notifier(Notifier *notify)
+{
+    notifier_remove(notify);
+}
+
 static void colo_old_packet_check_one_conn(void *opaque,
                                            void *user_data)
 {
@@ -392,7 +412,7 @@ static void colo_old_packet_check_one_conn(void *opaque,
     qemu_mutex_unlock(&conn->conn_lock);
     if (result) {
         /* do checkpoint will flush old packet */
-        /* TODO: colo_notify_checkpoint();*/
+        colo_compare_inconsistent_notify();
     }
 }
 
@@ -466,7 +486,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
              */
             trace_colo_compare_main("packet different");
             g_queue_push_tail(&conn->primary_list, pkt);
-            /* TODO: colo_notify_checkpoint();*/
+            colo_compare_inconsistent_notify();
             break;
         }
     }
diff --git a/net/colo-compare.h b/net/colo-compare.h
index 44f9014..769f55a 100644
--- a/net/colo-compare.h
+++ b/net/colo-compare.h
@@ -16,5 +16,7 @@
 #define QEMU_COLO_COMPARE_H
 
 void colo_compare_do_checkpoint(void);
+void colo_compare_register_notifier(Notifier *notify);
+void colo_compare_unregister_notifier(Notifier *notify);
 
 #endif /* QEMU_COLO_COMPARE_H */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage zhanghailiang
@ 2017-02-03  3:47   ` Jason Wang
  2017-02-06  8:13     ` Hailiang Zhang
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2017-02-03  3:47 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel; +Cc: zhangchen.fnst, lizhijian



On 2017年01月24日 22:05, zhanghailiang wrote:
> The original 'timer_check_lock' mutex lock of struct CompareState
> is used to protect the 'conn_list' queue and its child queues which
> are 'primary_list' and 'secondary_list', which is a little abused
> and confusing
>
> To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock'
> which is used to protect 'conn_list' queue, use another 'conn_lock'
> to protect 'primary_list' and 'secondary_list'.
>
> Besides, fix some missing places which need these mutex lock.
>
> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>

Instead of sticking to such kind of mutex, I think it's time to make 
colo timer run in colo thread (there's a TODO in the code).

Thought?

Thanks

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing
  2017-01-24 14:05 ` [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing zhanghailiang
@ 2017-02-03  4:50   ` Jason Wang
  2017-02-06  8:44     ` Hailiang Zhang
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2017-02-03  4:50 UTC (permalink / raw)
  To: zhanghailiang, qemu-devel; +Cc: lizhijian, zhangchen.fnst



On 2017年01月24日 22:05, zhanghailiang wrote:
> It's a good idea to use notifier to notify COLO frame of
> inconsistent packets comparing.
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> ---
>   net/colo-compare.c | 24 ++++++++++++++++++++++--
>   net/colo-compare.h |  2 ++
>   2 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 2ad577b..39c394d 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -30,6 +30,7 @@
>   #include "qapi-visit.h"
>   #include "net/colo.h"
>   #include "net/colo-compare.h"
> +#include "migration/migration.h"
>   
>   #define TYPE_COLO_COMPARE "colo-compare"
>   #define COLO_COMPARE(obj) \
> @@ -38,6 +39,9 @@
>   static QTAILQ_HEAD(, CompareState) net_compares =
>          QTAILQ_HEAD_INITIALIZER(net_compares);
>   
> +static NotifierList colo_compare_notifiers =
> +    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
> +
>   #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>   #define MAX_QUEUE_SIZE 1024
>   
> @@ -378,6 +382,22 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
>       }
>   }
>   
> +static void colo_compare_inconsistent_notify(void)
> +{
> +    notifier_list_notify(&colo_compare_notifiers,
> +                migrate_get_current());
> +}
> +
> +void colo_compare_register_notifier(Notifier *notify)
> +{
> +    notifier_list_add(&colo_compare_notifiers, notify);
> +}
> +
> +void colo_compare_unregister_notifier(Notifier *notify)
> +{
> +    notifier_remove(notify);
> +}
> +
>   static void colo_old_packet_check_one_conn(void *opaque,
>                                              void *user_data)
>   {
> @@ -392,7 +412,7 @@ static void colo_old_packet_check_one_conn(void *opaque,
>       qemu_mutex_unlock(&conn->conn_lock);
>       if (result) {
>           /* do checkpoint will flush old packet */
> -        /* TODO: colo_notify_checkpoint();*/
> +        colo_compare_inconsistent_notify();
>       }
>   }
>   
> @@ -466,7 +486,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
>                */
>               trace_colo_compare_main("packet different");
>               g_queue_push_tail(&conn->primary_list, pkt);
> -            /* TODO: colo_notify_checkpoint();*/
> +            colo_compare_inconsistent_notify();
>               break;
>           }
>       }

I don't see any users for 
colo_compare_register_notifier/colo_compare_unregister_notifier, is any 
patch missed in this series?

And is it safe to do notification in colo thread?

Thanks

> diff --git a/net/colo-compare.h b/net/colo-compare.h
> index 44f9014..769f55a 100644
> --- a/net/colo-compare.h
> +++ b/net/colo-compare.h
> @@ -16,5 +16,7 @@
>   #define QEMU_COLO_COMPARE_H
>   
>   void colo_compare_do_checkpoint(void);
> +void colo_compare_register_notifier(Notifier *notify);
> +void colo_compare_unregister_notifier(Notifier *notify);
>   
>   #endif /* QEMU_COLO_COMPARE_H */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-03  3:47   ` Jason Wang
@ 2017-02-06  8:13     ` Hailiang Zhang
  2017-02-06  8:30       ` Zhang Chen
  2017-02-06  9:35       ` Jason Wang
  0 siblings, 2 replies; 22+ messages in thread
From: Hailiang Zhang @ 2017-02-06  8:13 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: qemu-devel, xuquan8, lizhijian

On 2017/2/3 11:47, Jason Wang wrote:
>
>
> On 2017年01月24日 22:05, zhanghailiang wrote:
>> The original 'timer_check_lock' mutex lock of struct CompareState
>> is used to protect the 'conn_list' queue and its child queues which
>> are 'primary_list' and 'secondary_list', which is a little abused
>> and confusing
>>
>> To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock'
>> which is used to protect 'conn_list' queue, use another 'conn_lock'
>> to protect 'primary_list' and 'secondary_list'.
>>
>> Besides, fix some missing places which need these mutex lock.
>>
>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>
> Instead of sticking to such kind of mutex, I think it's time to make
> colo timer run in colo thread (there's a TODO in the code).
>

Er, it seems that, we still need these mutex locks even we make colo
timer and colo thread run in the same thread, because we may access
the connect/primary/secondary list from colo (migratioin) thread
concurrently.

Besides, it seems to be a little complex to make colo timer run in colo
compare thread, and it is not this series' goal. This series is preparing
work for integrating COLO compare with COLO frame and it is prerequisite.

So, we may consider implementing it later in another series.
Zhang Chen, what's your opinion ?


Thanks,
Hailiang

> Thought?
>
> Thanks
>
> .
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-06  8:13     ` Hailiang Zhang
@ 2017-02-06  8:30       ` Zhang Chen
  2017-02-06  9:35       ` Jason Wang
  1 sibling, 0 replies; 22+ messages in thread
From: Zhang Chen @ 2017-02-06  8:30 UTC (permalink / raw)
  To: Hailiang Zhang, Jason Wang; +Cc: qemu-devel, xuquan8, lizhijian



On 02/06/2017 04:13 PM, Hailiang Zhang wrote:
> On 2017/2/3 11:47, Jason Wang wrote:
>>
>>
>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>> is used to protect the 'conn_list' queue and its child queues which
>>> are 'primary_list' and 'secondary_list', which is a little abused
>>> and confusing
>>>
>>> To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock'
>>> which is used to protect 'conn_list' queue, use another 'conn_lock'
>>> to protect 'primary_list' and 'secondary_list'.
>>>
>>> Besides, fix some missing places which need these mutex lock.
>>>
>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>
>> Instead of sticking to such kind of mutex, I think it's time to make
>> colo timer run in colo thread (there's a TODO in the code).
>>
>
> Er, it seems that, we still need these mutex locks even we make colo
> timer and colo thread run in the same thread, because we may access
> the connect/primary/secondary list from colo (migratioin) thread
> concurrently.
>
> Besides, it seems to be a little complex to make colo timer run in colo
> compare thread, and it is not this series' goal. This series is preparing
> work for integrating COLO compare with COLO frame and it is prerequisite.
>
> So, we may consider implementing it later in another series.
> Zhang Chen, what's your opinion ?
>

I agree, The current situation of COLO is not a complete HA solution.
So, We should focus on make Qemu COLO run completely, and then do
this "TODO" job.

Thanks
Zhang Chen

>
> Thanks,
> Hailiang
>
>> Thought?
>>
>> Thanks
>>
>> .
>>
>
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing
  2017-02-03  4:50   ` Jason Wang
@ 2017-02-06  8:44     ` Hailiang Zhang
  2017-02-06  9:35       ` Jason Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Hailiang Zhang @ 2017-02-06  8:44 UTC (permalink / raw)
  To: Jason Wang, qemu-devel; +Cc: xuquan8, lizhijian, zhangchen.fnst

On 2017/2/3 12:50, Jason Wang wrote:
>
>
> On 2017年01月24日 22:05, zhanghailiang wrote:
>> It's a good idea to use notifier to notify COLO frame of
>> inconsistent packets comparing.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>    net/colo-compare.c | 24 ++++++++++++++++++++++--
>>    net/colo-compare.h |  2 ++
>>    2 files changed, 24 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 2ad577b..39c394d 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -30,6 +30,7 @@
>>    #include "qapi-visit.h"
>>    #include "net/colo.h"
>>    #include "net/colo-compare.h"
>> +#include "migration/migration.h"
>>
>>    #define TYPE_COLO_COMPARE "colo-compare"
>>    #define COLO_COMPARE(obj) \
>> @@ -38,6 +39,9 @@
>>    static QTAILQ_HEAD(, CompareState) net_compares =
>>           QTAILQ_HEAD_INITIALIZER(net_compares);
>>
>> +static NotifierList colo_compare_notifiers =
>> +    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
>> +
>>    #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>>    #define MAX_QUEUE_SIZE 1024
>>
>> @@ -378,6 +382,22 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
>>        }
>>    }
>>
>> +static void colo_compare_inconsistent_notify(void)
>> +{
>> +    notifier_list_notify(&colo_compare_notifiers,
>> +                migrate_get_current());
>> +}
>> +
>> +void colo_compare_register_notifier(Notifier *notify)
>> +{
>> +    notifier_list_add(&colo_compare_notifiers, notify);
>> +}
>> +
>> +void colo_compare_unregister_notifier(Notifier *notify)
>> +{
>> +    notifier_remove(notify);
>> +}
>> +
>>    static void colo_old_packet_check_one_conn(void *opaque,
>>                                               void *user_data)
>>    {
>> @@ -392,7 +412,7 @@ static void colo_old_packet_check_one_conn(void *opaque,
>>        qemu_mutex_unlock(&conn->conn_lock);
>>        if (result) {
>>            /* do checkpoint will flush old packet */
>> -        /* TODO: colo_notify_checkpoint();*/
>> +        colo_compare_inconsistent_notify();
>>        }
>>    }
>>
>> @@ -466,7 +486,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
>>                 */
>>                trace_colo_compare_main("packet different");
>>                g_queue_push_tail(&conn->primary_list, pkt);
>> -            /* TODO: colo_notify_checkpoint();*/
>> +            colo_compare_inconsistent_notify();
>>                break;
>>            }
>>        }
>
> I don't see any users for
> colo_compare_register_notifier/colo_compare_unregister_notifier, is any
> patch missed in this series?
>

No, we will use these functions in later series which integrate COLO compare
with COLO frame.
So, should i move this patch to the series where it is been called ?
patch 2 has the same problem.

> And is it safe to do notification in colo thread?
>

Yes, the notification callback will be quite simple.

Thanks,
hailiang


> Thanks
>
>> diff --git a/net/colo-compare.h b/net/colo-compare.h
>> index 44f9014..769f55a 100644
>> --- a/net/colo-compare.h
>> +++ b/net/colo-compare.h
>> @@ -16,5 +16,7 @@
>>    #define QEMU_COLO_COMPARE_H
>>
>>    void colo_compare_do_checkpoint(void);
>> +void colo_compare_register_notifier(Notifier *notify);
>> +void colo_compare_unregister_notifier(Notifier *notify);
>>
>>    #endif /* QEMU_COLO_COMPARE_H */
>
>
> .
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-06  8:13     ` Hailiang Zhang
  2017-02-06  8:30       ` Zhang Chen
@ 2017-02-06  9:35       ` Jason Wang
  2017-02-06 11:11         ` Hailiang Zhang
  1 sibling, 1 reply; 22+ messages in thread
From: Jason Wang @ 2017-02-06  9:35 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: qemu-devel, xuquan8, lizhijian



On 2017年02月06日 16:13, Hailiang Zhang wrote:
> On 2017/2/3 11:47, Jason Wang wrote:
>>
>>
>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>> is used to protect the 'conn_list' queue and its child queues which
>>> are 'primary_list' and 'secondary_list', which is a little abused
>>> and confusing
>>>
>>> To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock'
>>> which is used to protect 'conn_list' queue, use another 'conn_lock'
>>> to protect 'primary_list' and 'secondary_list'.
>>>
>>> Besides, fix some missing places which need these mutex lock.
>>>
>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>
>> Instead of sticking to such kind of mutex, I think it's time to make
>> colo timer run in colo thread (there's a TODO in the code).
>>
>
> Er, it seems that, we still need these mutex locks even we make colo
> timer and colo thread run in the same thread, because we may access
> the connect/primary/secondary list from colo (migratioin) thread
> concurrently.

Just make sure I understand the issue, why need it access the list?

>
> Besides, it seems to be a little complex to make colo timer run in colo
> compare thread, and it is not this series' goal. 

Seems not by just looking at how it was implemented in main loop, but 
maybe I was wrong.

> This series is preparing
> work for integrating COLO compare with COLO frame and it is prerequisite.
>
> So, we may consider implementing it later in another series.
> Zhang Chen, what's your opinion ?

The problem is this patch make things even worse, it introduces one more 
mutex.

Thanks

>
>
> Thanks,
> Hailiang
>
>> Thought?
>>
>> Thanks
>>
>> .
>>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing
  2017-02-06  8:44     ` Hailiang Zhang
@ 2017-02-06  9:35       ` Jason Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Wang @ 2017-02-06  9:35 UTC (permalink / raw)
  To: Hailiang Zhang, qemu-devel; +Cc: xuquan8, lizhijian, zhangchen.fnst



On 2017年02月06日 16:44, Hailiang Zhang wrote:
> On 2017/2/3 12:50, Jason Wang wrote:
>>
>>
>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>> It's a good idea to use notifier to notify COLO frame of
>>> inconsistent packets comparing.
>>>
>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> ---
>>>    net/colo-compare.c | 24 ++++++++++++++++++++++--
>>>    net/colo-compare.h |  2 ++
>>>    2 files changed, 24 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>> index 2ad577b..39c394d 100644
>>> --- a/net/colo-compare.c
>>> +++ b/net/colo-compare.c
>>> @@ -30,6 +30,7 @@
>>>    #include "qapi-visit.h"
>>>    #include "net/colo.h"
>>>    #include "net/colo-compare.h"
>>> +#include "migration/migration.h"
>>>
>>>    #define TYPE_COLO_COMPARE "colo-compare"
>>>    #define COLO_COMPARE(obj) \
>>> @@ -38,6 +39,9 @@
>>>    static QTAILQ_HEAD(, CompareState) net_compares =
>>>           QTAILQ_HEAD_INITIALIZER(net_compares);
>>>
>>> +static NotifierList colo_compare_notifiers =
>>> +    NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers);
>>> +
>>>    #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>>>    #define MAX_QUEUE_SIZE 1024
>>>
>>> @@ -378,6 +382,22 @@ static int colo_old_packet_check_one(Packet 
>>> *pkt, int64_t *check_time)
>>>        }
>>>    }
>>>
>>> +static void colo_compare_inconsistent_notify(void)
>>> +{
>>> +    notifier_list_notify(&colo_compare_notifiers,
>>> +                migrate_get_current());
>>> +}
>>> +
>>> +void colo_compare_register_notifier(Notifier *notify)
>>> +{
>>> +    notifier_list_add(&colo_compare_notifiers, notify);
>>> +}
>>> +
>>> +void colo_compare_unregister_notifier(Notifier *notify)
>>> +{
>>> +    notifier_remove(notify);
>>> +}
>>> +
>>>    static void colo_old_packet_check_one_conn(void *opaque,
>>>                                               void *user_data)
>>>    {
>>> @@ -392,7 +412,7 @@ static void colo_old_packet_check_one_conn(void 
>>> *opaque,
>>>        qemu_mutex_unlock(&conn->conn_lock);
>>>        if (result) {
>>>            /* do checkpoint will flush old packet */
>>> -        /* TODO: colo_notify_checkpoint();*/
>>> +        colo_compare_inconsistent_notify();
>>>        }
>>>    }
>>>
>>> @@ -466,7 +486,7 @@ static void colo_compare_connection(void 
>>> *opaque, void *user_data)
>>>                 */
>>>                trace_colo_compare_main("packet different");
>>>                g_queue_push_tail(&conn->primary_list, pkt);
>>> -            /* TODO: colo_notify_checkpoint();*/
>>> +            colo_compare_inconsistent_notify();
>>>                break;
>>>            }
>>>        }
>>
>> I don't see any users for
>> colo_compare_register_notifier/colo_compare_unregister_notifier, is any
>> patch missed in this series?
>>
>
> No, we will use these functions in later series which integrate COLO 
> compare
> with COLO frame.
> So, should i move this patch to the series where it is been called ?
> patch 2 has the same problem. 

Yes, please.

Thanks

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-06  9:35       ` Jason Wang
@ 2017-02-06 11:11         ` Hailiang Zhang
  2017-02-06 12:53           ` Jason Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Hailiang Zhang @ 2017-02-06 11:11 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian

On 2017/2/6 17:35, Jason Wang wrote:
>
>
> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>> On 2017/2/3 11:47, Jason Wang wrote:
>>>
>>>
>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>> is used to protect the 'conn_list' queue and its child queues which
>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>> and confusing
>>>>
>>>> To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock'
>>>> which is used to protect 'conn_list' queue, use another 'conn_lock'
>>>> to protect 'primary_list' and 'secondary_list'.
>>>>
>>>> Besides, fix some missing places which need these mutex lock.
>>>>
>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>
>>> Instead of sticking to such kind of mutex, I think it's time to make
>>> colo timer run in colo thread (there's a TODO in the code).
>>>
>>
>> Er, it seems that, we still need these mutex locks even we make colo
>> timer and colo thread run in the same thread, because we may access
>> the connect/primary/secondary list from colo (migratioin) thread
>> concurrently.
>
> Just make sure I understand the issue, why need it access the list?
>
>>
>> Besides, it seems to be a little complex to make colo timer run in colo
>> compare thread, and it is not this series' goal.
>
> Seems not by just looking at how it was implemented in main loop, but
> maybe I was wrong.
>
>> This series is preparing
>> work for integrating COLO compare with COLO frame and it is prerequisite.
>>
>> So, we may consider implementing it later in another series.
>> Zhang Chen, what's your opinion ?
>
> The problem is this patch make things even worse, it introduces one more
> mutex.
>

Hmm, for most cases, we need to get these two locks at the same time.
We can use one lock to protect conn_list/primary_list/secondary_list,
(We need to get this lock before operate all these lists, as you can see
in patch 2, while do checkpoint, we may operate these lists in
COLO checkpoint thread concurrently.)

But for the original codes, we didn't got timer_check_lock in
packet_enqueue() while operate conn_list/primary_list/secondary_list,
and didn't got this lock in colo_compare_connection while operate
secondary_list either.

So, is it OK to use the conn_lock instead of timer_check_lock, and
add the lock where it is need ?

Thanks.
Hailiang

> Thanks
>
>>
>>
>> Thanks,
>> Hailiang
>>
>>> Thought?
>>>
>>> Thanks
>>>
>>> .
>>>
>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-06 11:11         ` Hailiang Zhang
@ 2017-02-06 12:53           ` Jason Wang
  2017-02-07  7:54             ` Hailiang Zhang
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2017-02-06 12:53 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian



On 2017年02月06日 19:11, Hailiang Zhang wrote:
> On 2017/2/6 17:35, Jason Wang wrote:
>>
>>
>> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>>> On 2017/2/3 11:47, Jason Wang wrote:
>>>>
>>>>
>>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>>> is used to protect the 'conn_list' queue and its child queues which
>>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>>> and confusing
>>>>>
>>>>> To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock'
>>>>> which is used to protect 'conn_list' queue, use another 'conn_lock'
>>>>> to protect 'primary_list' and 'secondary_list'.
>>>>>
>>>>> Besides, fix some missing places which need these mutex lock.
>>>>>
>>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>>
>>>> Instead of sticking to such kind of mutex, I think it's time to make
>>>> colo timer run in colo thread (there's a TODO in the code).
>>>>
>>>
>>> Er, it seems that, we still need these mutex locks even we make colo
>>> timer and colo thread run in the same thread, because we may access
>>> the connect/primary/secondary list from colo (migratioin) thread
>>> concurrently.
>>
>> Just make sure I understand the issue, why need it access the list?
>>
>>>
>>> Besides, it seems to be a little complex to make colo timer run in colo
>>> compare thread, and it is not this series' goal.
>>
>> Seems not by just looking at how it was implemented in main loop, but
>> maybe I was wrong.
>>
>>> This series is preparing
>>> work for integrating COLO compare with COLO frame and it is 
>>> prerequisite.
>>>
>>> So, we may consider implementing it later in another series.
>>> Zhang Chen, what's your opinion ?
>>
>> The problem is this patch make things even worse, it introduces one more
>> mutex.
>>
>
> Hmm, for most cases, we need to get these two locks at the same time.
> We can use one lock to protect conn_list/primary_list/secondary_list,
> (We need to get this lock before operate all these lists, as you can see
> in patch 2, while do checkpoint, we may operate these lists in
> COLO checkpoint thread concurrently.)
>
> But for the original codes, we didn't got timer_check_lock in
> packet_enqueue() while operate conn_list/primary_list/secondary_list,
> and didn't got this lock in colo_compare_connection while operate
> secondary_list either.
>
> So, is it OK to use the conn_lock instead of timer_check_lock, and
> add the lock where it is need ?

I'd like to know if timer were run in colo thread (this looks as simple 
as a g_timeout_source_new() followed by a g_source_attach()), why do we 
still need a mutex. And if we need it now but plan to remove it in the 
future, I'd like to use big lock to simplify the codes.

Thanks

>
> Thanks.
> Hailiang
>
>> Thanks
>>
>>>
>>>
>>> Thanks,
>>> Hailiang
>>>
>>>> Thought?
>>>>
>>>> Thanks
>>>>
>>>> .
>>>>
>>>
>>
>>
>> .
>>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-06 12:53           ` Jason Wang
@ 2017-02-07  7:54             ` Hailiang Zhang
  2017-02-07  7:57               ` Jason Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Hailiang Zhang @ 2017-02-07  7:54 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian

Hi Jason,

On 2017/2/6 20:53, Jason Wang wrote:
>
>
> On 2017年02月06日 19:11, Hailiang Zhang wrote:
>> On 2017/2/6 17:35, Jason Wang wrote:
>>>
>>>
>>> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>>>> On 2017/2/3 11:47, Jason Wang wrote:
>>>>>
>>>>>
>>>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>>>> is used to protect the 'conn_list' queue and its child queues which
>>>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>>>> and confusing
>>>>>>
>>>>>> To make it clearer, we rename 'timer_check_lock' to 'conn_list_lock'
>>>>>> which is used to protect 'conn_list' queue, use another 'conn_lock'
>>>>>> to protect 'primary_list' and 'secondary_list'.
>>>>>>
>>>>>> Besides, fix some missing places which need these mutex lock.
>>>>>>
>>>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>>>
>>>>> Instead of sticking to such kind of mutex, I think it's time to make
>>>>> colo timer run in colo thread (there's a TODO in the code).
>>>>>
>>>>
>>>> Er, it seems that, we still need these mutex locks even we make colo
>>>> timer and colo thread run in the same thread, because we may access
>>>> the connect/primary/secondary list from colo (migratioin) thread
>>>> concurrently.
>>>
>>> Just make sure I understand the issue, why need it access the list?
>>>
>>>>
>>>> Besides, it seems to be a little complex to make colo timer run in colo
>>>> compare thread, and it is not this series' goal.
>>>
>>> Seems not by just looking at how it was implemented in main loop, but
>>> maybe I was wrong.
>>>
>>>> This series is preparing
>>>> work for integrating COLO compare with COLO frame and it is
>>>> prerequisite.
>>>>
>>>> So, we may consider implementing it later in another series.
>>>> Zhang Chen, what's your opinion ?
>>>
>>> The problem is this patch make things even worse, it introduces one more
>>> mutex.
>>>
>>
>> Hmm, for most cases, we need to get these two locks at the same time.
>> We can use one lock to protect conn_list/primary_list/secondary_list,
>> (We need to get this lock before operate all these lists, as you can see
>> in patch 2, while do checkpoint, we may operate these lists in
>> COLO checkpoint thread concurrently.)
>>
>> But for the original codes, we didn't got timer_check_lock in
>> packet_enqueue() while operate conn_list/primary_list/secondary_list,
>> and didn't got this lock in colo_compare_connection while operate
>> secondary_list either.
>>
>> So, is it OK to use the conn_lock instead of timer_check_lock, and
>> add the lock where it is need ?
>
> I'd like to know if timer were run in colo thread (this looks as simple
> as a g_timeout_source_new() followed by a g_source_attach()), why do we
> still need a mutex. And if we need it now but plan to remove it in the
> future, I'd like to use big lock to simplify the codes.
>

After investigated your above suggestion, I think it works by using
g_timeout_source_new() and g_source_attach(), but I'm not sure
if it is a good idea to use the big lock to protect the possible
concurrent cases which seem to only happen between COLO migration
thread and COLO compare thread, any further suggestions ?

Thanks,
Hailiang


> Thanks
>
>>
>> Thanks.
>> Hailiang
>>
>>> Thanks
>>>
>>>>
>>>>
>>>> Thanks,
>>>> Hailiang
>>>>
>>>>> Thought?
>>>>>
>>>>> Thanks
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>>
>>> .
>>>
>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-07  7:54             ` Hailiang Zhang
@ 2017-02-07  7:57               ` Jason Wang
  2017-02-07  8:19                 ` Hailiang Zhang
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2017-02-07  7:57 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian



On 2017年02月07日 15:54, Hailiang Zhang wrote:
> Hi Jason,
>
> On 2017/2/6 20:53, Jason Wang wrote:
>>
>>
>> On 2017年02月06日 19:11, Hailiang Zhang wrote:
>>> On 2017/2/6 17:35, Jason Wang wrote:
>>>>
>>>>
>>>> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>>>>> On 2017/2/3 11:47, Jason Wang wrote:
>>>>>>
>>>>>>
>>>>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>>>>> is used to protect the 'conn_list' queue and its child queues which
>>>>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>>>>> and confusing
>>>>>>>
>>>>>>> To make it clearer, we rename 'timer_check_lock' to 
>>>>>>> 'conn_list_lock'
>>>>>>> which is used to protect 'conn_list' queue, use another 'conn_lock'
>>>>>>> to protect 'primary_list' and 'secondary_list'.
>>>>>>>
>>>>>>> Besides, fix some missing places which need these mutex lock.
>>>>>>>
>>>>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>>>>
>>>>>> Instead of sticking to such kind of mutex, I think it's time to make
>>>>>> colo timer run in colo thread (there's a TODO in the code).
>>>>>>
>>>>>
>>>>> Er, it seems that, we still need these mutex locks even we make colo
>>>>> timer and colo thread run in the same thread, because we may access
>>>>> the connect/primary/secondary list from colo (migratioin) thread
>>>>> concurrently.
>>>>
>>>> Just make sure I understand the issue, why need it access the list?
>>>>
>>>>>
>>>>> Besides, it seems to be a little complex to make colo timer run in 
>>>>> colo
>>>>> compare thread, and it is not this series' goal.
>>>>
>>>> Seems not by just looking at how it was implemented in main loop, but
>>>> maybe I was wrong.
>>>>
>>>>> This series is preparing
>>>>> work for integrating COLO compare with COLO frame and it is
>>>>> prerequisite.
>>>>>
>>>>> So, we may consider implementing it later in another series.
>>>>> Zhang Chen, what's your opinion ?
>>>>
>>>> The problem is this patch make things even worse, it introduces one 
>>>> more
>>>> mutex.
>>>>
>>>
>>> Hmm, for most cases, we need to get these two locks at the same time.
>>> We can use one lock to protect conn_list/primary_list/secondary_list,
>>> (We need to get this lock before operate all these lists, as you can 
>>> see
>>> in patch 2, while do checkpoint, we may operate these lists in
>>> COLO checkpoint thread concurrently.)
>>>
>>> But for the original codes, we didn't got timer_check_lock in
>>> packet_enqueue() while operate conn_list/primary_list/secondary_list,
>>> and didn't got this lock in colo_compare_connection while operate
>>> secondary_list either.
>>>
>>> So, is it OK to use the conn_lock instead of timer_check_lock, and
>>> add the lock where it is need ?
>>
>> I'd like to know if timer were run in colo thread (this looks as simple
>> as a g_timeout_source_new() followed by a g_source_attach()), why do we
>> still need a mutex. And if we need it now but plan to remove it in the
>> future, I'd like to use big lock to simplify the codes.
>>
>
> After investigated your above suggestion, I think it works by using
> g_timeout_source_new() and g_source_attach(), but I'm not sure
> if it is a good idea to use the big lock to protect the possible
> concurrent cases which seem to only happen between COLO migration
> thread and COLO compare thread, any further suggestions ?

I think I need first understand why migration thread need to access the 
list?

Thanks

>
> Thanks,
> Hailiang
>
>
>> Thanks
>>
>>>
>>> Thanks.
>>> Hailiang
>>>
>>>> Thanks
>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Hailiang
>>>>>
>>>>>> Thought?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>>
>>>>
>>>> .
>>>>
>>>
>>
>>
>> .
>>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-07  7:57               ` Jason Wang
@ 2017-02-07  8:19                 ` Hailiang Zhang
  2017-02-07  9:21                   ` Jason Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Hailiang Zhang @ 2017-02-07  8:19 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian

On 2017/2/7 15:57, Jason Wang wrote:
>
>
> On 2017年02月07日 15:54, Hailiang Zhang wrote:
>> Hi Jason,
>>
>> On 2017/2/6 20:53, Jason Wang wrote:
>>>
>>>
>>> On 2017年02月06日 19:11, Hailiang Zhang wrote:
>>>> On 2017/2/6 17:35, Jason Wang wrote:
>>>>>
>>>>>
>>>>> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>>>>>> On 2017/2/3 11:47, Jason Wang wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>>>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>>>>>> is used to protect the 'conn_list' queue and its child queues which
>>>>>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>>>>>> and confusing
>>>>>>>>
>>>>>>>> To make it clearer, we rename 'timer_check_lock' to
>>>>>>>> 'conn_list_lock'
>>>>>>>> which is used to protect 'conn_list' queue, use another 'conn_lock'
>>>>>>>> to protect 'primary_list' and 'secondary_list'.
>>>>>>>>
>>>>>>>> Besides, fix some missing places which need these mutex lock.
>>>>>>>>
>>>>>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>>>>>
>>>>>>> Instead of sticking to such kind of mutex, I think it's time to make
>>>>>>> colo timer run in colo thread (there's a TODO in the code).
>>>>>>>
>>>>>>
>>>>>> Er, it seems that, we still need these mutex locks even we make colo
>>>>>> timer and colo thread run in the same thread, because we may access
>>>>>> the connect/primary/secondary list from colo (migratioin) thread
>>>>>> concurrently.
>>>>>
>>>>> Just make sure I understand the issue, why need it access the list?
>>>>>
>>>>>>
>>>>>> Besides, it seems to be a little complex to make colo timer run in
>>>>>> colo
>>>>>> compare thread, and it is not this series' goal.
>>>>>
>>>>> Seems not by just looking at how it was implemented in main loop, but
>>>>> maybe I was wrong.
>>>>>
>>>>>> This series is preparing
>>>>>> work for integrating COLO compare with COLO frame and it is
>>>>>> prerequisite.
>>>>>>
>>>>>> So, we may consider implementing it later in another series.
>>>>>> Zhang Chen, what's your opinion ?
>>>>>
>>>>> The problem is this patch make things even worse, it introduces one
>>>>> more
>>>>> mutex.
>>>>>
>>>>
>>>> Hmm, for most cases, we need to get these two locks at the same time.
>>>> We can use one lock to protect conn_list/primary_list/secondary_list,
>>>> (We need to get this lock before operate all these lists, as you can
>>>> see
>>>> in patch 2, while do checkpoint, we may operate these lists in
>>>> COLO checkpoint thread concurrently.)
>>>>
>>>> But for the original codes, we didn't got timer_check_lock in
>>>> packet_enqueue() while operate conn_list/primary_list/secondary_list,
>>>> and didn't got this lock in colo_compare_connection while operate
>>>> secondary_list either.
>>>>
>>>> So, is it OK to use the conn_lock instead of timer_check_lock, and
>>>> add the lock where it is need ?
>>>
>>> I'd like to know if timer were run in colo thread (this looks as simple
>>> as a g_timeout_source_new() followed by a g_source_attach()), why do we
>>> still need a mutex. And if we need it now but plan to remove it in the
>>> future, I'd like to use big lock to simplify the codes.
>>>
>>
>> After investigated your above suggestion, I think it works by using
>> g_timeout_source_new() and g_source_attach(), but I'm not sure
>> if it is a good idea to use the big lock to protect the possible
>> concurrent cases which seem to only happen between COLO migration
>> thread and COLO compare thread, any further suggestions ?
>
> I think I need first understand why migration thread need to access the
> list?
>

Er, sorry to confuse you here, to be exactly, it is COLO checkpoint thread,
we reuse the migration thread to realize checkpoint process for COLO,
Because qemu enters into COLO state after a complete migration, so we reuse it.

While do checkpoint, we need to release all the buffered packets that have not
yet been compared, so we need to access the list in COLO checkpoint thread.

Thanks.

> Thanks
>
>>
>> Thanks,
>> Hailiang
>>
>>
>>> Thanks
>>>
>>>>
>>>> Thanks.
>>>> Hailiang
>>>>
>>>>> Thanks
>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Hailiang
>>>>>>
>>>>>>> Thought?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>>
>>> .
>>>
>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-07  8:19                 ` Hailiang Zhang
@ 2017-02-07  9:21                   ` Jason Wang
  2017-02-07  9:30                     ` Hailiang Zhang
  2017-02-14  2:32                     ` Hailiang Zhang
  0 siblings, 2 replies; 22+ messages in thread
From: Jason Wang @ 2017-02-07  9:21 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian



On 2017年02月07日 16:19, Hailiang Zhang wrote:
> On 2017/2/7 15:57, Jason Wang wrote:
>>
>>
>> On 2017年02月07日 15:54, Hailiang Zhang wrote:
>>> Hi Jason,
>>>
>>> On 2017/2/6 20:53, Jason Wang wrote:
>>>>
>>>>
>>>> On 2017年02月06日 19:11, Hailiang Zhang wrote:
>>>>> On 2017/2/6 17:35, Jason Wang wrote:
>>>>>>
>>>>>>
>>>>>> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>>>>>>> On 2017/2/3 11:47, Jason Wang wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>>>>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>>>>>>> is used to protect the 'conn_list' queue and its child queues 
>>>>>>>>> which
>>>>>>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>>>>>>> and confusing
>>>>>>>>>
>>>>>>>>> To make it clearer, we rename 'timer_check_lock' to
>>>>>>>>> 'conn_list_lock'
>>>>>>>>> which is used to protect 'conn_list' queue, use another 
>>>>>>>>> 'conn_lock'
>>>>>>>>> to protect 'primary_list' and 'secondary_list'.
>>>>>>>>>
>>>>>>>>> Besides, fix some missing places which need these mutex lock.
>>>>>>>>>
>>>>>>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>>>>>>
>>>>>>>> Instead of sticking to such kind of mutex, I think it's time to 
>>>>>>>> make
>>>>>>>> colo timer run in colo thread (there's a TODO in the code).
>>>>>>>>
>>>>>>>
>>>>>>> Er, it seems that, we still need these mutex locks even we make 
>>>>>>> colo
>>>>>>> timer and colo thread run in the same thread, because we may access
>>>>>>> the connect/primary/secondary list from colo (migratioin) thread
>>>>>>> concurrently.
>>>>>>
>>>>>> Just make sure I understand the issue, why need it access the list?
>>>>>>
>>>>>>>
>>>>>>> Besides, it seems to be a little complex to make colo timer run in
>>>>>>> colo
>>>>>>> compare thread, and it is not this series' goal.
>>>>>>
>>>>>> Seems not by just looking at how it was implemented in main loop, 
>>>>>> but
>>>>>> maybe I was wrong.
>>>>>>
>>>>>>> This series is preparing
>>>>>>> work for integrating COLO compare with COLO frame and it is
>>>>>>> prerequisite.
>>>>>>>
>>>>>>> So, we may consider implementing it later in another series.
>>>>>>> Zhang Chen, what's your opinion ?
>>>>>>
>>>>>> The problem is this patch make things even worse, it introduces one
>>>>>> more
>>>>>> mutex.
>>>>>>
>>>>>
>>>>> Hmm, for most cases, we need to get these two locks at the same time.
>>>>> We can use one lock to protect conn_list/primary_list/secondary_list,
>>>>> (We need to get this lock before operate all these lists, as you can
>>>>> see
>>>>> in patch 2, while do checkpoint, we may operate these lists in
>>>>> COLO checkpoint thread concurrently.)
>>>>>
>>>>> But for the original codes, we didn't got timer_check_lock in
>>>>> packet_enqueue() while operate conn_list/primary_list/secondary_list,
>>>>> and didn't got this lock in colo_compare_connection while operate
>>>>> secondary_list either.
>>>>>
>>>>> So, is it OK to use the conn_lock instead of timer_check_lock, and
>>>>> add the lock where it is need ?
>>>>
>>>> I'd like to know if timer were run in colo thread (this looks as 
>>>> simple
>>>> as a g_timeout_source_new() followed by a g_source_attach()), why 
>>>> do we
>>>> still need a mutex. And if we need it now but plan to remove it in the
>>>> future, I'd like to use big lock to simplify the codes.
>>>>
>>>
>>> After investigated your above suggestion, I think it works by using
>>> g_timeout_source_new() and g_source_attach(), but I'm not sure
>>> if it is a good idea to use the big lock to protect the possible
>>> concurrent cases which seem to only happen between COLO migration
>>> thread and COLO compare thread, any further suggestions ?
>>
>> I think I need first understand why migration thread need to access the
>> list?
>>
>
> Er, sorry to confuse you here, to be exactly, it is COLO checkpoint 
> thread,
> we reuse the migration thread to realize checkpoint process for COLO,
> Because qemu enters into COLO state after a complete migration, so we 
> reuse it.
>
> While do checkpoint, we need to release all the buffered packets that 
> have not
> yet been compared, so we need to access the list in COLO checkpoint 
> thread.

I think the better way is notify the comparing thread and let it do the 
releasing. You probably need similar mechanism to notify from comparing 
thread to checkpoint thread.

Thanks

>
> Thanks. 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-07  9:21                   ` Jason Wang
@ 2017-02-07  9:30                     ` Hailiang Zhang
  2017-02-14  2:32                     ` Hailiang Zhang
  1 sibling, 0 replies; 22+ messages in thread
From: Hailiang Zhang @ 2017-02-07  9:30 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian

On 2017/2/7 17:21, Jason Wang wrote:
>
>
> On 2017年02月07日 16:19, Hailiang Zhang wrote:
>> On 2017/2/7 15:57, Jason Wang wrote:
>>>
>>>
>>> On 2017年02月07日 15:54, Hailiang Zhang wrote:
>>>> Hi Jason,
>>>>
>>>> On 2017/2/6 20:53, Jason Wang wrote:
>>>>>
>>>>>
>>>>> On 2017年02月06日 19:11, Hailiang Zhang wrote:
>>>>>> On 2017/2/6 17:35, Jason Wang wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>>>>>>>> On 2017/2/3 11:47, Jason Wang wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>>>>>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>>>>>>>> is used to protect the 'conn_list' queue and its child queues
>>>>>>>>>> which
>>>>>>>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>>>>>>>> and confusing
>>>>>>>>>>
>>>>>>>>>> To make it clearer, we rename 'timer_check_lock' to
>>>>>>>>>> 'conn_list_lock'
>>>>>>>>>> which is used to protect 'conn_list' queue, use another
>>>>>>>>>> 'conn_lock'
>>>>>>>>>> to protect 'primary_list' and 'secondary_list'.
>>>>>>>>>>
>>>>>>>>>> Besides, fix some missing places which need these mutex lock.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>>>>>>>
>>>>>>>>> Instead of sticking to such kind of mutex, I think it's time to
>>>>>>>>> make
>>>>>>>>> colo timer run in colo thread (there's a TODO in the code).
>>>>>>>>>
>>>>>>>>
>>>>>>>> Er, it seems that, we still need these mutex locks even we make
>>>>>>>> colo
>>>>>>>> timer and colo thread run in the same thread, because we may access
>>>>>>>> the connect/primary/secondary list from colo (migratioin) thread
>>>>>>>> concurrently.
>>>>>>>
>>>>>>> Just make sure I understand the issue, why need it access the list?
>>>>>>>
>>>>>>>>
>>>>>>>> Besides, it seems to be a little complex to make colo timer run in
>>>>>>>> colo
>>>>>>>> compare thread, and it is not this series' goal.
>>>>>>>
>>>>>>> Seems not by just looking at how it was implemented in main loop,
>>>>>>> but
>>>>>>> maybe I was wrong.
>>>>>>>
>>>>>>>> This series is preparing
>>>>>>>> work for integrating COLO compare with COLO frame and it is
>>>>>>>> prerequisite.
>>>>>>>>
>>>>>>>> So, we may consider implementing it later in another series.
>>>>>>>> Zhang Chen, what's your opinion ?
>>>>>>>
>>>>>>> The problem is this patch make things even worse, it introduces one
>>>>>>> more
>>>>>>> mutex.
>>>>>>>
>>>>>>
>>>>>> Hmm, for most cases, we need to get these two locks at the same time.
>>>>>> We can use one lock to protect conn_list/primary_list/secondary_list,
>>>>>> (We need to get this lock before operate all these lists, as you can
>>>>>> see
>>>>>> in patch 2, while do checkpoint, we may operate these lists in
>>>>>> COLO checkpoint thread concurrently.)
>>>>>>
>>>>>> But for the original codes, we didn't got timer_check_lock in
>>>>>> packet_enqueue() while operate conn_list/primary_list/secondary_list,
>>>>>> and didn't got this lock in colo_compare_connection while operate
>>>>>> secondary_list either.
>>>>>>
>>>>>> So, is it OK to use the conn_lock instead of timer_check_lock, and
>>>>>> add the lock where it is need ?
>>>>>
>>>>> I'd like to know if timer were run in colo thread (this looks as
>>>>> simple
>>>>> as a g_timeout_source_new() followed by a g_source_attach()), why
>>>>> do we
>>>>> still need a mutex. And if we need it now but plan to remove it in the
>>>>> future, I'd like to use big lock to simplify the codes.
>>>>>
>>>>
>>>> After investigated your above suggestion, I think it works by using
>>>> g_timeout_source_new() and g_source_attach(), but I'm not sure
>>>> if it is a good idea to use the big lock to protect the possible
>>>> concurrent cases which seem to only happen between COLO migration
>>>> thread and COLO compare thread, any further suggestions ?
>>>
>>> I think I need first understand why migration thread need to access the
>>> list?
>>>
>>
>> Er, sorry to confuse you here, to be exactly, it is COLO checkpoint
>> thread,
>> we reuse the migration thread to realize checkpoint process for COLO,
>> Because qemu enters into COLO state after a complete migration, so we
>> reuse it.
>>
>> While do checkpoint, we need to release all the buffered packets that
>> have not
>> yet been compared, so we need to access the list in COLO checkpoint
>> thread.
>
> I think the better way is notify the comparing thread and let it do the
> releasing. You probably need similar mechanism to notify from comparing
> thread to checkpoint thread.
>

Hmm, that's really a good idea, it can avoid using the mutex lock,
I'll try to implement it in next version, thanks very much. :)

> Thanks
>
>>
>> Thanks.
>
>
> .
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-07  9:21                   ` Jason Wang
  2017-02-07  9:30                     ` Hailiang Zhang
@ 2017-02-14  2:32                     ` Hailiang Zhang
  2017-02-14  4:08                       ` Jason Wang
  1 sibling, 1 reply; 22+ messages in thread
From: Hailiang Zhang @ 2017-02-14  2:32 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian

On 2017/2/7 17:21, Jason Wang wrote:
>
>
> On 2017年02月07日 16:19, Hailiang Zhang wrote:
>> On 2017/2/7 15:57, Jason Wang wrote:
>>>
>>>
>>> On 2017年02月07日 15:54, Hailiang Zhang wrote:
>>>> Hi Jason,
>>>>
>>>> On 2017/2/6 20:53, Jason Wang wrote:
>>>>>
>>>>>
>>>>> On 2017年02月06日 19:11, Hailiang Zhang wrote:
>>>>>> On 2017/2/6 17:35, Jason Wang wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2017年02月06日 16:13, Hailiang Zhang wrote:
>>>>>>>> On 2017/2/3 11:47, Jason Wang wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2017年01月24日 22:05, zhanghailiang wrote:
>>>>>>>>>> The original 'timer_check_lock' mutex lock of struct CompareState
>>>>>>>>>> is used to protect the 'conn_list' queue and its child queues
>>>>>>>>>> which
>>>>>>>>>> are 'primary_list' and 'secondary_list', which is a little abused
>>>>>>>>>> and confusing
>>>>>>>>>>
>>>>>>>>>> To make it clearer, we rename 'timer_check_lock' to
>>>>>>>>>> 'conn_list_lock'
>>>>>>>>>> which is used to protect 'conn_list' queue, use another
>>>>>>>>>> 'conn_lock'
>>>>>>>>>> to protect 'primary_list' and 'secondary_list'.
>>>>>>>>>>
>>>>>>>>>> Besides, fix some missing places which need these mutex lock.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
>>>>>>>>>
>>>>>>>>> Instead of sticking to such kind of mutex, I think it's time to
>>>>>>>>> make
>>>>>>>>> colo timer run in colo thread (there's a TODO in the code).
>>>>>>>>>
>>>>>>>>
>>>>>>>> Er, it seems that, we still need these mutex locks even we make
>>>>>>>> colo
>>>>>>>> timer and colo thread run in the same thread, because we may access
>>>>>>>> the connect/primary/secondary list from colo (migratioin) thread
>>>>>>>> concurrently.
>>>>>>>
>>>>>>> Just make sure I understand the issue, why need it access the list?
>>>>>>>
>>>>>>>>
>>>>>>>> Besides, it seems to be a little complex to make colo timer run in
>>>>>>>> colo
>>>>>>>> compare thread, and it is not this series' goal.
>>>>>>>
>>>>>>> Seems not by just looking at how it was implemented in main loop,
>>>>>>> but
>>>>>>> maybe I was wrong.
>>>>>>>
>>>>>>>> This series is preparing
>>>>>>>> work for integrating COLO compare with COLO frame and it is
>>>>>>>> prerequisite.
>>>>>>>>
>>>>>>>> So, we may consider implementing it later in another series.
>>>>>>>> Zhang Chen, what's your opinion ?
>>>>>>>
>>>>>>> The problem is this patch make things even worse, it introduces one
>>>>>>> more
>>>>>>> mutex.
>>>>>>>
>>>>>>
>>>>>> Hmm, for most cases, we need to get these two locks at the same time.
>>>>>> We can use one lock to protect conn_list/primary_list/secondary_list,
>>>>>> (We need to get this lock before operate all these lists, as you can
>>>>>> see
>>>>>> in patch 2, while do checkpoint, we may operate these lists in
>>>>>> COLO checkpoint thread concurrently.)
>>>>>>
>>>>>> But for the original codes, we didn't got timer_check_lock in
>>>>>> packet_enqueue() while operate conn_list/primary_list/secondary_list,
>>>>>> and didn't got this lock in colo_compare_connection while operate
>>>>>> secondary_list either.
>>>>>>
>>>>>> So, is it OK to use the conn_lock instead of timer_check_lock, and
>>>>>> add the lock where it is need ?
>>>>>
>>>>> I'd like to know if timer were run in colo thread (this looks as
>>>>> simple
>>>>> as a g_timeout_source_new() followed by a g_source_attach()), why
>>>>> do we
>>>>> still need a mutex. And if we need it now but plan to remove it in the
>>>>> future, I'd like to use big lock to simplify the codes.
>>>>>
>>>>
>>>> After investigated your above suggestion, I think it works by using
>>>> g_timeout_source_new() and g_source_attach(), but I'm not sure
>>>> if it is a good idea to use the big lock to protect the possible
>>>> concurrent cases which seem to only happen between COLO migration
>>>> thread and COLO compare thread, any further suggestions ?
>>>
>>> I think I need first understand why migration thread need to access the
>>> list?
>>>
>>
>> Er, sorry to confuse you here, to be exactly, it is COLO checkpoint
>> thread,
>> we reuse the migration thread to realize checkpoint process for COLO,
>> Because qemu enters into COLO state after a complete migration, so we
>> reuse it.
>>
>> While do checkpoint, we need to release all the buffered packets that
>> have not
>> yet been compared, so we need to access the list in COLO checkpoint
>> thread.
>

Hi Jason,

> I think the better way is notify the comparing thread and let it do the
> releasing. You probably need similar mechanism to notify from comparing
> thread to checkpoint thread.
>

It seems that there is no available APIs in glib to notify a thread which
are run coroutine to do something (idle source ?). What about using anonymous pipe
as the GPollFD to communicate between colo comparing thread and colo thread ?

Any ideas ?

Hailiang


> Thanks
>
>>
>> Thanks.
>
>
> .
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-14  2:32                     ` Hailiang Zhang
@ 2017-02-14  4:08                       ` Jason Wang
  2017-02-14  7:33                         ` Hailiang Zhang
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2017-02-14  4:08 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian



On 2017年02月14日 10:32, Hailiang Zhang wrote:
>>>
>>
>
> Hi Jason,
>
>> I think the better way is notify the comparing thread and let it do the
>> releasing. You probably need similar mechanism to notify from comparing
>> thread to checkpoint thread.
>>
>
> It seems that there is no available APIs in glib to notify a thread which
> are run coroutine to do something (idle source ?). What about using 
> anonymous pipe
> as the GPollFD to communicate between colo comparing thread and colo 
> thread ?
>
> Any ideas ?
>
> Hailiang

Haven't thought this deeply. But I think you can try event notifier 
first which was designed for such kind of notification.

Thanks

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-14  4:08                       ` Jason Wang
@ 2017-02-14  7:33                         ` Hailiang Zhang
  2017-02-15  3:16                           ` Jason Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Hailiang Zhang @ 2017-02-14  7:33 UTC (permalink / raw)
  To: Jason Wang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian

On 2017/2/14 12:08, Jason Wang wrote:
>
>
> On 2017年02月14日 10:32, Hailiang Zhang wrote:
>>>>
>>>
>>
>> Hi Jason,
>>
>>> I think the better way is notify the comparing thread and let it do the
>>> releasing. You probably need similar mechanism to notify from comparing
>>> thread to checkpoint thread.
>>>
>>
>> It seems that there is no available APIs in glib to notify a thread which
>> are run coroutine to do something (idle source ?). What about using
>> anonymous pipe
>> as the GPollFD to communicate between colo comparing thread and colo
>> thread ?
>>
>> Any ideas ?
>>
>> Hailiang
>
> Haven't thought this deeply. But I think you can try event notifier
> first which was designed for such kind of notification.
>

Hmm, It is quite same with what i said above except it uses the
return value of eventfd as the GPollFD.
The things here are more complex especially if there are more than
one vNIC, which each is corresponding to one compare thread.

How about using g_source_attach(), g_source_attach(), g_source_remove(),
g_source_set_callback() and g_source_set_priority() to control the process
of releasing packets dynamically ? Which we call g_source_attach()
to add the coroutine of releasing packets when do checkpoint, and remove
it once finishing the checkpoint process.

About colo comparing thread notifies colo thread, i think we can use
qemu_cond_wait(),qemu_cond_broadcast(), just like what we do in
pause_all_vcpus().

Thanks.
Hailiang

> Thanks
>
> .
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage
  2017-02-14  7:33                         ` Hailiang Zhang
@ 2017-02-15  3:16                           ` Jason Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Wang @ 2017-02-15  3:16 UTC (permalink / raw)
  To: Hailiang Zhang, zhangchen.fnst; +Cc: xuquan8, qemu-devel, lizhijian



On 2017年02月14日 15:33, Hailiang Zhang wrote:
> On 2017/2/14 12:08, Jason Wang wrote:
>>
>>
>> On 2017年02月14日 10:32, Hailiang Zhang wrote:
>>>>>
>>>>
>>>
>>> Hi Jason,
>>>
>>>> I think the better way is notify the comparing thread and let it do 
>>>> the
>>>> releasing. You probably need similar mechanism to notify from 
>>>> comparing
>>>> thread to checkpoint thread.
>>>>
>>>
>>> It seems that there is no available APIs in glib to notify a thread 
>>> which
>>> are run coroutine to do something (idle source ?). What about using
>>> anonymous pipe
>>> as the GPollFD to communicate between colo comparing thread and colo
>>> thread ?
>>>
>>> Any ideas ?
>>>
>>> Hailiang
>>
>> Haven't thought this deeply. But I think you can try event notifier
>> first which was designed for such kind of notification.
>>
>
> Hmm, It is quite same with what i said above except it uses the
> return value of eventfd as the GPollFD.
> The things here are more complex especially if there are more than
> one vNIC, which each is corresponding to one compare thread.
>
> How about using g_source_attach(), g_source_attach(), g_source_remove(),
> g_source_set_callback() and g_source_set_priority() to control the 
> process
> of releasing packets dynamically ? Which we call g_source_attach()
> to add the coroutine of releasing packets when do checkpoint, and remove
> it once finishing the checkpoint process.

I'm not sure I get the idea, maybe you can post patch for early review.

>
> About colo comparing thread notifies colo thread, i think we can use
> qemu_cond_wait(),qemu_cond_broadcast(), just like what we do in
> pause_all_vcpus().

Yes, agree.

Thanks

>
> Thanks.
> Hailiang
>
>> Thanks
>>
>> .
>>
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-02-15  3:16 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-24 14:05 [Qemu-devel] [PATCH 0/3] colo-compare: Preparing work for combining with COLO frame zhanghailiang
2017-01-24 14:05 ` [Qemu-devel] [PATCH 1/3] colo-compare: reconstruct the mutex lock usage zhanghailiang
2017-02-03  3:47   ` Jason Wang
2017-02-06  8:13     ` Hailiang Zhang
2017-02-06  8:30       ` Zhang Chen
2017-02-06  9:35       ` Jason Wang
2017-02-06 11:11         ` Hailiang Zhang
2017-02-06 12:53           ` Jason Wang
2017-02-07  7:54             ` Hailiang Zhang
2017-02-07  7:57               ` Jason Wang
2017-02-07  8:19                 ` Hailiang Zhang
2017-02-07  9:21                   ` Jason Wang
2017-02-07  9:30                     ` Hailiang Zhang
2017-02-14  2:32                     ` Hailiang Zhang
2017-02-14  4:08                       ` Jason Wang
2017-02-14  7:33                         ` Hailiang Zhang
2017-02-15  3:16                           ` Jason Wang
2017-01-24 14:05 ` [Qemu-devel] [PATCH 2/3] colo-compare: add API to flush all queued packets while do checkpoint zhanghailiang
2017-01-24 14:05 ` [Qemu-devel] [PATCH 3/3] colo-compare: use notifier to notify inconsistent packets comparing zhanghailiang
2017-02-03  4:50   ` Jason Wang
2017-02-06  8:44     ` Hailiang Zhang
2017-02-06  9:35       ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.