[PATCH 00/10] Fixed some bugs and optimized some codes for COLO

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 00/10] Fixed some bugs and optimized some codes for COLO
@ 2021-01-13  2:46 leirao
  2021-01-13  2:46 ` [PATCH 01/10] Remove some duplicate trace code leirao
                   ` (10 more replies)
  0 siblings, 11 replies; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: leirao, qemu-devel

The series of patches include:
	Fixed some bugs of qemu crash.
	Optimized some code to reduce the time of checkpoint.
	Remove some unnecessary code to improve COLO.

Rao, Lei (10):
  Remove some duplicate trace code.
  Fix the qemu crash when guest shutdown during checkpoint
  Optimize the function of filter_send
  Remove migrate_set_block_enabled in checkpoint
  Optimize the function of packet_new
  Add the function of colo_compare_cleanup
  Disable auto-coverge before entering COLO mode.
  Reduce the PVM stop time during Checkpoint
  Add the function of colo_bitmap_clear_diry
  Fixed calculation error of pkt->header_size in fill_pkt_tcp_info()

 migration/colo.c      |  6 -----
 migration/migration.c | 20 +++++++++++++++-
 migration/ram.c       | 65 ++++++++++++++++++++++++++++++++++++++++++++++++---
 net/colo-compare.c    | 32 ++++++++++++-------------
 net/colo-compare.h    |  1 +
 net/colo.c            |  4 ++--
 net/colo.h            |  2 +-
 net/filter-mirror.c   |  8 +++----
 net/filter-rewriter.c |  1 -
 net/net.c             |  4 ++++
 softmmu/runstate.c    |  1 +
 11 files changed, 110 insertions(+), 34 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 01/10] Remove some duplicate trace code.
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-20 18:43   ` Lukas Straub
  2021-01-13  2:46 ` [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint leirao
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

There is the same trace code in the colo_compare_packet_payload.

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 net/colo-compare.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 84db497..9e18baa 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -590,19 +590,6 @@ static int colo_packet_compare_other(Packet *spkt, Packet *ppkt)
     uint16_t offset = ppkt->vnet_hdr_len;
 
     trace_colo_compare_main("compare other");
-    if (trace_event_get_state_backends(TRACE_COLO_COMPARE_IP_INFO)) {
-        char pri_ip_src[20], pri_ip_dst[20], sec_ip_src[20], sec_ip_dst[20];
-
-        strcpy(pri_ip_src, inet_ntoa(ppkt->ip->ip_src));
-        strcpy(pri_ip_dst, inet_ntoa(ppkt->ip->ip_dst));
-        strcpy(sec_ip_src, inet_ntoa(spkt->ip->ip_src));
-        strcpy(sec_ip_dst, inet_ntoa(spkt->ip->ip_dst));
-
-        trace_colo_compare_ip_info(ppkt->size, pri_ip_src,
-                                   pri_ip_dst, spkt->size,
-                                   sec_ip_src, sec_ip_dst);
-    }
-
     if (ppkt->size != spkt->size) {
         trace_colo_compare_main("Other: payload size of packets are different");
         return -1;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
  2021-01-13  2:46 ` [PATCH 01/10] Remove some duplicate trace code leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-20 19:12   ` Lukas Straub
  2021-01-13  2:46 ` [PATCH 03/10] Optimize the function of filter_send leirao
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

This patch fixes the following:
    qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
    Aborted (core dumped)

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 softmmu/runstate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index 636aab0..455ad0d 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -125,6 +125,7 @@ static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
 
     { RUN_STATE_COLO, RUN_STATE_RUNNING },
+    { RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
 
     { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
     { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 03/10] Optimize the function of filter_send
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
  2021-01-13  2:46 ` [PATCH 01/10] Remove some duplicate trace code leirao
  2021-01-13  2:46 ` [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-20 19:21   ` Lukas Straub
  2021-01-13  2:46 ` [PATCH 04/10] Remove migrate_set_block_enabled in checkpoint leirao
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

The iov_size has been calculated in filter_send(). we can directly
return the size.In this way, this is no need to repeat calculations
in filter_redirector_receive_iov();

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 net/filter-mirror.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index f8e6500..7fa2eb3 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -88,7 +88,7 @@ static int filter_send(MirrorState *s,
         goto err;
     }
 
-    return 0;
+    return size;
 
 err:
     return ret < 0 ? ret : -EIO;
@@ -159,7 +159,7 @@ static ssize_t filter_mirror_receive_iov(NetFilterState *nf,
     int ret;
 
     ret = filter_send(s, iov, iovcnt);
-    if (ret) {
+    if (ret <= 0) {
         error_report("filter mirror send failed(%s)", strerror(-ret));
     }
 
@@ -182,10 +182,10 @@ static ssize_t filter_redirector_receive_iov(NetFilterState *nf,
 
     if (qemu_chr_fe_backend_connected(&s->chr_out)) {
         ret = filter_send(s, iov, iovcnt);
-        if (ret) {
+        if (ret <= 0) {
             error_report("filter redirector send failed(%s)", strerror(-ret));
         }
-        return iov_size(iov, iovcnt);
+        return ret;
     } else {
         return 0;
     }
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 04/10] Remove migrate_set_block_enabled in checkpoint
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
                   ` (2 preceding siblings ...)
  2021-01-13  2:46 ` [PATCH 03/10] Optimize the function of filter_send leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-20 19:28   ` Lukas Straub
  2021-01-13  2:46 ` [PATCH 05/10] Optimize the function of packet_new leirao
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

We can detect disk migration in migrate_prepare, if disk migration
is enabled in COLO mode, we can directly report an error.and there
is no need to disable block migration at every checkpoint.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
---
 migration/colo.c      | 6 ------
 migration/migration.c | 4 ++++
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index de27662..1aaf316 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -435,12 +435,6 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     if (failover_get_state() != FAILOVER_STATUS_NONE) {
         goto out;
     }
-
-    /* Disable block migration */
-    migrate_set_block_enabled(false, &local_err);
-    if (local_err) {
-        goto out;
-    }
     qemu_mutex_lock_iothread();
 
 #ifdef CONFIG_REPLICATION
diff --git a/migration/migration.c b/migration/migration.c
index a5da718..31417ce 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2107,6 +2107,10 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
     }
 
     if (blk || blk_inc) {
+        if (migrate_colo_enabled()) {
+            error_setg(errp, "No disk migration is required in COLO mode");
+            return false;
+        }
         if (migrate_use_block() || migrate_use_block_incremental()) {
             error_setg(errp, "Command options are incompatible with "
                        "current migration capabilities");
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 05/10] Optimize the function of packet_new
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
                   ` (3 preceding siblings ...)
  2021-01-13  2:46 ` [PATCH 04/10] Remove migrate_set_block_enabled in checkpoint leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-20 19:45   ` Lukas Straub
  2021-01-13  2:46 ` [PATCH 06/10] Add the function of colo_compare_cleanup leirao
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

if we put the data copy outside the packet_new(), then for the
filter-rewrite module, there will be one less memory copy in the
processing of each network packet.

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 net/colo-compare.c    | 7 +++++--
 net/colo.c            | 4 ++--
 net/colo.h            | 2 +-
 net/filter-rewriter.c | 1 -
 4 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 9e18baa..8bdf5a8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -247,14 +247,17 @@ static int packet_enqueue(CompareState *s, int mode, Connection **con)
     ConnectionKey key;
     Packet *pkt = NULL;
     Connection *conn;
+    char *data = NULL;
     int ret;
 
     if (mode == PRIMARY_IN) {
-        pkt = packet_new(s->pri_rs.buf,
+        data = g_memdup(s->pri_rs.buf, s->pri_rs.packet_len);
+        pkt = packet_new(data,
                          s->pri_rs.packet_len,
                          s->pri_rs.vnet_hdr_len);
     } else {
-        pkt = packet_new(s->sec_rs.buf,
+        data = g_memdup(s->sec_rs.buf, s->sec_rs.packet_len);
+        pkt = packet_new(data,
                          s->sec_rs.packet_len,
                          s->sec_rs.vnet_hdr_len);
     }
diff --git a/net/colo.c b/net/colo.c
index ef00609..08fb37e 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -155,11 +155,11 @@ void connection_destroy(void *opaque)
     g_slice_free(Connection, conn);
 }
 
-Packet *packet_new(const void *data, int size, int vnet_hdr_len)
+Packet *packet_new(void *data, int size, int vnet_hdr_len)
 {
     Packet *pkt = g_slice_new(Packet);
 
-    pkt->data = g_memdup(data, size);
+    pkt->data = data;
     pkt->size = size;
     pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     pkt->vnet_hdr_len = vnet_hdr_len;
diff --git a/net/colo.h b/net/colo.h
index 573ab91..bd2d719 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -100,7 +100,7 @@ Connection *connection_get(GHashTable *connection_track_table,
 bool connection_has_tracked(GHashTable *connection_track_table,
                             ConnectionKey *key);
 void connection_hashtable_reset(GHashTable *connection_track_table);
-Packet *packet_new(const void *data, int size, int vnet_hdr_len);
+Packet *packet_new(void *data, int size, int vnet_hdr_len);
 void packet_destroy(void *opaque, void *user_data);
 void packet_destroy_partial(void *opaque, void *user_data);
 
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index fc0e64c..e24afe5 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -271,7 +271,6 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
     }
 
     pkt = packet_new(buf, size, vnet_hdr_len);
-    g_free(buf);
 
     /*
      * if we get tcp packet
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 06/10] Add the function of colo_compare_cleanup
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
                   ` (4 preceding siblings ...)
  2021-01-13  2:46 ` [PATCH 05/10] Optimize the function of packet_new leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-13  2:46 ` [PATCH 07/10] Disable auto-coverge before entering COLO mode leirao
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

This patch fixes the following:
    #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
    #1  0x00007f6ae4559859 in __GI_abort () at abort.c:79
    #2  0x0000559aaa386720 in error_exit (err=16, msg=0x559aaa5973d0 <__func__.16227> "qemu_mutex_destroy") at util/qemu-thread-posix.c:36
    #3  0x0000559aaa3868c5 in qemu_mutex_destroy (mutex=0x559aabffe828) at util/qemu-thread-posix.c:69
    #4  0x0000559aaa2f93a8 in char_finalize (obj=0x559aabffe800) at chardev/char.c:285
    #5  0x0000559aaa23318a in object_deinit (obj=0x559aabffe800, type=0x559aabfd7d20) at qom/object.c:606
    #6  0x0000559aaa2331b8 in object_deinit (obj=0x559aabffe800, type=0x559aabfd9060) at qom/object.c:610
    #7  0x0000559aaa233200 in object_finalize (data=0x559aabffe800) at qom/object.c:620
    #8  0x0000559aaa234202 in object_unref (obj=0x559aabffe800) at qom/object.c:1074
    #9  0x0000559aaa2356b6 in object_finalize_child_property (obj=0x559aac0dac10, name=0x559aac778760 "compare0-0", opaque=0x559aabffe800) at qom/object.c:1584
    #10 0x0000559aaa232f70 in object_property_del_all (obj=0x559aac0dac10) at qom/object.c:557
    #11 0x0000559aaa2331ed in object_finalize (data=0x559aac0dac10) at qom/object.c:619
    #12 0x0000559aaa234202 in object_unref (obj=0x559aac0dac10) at qom/object.c:1074
    #13 0x0000559aaa2356b6 in object_finalize_child_property (obj=0x559aac0c75c0, name=0x559aac0dadc0 "chardevs", opaque=0x559aac0dac10) at qom/object.c:1584
    #14 0x0000559aaa233071 in object_property_del_child (obj=0x559aac0c75c0, child=0x559aac0dac10, errp=0x0) at qom/object.c:580
    #15 0x0000559aaa233155 in object_unparent (obj=0x559aac0dac10) at qom/object.c:599
    #16 0x0000559aaa2fb721 in qemu_chr_cleanup () at chardev/char.c:1159
    #17 0x0000559aa9f9b110 in main (argc=54, argv=0x7ffeb62fa998, envp=0x7ffeb62fab50) at vl.c:4539

When chardev is cleaned up, chr_write_lock needs to be destroyed. But
the colo-compare module is not cleaned up normally before it when the
guest poweroff. It is holding chr_write_lock at this time. This will
cause qemu crash.So we add the function of colo_compare_cleanup() before
qemu_chr_cleanup() to fix the bug.

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 net/colo-compare.c | 10 ++++++++++
 net/colo-compare.h |  1 +
 net/net.c          |  4 ++++
 3 files changed, 15 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 8bdf5a8..06f2c28 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -1404,6 +1404,16 @@ static void colo_compare_init(Object *obj)
                              compare_set_vnet_hdr);
 }
 
+void colo_compare_cleanup(void)
+{
+    CompareState *tmp = NULL;
+    CompareState *n = NULL;
+
+    QTAILQ_FOREACH_SAFE(tmp, &net_compares, next, n) {
+        object_unparent(OBJECT(tmp));
+    }
+}
+
 static void colo_compare_finalize(Object *obj)
 {
     CompareState *s = COLO_COMPARE(obj);
diff --git a/net/colo-compare.h b/net/colo-compare.h
index 22ddd51..b055270 100644
--- a/net/colo-compare.h
+++ b/net/colo-compare.h
@@ -20,5 +20,6 @@
 void colo_notify_compares_event(void *opaque, int event, Error **errp);
 void colo_compare_register_notifier(Notifier *notify);
 void colo_compare_unregister_notifier(Notifier *notify);
+void colo_compare_cleanup(void);
 
 #endif /* QEMU_COLO_COMPARE_H */
diff --git a/net/net.c b/net/net.c
index e1035f2..f69db4b 100644
--- a/net/net.c
+++ b/net/net.c
@@ -53,6 +53,7 @@
 #include "sysemu/qtest.h"
 #include "sysemu/runstate.h"
 #include "sysemu/sysemu.h"
+#include "net/colo-compare.h"
 #include "net/filter.h"
 #include "qapi/string-output-visitor.h"
 
@@ -1366,6 +1367,9 @@ void net_cleanup(void)
 {
     NetClientState *nc;
 
+    /*cleanup colo compare module for COLO*/
+    colo_compare_cleanup();
+
     /* We may del multiple entries during qemu_del_net_client(),
      * so QTAILQ_FOREACH_SAFE() is also not safe here.
      */
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 07/10] Disable auto-coverge before entering COLO mode.
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
                   ` (5 preceding siblings ...)
  2021-01-13  2:46 ` [PATCH 06/10] Add the function of colo_compare_cleanup leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-13 11:31   ` Dr. David Alan Gilbert
  2021-02-14 10:52   ` Lukas Straub
  2021-01-13  2:46 ` [PATCH 08/10] Reduce the PVM stop time during Checkpoint leirao
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

If we don't disable the feature of auto-converge for live migration
before entering COLO mode, it will continue to run with COLO running,
and eventually the system will hang due to the CPU throttle reaching
DEFAULT_MIGRATE_MAX_CPU_THROTTLE.

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 migration/migration.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 31417ce..6ab37e5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1673,6 +1673,20 @@ void migrate_set_block_enabled(bool value, Error **errp)
     qapi_free_MigrationCapabilityStatusList(cap);
 }
 
+static void colo_auto_converge_enabled(bool value, Error **errp)
+{
+    MigrationCapabilityStatusList *cap = NULL;
+
+    if (migrate_colo_enabled() && migrate_auto_converge()) {
+        QAPI_LIST_PREPEND(cap,
+                          migrate_cap_add(MIGRATION_CAPABILITY_AUTO_CONVERGE,
+                                          value));
+        qmp_migrate_set_capabilities(cap, errp);
+        qapi_free_MigrationCapabilityStatusList(cap);
+    }
+    cpu_throttle_stop();
+}
+
 static void migrate_set_block_incremental(MigrationState *s, bool value)
 {
     s->parameters.block_incremental = value;
@@ -3401,7 +3415,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
 static void migration_iteration_finish(MigrationState *s)
 {
     /* If we enabled cpu throttling for auto-converge, turn it off. */
-    cpu_throttle_stop();
+    colo_auto_converge_enabled(false, &error_abort);
 
     qemu_mutex_lock_iothread();
     switch (s->state) {
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 08/10] Reduce the PVM stop time during Checkpoint
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
                   ` (6 preceding siblings ...)
  2021-01-13  2:46 ` [PATCH 07/10] Disable auto-coverge before entering COLO mode leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-13  2:46 ` [PATCH 09/10] Add the function of colo_bitmap_clear_diry leirao
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

When flushing memory from ram cache to ram during every checkpoint
on secondary VM, we can copy continuous chunks of memory instead of
4096 bytes per time to reduce the time of VM stop during checkpoint.

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 migration/ram.c | 44 +++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 41 insertions(+), 3 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 7811cde..d875e9a 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -806,6 +806,39 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
     return next;
 }
 
+/*
+ * colo_bitmap_find_diry:find contiguous dirty pages from start
+ *
+ * Returns the page offset within memory region of the start of the contiguout
+ * dirty page
+ *
+ * @rs: current RAM state
+ * @rb: RAMBlock where to search for dirty pages
+ * @start: page where we start the search
+ * @num: the number of contiguous dirty pages
+ */
+static inline
+unsigned long colo_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
+                                     unsigned long start, unsigned long *num)
+{
+    unsigned long size = rb->used_length >> TARGET_PAGE_BITS;
+    unsigned long *bitmap = rb->bmap;
+    unsigned long first, next;
+
+    if (ramblock_is_ignored(rb)) {
+        return size;
+    }
+
+    first = find_next_bit(bitmap, size, start);
+    if (first >= size) {
+        return first;
+    }
+    next = find_next_zero_bit(bitmap, size, first + 1);
+    assert(next >= first);
+    *num = next - first;
+    return first;
+}
+
 static inline bool migration_bitmap_clear_dirty(RAMState *rs,
                                                 RAMBlock *rb,
                                                 unsigned long page)
@@ -3372,6 +3405,8 @@ void colo_flush_ram_cache(void)
     void *dst_host;
     void *src_host;
     unsigned long offset = 0;
+    unsigned long num = 0;
+    unsigned long i = 0;
 
     memory_global_dirty_log_sync();
     WITH_RCU_READ_LOCK_GUARD() {
@@ -3385,19 +3420,22 @@ void colo_flush_ram_cache(void)
         block = QLIST_FIRST_RCU(&ram_list.blocks);
 
         while (block) {
-            offset = migration_bitmap_find_dirty(ram_state, block, offset);
+            offset = colo_bitmap_find_dirty(ram_state, block, offset, &num);
 
             if (((ram_addr_t)offset) << TARGET_PAGE_BITS
                 >= block->used_length) {
                 offset = 0;
+                num = 0;
                 block = QLIST_NEXT_RCU(block, next);
             } else {
-                migration_bitmap_clear_dirty(ram_state, block, offset);
+                for (i = 0; i < num; i++) {
+                    migration_bitmap_clear_dirty(ram_state, block, offset + i);
+                }
                 dst_host = block->host
                          + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
                 src_host = block->colo_cache
                          + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
-                memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+                memcpy(dst_host, src_host, TARGET_PAGE_SIZE * num);
             }
         }
     }
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 09/10] Add the function of colo_bitmap_clear_diry
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
                   ` (7 preceding siblings ...)
  2021-01-13  2:46 ` [PATCH 08/10] Reduce the PVM stop time during Checkpoint leirao
@ 2021-01-13  2:46 ` leirao
  2021-01-13  2:46 ` [PATCH 10/10] Fixed calculation error of pkt->header_size in fill_pkt_tcp_info() leirao
  2021-02-14 11:50 ` [PATCH 00/10] Fixed some bugs and optimized some codes for COLO Lukas Straub
  10 siblings, 0 replies; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

When we use continuous dirty memory copy for flushing ram cache on
secondary VM, we can also clean up the bitmap of contiguous dirty
page memory. This also can reduce the VM stop time during checkpoint.

Signed-off-by: Lei Rao <lei.rao@intel.com>
---
 migration/ram.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index d875e9a..0f43b79 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -839,6 +839,30 @@ unsigned long colo_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
     return first;
 }
 
+/**
+ * colo_bitmap_clear_dirty:when we flush ram cache to ram, we will use
+ * continuous memory copy, so we can also clean up the bitmap of contiguous
+ * dirty memory.
+ */
+static inline bool colo_bitmap_clear_dirty(RAMState *rs,
+                                           RAMBlock *rb,
+                                           unsigned long start,
+                                           unsigned long num)
+{
+    bool ret;
+    unsigned long i = 0;
+
+    qemu_mutex_lock(&rs->bitmap_mutex);
+    for (i = 0; i < num; i++) {
+        ret = test_and_clear_bit(start + i, rb->bmap);
+        if (ret) {
+            rs->migration_dirty_pages--;
+        }
+    }
+    qemu_mutex_unlock(&rs->bitmap_mutex);
+    return ret;
+}
+
 static inline bool migration_bitmap_clear_dirty(RAMState *rs,
                                                 RAMBlock *rb,
                                                 unsigned long page)
@@ -3406,7 +3430,6 @@ void colo_flush_ram_cache(void)
     void *src_host;
     unsigned long offset = 0;
     unsigned long num = 0;
-    unsigned long i = 0;
 
     memory_global_dirty_log_sync();
     WITH_RCU_READ_LOCK_GUARD() {
@@ -3428,9 +3451,7 @@ void colo_flush_ram_cache(void)
                 num = 0;
                 block = QLIST_NEXT_RCU(block, next);
             } else {
-                for (i = 0; i < num; i++) {
-                    migration_bitmap_clear_dirty(ram_state, block, offset + i);
-                }
+                colo_bitmap_clear_dirty(ram_state, block, offset, num);
                 dst_host = block->host
                          + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
                 src_host = block->colo_cache
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 10/10] Fixed calculation error of pkt->header_size in fill_pkt_tcp_info()
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
                   ` (8 preceding siblings ...)
  2021-01-13  2:46 ` [PATCH 09/10] Add the function of colo_bitmap_clear_diry leirao
@ 2021-01-13  2:46 ` leirao
  2021-02-14 11:50 ` [PATCH 00/10] Fixed some bugs and optimized some codes for COLO Lukas Straub
  10 siblings, 0 replies; 27+ messages in thread
From: leirao @ 2021-01-13  2:46 UTC (permalink / raw)
  To: chen.zhang, lizhijian, jasowang, zhang.zhanghailiang, quintela, dgilbert
  Cc: Rao, Lei, qemu-devel

From: "Rao, Lei" <lei.rao@intel.com>

The data pointer has skipped vnet_hdr_len in the function of
parse_packet_early().So, we can not subtract vnet_hdr_len again
when calculating pkt->header_size in fill_pkt_tcp_info(). Otherwise,
it will cause network packet comparsion errors and greatly increase
the frequency of checkpoints.

Signed-off-by: Lei Rao <lei.rao@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
---
 net/colo-compare.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 06f2c28..af30490 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -211,7 +211,7 @@ static void fill_pkt_tcp_info(void *data, uint32_t *max_ack)
     pkt->tcp_ack = ntohl(tcphd->th_ack);
     *max_ack = *max_ack > pkt->tcp_ack ? *max_ack : pkt->tcp_ack;
     pkt->header_size = pkt->transport_header - (uint8_t *)pkt->data
-                       + (tcphd->th_off << 2) - pkt->vnet_hdr_len;
+                       + (tcphd->th_off << 2);
     pkt->payload_size = pkt->size - pkt->header_size;
     pkt->seq_end = pkt->tcp_seq + pkt->payload_size;
     pkt->flags = tcphd->th_flags;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 07/10] Disable auto-coverge before entering COLO mode.
  2021-01-13  2:46 ` [PATCH 07/10] Disable auto-coverge before entering COLO mode leirao
@ 2021-01-13 11:31   ` Dr. David Alan Gilbert
  2021-01-14  3:21     ` Rao, Lei
  2021-02-14 10:52   ` Lukas Straub
  1 sibling, 1 reply; 27+ messages in thread
From: Dr. David Alan Gilbert @ 2021-01-13 11:31 UTC (permalink / raw)
  To: leirao
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, qemu-devel,
	chen.zhang

* leirao (lei.rao@intel.com) wrote:
> From: "Rao, Lei" <lei.rao@intel.com>
> 
> If we don't disable the feature of auto-converge for live migration
> before entering COLO mode, it will continue to run with COLO running,
> and eventually the system will hang due to the CPU throttle reaching
> DEFAULT_MIGRATE_MAX_CPU_THROTTLE.
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>

I don't think that's the right answer, because it would seem reasonable
to use auto-converge to ensure that a COLO snapshot succeeded by
limiting guest CPU time.  Is the right fix here to reset the state of
the auto-converge counters at the start of each colo snapshot?

Dave

> ---
>  migration/migration.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 31417ce..6ab37e5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1673,6 +1673,20 @@ void migrate_set_block_enabled(bool value, Error **errp)
>      qapi_free_MigrationCapabilityStatusList(cap);
>  }
>  
> +static void colo_auto_converge_enabled(bool value, Error **errp)
> +{
> +    MigrationCapabilityStatusList *cap = NULL;
> +
> +    if (migrate_colo_enabled() && migrate_auto_converge()) {
> +        QAPI_LIST_PREPEND(cap,
> +                          migrate_cap_add(MIGRATION_CAPABILITY_AUTO_CONVERGE,
> +                                          value));
> +        qmp_migrate_set_capabilities(cap, errp);
> +        qapi_free_MigrationCapabilityStatusList(cap);
> +    }
> +    cpu_throttle_stop();
> +}
> +
>  static void migrate_set_block_incremental(MigrationState *s, bool value)
>  {
>      s->parameters.block_incremental = value;
> @@ -3401,7 +3415,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
>  static void migration_iteration_finish(MigrationState *s)
>  {
>      /* If we enabled cpu throttling for auto-converge, turn it off. */
> -    cpu_throttle_stop();
> +    colo_auto_converge_enabled(false, &error_abort);
>  
>      qemu_mutex_lock_iothread();
>      switch (s->state) {
> -- 
> 1.8.3.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 07/10] Disable auto-coverge before entering COLO mode.
  2021-01-13 11:31   ` Dr. David Alan Gilbert
@ 2021-01-14  3:21     ` Rao, Lei
  0 siblings, 0 replies; 27+ messages in thread
From: Rao, Lei @ 2021-01-14  3:21 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, qemu-devel,
	Zhang, Chen

I think there is a difference between doing checkpoints in COLO and live migration.
The feature of auto-converge is to ensure the success of live migration even though the dirty page generation speed is faster than data transfer.
but for COLO, we will force the VM to stop when something is doing a checkpoint. This will ensure the success of doing a checkpoint and this has nothing to do with auto-converge.

Thanks,
Lei.

-----Original Message-----
From: Dr. David Alan Gilbert <dgilbert@redhat.com> 
Sent: Wednesday, January 13, 2021 7:32 PM
To: Rao, Lei <lei.rao@intel.com>
Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 07/10] Disable auto-coverge before entering COLO mode.

* leirao (lei.rao@intel.com) wrote:
> From: "Rao, Lei" <lei.rao@intel.com>
> 
> If we don't disable the feature of auto-converge for live migration 
> before entering COLO mode, it will continue to run with COLO running, 
> and eventually the system will hang due to the CPU throttle reaching 
> DEFAULT_MIGRATE_MAX_CPU_THROTTLE.
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>

I don't think that's the right answer, because it would seem reasonable to use auto-converge to ensure that a COLO snapshot succeeded by limiting guest CPU time.  Is the right fix here to reset the state of the auto-converge counters at the start of each colo snapshot?

Dave

> ---
>  migration/migration.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c index 
> 31417ce..6ab37e5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1673,6 +1673,20 @@ void migrate_set_block_enabled(bool value, Error **errp)
>      qapi_free_MigrationCapabilityStatusList(cap);
>  }
>  
> +static void colo_auto_converge_enabled(bool value, Error **errp) {
> +    MigrationCapabilityStatusList *cap = NULL;
> +
> +    if (migrate_colo_enabled() && migrate_auto_converge()) {
> +        QAPI_LIST_PREPEND(cap,
> +                          migrate_cap_add(MIGRATION_CAPABILITY_AUTO_CONVERGE,
> +                                          value));
> +        qmp_migrate_set_capabilities(cap, errp);
> +        qapi_free_MigrationCapabilityStatusList(cap);
> +    }
> +    cpu_throttle_stop();
> +}
> +
>  static void migrate_set_block_incremental(MigrationState *s, bool 
> value)  {
>      s->parameters.block_incremental = value; @@ -3401,7 +3415,7 @@ 
> static MigIterateState migration_iteration_run(MigrationState *s)  
> static void migration_iteration_finish(MigrationState *s)  {
>      /* If we enabled cpu throttling for auto-converge, turn it off. */
> -    cpu_throttle_stop();
> +    colo_auto_converge_enabled(false, &error_abort);
>  
>      qemu_mutex_lock_iothread();
>      switch (s->state) {
> --
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 01/10] Remove some duplicate trace code.
  2021-01-13  2:46 ` [PATCH 01/10] Remove some duplicate trace code leirao
@ 2021-01-20 18:43   ` Lukas Straub
  0 siblings, 0 replies; 27+ messages in thread
From: Lukas Straub @ 2021-01-20 18:43 UTC (permalink / raw)
  To: leirao
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, chen.zhang

[-- Attachment #1: Type: text/plain, Size: 1508 bytes --]

On Wed, 13 Jan 2021 10:46:26 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> There is the same trace code in the colo_compare_packet_payload.
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>

Looks good to me,

Reviewed-by: Lukas Straub <lukasstraub2@web.de>

> ---
>  net/colo-compare.c | 13 -------------
>  1 file changed, 13 deletions(-)
> 
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 84db497..9e18baa 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -590,19 +590,6 @@ static int colo_packet_compare_other(Packet *spkt, Packet *ppkt)
>      uint16_t offset = ppkt->vnet_hdr_len;
>  
>      trace_colo_compare_main("compare other");
> -    if (trace_event_get_state_backends(TRACE_COLO_COMPARE_IP_INFO)) {
> -        char pri_ip_src[20], pri_ip_dst[20], sec_ip_src[20], sec_ip_dst[20];
> -
> -        strcpy(pri_ip_src, inet_ntoa(ppkt->ip->ip_src));
> -        strcpy(pri_ip_dst, inet_ntoa(ppkt->ip->ip_dst));
> -        strcpy(sec_ip_src, inet_ntoa(spkt->ip->ip_src));
> -        strcpy(sec_ip_dst, inet_ntoa(spkt->ip->ip_dst));
> -
> -        trace_colo_compare_ip_info(ppkt->size, pri_ip_src,
> -                                   pri_ip_dst, spkt->size,
> -                                   sec_ip_src, sec_ip_dst);
> -    }
> -
>      if (ppkt->size != spkt->size) {
>          trace_colo_compare_main("Other: payload size of packets are different");
>          return -1;



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
  2021-01-13  2:46 ` [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint leirao
@ 2021-01-20 19:12   ` Lukas Straub
  2021-01-21  1:48     ` Rao, Lei
  0 siblings, 1 reply; 27+ messages in thread
From: Lukas Straub @ 2021-01-20 19:12 UTC (permalink / raw)
  To: leirao
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, chen.zhang

[-- Attachment #1: Type: text/plain, Size: 1134 bytes --]

On Wed, 13 Jan 2021 10:46:27 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> This patch fixes the following:
>     qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
>     Aborted (core dumped)
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>

I wonder how that is possible, since the VM is stopped during 'colo' state.

Unrelated to this patch, I think this area needs some work since
the following unintended runstate transition is possible:
'shutdown' -> 'colo' -> 'running'.

> ---
>  softmmu/runstate.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
> index 636aab0..455ad0d 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -125,6 +125,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
>  
>      { RUN_STATE_COLO, RUN_STATE_RUNNING },
> +    { RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
>  
>      { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
>      { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 03/10] Optimize the function of filter_send
  2021-01-13  2:46 ` [PATCH 03/10] Optimize the function of filter_send leirao
@ 2021-01-20 19:21   ` Lukas Straub
  2021-01-21  1:02     ` Rao, Lei
  0 siblings, 1 reply; 27+ messages in thread
From: Lukas Straub @ 2021-01-20 19:21 UTC (permalink / raw)
  To: leirao
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, chen.zhang

[-- Attachment #1: Type: text/plain, Size: 1611 bytes --]

On Wed, 13 Jan 2021 10:46:28 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> The iov_size has been calculated in filter_send(). we can directly
> return the size.In this way, this is no need to repeat calculations
> in filter_redirector_receive_iov();
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>
> ---
>  net/filter-mirror.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/filter-mirror.c b/net/filter-mirror.c
> index f8e6500..7fa2eb3 100644
> --- a/net/filter-mirror.c
> +++ b/net/filter-mirror.c
> @@ -88,7 +88,7 @@ static int filter_send(MirrorState *s,
>          goto err;
>      }
>  
> -    return 0;
> +    return size;
>  
>  err:
>      return ret < 0 ? ret : -EIO;
> @@ -159,7 +159,7 @@ static ssize_t filter_mirror_receive_iov(NetFilterState *nf,
>      int ret;
>  
>      ret = filter_send(s, iov, iovcnt);
> -    if (ret) {
> +    if (ret <= 0) {
>          error_report("filter mirror send failed(%s)", strerror(-ret));
>      }

0 is a valid return value if the data to send has size = 0.

> @@ -182,10 +182,10 @@ static ssize_t filter_redirector_receive_iov(NetFilterState *nf,
>  
>      if (qemu_chr_fe_backend_connected(&s->chr_out)) {
>          ret = filter_send(s, iov, iovcnt);
> -        if (ret) {
> +        if (ret <= 0) {
>              error_report("filter redirector send failed(%s)", strerror(-ret));
>          }

dito

> -        return iov_size(iov, iovcnt);
> +        return ret;
>      } else {
>          return 0;
>      }



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 04/10] Remove migrate_set_block_enabled in checkpoint
  2021-01-13  2:46 ` [PATCH 04/10] Remove migrate_set_block_enabled in checkpoint leirao
@ 2021-01-20 19:28   ` Lukas Straub
  0 siblings, 0 replies; 27+ messages in thread
From: Lukas Straub @ 2021-01-20 19:28 UTC (permalink / raw)
  To: leirao
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, chen.zhang

[-- Attachment #1: Type: text/plain, Size: 1860 bytes --]

On Wed, 13 Jan 2021 10:46:29 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> We can detect disk migration in migrate_prepare, if disk migration
> is enabled in COLO mode, we can directly report an error.and there
> is no need to disable block migration at every checkpoint.
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>

Looks good to me,

Reviewed-by: Lukas Straub <lukasstraub2@web.de>


> ---
>  migration/colo.c      | 6 ------
>  migration/migration.c | 4 ++++
>  2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index de27662..1aaf316 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -435,12 +435,6 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
>      if (failover_get_state() != FAILOVER_STATUS_NONE) {
>          goto out;
>      }
> -
> -    /* Disable block migration */
> -    migrate_set_block_enabled(false, &local_err);
> -    if (local_err) {
> -        goto out;
> -    }
>      qemu_mutex_lock_iothread();
>  
>  #ifdef CONFIG_REPLICATION
> diff --git a/migration/migration.c b/migration/migration.c
> index a5da718..31417ce 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2107,6 +2107,10 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
>      }
>  
>      if (blk || blk_inc) {
> +        if (migrate_colo_enabled()) {
> +            error_setg(errp, "No disk migration is required in COLO mode");
> +            return false;
> +        }
>          if (migrate_use_block() || migrate_use_block_incremental()) {
>              error_setg(errp, "Command options are incompatible with "
>                         "current migration capabilities");



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 05/10] Optimize the function of packet_new
  2021-01-13  2:46 ` [PATCH 05/10] Optimize the function of packet_new leirao
@ 2021-01-20 19:45   ` Lukas Straub
  0 siblings, 0 replies; 27+ messages in thread
From: Lukas Straub @ 2021-01-20 19:45 UTC (permalink / raw)
  To: leirao
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, chen.zhang

[-- Attachment #1: Type: text/plain, Size: 3200 bytes --]

On Wed, 13 Jan 2021 10:46:30 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> if we put the data copy outside the packet_new(), then for the
> filter-rewrite module, there will be one less memory copy in the
> processing of each network packet.
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>

Looks good to me,

Reviewed-by: Lukas Straub <lukasstraub2@web.de>

> ---
>  net/colo-compare.c    | 7 +++++--
>  net/colo.c            | 4 ++--
>  net/colo.h            | 2 +-
>  net/filter-rewriter.c | 1 -
>  4 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 9e18baa..8bdf5a8 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -247,14 +247,17 @@ static int packet_enqueue(CompareState *s, int mode, Connection **con)
>      ConnectionKey key;
>      Packet *pkt = NULL;
>      Connection *conn;
> +    char *data = NULL;
>      int ret;
>  
>      if (mode == PRIMARY_IN) {
> -        pkt = packet_new(s->pri_rs.buf,
> +        data = g_memdup(s->pri_rs.buf, s->pri_rs.packet_len);
> +        pkt = packet_new(data,
>                           s->pri_rs.packet_len,
>                           s->pri_rs.vnet_hdr_len);
>      } else {
> -        pkt = packet_new(s->sec_rs.buf,
> +        data = g_memdup(s->sec_rs.buf, s->sec_rs.packet_len);
> +        pkt = packet_new(data,
>                           s->sec_rs.packet_len,
>                           s->sec_rs.vnet_hdr_len);
>      }
> diff --git a/net/colo.c b/net/colo.c
> index ef00609..08fb37e 100644
> --- a/net/colo.c
> +++ b/net/colo.c
> @@ -155,11 +155,11 @@ void connection_destroy(void *opaque)
>      g_slice_free(Connection, conn);
>  }
>  
> -Packet *packet_new(const void *data, int size, int vnet_hdr_len)
> +Packet *packet_new(void *data, int size, int vnet_hdr_len)
>  {
>      Packet *pkt = g_slice_new(Packet);
>  
> -    pkt->data = g_memdup(data, size);
> +    pkt->data = data;
>      pkt->size = size;
>      pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      pkt->vnet_hdr_len = vnet_hdr_len;
> diff --git a/net/colo.h b/net/colo.h
> index 573ab91..bd2d719 100644
> --- a/net/colo.h
> +++ b/net/colo.h
> @@ -100,7 +100,7 @@ Connection *connection_get(GHashTable *connection_track_table,
>  bool connection_has_tracked(GHashTable *connection_track_table,
>                              ConnectionKey *key);
>  void connection_hashtable_reset(GHashTable *connection_track_table);
> -Packet *packet_new(const void *data, int size, int vnet_hdr_len);
> +Packet *packet_new(void *data, int size, int vnet_hdr_len);
>  void packet_destroy(void *opaque, void *user_data);
>  void packet_destroy_partial(void *opaque, void *user_data);
>  
> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
> index fc0e64c..e24afe5 100644
> --- a/net/filter-rewriter.c
> +++ b/net/filter-rewriter.c
> @@ -271,7 +271,6 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>      }
>  
>      pkt = packet_new(buf, size, vnet_hdr_len);
> -    g_free(buf);
>  
>      /*
>       * if we get tcp packet



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 03/10] Optimize the function of filter_send
  2021-01-20 19:21   ` Lukas Straub
@ 2021-01-21  1:02     ` Rao, Lei
  0 siblings, 0 replies; 27+ messages in thread
From: Rao, Lei @ 2021-01-21  1:02 UTC (permalink / raw)
  To: Lukas Straub
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, Zhang, Chen

OK, you are right,  I will change it in V2.

Thanks,
Lei.

-----Original Message-----
From: Lukas Straub <lukasstraub2@web.de> 
Sent: Thursday, January 21, 2021 3:21 AM
To: Rao, Lei <lei.rao@intel.com>
Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 03/10] Optimize the function of filter_send

On Wed, 13 Jan 2021 10:46:28 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> The iov_size has been calculated in filter_send(). we can directly 
> return the size.In this way, this is no need to repeat calculations in 
> filter_redirector_receive_iov();
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>
> ---
>  net/filter-mirror.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/filter-mirror.c b/net/filter-mirror.c index 
> f8e6500..7fa2eb3 100644
> --- a/net/filter-mirror.c
> +++ b/net/filter-mirror.c
> @@ -88,7 +88,7 @@ static int filter_send(MirrorState *s,
>          goto err;
>      }
>  
> -    return 0;
> +    return size;
>  
>  err:
>      return ret < 0 ? ret : -EIO;
> @@ -159,7 +159,7 @@ static ssize_t filter_mirror_receive_iov(NetFilterState *nf,
>      int ret;
>  
>      ret = filter_send(s, iov, iovcnt);
> -    if (ret) {
> +    if (ret <= 0) {
>          error_report("filter mirror send failed(%s)", strerror(-ret));
>      }

0 is a valid return value if the data to send has size = 0.

> @@ -182,10 +182,10 @@ static ssize_t 
> filter_redirector_receive_iov(NetFilterState *nf,
>  
>      if (qemu_chr_fe_backend_connected(&s->chr_out)) {
>          ret = filter_send(s, iov, iovcnt);
> -        if (ret) {
> +        if (ret <= 0) {
>              error_report("filter redirector send failed(%s)", strerror(-ret));
>          }

dito

> -        return iov_size(iov, iovcnt);
> +        return ret;
>      } else {
>          return 0;
>      }



-- 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
  2021-01-20 19:12   ` Lukas Straub
@ 2021-01-21  1:48     ` Rao, Lei
  2021-01-27 18:24       ` Lukas Straub
  0 siblings, 1 reply; 27+ messages in thread
From: Rao, Lei @ 2021-01-21  1:48 UTC (permalink / raw)
  To: Lukas Straub
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, Zhang, Chen

The Primary VM can be shut down when it is in COLO state, which may trigger this bug.
About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have the problems you said. For 'shutdown'->'colo', The fixed patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
Recently, I found another bug as follows in the test.
	qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
    	Aborted (core dumped)
The gdb bt as following:
    #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
    #1  0x00007faa3d613859 in __GI_abort () at abort.c:79
    #2  0x000055c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at vl.c:723
    #3  0x000055c5a1f8cae4 in vm_prepare_start () at /home/workspace/colo-qemu/cpus.c:2206
    #4  0x000055c5a1f8cb1b in vm_start () at /home/workspace/colo-qemu/cpus.c:2213
    #5  0x000055c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at migration/migration.c:3376
    #6  0x000055c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at migration/migration.c:3527
    #7  0x000055c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at util/qemu-thread-posix.c:519
    #8  0x00007faa3d7e9609 in start_thread (arg=<optimized out>) at pthread_create.c:477
    #9  0x00007faa3d710293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

For the bug, I made the following changes:
	@@ -3379,7 +3379,9 @@ static void migration_iteration_finish(MigrationState *s)
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_CANCELLING:
         if (s->vm_was_running) {
-            vm_start();
+            if (!runstate_check(RUN_STATE_SHUTDOWN)) {
+                vm_start();
+            }
         } else {
             if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
                 runstate_set(RUN_STATE_POSTMIGRATE);
				 
I will send the patch to community after more test.

Thanks,
Lei.

-----Original Message-----
From: Lukas Straub <lukasstraub2@web.de> 
Sent: Thursday, January 21, 2021 3:13 AM
To: Rao, Lei <lei.rao@intel.com>
Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint

On Wed, 13 Jan 2021 10:46:27 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> This patch fixes the following:
>     qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
>     Aborted (core dumped)
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>

I wonder how that is possible, since the VM is stopped during 'colo' state.

Unrelated to this patch, I think this area needs some work since the following unintended runstate transition is possible:
'shutdown' -> 'colo' -> 'running'.

> ---
>  softmmu/runstate.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 
> 636aab0..455ad0d 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -125,6 +125,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
>  
>      { RUN_STATE_COLO, RUN_STATE_RUNNING },
> +    { RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
>  
>      { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
>      { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },



-- 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
  2021-01-21  1:48     ` Rao, Lei
@ 2021-01-27 18:24       ` Lukas Straub
  2021-01-29  2:57         ` Rao, Lei
  0 siblings, 1 reply; 27+ messages in thread
From: Lukas Straub @ 2021-01-27 18:24 UTC (permalink / raw)
  To: Rao, Lei
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, Zhang, Chen

[-- Attachment #1: Type: text/plain, Size: 3782 bytes --]

On Thu, 21 Jan 2021 01:48:31 +0000
"Rao, Lei" <lei.rao@intel.com> wrote:

> The Primary VM can be shut down when it is in COLO state, which may trigger this bug.

Do you have a backtrace for this bug?

> About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have the problems you said. For 'shutdown'->'colo', The fixed patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
> Recently, I found another bug as follows in the test.
> 	qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
>     	Aborted (core dumped)
> The gdb bt as following:
>     #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>     #1  0x00007faa3d613859 in __GI_abort () at abort.c:79
>     #2  0x000055c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at vl.c:723
>     #3  0x000055c5a1f8cae4 in vm_prepare_start () at /home/workspace/colo-qemu/cpus.c:2206
>     #4  0x000055c5a1f8cb1b in vm_start () at /home/workspace/colo-qemu/cpus.c:2213
>     #5  0x000055c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at migration/migration.c:3376
>     #6  0x000055c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at migration/migration.c:3527
>     #7  0x000055c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at util/qemu-thread-posix.c:519
>     #8  0x00007faa3d7e9609 in start_thread (arg=<optimized out>) at pthread_create.c:477
>     #9  0x00007faa3d710293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> For the bug, I made the following changes:
> 	@@ -3379,7 +3379,9 @@ static void migration_iteration_finish(MigrationState *s)
>      case MIGRATION_STATUS_CANCELLED:
>      case MIGRATION_STATUS_CANCELLING:
>          if (s->vm_was_running) {
> -            vm_start();
> +            if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> +                vm_start();
> +            }
>          } else {
>              if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
>                  runstate_set(RUN_STATE_POSTMIGRATE);
> 				 
> I will send the patch to community after more test.
> 
> Thanks,
> Lei.
> 
> -----Original Message-----
> From: Lukas Straub <lukasstraub2@web.de> 
> Sent: Thursday, January 21, 2021 3:13 AM
> To: Rao, Lei <lei.rao@intel.com>
> Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
> 
> On Wed, 13 Jan 2021 10:46:27 +0800
> leirao <lei.rao@intel.com> wrote:
> 
> > From: "Rao, Lei" <lei.rao@intel.com>
> > 
> > This patch fixes the following:
> >     qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
> >     Aborted (core dumped)
> > 
> > Signed-off-by: Lei Rao <lei.rao@intel.com>  
> 
> I wonder how that is possible, since the VM is stopped during 'colo' state.
> 
> Unrelated to this patch, I think this area needs some work since the following unintended runstate transition is possible:
> 'shutdown' -> 'colo' -> 'running'.
> 
> > ---
> >  softmmu/runstate.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 
> > 636aab0..455ad0d 100644
> > --- a/softmmu/runstate.c
> > +++ b/softmmu/runstate.c
> > @@ -125,6 +125,7 @@ static const RunStateTransition runstate_transitions_def[] = {
> >      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
> >  
> >      { RUN_STATE_COLO, RUN_STATE_RUNNING },
> > +    { RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
> >  
> >      { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
> >      { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },  
> 
> 
> 



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
  2021-01-27 18:24       ` Lukas Straub
@ 2021-01-29  2:57         ` Rao, Lei
  2021-02-14 11:45           ` Lukas Straub
  0 siblings, 1 reply; 27+ messages in thread
From: Rao, Lei @ 2021-01-29  2:57 UTC (permalink / raw)
  To: Lukas Straub
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, Zhang, Chen

The state will be set RUN_STATE_COLO in colo_do_checkpoint_transaction(). If the guest executes power off or shutdown at this time and the QEMU main thread will call vm_shutdown(), it will set the state to RUN_STATE_SHUTDOWN.
The state switch from RUN_STATE_COLO to RUN_STATE_SHUTDOWN is not defined in runstate_transitions_def. this will cause QEMU crash. Although this is small probability, it may still happen. By the way. Do you have any comments about other patches?

Thanks,
Lei.

-----Original Message-----
From: Lukas Straub <lukasstraub2@web.de> 
Sent: Thursday, January 28, 2021 2:24 AM
To: Rao, Lei <lei.rao@intel.com>
Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint

On Thu, 21 Jan 2021 01:48:31 +0000
"Rao, Lei" <lei.rao@intel.com> wrote:

> The Primary VM can be shut down when it is in COLO state, which may trigger this bug.

Do you have a backtrace for this bug?

> About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have the problems you said. For 'shutdown'->'colo', The fixed patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
> Recently, I found another bug as follows in the test.
> 	qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
>     	Aborted (core dumped)
> The gdb bt as following:
>     #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>     #1  0x00007faa3d613859 in __GI_abort () at abort.c:79
>     #2  0x000055c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at vl.c:723
>     #3  0x000055c5a1f8cae4 in vm_prepare_start () at /home/workspace/colo-qemu/cpus.c:2206
>     #4  0x000055c5a1f8cb1b in vm_start () at /home/workspace/colo-qemu/cpus.c:2213
>     #5  0x000055c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at migration/migration.c:3376
>     #6  0x000055c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at migration/migration.c:3527
>     #7  0x000055c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at util/qemu-thread-posix.c:519
>     #8  0x00007faa3d7e9609 in start_thread (arg=<optimized out>) at pthread_create.c:477
>     #9  0x00007faa3d710293 in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> For the bug, I made the following changes:
> 	@@ -3379,7 +3379,9 @@ static void migration_iteration_finish(MigrationState *s)
>      case MIGRATION_STATUS_CANCELLED:
>      case MIGRATION_STATUS_CANCELLING:
>          if (s->vm_was_running) {
> -            vm_start();
> +            if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> +                vm_start();
> +            }
>          } else {
>              if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
>                  runstate_set(RUN_STATE_POSTMIGRATE);
> 				 
> I will send the patch to community after more test.
> 
> Thanks,
> Lei.
> 
> -----Original Message-----
> From: Lukas Straub <lukasstraub2@web.de>
> Sent: Thursday, January 21, 2021 3:13 AM
> To: Rao, Lei <lei.rao@intel.com>
> Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; 
> jasowang@redhat.com; zhang.zhanghailiang@huawei.com; 
> quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown 
> during checkpoint
> 
> On Wed, 13 Jan 2021 10:46:27 +0800
> leirao <lei.rao@intel.com> wrote:
> 
> > From: "Rao, Lei" <lei.rao@intel.com>
> > 
> > This patch fixes the following:
> >     qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
> >     Aborted (core dumped)
> > 
> > Signed-off-by: Lei Rao <lei.rao@intel.com>
> 
> I wonder how that is possible, since the VM is stopped during 'colo' state.
> 
> Unrelated to this patch, I think this area needs some work since the following unintended runstate transition is possible:
> 'shutdown' -> 'colo' -> 'running'.
> 
> > ---
> >  softmmu/runstate.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 
> > 636aab0..455ad0d 100644
> > --- a/softmmu/runstate.c
> > +++ b/softmmu/runstate.c
> > @@ -125,6 +125,7 @@ static const RunStateTransition runstate_transitions_def[] = {
> >      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
> >  
> >      { RUN_STATE_COLO, RUN_STATE_RUNNING },
> > +    { RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
> >  
> >      { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
> >      { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
> 
> 
> 



-- 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 07/10] Disable auto-coverge before entering COLO mode.
  2021-01-13  2:46 ` [PATCH 07/10] Disable auto-coverge before entering COLO mode leirao
  2021-01-13 11:31   ` Dr. David Alan Gilbert
@ 2021-02-14 10:52   ` Lukas Straub
  2021-02-25  9:22     ` Rao, Lei
  1 sibling, 1 reply; 27+ messages in thread
From: Lukas Straub @ 2021-02-14 10:52 UTC (permalink / raw)
  To: leirao
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, chen.zhang

[-- Attachment #1: Type: text/plain, Size: 2024 bytes --]

On Wed, 13 Jan 2021 10:46:32 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> If we don't disable the feature of auto-converge for live migration
> before entering COLO mode, it will continue to run with COLO running,
> and eventually the system will hang due to the CPU throttle reaching
> DEFAULT_MIGRATE_MAX_CPU_THROTTLE.
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>
> ---
>  migration/migration.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 31417ce..6ab37e5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1673,6 +1673,20 @@ void migrate_set_block_enabled(bool value, Error **errp)
>      qapi_free_MigrationCapabilityStatusList(cap);
>  }
>  
> +static void colo_auto_converge_enabled(bool value, Error **errp)
> +{
> +    MigrationCapabilityStatusList *cap = NULL;
> +
> +    if (migrate_colo_enabled() && migrate_auto_converge()) {
> +        QAPI_LIST_PREPEND(cap,
> +                          migrate_cap_add(MIGRATION_CAPABILITY_AUTO_CONVERGE,
> +                                          value));
> +        qmp_migrate_set_capabilities(cap, errp);
> +        qapi_free_MigrationCapabilityStatusList(cap);
> +    }
> +    cpu_throttle_stop();
> +}
> +

I think it's better to error out in migration_prepare or migrate_caps_check
if both colo and auto-converge is enabled.

>  static void migrate_set_block_incremental(MigrationState *s, bool value)
>  {
>      s->parameters.block_incremental = value;
> @@ -3401,7 +3415,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
>  static void migration_iteration_finish(MigrationState *s)
>  {
>      /* If we enabled cpu throttling for auto-converge, turn it off. */
> -    cpu_throttle_stop();
> +    colo_auto_converge_enabled(false, &error_abort);
>  
>      qemu_mutex_lock_iothread();
>      switch (s->state) {



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
  2021-01-29  2:57         ` Rao, Lei
@ 2021-02-14 11:45           ` Lukas Straub
  2021-02-25  9:40             ` Rao, Lei
  0 siblings, 1 reply; 27+ messages in thread
From: Lukas Straub @ 2021-02-14 11:45 UTC (permalink / raw)
  To: Rao, Lei
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, Zhang, Chen

[-- Attachment #1: Type: text/plain, Size: 5241 bytes --]

On Fri, 29 Jan 2021 02:57:57 +0000
"Rao, Lei" <lei.rao@intel.com> wrote:

> The state will be set RUN_STATE_COLO in colo_do_checkpoint_transaction(). If the guest executes power off or shutdown at this time and the QEMU main thread will call vm_shutdown(), it will set the state to RUN_STATE_SHUTDOWN.
> The state switch from RUN_STATE_COLO to RUN_STATE_SHUTDOWN is not defined in runstate_transitions_def. this will cause QEMU crash. Although this is small probability, it may still happen.

This patch fixes the 'colo' -> 'shutdown' transition. AFAIK then
colo_do_checkpoint_transaction will call vm_start() again, which
does 'shutdown' -> 'running' and (rightfully) crashes. So I think
it is better to crash here too.

>  By the way. Do you have any comments about other patches?
> Thanks,
> Lei.
> 
> -----Original Message-----
> From: Lukas Straub <lukasstraub2@web.de> 
> Sent: Thursday, January 28, 2021 2:24 AM
> To: Rao, Lei <lei.rao@intel.com>
> Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
> 
> On Thu, 21 Jan 2021 01:48:31 +0000
> "Rao, Lei" <lei.rao@intel.com> wrote:
> 
> > The Primary VM can be shut down when it is in COLO state, which may trigger this bug.  
> 
> Do you have a backtrace for this bug?
> 
> > About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have the problems you said. For 'shutdown'->'colo', The fixed patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
> > Recently, I found another bug as follows in the test.
> > 	qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
> >     	Aborted (core dumped)
> > The gdb bt as following:
> >     #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> >     #1  0x00007faa3d613859 in __GI_abort () at abort.c:79
> >     #2  0x000055c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at vl.c:723
> >     #3  0x000055c5a1f8cae4 in vm_prepare_start () at /home/workspace/colo-qemu/cpus.c:2206
> >     #4  0x000055c5a1f8cb1b in vm_start () at /home/workspace/colo-qemu/cpus.c:2213
> >     #5  0x000055c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at migration/migration.c:3376
> >     #6  0x000055c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at migration/migration.c:3527
> >     #7  0x000055c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at util/qemu-thread-posix.c:519
> >     #8  0x00007faa3d7e9609 in start_thread (arg=<optimized out>) at pthread_create.c:477
> >     #9  0x00007faa3d710293 in clone () at 
> > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> > 
> > For the bug, I made the following changes:
> > 	@@ -3379,7 +3379,9 @@ static void migration_iteration_finish(MigrationState *s)
> >      case MIGRATION_STATUS_CANCELLED:
> >      case MIGRATION_STATUS_CANCELLING:
> >          if (s->vm_was_running) {
> > -            vm_start();
> > +            if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> > +                vm_start();
> > +            }
> >          } else {
> >              if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
> >                  runstate_set(RUN_STATE_POSTMIGRATE);
> > 				 
> > I will send the patch to community after more test.
> > 
> > Thanks,
> > Lei.
> > 
> > -----Original Message-----
> > From: Lukas Straub <lukasstraub2@web.de>
> > Sent: Thursday, January 21, 2021 3:13 AM
> > To: Rao, Lei <lei.rao@intel.com>
> > Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; 
> > jasowang@redhat.com; zhang.zhanghailiang@huawei.com; 
> > quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
> > Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown 
> > during checkpoint
> > 
> > On Wed, 13 Jan 2021 10:46:27 +0800
> > leirao <lei.rao@intel.com> wrote:
> >   
> > > From: "Rao, Lei" <lei.rao@intel.com>
> > > 
> > > This patch fixes the following:
> > >     qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
> > >     Aborted (core dumped)
> > > 
> > > Signed-off-by: Lei Rao <lei.rao@intel.com>  
> > 
> > I wonder how that is possible, since the VM is stopped during 'colo' state.
> > 
> > Unrelated to this patch, I think this area needs some work since the following unintended runstate transition is possible:
> > 'shutdown' -> 'colo' -> 'running'.
> >   
> > > ---
> > >  softmmu/runstate.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 
> > > 636aab0..455ad0d 100644
> > > --- a/softmmu/runstate.c
> > > +++ b/softmmu/runstate.c
> > > @@ -125,6 +125,7 @@ static const RunStateTransition runstate_transitions_def[] = {
> > >      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
> > >  
> > >      { RUN_STATE_COLO, RUN_STATE_RUNNING },
> > > +    { RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
> > >  
> > >      { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
> > >      { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },  
> > 
> > 
> >   
> 
> 
> 



-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 00/10] Fixed some bugs and optimized some codes for COLO
  2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
                   ` (9 preceding siblings ...)
  2021-01-13  2:46 ` [PATCH 10/10] Fixed calculation error of pkt->header_size in fill_pkt_tcp_info() leirao
@ 2021-02-14 11:50 ` Lukas Straub
  10 siblings, 0 replies; 27+ messages in thread
From: Lukas Straub @ 2021-02-14 11:50 UTC (permalink / raw)
  To: leirao
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, chen.zhang

[-- Attachment #1: Type: text/plain, Size: 605 bytes --]

On Wed, 13 Jan 2021 10:46:25 +0800
leirao <lei.rao@intel.com> wrote:

> The series of patches include:
> 	Fixed some bugs of qemu crash.
> 	Optimized some code to reduce the time of checkpoint.
> 	Remove some unnecessary code to improve COLO.
> 

The rest of the patches look good to me. Can you address my comments
and send v2? I'll then run my test-suite over it and give you
Reviewed-by for the whole series.
Also, you should split this into two patch series: One for network and
one for migration, so the respective maintainers can take it trough
their trees.

Regards,
Lukas Straub

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 07/10] Disable auto-coverge before entering COLO mode.
  2021-02-14 10:52   ` Lukas Straub
@ 2021-02-25  9:22     ` Rao, Lei
  0 siblings, 0 replies; 27+ messages in thread
From: Rao, Lei @ 2021-02-25  9:22 UTC (permalink / raw)
  To: Lukas Straub
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, Zhang, Chen

Sorry for the late reply due to CNY.
Auto-converge ensure that live migration can be completed smoothly when there are too many dirty pages.
COLO may encounter the same situation when rebuild a new secondary VM. 
So, I think it is necessary to enable COLO and auto-converge at the same time.

Thanks,
Lei.

-----Original Message-----
From: Lukas Straub <lukasstraub2@web.de> 
Sent: Sunday, February 14, 2021 6:52 PM
To: Rao, Lei <lei.rao@intel.com>
Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 07/10] Disable auto-coverge before entering COLO mode.

On Wed, 13 Jan 2021 10:46:32 +0800
leirao <lei.rao@intel.com> wrote:

> From: "Rao, Lei" <lei.rao@intel.com>
> 
> If we don't disable the feature of auto-converge for live migration 
> before entering COLO mode, it will continue to run with COLO running, 
> and eventually the system will hang due to the CPU throttle reaching 
> DEFAULT_MIGRATE_MAX_CPU_THROTTLE.
> 
> Signed-off-by: Lei Rao <lei.rao@intel.com>
> ---
>  migration/migration.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c index 
> 31417ce..6ab37e5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1673,6 +1673,20 @@ void migrate_set_block_enabled(bool value, Error **errp)
>      qapi_free_MigrationCapabilityStatusList(cap);
>  }
>  
> +static void colo_auto_converge_enabled(bool value, Error **errp) {
> +    MigrationCapabilityStatusList *cap = NULL;
> +
> +    if (migrate_colo_enabled() && migrate_auto_converge()) {
> +        QAPI_LIST_PREPEND(cap,
> +                          migrate_cap_add(MIGRATION_CAPABILITY_AUTO_CONVERGE,
> +                                          value));
> +        qmp_migrate_set_capabilities(cap, errp);
> +        qapi_free_MigrationCapabilityStatusList(cap);
> +    }
> +    cpu_throttle_stop();
> +}
> +

I think it's better to error out in migration_prepare or migrate_caps_check if both colo and auto-converge is enabled.

>  static void migrate_set_block_incremental(MigrationState *s, bool 
> value)  {
>      s->parameters.block_incremental = value; @@ -3401,7 +3415,7 @@ 
> static MigIterateState migration_iteration_run(MigrationState *s)  
> static void migration_iteration_finish(MigrationState *s)  {
>      /* If we enabled cpu throttling for auto-converge, turn it off. */
> -    cpu_throttle_stop();
> +    colo_auto_converge_enabled(false, &error_abort);
>  
>      qemu_mutex_lock_iothread();
>      switch (s->state) {



-- 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint
  2021-02-14 11:45           ` Lukas Straub
@ 2021-02-25  9:40             ` Rao, Lei
  0 siblings, 0 replies; 27+ messages in thread
From: Rao, Lei @ 2021-02-25  9:40 UTC (permalink / raw)
  To: Lukas Straub
  Cc: zhang.zhanghailiang, lizhijian, quintela, jasowang, dgilbert,
	qemu-devel, Zhang, Chen

If user executes the shutdown normally and QEMU crashes, I think this is unacceptable.
Since we can avoid this situation, why not do it?

Thanks,
Lei.

-----Original Message-----
From: Lukas Straub <lukasstraub2@web.de> 
Sent: Sunday, February 14, 2021 7:46 PM
To: Rao, Lei <lei.rao@intel.com>
Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; jasowang@redhat.com; zhang.zhanghailiang@huawei.com; quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint

On Fri, 29 Jan 2021 02:57:57 +0000
"Rao, Lei" <lei.rao@intel.com> wrote:

> The state will be set RUN_STATE_COLO in colo_do_checkpoint_transaction(). If the guest executes power off or shutdown at this time and the QEMU main thread will call vm_shutdown(), it will set the state to RUN_STATE_SHUTDOWN.
> The state switch from RUN_STATE_COLO to RUN_STATE_SHUTDOWN is not defined in runstate_transitions_def. this will cause QEMU crash. Although this is small probability, it may still happen.

This patch fixes the 'colo' -> 'shutdown' transition. AFAIK then colo_do_checkpoint_transaction will call vm_start() again, which does 'shutdown' -> 'running' and (rightfully) crashes. So I think it is better to crash here too.

>  By the way. Do you have any comments about other patches?
> Thanks,
> Lei.
> 
> -----Original Message-----
> From: Lukas Straub <lukasstraub2@web.de>
> Sent: Thursday, January 28, 2021 2:24 AM
> To: Rao, Lei <lei.rao@intel.com>
> Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; 
> jasowang@redhat.com; zhang.zhanghailiang@huawei.com; 
> quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown 
> during checkpoint
> 
> On Thu, 21 Jan 2021 01:48:31 +0000
> "Rao, Lei" <lei.rao@intel.com> wrote:
> 
> > The Primary VM can be shut down when it is in COLO state, which may trigger this bug.  
> 
> Do you have a backtrace for this bug?
> 
> > About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have the problems you said. For 'shutdown'->'colo', The fixed patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
> > Recently, I found another bug as follows in the test.
> > 	qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
> >     	Aborted (core dumped)
> > The gdb bt as following:
> >     #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> >     #1  0x00007faa3d613859 in __GI_abort () at abort.c:79
> >     #2  0x000055c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at vl.c:723
> >     #3  0x000055c5a1f8cae4 in vm_prepare_start () at /home/workspace/colo-qemu/cpus.c:2206
> >     #4  0x000055c5a1f8cb1b in vm_start () at /home/workspace/colo-qemu/cpus.c:2213
> >     #5  0x000055c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at migration/migration.c:3376
> >     #6  0x000055c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at migration/migration.c:3527
> >     #7  0x000055c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at util/qemu-thread-posix.c:519
> >     #8  0x00007faa3d7e9609 in start_thread (arg=<optimized out>) at pthread_create.c:477
> >     #9  0x00007faa3d710293 in clone () at
> > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> > 
> > For the bug, I made the following changes:
> > 	@@ -3379,7 +3379,9 @@ static void migration_iteration_finish(MigrationState *s)
> >      case MIGRATION_STATUS_CANCELLED:
> >      case MIGRATION_STATUS_CANCELLING:
> >          if (s->vm_was_running) {
> > -            vm_start();
> > +            if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> > +                vm_start();
> > +            }
> >          } else {
> >              if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
> >                  runstate_set(RUN_STATE_POSTMIGRATE);
> > 				 
> > I will send the patch to community after more test.
> > 
> > Thanks,
> > Lei.
> > 
> > -----Original Message-----
> > From: Lukas Straub <lukasstraub2@web.de>
> > Sent: Thursday, January 21, 2021 3:13 AM
> > To: Rao, Lei <lei.rao@intel.com>
> > Cc: Zhang, Chen <chen.zhang@intel.com>; lizhijian@cn.fujitsu.com; 
> > jasowang@redhat.com; zhang.zhanghailiang@huawei.com; 
> > quintela@redhat.com; dgilbert@redhat.com; qemu-devel@nongnu.org
> > Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown 
> > during checkpoint
> > 
> > On Wed, 13 Jan 2021 10:46:27 +0800
> > leirao <lei.rao@intel.com> wrote:
> >   
> > > From: "Rao, Lei" <lei.rao@intel.com>
> > > 
> > > This patch fixes the following:
> > >     qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
> > >     Aborted (core dumped)
> > > 
> > > Signed-off-by: Lei Rao <lei.rao@intel.com>
> > 
> > I wonder how that is possible, since the VM is stopped during 'colo' state.
> > 
> > Unrelated to this patch, I think this area needs some work since the following unintended runstate transition is possible:
> > 'shutdown' -> 'colo' -> 'running'.
> >   
> > > ---
> > >  softmmu/runstate.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 
> > > 636aab0..455ad0d 100644
> > > --- a/softmmu/runstate.c
> > > +++ b/softmmu/runstate.c
> > > @@ -125,6 +125,7 @@ static const RunStateTransition runstate_transitions_def[] = {
> > >      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
> > >  
> > >      { RUN_STATE_COLO, RUN_STATE_RUNNING },
> > > +    { RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
> > >  
> > >      { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
> > >      { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
> > 
> > 
> >   
> 
> 
> 



-- 



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-02-25  9:41 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-13  2:46 [PATCH 00/10] Fixed some bugs and optimized some codes for COLO leirao
2021-01-13  2:46 ` [PATCH 01/10] Remove some duplicate trace code leirao
2021-01-20 18:43   ` Lukas Straub
2021-01-13  2:46 ` [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint leirao
2021-01-20 19:12   ` Lukas Straub
2021-01-21  1:48     ` Rao, Lei
2021-01-27 18:24       ` Lukas Straub
2021-01-29  2:57         ` Rao, Lei
2021-02-14 11:45           ` Lukas Straub
2021-02-25  9:40             ` Rao, Lei
2021-01-13  2:46 ` [PATCH 03/10] Optimize the function of filter_send leirao
2021-01-20 19:21   ` Lukas Straub
2021-01-21  1:02     ` Rao, Lei
2021-01-13  2:46 ` [PATCH 04/10] Remove migrate_set_block_enabled in checkpoint leirao
2021-01-20 19:28   ` Lukas Straub
2021-01-13  2:46 ` [PATCH 05/10] Optimize the function of packet_new leirao
2021-01-20 19:45   ` Lukas Straub
2021-01-13  2:46 ` [PATCH 06/10] Add the function of colo_compare_cleanup leirao
2021-01-13  2:46 ` [PATCH 07/10] Disable auto-coverge before entering COLO mode leirao
2021-01-13 11:31   ` Dr. David Alan Gilbert
2021-01-14  3:21     ` Rao, Lei
2021-02-14 10:52   ` Lukas Straub
2021-02-25  9:22     ` Rao, Lei
2021-01-13  2:46 ` [PATCH 08/10] Reduce the PVM stop time during Checkpoint leirao
2021-01-13  2:46 ` [PATCH 09/10] Add the function of colo_bitmap_clear_diry leirao
2021-01-13  2:46 ` [PATCH 10/10] Fixed calculation error of pkt->header_size in fill_pkt_tcp_info() leirao
2021-02-14 11:50 ` [PATCH 00/10] Fixed some bugs and optimized some codes for COLO Lukas Straub

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.