[Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine
@ 2016-05-05  7:32 Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 1/9] migration: Fix multi-thread compression bug Liang Li
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

This patch set fixed a bug which will block live migration and another
potential issue when using multi-thread (de)compression.

The last patches try to refine the code and make the using of lock more
clear. Some of the code snippets are from Juan's multiple-fd patches,
with very small change. Thanks for Juan's work.


Liang Li (9):
  migration: Fix multi-thread compression bug
  migration: Fix a potential issue
  migration: remove useless code
  qemu-file: Fix qemu_put_compression_data flaw
  migration: refine ram_save_compressed_page
  migration: protect the quit flag by lock
  migration: refine the compression code
  migration: refine the decompression code
  migration: code clean up

 migration/qemu-file.c |  23 ++++-
 migration/ram.c       | 251 ++++++++++++++++++++++++++++----------------------
 2 files changed, 162 insertions(+), 112 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 1/9] migration: Fix multi-thread compression bug
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue Liang Li
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

Recently, a bug related to multiple thread compression feature for
live migration is reported. The destination side will be blocked
during live migration if there are heavy workload in host and
memory intensive workload in guest, this is most likely to happen
when there is one decompression thread.

Some parts of the decompression code are incorrect:
1. The main thread receives data from source side will enter a busy
loop to wait for a free decompression thread.
2. A lock is needed to protect the decomp_param[idx]->start, because
it is checked in the main thread and is updated in the decompression
thread.

Fix these two issues by following the code pattern for compression.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reported-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Daniel P. Berrange <berrange@redhat.com>

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 38 +++++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 3f05738..7ab6ab5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -263,6 +263,7 @@ typedef struct CompressParam CompressParam;
 
 struct DecompressParam {
     bool start;
+    bool done;
     QemuMutex mutex;
     QemuCond cond;
     void *des;
@@ -287,6 +288,8 @@ static bool quit_comp_thread;
 static bool quit_decomp_thread;
 static DecompressParam *decomp_param;
 static QemuThread *decompress_threads;
+static QemuMutex decomp_done_lock;
+static QemuCond decomp_done_cond;
 
 static int do_compress_ram_page(CompressParam *param);
 
@@ -834,6 +837,7 @@ static inline void start_compression(CompressParam *param)
 
 static inline void start_decompression(DecompressParam *param)
 {
+    param->done = false;
     qemu_mutex_lock(&param->mutex);
     param->start = true;
     qemu_cond_signal(&param->cond);
@@ -2193,19 +2197,24 @@ static void *do_data_decompress(void *opaque)
         qemu_mutex_lock(&param->mutex);
         while (!param->start && !quit_decomp_thread) {
             qemu_cond_wait(&param->cond, &param->mutex);
+        }
+        if (!quit_decomp_thread) {
             pagesize = TARGET_PAGE_SIZE;
-            if (!quit_decomp_thread) {
-                /* uncompress() will return failed in some case, especially
-                 * when the page is dirted when doing the compression, it's
-                 * not a problem because the dirty page will be retransferred
-                 * and uncompress() won't break the data in other pages.
-                 */
-                uncompress((Bytef *)param->des, &pagesize,
-                           (const Bytef *)param->compbuf, param->len);
-            }
-            param->start = false;
+            /* uncompress() will return failed in some case, especially
+             * when the page is dirted when doing the compression, it's
+             * not a problem because the dirty page will be retransferred
+             * and uncompress() won't break the data in other pages.
+             */
+            uncompress((Bytef *)param->des, &pagesize,
+                       (const Bytef *)param->compbuf, param->len);
         }
+        param->start = false;
         qemu_mutex_unlock(&param->mutex);
+
+        qemu_mutex_lock(&decomp_done_lock);
+        param->done = true;
+        qemu_cond_signal(&decomp_done_cond);
+        qemu_mutex_unlock(&decomp_done_lock);
     }
 
     return NULL;
@@ -2219,10 +2228,13 @@ void migrate_decompress_threads_create(void)
     decompress_threads = g_new0(QemuThread, thread_count);
     decomp_param = g_new0(DecompressParam, thread_count);
     quit_decomp_thread = false;
+    qemu_mutex_init(&decomp_done_lock);
+    qemu_cond_init(&decomp_done_cond);
     for (i = 0; i < thread_count; i++) {
         qemu_mutex_init(&decomp_param[i].mutex);
         qemu_cond_init(&decomp_param[i].cond);
         decomp_param[i].compbuf = g_malloc0(compressBound(TARGET_PAGE_SIZE));
+        decomp_param[i].done = true;
         qemu_thread_create(decompress_threads + i, "decompress",
                            do_data_decompress, decomp_param + i,
                            QEMU_THREAD_JOINABLE);
@@ -2258,9 +2270,10 @@ static void decompress_data_with_multi_threads(QEMUFile *f,
     int idx, thread_count;
 
     thread_count = migrate_decompress_threads();
+    qemu_mutex_lock(&decomp_done_lock);
     while (true) {
         for (idx = 0; idx < thread_count; idx++) {
-            if (!decomp_param[idx].start) {
+            if (decomp_param[idx].done) {
                 qemu_get_buffer(f, decomp_param[idx].compbuf, len);
                 decomp_param[idx].des = host;
                 decomp_param[idx].len = len;
@@ -2270,8 +2283,11 @@ static void decompress_data_with_multi_threads(QEMUFile *f,
         }
         if (idx < thread_count) {
             break;
+        } else {
+            qemu_cond_wait(&decomp_done_cond, &decomp_done_lock);
         }
     }
+    qemu_mutex_unlock(&decomp_done_lock);
 }
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 1/9] migration: Fix multi-thread compression bug Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-06-10 13:39   ` Amit Shah
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 3/9] migration: remove useless code Liang Li
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

At the end of live migration and before vm_start() on the destination
side, we should make sure all the decompression tasks are finished, if
this can not be guaranteed, the VM may get the incorrect memory data,
or the updated memory may be overwritten by the decompression thread.
Add the code to fix this potential issue.

Suggested-by: David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 7ab6ab5..8a59a08 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2220,6 +2220,24 @@ static void *do_data_decompress(void *opaque)
     return NULL;
 }
 
+static void wait_for_decompress_done(void)
+{
+    int idx, thread_count;
+
+    if (!migrate_use_compression()) {
+        return;
+    }
+
+    thread_count = migrate_decompress_threads();
+    qemu_mutex_lock(&decomp_done_lock);
+    for (idx = 0; idx < thread_count; idx++) {
+        while (!decomp_param[idx].done) {
+            qemu_cond_wait(&decomp_done_cond, &decomp_done_lock);
+        }
+    }
+    qemu_mutex_unlock(&decomp_done_lock);
+}
+
 void migrate_decompress_threads_create(void)
 {
     int i, thread_count;
@@ -2554,6 +2572,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
         }
     }
 
+    wait_for_decompress_done();
     rcu_read_unlock();
     DPRINTF("Completed load of VM with exit code %d seq iteration "
             "%" PRIu64 "\n", ret, seq_iter);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 3/9] migration: remove useless code
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 1/9] migration: Fix multi-thread compression bug Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 4/9] qemu-file: Fix qemu_put_compression_data flaw Liang Li
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

page_buffer is set twice repeatedly, remove the previous set.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 8a59a08..5d544c0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2356,7 +2356,6 @@ static int ram_load_postcopy(QEMUFile *f)
                 ret = -EINVAL;
                 break;
             }
-            page_buffer = host;
             /*
              * Postcopy requires that we place whole host pages atomically.
              * To make it atomic, the data is read into a temporary page
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 4/9] qemu-file: Fix qemu_put_compression_data flaw
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
                   ` (2 preceding siblings ...)
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 3/9] migration: remove useless code Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 5/9] migration: refine ram_save_compressed_page Liang Li
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

Current qemu_put_compression_data can only work with no writable
QEMUFile, and can't work with the writable QEMUFile. But it does
not provide any measure to prevent users from using it with a
writable QEMUFile.

We should fix this flaw to make it works with writable QEMUFile.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Suggested-by: Juan Quintela <quintela@redhat.com>
---
 migration/qemu-file.c | 23 +++++++++++++++++++++--
 migration/ram.c       | 18 +++++++++++++-----
 2 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 6f4a129..b0ef1f3 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -607,8 +607,14 @@ uint64_t qemu_get_be64(QEMUFile *f)
     return v;
 }
 
-/* compress size bytes of data start at p with specific compression
+/* Compress size bytes of data start at p with specific compression
  * level and store the compressed data to the buffer of f.
+ *
+ * When f is not writable, return -1 if f has no space to save the
+ * compressed data.
+ * When f is wirtable and it has no space to save the compressed data,
+ * do fflush first, if f still has no space to save the compressed
+ * data, return -1.
  */
 
 ssize_t qemu_put_compression_data(QEMUFile *f, const uint8_t *p, size_t size,
@@ -617,7 +623,14 @@ ssize_t qemu_put_compression_data(QEMUFile *f, const uint8_t *p, size_t size,
     ssize_t blen = IO_BUF_SIZE - f->buf_index - sizeof(int32_t);
 
     if (blen < compressBound(size)) {
-        return 0;
+        if (!qemu_file_is_writable(f)) {
+            return -1;
+        }
+        qemu_fflush(f);
+        blen = IO_BUF_SIZE - sizeof(int32_t);
+        if (blen < compressBound(size)) {
+            return -1;
+        }
     }
     if (compress2(f->buf + f->buf_index + sizeof(int32_t), (uLongf *)&blen,
                   (Bytef *)p, size, level) != Z_OK) {
@@ -625,7 +638,13 @@ ssize_t qemu_put_compression_data(QEMUFile *f, const uint8_t *p, size_t size,
         return 0;
     }
     qemu_put_be32(f, blen);
+    if (f->ops->writev_buffer) {
+        add_to_iovec(f, f->buf + f->buf_index, blen);
+    }
     f->buf_index += blen;
+    if (f->buf_index == IO_BUF_SIZE) {
+        qemu_fflush(f);
+    }
     return blen + sizeof(int32_t);
 }
 
diff --git a/migration/ram.c b/migration/ram.c
index 5d544c0..8a21c2c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -821,7 +821,13 @@ static int do_compress_ram_page(CompressParam *param)
                                   RAM_SAVE_FLAG_COMPRESS_PAGE);
     blen = qemu_put_compression_data(param->file, p, TARGET_PAGE_SIZE,
                                      migrate_compress_level());
-    bytes_sent += blen;
+    if (blen < 0) {
+        bytes_sent = 0;
+        qemu_file_set_error(migrate_get_current()->to_dst_file, blen);
+        error_report("compressed data failed!");
+    } else {
+        bytes_sent += blen;
+    }
 
     return bytes_sent;
 }
@@ -965,10 +971,12 @@ static int ram_save_compressed_page(QEMUFile *f, PageSearchStatus *pss,
                  * first page is sent out before other pages
                  */
                 bytes_xmit = do_compress_ram_page(&comp_param[0]);
-                acct_info.norm_pages++;
-                qemu_put_qemu_file(f, comp_param[0].file);
-                *bytes_transferred += bytes_xmit;
-                pages = 1;
+                if (bytes_xmit > 0) {
+                    acct_info.norm_pages++;
+                    qemu_put_qemu_file(f, comp_param[0].file);
+                    *bytes_transferred += bytes_xmit;
+                    pages = 1;
+                }
             }
         } else {
             pages = save_zero_page(f, block, offset, p, bytes_transferred);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 5/9] migration: refine ram_save_compressed_page
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
                   ` (3 preceding siblings ...)
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 4/9] qemu-file: Fix qemu_put_compression_data flaw Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 6/9] migration: protect the quit flag by lock Liang Li
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

Use qemu_put_compression_data to do the compression directly
instead of using do_compress_ram_page, avoid some data copy.
very small improvement, at the same time, add code to check
if the compression is successful.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 8a21c2c..1a45227 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -929,24 +929,20 @@ static int ram_save_compressed_page(QEMUFile *f, PageSearchStatus *pss,
                                     uint64_t *bytes_transferred)
 {
     int pages = -1;
-    uint64_t bytes_xmit;
+    uint64_t bytes_xmit = 0;
     uint8_t *p;
-    int ret;
+    int ret, blen;
     RAMBlock *block = pss->block;
     ram_addr_t offset = pss->offset;
 
     p = block->host + offset;
 
-    bytes_xmit = 0;
     ret = ram_control_save_page(f, block->offset,
                                 offset, TARGET_PAGE_SIZE, &bytes_xmit);
     if (bytes_xmit) {
         *bytes_transferred += bytes_xmit;
         pages = 1;
     }
-    if (block == last_sent_block) {
-        offset |= RAM_SAVE_FLAG_CONTINUE;
-    }
     if (ret != RAM_SAVE_CONTROL_NOT_SUPP) {
         if (ret != RAM_SAVE_CONTROL_DELAYED) {
             if (bytes_xmit > 0) {
@@ -966,19 +962,22 @@ static int ram_save_compressed_page(QEMUFile *f, PageSearchStatus *pss,
             flush_compressed_data(f);
             pages = save_zero_page(f, block, offset, p, bytes_transferred);
             if (pages == -1) {
-                set_compress_params(&comp_param[0], block, offset);
-                /* Use the qemu thread to compress the data to make sure the
-                 * first page is sent out before other pages
-                 */
-                bytes_xmit = do_compress_ram_page(&comp_param[0]);
-                if (bytes_xmit > 0) {
+                /* Make sure the first page is sent out before other pages */
+                bytes_xmit = save_page_header(f, block, offset |
+                                              RAM_SAVE_FLAG_COMPRESS_PAGE);
+                blen = qemu_put_compression_data(f, p, TARGET_PAGE_SIZE,
+                                                 migrate_compress_level());
+                if (blen > 0) {
+                    *bytes_transferred += bytes_xmit + blen;
                     acct_info.norm_pages++;
-                    qemu_put_qemu_file(f, comp_param[0].file);
-                    *bytes_transferred += bytes_xmit;
                     pages = 1;
+                } else {
+                    qemu_file_set_error(f, blen);
+                    error_report("compressed data failed!");
                 }
             }
         } else {
+            offset |= RAM_SAVE_FLAG_CONTINUE;
             pages = save_zero_page(f, block, offset, p, bytes_transferred);
             if (pages == -1) {
                 pages = compress_page_with_multi_thread(f, block, offset,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 6/9] migration: protect the quit flag by lock
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
                   ` (4 preceding siblings ...)
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 5/9] migration: refine ram_save_compressed_page Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 7/9] migration: refine the compression code Liang Li
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

quit_comp_thread and quit_decomp_thread are accessed by several
thread, it's better to protect them with locks. We use a per
thread flag to replace the global one, and the new flag is protected
by a lock.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 1a45227..50eae7c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -253,6 +253,7 @@ static struct BitmapRcu {
 struct CompressParam {
     bool start;
     bool done;
+    bool quit;
     QEMUFile *file;
     QemuMutex mutex;
     QemuCond cond;
@@ -264,6 +265,7 @@ typedef struct CompressParam CompressParam;
 struct DecompressParam {
     bool start;
     bool done;
+    bool quit;
     QemuMutex mutex;
     QemuCond cond;
     void *des;
@@ -284,8 +286,6 @@ static QemuCond *comp_done_cond;
 static const QEMUFileOps empty_ops = { };
 
 static bool compression_switch;
-static bool quit_comp_thread;
-static bool quit_decomp_thread;
 static DecompressParam *decomp_param;
 static QemuThread *decompress_threads;
 static QemuMutex decomp_done_lock;
@@ -297,18 +297,18 @@ static void *do_data_compress(void *opaque)
 {
     CompressParam *param = opaque;
 
-    while (!quit_comp_thread) {
+    while (!param->quit) {
         qemu_mutex_lock(&param->mutex);
-        /* Re-check the quit_comp_thread in case of
+        /* Re-check the quit flag in case of
          * terminate_compression_threads is called just before
          * qemu_mutex_lock(&param->mutex) and after
-         * while(!quit_comp_thread), re-check it here can make
+         * while(!param->quit), re-check it here can make
          * sure the compression thread terminate as expected.
          */
-        while (!param->start && !quit_comp_thread) {
+        while (!param->start && !param->quit) {
             qemu_cond_wait(&param->cond, &param->mutex);
         }
-        if (!quit_comp_thread) {
+        if (!param->quit) {
             do_compress_ram_page(param);
         }
         param->start = false;
@@ -328,9 +328,9 @@ static inline void terminate_compression_threads(void)
     int idx, thread_count;
 
     thread_count = migrate_compress_threads();
-    quit_comp_thread = true;
     for (idx = 0; idx < thread_count; idx++) {
         qemu_mutex_lock(&comp_param[idx].mutex);
+        comp_param[idx].quit = true;
         qemu_cond_signal(&comp_param[idx].cond);
         qemu_mutex_unlock(&comp_param[idx].mutex);
     }
@@ -370,7 +370,6 @@ void migrate_compress_threads_create(void)
     if (!migrate_use_compression()) {
         return;
     }
-    quit_comp_thread = false;
     compression_switch = true;
     thread_count = migrate_compress_threads();
     compress_threads = g_new0(QemuThread, thread_count);
@@ -385,6 +384,7 @@ void migrate_compress_threads_create(void)
          */
         comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
         comp_param[i].done = true;
+        comp_param[i].quit = false;
         qemu_mutex_init(&comp_param[i].mutex);
         qemu_cond_init(&comp_param[i].cond);
         qemu_thread_create(compress_threads + i, "compress",
@@ -863,12 +863,12 @@ static void flush_compressed_data(QEMUFile *f)
     for (idx = 0; idx < thread_count; idx++) {
         if (!comp_param[idx].done) {
             qemu_mutex_lock(comp_done_lock);
-            while (!comp_param[idx].done && !quit_comp_thread) {
+            while (!comp_param[idx].done && !comp_param[idx].quit) {
                 qemu_cond_wait(comp_done_cond, comp_done_lock);
             }
             qemu_mutex_unlock(comp_done_lock);
         }
-        if (!quit_comp_thread) {
+        if (!comp_param[idx].quit) {
             len = qemu_put_qemu_file(f, comp_param[idx].file);
             bytes_transferred += len;
         }
@@ -2200,12 +2200,12 @@ static void *do_data_decompress(void *opaque)
     DecompressParam *param = opaque;
     unsigned long pagesize;
 
-    while (!quit_decomp_thread) {
+    while (!param->quit) {
         qemu_mutex_lock(&param->mutex);
-        while (!param->start && !quit_decomp_thread) {
+        while (!param->start && !param->quit) {
             qemu_cond_wait(&param->cond, &param->mutex);
         }
-        if (!quit_decomp_thread) {
+        if (!param->quit) {
             pagesize = TARGET_PAGE_SIZE;
             /* uncompress() will return failed in some case, especially
              * when the page is dirted when doing the compression, it's
@@ -2252,7 +2252,6 @@ void migrate_decompress_threads_create(void)
     thread_count = migrate_decompress_threads();
     decompress_threads = g_new0(QemuThread, thread_count);
     decomp_param = g_new0(DecompressParam, thread_count);
-    quit_decomp_thread = false;
     qemu_mutex_init(&decomp_done_lock);
     qemu_cond_init(&decomp_done_cond);
     for (i = 0; i < thread_count; i++) {
@@ -2260,6 +2259,7 @@ void migrate_decompress_threads_create(void)
         qemu_cond_init(&decomp_param[i].cond);
         decomp_param[i].compbuf = g_malloc0(compressBound(TARGET_PAGE_SIZE));
         decomp_param[i].done = true;
+        decomp_param[i].quit = false;
         qemu_thread_create(decompress_threads + i, "decompress",
                            do_data_decompress, decomp_param + i,
                            QEMU_THREAD_JOINABLE);
@@ -2270,10 +2270,10 @@ void migrate_decompress_threads_join(void)
 {
     int i, thread_count;
 
-    quit_decomp_thread = true;
     thread_count = migrate_decompress_threads();
     for (i = 0; i < thread_count; i++) {
         qemu_mutex_lock(&decomp_param[i].mutex);
+        decomp_param[i].quit = true;
         qemu_cond_signal(&decomp_param[i].cond);
         qemu_mutex_unlock(&decomp_param[i].mutex);
     }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 7/9] migration: refine the compression code
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
                   ` (5 preceding siblings ...)
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 6/9] migration: protect the quit flag by lock Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 8/9] migration: refine the decompression code Liang Li
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

The current code for multi-thread compression is not clear,
especially in the aspect of using lock. Refine the code
to make it clear.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 84 +++++++++++++++++++++++++++------------------------------
 1 file changed, 40 insertions(+), 44 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 50eae7c..f44f833 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -251,7 +251,6 @@ static struct BitmapRcu {
 } *migration_bitmap_rcu;
 
 struct CompressParam {
-    bool start;
     bool done;
     bool quit;
     QEMUFile *file;
@@ -291,34 +290,36 @@ static QemuThread *decompress_threads;
 static QemuMutex decomp_done_lock;
 static QemuCond decomp_done_cond;
 
-static int do_compress_ram_page(CompressParam *param);
+static int do_compress_ram_page(QEMUFile *f, RAMBlock *block,
+                                ram_addr_t offset);
 
 static void *do_data_compress(void *opaque)
 {
     CompressParam *param = opaque;
+    RAMBlock *block;
+    ram_addr_t offset;
 
+    qemu_mutex_lock(&param->mutex);
     while (!param->quit) {
-        qemu_mutex_lock(&param->mutex);
-        /* Re-check the quit flag in case of
-         * terminate_compression_threads is called just before
-         * qemu_mutex_lock(&param->mutex) and after
-         * while(!param->quit), re-check it here can make
-         * sure the compression thread terminate as expected.
-         */
-        while (!param->start && !param->quit) {
+        if (param->block) {
+            block = param->block;
+            offset = param->offset;
+            param->block = NULL;
+            qemu_mutex_unlock(&param->mutex);
+
+            do_compress_ram_page(param->file, block, offset);
+
+            qemu_mutex_lock(comp_done_lock);
+            param->done = true;
+            qemu_cond_signal(comp_done_cond);
+            qemu_mutex_unlock(comp_done_lock);
+
+            qemu_mutex_lock(&param->mutex);
+        } else {
             qemu_cond_wait(&param->cond, &param->mutex);
         }
-        if (!param->quit) {
-            do_compress_ram_page(param);
-        }
-        param->start = false;
-        qemu_mutex_unlock(&param->mutex);
-
-        qemu_mutex_lock(comp_done_lock);
-        param->done = true;
-        qemu_cond_signal(comp_done_cond);
-        qemu_mutex_unlock(comp_done_lock);
     }
+    qemu_mutex_unlock(&param->mutex);
 
     return NULL;
 }
@@ -808,18 +809,15 @@ static int ram_save_page(QEMUFile *f, PageSearchStatus *pss,
     return pages;
 }
 
-static int do_compress_ram_page(CompressParam *param)
+static int do_compress_ram_page(QEMUFile *f, RAMBlock *block,
+                                ram_addr_t offset)
 {
     int bytes_sent, blen;
-    uint8_t *p;
-    RAMBlock *block = param->block;
-    ram_addr_t offset = param->offset;
+    uint8_t *p = block->host + (offset & TARGET_PAGE_MASK);
 
-    p = block->host + (offset & TARGET_PAGE_MASK);
-
-    bytes_sent = save_page_header(param->file, block, offset |
+    bytes_sent = save_page_header(f, block, offset |
                                   RAM_SAVE_FLAG_COMPRESS_PAGE);
-    blen = qemu_put_compression_data(param->file, p, TARGET_PAGE_SIZE,
+    blen = qemu_put_compression_data(f, p, TARGET_PAGE_SIZE,
                                      migrate_compress_level());
     if (blen < 0) {
         bytes_sent = 0;
@@ -832,15 +830,6 @@ static int do_compress_ram_page(CompressParam *param)
     return bytes_sent;
 }
 
-static inline void start_compression(CompressParam *param)
-{
-    param->done = false;
-    qemu_mutex_lock(&param->mutex);
-    param->start = true;
-    qemu_cond_signal(&param->cond);
-    qemu_mutex_unlock(&param->mutex);
-}
-
 static inline void start_decompression(DecompressParam *param)
 {
     param->done = false;
@@ -860,18 +849,22 @@ static void flush_compressed_data(QEMUFile *f)
         return;
     }
     thread_count = migrate_compress_threads();
+
+    qemu_mutex_lock(comp_done_lock);
     for (idx = 0; idx < thread_count; idx++) {
-        if (!comp_param[idx].done) {
-            qemu_mutex_lock(comp_done_lock);
-            while (!comp_param[idx].done && !comp_param[idx].quit) {
-                qemu_cond_wait(comp_done_cond, comp_done_lock);
-            }
-            qemu_mutex_unlock(comp_done_lock);
+        while (!comp_param[idx].done) {
+            qemu_cond_wait(comp_done_cond, comp_done_lock);
         }
+    }
+    qemu_mutex_unlock(comp_done_lock);
+
+    for (idx = 0; idx < thread_count; idx++) {
+        qemu_mutex_lock(&comp_param[idx].mutex);
         if (!comp_param[idx].quit) {
             len = qemu_put_qemu_file(f, comp_param[idx].file);
             bytes_transferred += len;
         }
+        qemu_mutex_unlock(&comp_param[idx].mutex);
     }
 }
 
@@ -893,9 +886,12 @@ static int compress_page_with_multi_thread(QEMUFile *f, RAMBlock *block,
     while (true) {
         for (idx = 0; idx < thread_count; idx++) {
             if (comp_param[idx].done) {
+                comp_param[idx].done = false;
                 bytes_xmit = qemu_put_qemu_file(f, comp_param[idx].file);
+                qemu_mutex_lock(&comp_param[idx].mutex);
                 set_compress_params(&comp_param[idx], block, offset);
-                start_compression(&comp_param[idx]);
+                qemu_cond_signal(&comp_param[idx].cond);
+                qemu_mutex_unlock(&comp_param[idx].mutex);
                 pages = 1;
                 acct_info.norm_pages++;
                 *bytes_transferred += bytes_xmit;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 8/9] migration: refine the decompression code
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
                   ` (6 preceding siblings ...)
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 7/9] migration: refine the compression code Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 9/9] migration: code clean up Liang Li
  2016-05-23  8:55 ` [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Amit Shah
  9 siblings, 0 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

The current code for multi-thread decompression is not clear,
especially in the aspect of using lock. Refine the code
to make it clear.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 50 +++++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f44f833..51de958 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -262,7 +262,6 @@ struct CompressParam {
 typedef struct CompressParam CompressParam;
 
 struct DecompressParam {
-    bool start;
     bool done;
     bool quit;
     QemuMutex mutex;
@@ -830,15 +829,6 @@ static int do_compress_ram_page(QEMUFile *f, RAMBlock *block,
     return bytes_sent;
 }
 
-static inline void start_decompression(DecompressParam *param)
-{
-    param->done = false;
-    qemu_mutex_lock(&param->mutex);
-    param->start = true;
-    qemu_cond_signal(&param->cond);
-    qemu_mutex_unlock(&param->mutex);
-}
-
 static uint64_t bytes_transferred;
 
 static void flush_compressed_data(QEMUFile *f)
@@ -2195,30 +2185,37 @@ static void *do_data_decompress(void *opaque)
 {
     DecompressParam *param = opaque;
     unsigned long pagesize;
+    uint8_t *des;
+    int len;
 
+    qemu_mutex_lock(&param->mutex);
     while (!param->quit) {
-        qemu_mutex_lock(&param->mutex);
-        while (!param->start && !param->quit) {
-            qemu_cond_wait(&param->cond, &param->mutex);
-        }
-        if (!param->quit) {
+        if (param->des) {
+            des = param->des;
+            len = param->len;
+            param->des = 0;
+            qemu_mutex_unlock(&param->mutex);
+
             pagesize = TARGET_PAGE_SIZE;
             /* uncompress() will return failed in some case, especially
              * when the page is dirted when doing the compression, it's
              * not a problem because the dirty page will be retransferred
              * and uncompress() won't break the data in other pages.
              */
-            uncompress((Bytef *)param->des, &pagesize,
-                       (const Bytef *)param->compbuf, param->len);
-        }
-        param->start = false;
-        qemu_mutex_unlock(&param->mutex);
+            uncompress((Bytef *)des, &pagesize,
+                       (const Bytef *)param->compbuf, len);
 
-        qemu_mutex_lock(&decomp_done_lock);
-        param->done = true;
-        qemu_cond_signal(&decomp_done_cond);
-        qemu_mutex_unlock(&decomp_done_lock);
+            qemu_mutex_lock(&decomp_done_lock);
+            param->done = true;
+            qemu_cond_signal(&decomp_done_cond);
+            qemu_mutex_unlock(&decomp_done_lock);
+
+            qemu_mutex_lock(&param->mutex);
+        } else {
+            qemu_cond_wait(&param->cond, &param->mutex);
+        }
     }
+    qemu_mutex_unlock(&param->mutex);
 
     return NULL;
 }
@@ -2295,10 +2292,13 @@ static void decompress_data_with_multi_threads(QEMUFile *f,
     while (true) {
         for (idx = 0; idx < thread_count; idx++) {
             if (decomp_param[idx].done) {
+                decomp_param[idx].done = false;
+                qemu_mutex_lock(&decomp_param[idx].mutex);
                 qemu_get_buffer(f, decomp_param[idx].compbuf, len);
                 decomp_param[idx].des = host;
                 decomp_param[idx].len = len;
-                start_decompression(&decomp_param[idx]);
+                qemu_cond_signal(&decomp_param[idx].cond);
+                qemu_mutex_unlock(&decomp_param[idx].mutex);
                 break;
             }
         }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v2 9/9] migration: code clean up
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
                   ` (7 preceding siblings ...)
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 8/9] migration: refine the decompression code Liang Li
@ 2016-05-05  7:32 ` Liang Li
  2016-05-23  8:55 ` [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Amit Shah
  9 siblings, 0 replies; 16+ messages in thread
From: Liang Li @ 2016-05-05  7:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: quintela, amit.shah, dgilbert, berrange, Liang Li

Use 'QemuMutex comp_done_lock' and 'QemuCond comp_done_cond' instead
of 'QemuMutex *comp_done_lock' and 'QemuCond comp_done_cond'. To keep
consistent with 'QemuMutex decomp_done_lock' and
'QemuCond comp_done_cond'.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 migration/ram.c | 36 +++++++++++++++---------------------
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 51de958..5076862 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -278,8 +278,8 @@ static QemuThread *compress_threads;
  * one of the compression threads has finished the compression.
  * comp_done_lock is used to co-work with comp_done_cond.
  */
-static QemuMutex *comp_done_lock;
-static QemuCond *comp_done_cond;
+static QemuMutex comp_done_lock;
+static QemuCond comp_done_cond;
 /* The empty QEMUFileOps will be used by file in CompressParam */
 static const QEMUFileOps empty_ops = { };
 
@@ -308,10 +308,10 @@ static void *do_data_compress(void *opaque)
 
             do_compress_ram_page(param->file, block, offset);
 
-            qemu_mutex_lock(comp_done_lock);
+            qemu_mutex_lock(&comp_done_lock);
             param->done = true;
-            qemu_cond_signal(comp_done_cond);
-            qemu_mutex_unlock(comp_done_lock);
+            qemu_cond_signal(&comp_done_cond);
+            qemu_mutex_unlock(&comp_done_lock);
 
             qemu_mutex_lock(&param->mutex);
         } else {
@@ -351,16 +351,12 @@ void migrate_compress_threads_join(void)
         qemu_mutex_destroy(&comp_param[i].mutex);
         qemu_cond_destroy(&comp_param[i].cond);
     }
-    qemu_mutex_destroy(comp_done_lock);
-    qemu_cond_destroy(comp_done_cond);
+    qemu_mutex_destroy(&comp_done_lock);
+    qemu_cond_destroy(&comp_done_cond);
     g_free(compress_threads);
     g_free(comp_param);
-    g_free(comp_done_cond);
-    g_free(comp_done_lock);
     compress_threads = NULL;
     comp_param = NULL;
-    comp_done_cond = NULL;
-    comp_done_lock = NULL;
 }
 
 void migrate_compress_threads_create(void)
@@ -374,10 +370,8 @@ void migrate_compress_threads_create(void)
     thread_count = migrate_compress_threads();
     compress_threads = g_new0(QemuThread, thread_count);
     comp_param = g_new0(CompressParam, thread_count);
-    comp_done_cond = g_new0(QemuCond, 1);
-    comp_done_lock = g_new0(QemuMutex, 1);
-    qemu_cond_init(comp_done_cond);
-    qemu_mutex_init(comp_done_lock);
+    qemu_cond_init(&comp_done_cond);
+    qemu_mutex_init(&comp_done_lock);
     for (i = 0; i < thread_count; i++) {
         /* com_param[i].file is just used as a dummy buffer to save data, set
          * it's ops to empty.
@@ -840,13 +834,13 @@ static void flush_compressed_data(QEMUFile *f)
     }
     thread_count = migrate_compress_threads();
 
-    qemu_mutex_lock(comp_done_lock);
+    qemu_mutex_lock(&comp_done_lock);
     for (idx = 0; idx < thread_count; idx++) {
         while (!comp_param[idx].done) {
-            qemu_cond_wait(comp_done_cond, comp_done_lock);
+            qemu_cond_wait(&comp_done_cond, &comp_done_lock);
         }
     }
-    qemu_mutex_unlock(comp_done_lock);
+    qemu_mutex_unlock(&comp_done_lock);
 
     for (idx = 0; idx < thread_count; idx++) {
         qemu_mutex_lock(&comp_param[idx].mutex);
@@ -872,7 +866,7 @@ static int compress_page_with_multi_thread(QEMUFile *f, RAMBlock *block,
     int idx, thread_count, bytes_xmit = -1, pages = -1;
 
     thread_count = migrate_compress_threads();
-    qemu_mutex_lock(comp_done_lock);
+    qemu_mutex_lock(&comp_done_lock);
     while (true) {
         for (idx = 0; idx < thread_count; idx++) {
             if (comp_param[idx].done) {
@@ -891,10 +885,10 @@ static int compress_page_with_multi_thread(QEMUFile *f, RAMBlock *block,
         if (pages > 0) {
             break;
         } else {
-            qemu_cond_wait(comp_done_cond, comp_done_lock);
+            qemu_cond_wait(&comp_done_cond, &comp_done_lock);
         }
     }
-    qemu_mutex_unlock(comp_done_lock);
+    qemu_mutex_unlock(&comp_done_lock);
 
     return pages;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine
  2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
                   ` (8 preceding siblings ...)
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 9/9] migration: code clean up Liang Li
@ 2016-05-23  8:55 ` Amit Shah
  9 siblings, 0 replies; 16+ messages in thread
From: Amit Shah @ 2016-05-23  8:55 UTC (permalink / raw)
  To: Liang Li; +Cc: qemu-devel, quintela, dgilbert, berrange

I'll wait for Juan's review before picking this up.

On (Thu) 05 May 2016 [15:32:50], Liang Li wrote:
> This patch set fixed a bug which will block live migration and another
> potential issue when using multi-thread (de)compression.
> 
> The last patches try to refine the code and make the using of lock more
> clear. Some of the code snippets are from Juan's multiple-fd patches,
> with very small change. Thanks for Juan's work.
> 
> 
> Liang Li (9):
>   migration: Fix multi-thread compression bug
>   migration: Fix a potential issue
>   migration: remove useless code
>   qemu-file: Fix qemu_put_compression_data flaw
>   migration: refine ram_save_compressed_page
>   migration: protect the quit flag by lock
>   migration: refine the compression code
>   migration: refine the decompression code
>   migration: code clean up
> 
>  migration/qemu-file.c |  23 ++++-
>  migration/ram.c       | 251 ++++++++++++++++++++++++++++----------------------
>  2 files changed, 162 insertions(+), 112 deletions(-)
> 
> -- 
> 1.9.1
> 

		Amit

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue
  2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue Liang Li
@ 2016-06-10 13:39   ` Amit Shah
  2016-06-10 15:03     ` Li, Liang Z
  0 siblings, 1 reply; 16+ messages in thread
From: Amit Shah @ 2016-06-10 13:39 UTC (permalink / raw)
  To: Liang Li; +Cc: qemu-devel, quintela, dgilbert, berrange

On (Thu) 05 May 2016 [15:32:52], Liang Li wrote:
> At the end of live migration and before vm_start() on the destination
> side, we should make sure all the decompression tasks are finished, if
> this can not be guaranteed, the VM may get the incorrect memory data,
> or the updated memory may be overwritten by the decompression thread.
> Add the code to fix this potential issue.
> 
> Suggested-by: David Alan Gilbert <dgilbert@redhat.com>
> Suggested-by: Juan Quintela <quintela@redhat.com>
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> ---
>  migration/ram.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 7ab6ab5..8a59a08 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2220,6 +2220,24 @@ static void *do_data_decompress(void *opaque)
>      return NULL;
>  }
>  
> +static void wait_for_decompress_done(void)
> +{
> +    int idx, thread_count;
> +
> +    if (!migrate_use_compression()) {
> +        return;
> +    }
> +
> +    thread_count = migrate_decompress_threads();
> +    qemu_mutex_lock(&decomp_done_lock);
> +    for (idx = 0; idx < thread_count; idx++) {
> +        while (!decomp_param[idx].done) {
> +            qemu_cond_wait(&decomp_done_cond, &decomp_done_lock);
> +        }
> +    }
> +    qemu_mutex_unlock(&decomp_done_lock);

Not sure how this works: in the previous patch, done is set to false
under the decomp_done_lock.  Here, we take the lock, and wait for done
to turn false.  That can't happen because this thread holds the lock.
My reading is this is going to lead to a deadlock.  What am I missing?


		Amit

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue
  2016-06-10 13:39   ` Amit Shah
@ 2016-06-10 15:03     ` Li, Liang Z
  2016-06-13  4:36       ` Amit Shah
  0 siblings, 1 reply; 16+ messages in thread
From: Li, Liang Z @ 2016-06-10 15:03 UTC (permalink / raw)
  To: Amit Shah; +Cc: qemu-devel, quintela, dgilbert, berrange

> Subject: Re: [PATCH v2 2/9] migration: Fix a potential issue
> 
> On (Thu) 05 May 2016 [15:32:52], Liang Li wrote:
> > At the end of live migration and before vm_start() on the destination
> > side, we should make sure all the decompression tasks are finished, if
> > this can not be guaranteed, the VM may get the incorrect memory data,
> > or the updated memory may be overwritten by the decompression thread.
> > Add the code to fix this potential issue.
> >
> > Suggested-by: David Alan Gilbert <dgilbert@redhat.com>
> > Suggested-by: Juan Quintela <quintela@redhat.com>
> > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > ---
> >  migration/ram.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> >
> > diff --git a/migration/ram.c b/migration/ram.c index 7ab6ab5..8a59a08
> > 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -2220,6 +2220,24 @@ static void *do_data_decompress(void *opaque)
> >      return NULL;
> >  }
> >
> > +static void wait_for_decompress_done(void) {
> > +    int idx, thread_count;
> > +
> > +    if (!migrate_use_compression()) {
> > +        return;
> > +    }
> > +
> > +    thread_count = migrate_decompress_threads();
> > +    qemu_mutex_lock(&decomp_done_lock);
> > +    for (idx = 0; idx < thread_count; idx++) {
> > +        while (!decomp_param[idx].done) {
> > +            qemu_cond_wait(&decomp_done_cond, &decomp_done_lock);
> > +        }
> > +    }
> > +    qemu_mutex_unlock(&decomp_done_lock);
> 
> Not sure how this works: in the previous patch, done is set to false under the
> decomp_done_lock.  Here, we take the lock, and wait for done to turn false.
> That can't happen because this thread holds the lock.
> My reading is this is going to lead to a deadlock.  What am I missing?
> 
> 
> 		Amit

This is the typical usage of the QemuCond, actually, in qemu_cond_wait() ,
decomp_done_lock will be unlocked at first and then locked again before 
qemu_cond_wait() return.  So deadlock won't happen.

Liang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue
  2016-06-10 15:03     ` Li, Liang Z
@ 2016-06-13  4:36       ` Amit Shah
  2016-06-13  5:07         ` Li, Liang Z
  0 siblings, 1 reply; 16+ messages in thread
From: Amit Shah @ 2016-06-13  4:36 UTC (permalink / raw)
  To: Li, Liang Z; +Cc: qemu-devel, quintela, dgilbert, berrange

On (Fri) 10 Jun 2016 [15:03:15], Li, Liang Z wrote:
> > Subject: Re: [PATCH v2 2/9] migration: Fix a potential issue
> > 
> > On (Thu) 05 May 2016 [15:32:52], Liang Li wrote:
> > > At the end of live migration and before vm_start() on the destination
> > > side, we should make sure all the decompression tasks are finished, if
> > > this can not be guaranteed, the VM may get the incorrect memory data,
> > > or the updated memory may be overwritten by the decompression thread.
> > > Add the code to fix this potential issue.
> > >
> > > Suggested-by: David Alan Gilbert <dgilbert@redhat.com>
> > > Suggested-by: Juan Quintela <quintela@redhat.com>
> > > Signed-off-by: Liang Li <liang.z.li@intel.com>
> > > ---
> > >  migration/ram.c | 19 +++++++++++++++++++
> > >  1 file changed, 19 insertions(+)
> > >
> > > diff --git a/migration/ram.c b/migration/ram.c index 7ab6ab5..8a59a08
> > > 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -2220,6 +2220,24 @@ static void *do_data_decompress(void *opaque)
> > >      return NULL;
> > >  }
> > >
> > > +static void wait_for_decompress_done(void) {
> > > +    int idx, thread_count;
> > > +
> > > +    if (!migrate_use_compression()) {
> > > +        return;
> > > +    }
> > > +
> > > +    thread_count = migrate_decompress_threads();
> > > +    qemu_mutex_lock(&decomp_done_lock);
> > > +    for (idx = 0; idx < thread_count; idx++) {
> > > +        while (!decomp_param[idx].done) {
> > > +            qemu_cond_wait(&decomp_done_cond, &decomp_done_lock);
> > > +        }
> > > +    }
> > > +    qemu_mutex_unlock(&decomp_done_lock);
> > 
> > Not sure how this works: in the previous patch, done is set to false under the
> > decomp_done_lock.  Here, we take the lock, and wait for done to turn false.
> > That can't happen because this thread holds the lock.
> > My reading is this is going to lead to a deadlock.  What am I missing?
> > 
> 
> This is the typical usage of the QemuCond, actually, in qemu_cond_wait() ,
> decomp_done_lock will be unlocked at first and then locked again before 
> qemu_cond_wait() return.  So deadlock won't happen.

In qemu-thread-posix.c, I don't see such unlock/lock.


		Amit

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue
  2016-06-13  4:36       ` Amit Shah
@ 2016-06-13  5:07         ` Li, Liang Z
  2016-06-13 10:33           ` Amit Shah
  0 siblings, 1 reply; 16+ messages in thread
From: Li, Liang Z @ 2016-06-13  5:07 UTC (permalink / raw)
  To: Amit Shah; +Cc: qemu-devel, quintela, dgilbert, berrange

> > > > +static void wait_for_decompress_done(void) {
> > > > +    int idx, thread_count;
> > > > +
> > > > +    if (!migrate_use_compression()) {
> > > > +        return;
> > > > +    }
> > > > +
> > > > +    thread_count = migrate_decompress_threads();
> > > > +    qemu_mutex_lock(&decomp_done_lock);
> > > > +    for (idx = 0; idx < thread_count; idx++) {
> > > > +        while (!decomp_param[idx].done) {
> > > > +            qemu_cond_wait(&decomp_done_cond,
> &decomp_done_lock);
> > > > +        }
> > > > +    }
> > > > +    qemu_mutex_unlock(&decomp_done_lock);
> > >
> > > Not sure how this works: in the previous patch, done is set to false
> > > under the decomp_done_lock.  Here, we take the lock, and wait for done
> to turn false.
> > > That can't happen because this thread holds the lock.
> > > My reading is this is going to lead to a deadlock.  What am I missing?
> > >
> >
> > This is the typical usage of the QemuCond, actually, in
> > qemu_cond_wait() , decomp_done_lock will be unlocked at first and then
> > locked again before
> > qemu_cond_wait() return.  So deadlock won't happen.
> 
> In qemu-thread-posix.c, I don't see such unlock/lock.
> 
> 
> 		Amit

I mean in the 'pthread_cond_wait()' which called by qemu_cond_wait().

Liang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue
  2016-06-13  5:07         ` Li, Liang Z
@ 2016-06-13 10:33           ` Amit Shah
  0 siblings, 0 replies; 16+ messages in thread
From: Amit Shah @ 2016-06-13 10:33 UTC (permalink / raw)
  To: Li, Liang Z; +Cc: qemu-devel, quintela, dgilbert, berrange

On (Mon) 13 Jun 2016 [05:07:39], Li, Liang Z wrote:
> > > > > +static void wait_for_decompress_done(void) {
> > > > > +    int idx, thread_count;
> > > > > +
> > > > > +    if (!migrate_use_compression()) {
> > > > > +        return;
> > > > > +    }
> > > > > +
> > > > > +    thread_count = migrate_decompress_threads();
> > > > > +    qemu_mutex_lock(&decomp_done_lock);
> > > > > +    for (idx = 0; idx < thread_count; idx++) {
> > > > > +        while (!decomp_param[idx].done) {
> > > > > +            qemu_cond_wait(&decomp_done_cond,
> > &decomp_done_lock);
> > > > > +        }
> > > > > +    }
> > > > > +    qemu_mutex_unlock(&decomp_done_lock);
> > > >
> > > > Not sure how this works: in the previous patch, done is set to false
> > > > under the decomp_done_lock.  Here, we take the lock, and wait for done
> > to turn false.
> > > > That can't happen because this thread holds the lock.
> > > > My reading is this is going to lead to a deadlock.  What am I missing?
> > > >
> > >
> > > This is the typical usage of the QemuCond, actually, in
> > > qemu_cond_wait() , decomp_done_lock will be unlocked at first and then
> > > locked again before
> > > qemu_cond_wait() return.  So deadlock won't happen.
> > 
> > In qemu-thread-posix.c, I don't see such unlock/lock.
> > 
> > 
> > 		Amit
> 
> I mean in the 'pthread_cond_wait()' which called by qemu_cond_wait().

Yes, OK - makes sense now.  Thanks, I'll continue the review.

		Amit

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-06-13 10:33 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-05  7:32 [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Liang Li
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 1/9] migration: Fix multi-thread compression bug Liang Li
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 2/9] migration: Fix a potential issue Liang Li
2016-06-10 13:39   ` Amit Shah
2016-06-10 15:03     ` Li, Liang Z
2016-06-13  4:36       ` Amit Shah
2016-06-13  5:07         ` Li, Liang Z
2016-06-13 10:33           ` Amit Shah
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 3/9] migration: remove useless code Liang Li
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 4/9] qemu-file: Fix qemu_put_compression_data flaw Liang Li
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 5/9] migration: refine ram_save_compressed_page Liang Li
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 6/9] migration: protect the quit flag by lock Liang Li
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 7/9] migration: refine the compression code Liang Li
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 8/9] migration: refine the decompression code Liang Li
2016-05-05  7:32 ` [Qemu-devel] [PATCH v2 9/9] migration: code clean up Liang Li
2016-05-23  8:55 ` [Qemu-devel] [PATCH v2 0/9] live migration bug fix and refine Amit Shah

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.