All of lore.kernel.org
 help / color / mirror / Atom feed
From: zhanghailiang <zhang.zhanghailiang@huawei.com>
To: qemu-devel@nongnu.org
Cc: amit.shah@redhat.com, quintela@redhat.com, dgilbert@redhat.com,
	peter.huangpeng@huawei.com, eddie.dong@intel.com,
	wency@cn.fujitsu.com, lizhijian@cn.fujitsu.com,
	zhangchen.fnst@cn.fujitsu.com, xiecl.fnst@cn.fujitsu.com,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	Gonglei <arei.gonglei@huawei.com>
Subject: [Qemu-devel] [PATCH COLO-Frame v18 10/34] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily
Date: Wed, 3 Aug 2016 20:25:48 +0800	[thread overview]
Message-ID: <1470227172-13704-11-git-send-email-zhang.zhanghailiang@huawei.com> (raw)
In-Reply-To: <1470227172-13704-1-git-send-email-zhang.zhanghailiang@huawei.com>

We should not load PVM's state directly into SVM, because there maybe some
errors happen when SVM is receving data, which will break SVM.

We need to ensure receving all data before load the state into SVM. We use
an extra memory to cache these data (PVM's ram). The ram cache in secondary side
is initially the same as SVM/PVM's memory. And in the process of checkpoint,
we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
always the same as PVM's memory at every checkpoint, then we flush this cached ram
to SVM after we receive all PVM's state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
v12:
- Fix minor error in error_report (Dave's comment)
- Add Reviewed-by tag
v11:
- Rename 'host_cache' to 'colo_cache' (Dave's suggestion)
v10:
- Split the process of dirty pages recording into a new patch
---
 include/exec/ram_addr.h       |  1 +
 include/migration/migration.h |  4 +++
 migration/colo.c              | 11 +++++++
 migration/ram.c               | 73 ++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 2a9465d..b4c04fb 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -26,6 +26,7 @@ struct RAMBlock {
     struct rcu_head rcu;
     struct MemoryRegion *mr;
     uint8_t *host;
+    uint8_t *colo_cache; /* For colo, VM's ram cache */
     ram_addr_t offset;
     ram_addr_t used_length;
     ram_addr_t max_length;
diff --git a/include/migration/migration.h b/include/migration/migration.h
index f4b215a..55679b5 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -362,4 +362,8 @@ int ram_save_queue_pages(MigrationState *ms, const char *rbname,
 PostcopyState postcopy_state_get(void);
 /* Set the state and return the old state */
 PostcopyState postcopy_state_set(PostcopyState new_state);
+
+/* ram cache */
+int colo_init_ram_cache(void);
+void colo_release_ram_cache(void);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index ec43250..0401629 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -287,6 +287,7 @@ void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
     Error *local_err = NULL;
+    int ret;
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_COLO);
@@ -303,6 +304,12 @@ void *colo_process_incoming_thread(void *opaque)
      */
     qemu_file_set_blocking(mis->from_src_file, true);
 
+    ret = colo_init_ram_cache();
+    if (ret < 0) {
+        error_report("Failed to initialize ram cache");
+        goto out;
+    }
+
     colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
                       &local_err);
     if (local_err) {
@@ -353,6 +360,10 @@ out:
         error_report_err(local_err);
     }
 
+    qemu_mutex_lock_iothread();
+    colo_release_ram_cache();
+    qemu_mutex_unlock_iothread();
+
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
     }
diff --git a/migration/ram.c b/migration/ram.c
index e9067c5..ac55980 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -227,6 +227,7 @@ static RAMBlock *last_sent_block;
 static ram_addr_t last_offset;
 static QemuMutex migration_bitmap_mutex;
 static uint64_t migration_dirty_pages;
+static bool ram_cache_enable;
 static uint32_t last_version;
 static bool ram_bulk_stage;
 
@@ -2186,6 +2187,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block,
     return block->host + offset;
 }
 
+static inline void *colo_cache_from_block_offset(RAMBlock *block,
+                                                 ram_addr_t offset)
+{
+    if (!offset_in_ramblock(block, offset)) {
+        return NULL;
+    }
+    if (!block->colo_cache) {
+        error_report("%s: colo_cache is NULL in block :%s",
+                     __func__, block->idstr);
+        return NULL;
+    }
+    return block->colo_cache + offset;
+}
+
 /*
  * If a page (or a whole RDMA chunk) has been
  * determined to be zero, then zap it.
@@ -2501,7 +2516,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                      RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) {
             RAMBlock *block = ram_block_from_stream(f, flags);
 
-            host = host_from_ram_block_offset(block, addr);
+            /* After going into COLO, we should load the Page into colo_cache */
+            if (ram_cache_enable) {
+                host = colo_cache_from_block_offset(block, addr);
+            } else {
+                host = host_from_ram_block_offset(block, addr);
+            }
             if (!host) {
                 error_report("Illegal RAM offset " RAM_ADDR_FMT, addr);
                 ret = -EINVAL;
@@ -2597,6 +2617,57 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     return ret;
 }
 
+/*
+ * colo cache: this is for secondary VM, we cache the whole
+ * memory of the secondary VM, it will be called after first migration.
+ */
+int colo_init_ram_cache(void)
+{
+    RAMBlock *block;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        block->colo_cache = qemu_anon_ram_alloc(block->used_length, NULL);
+        if (!block->colo_cache) {
+            error_report("%s: Can't alloc memory for COLO cache of block %s,"
+                         "size 0x" RAM_ADDR_FMT, __func__, block->idstr,
+                         block->used_length);
+            goto out_locked;
+        }
+        memcpy(block->colo_cache, block->host, block->used_length);
+    }
+    rcu_read_unlock();
+    ram_cache_enable = true;
+    return 0;
+
+out_locked:
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->colo_cache) {
+            qemu_anon_ram_free(block->colo_cache, block->used_length);
+            block->colo_cache = NULL;
+        }
+    }
+
+    rcu_read_unlock();
+    return -errno;
+}
+
+void colo_release_ram_cache(void)
+{
+    RAMBlock *block;
+
+    ram_cache_enable = false;
+
+    rcu_read_lock();
+    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
+        if (block->colo_cache) {
+            qemu_anon_ram_free(block->colo_cache, block->used_length);
+            block->colo_cache = NULL;
+        }
+    }
+    rcu_read_unlock();
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
-- 
1.8.3.1

  parent reply	other threads:[~2016-08-03 12:27 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-03 12:25 [Qemu-devel] [PATCH COLO-Frame v18 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 01/34] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
2016-08-25 21:45   ` Amit Shah
2016-09-01  3:26     ` Hailiang Zhang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 02/34] migration: Introduce capability 'x-colo' to migration zhanghailiang
2016-08-25 21:47   ` Amit Shah
2016-09-01  3:27     ` Hailiang Zhang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 03/34] COLO: migrate colo related info to secondary node zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 04/34] migration: Integrate COLO checkpoint process into migration zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 05/34] migration: Integrate COLO checkpoint process into loadvm zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 06/34] COLO/migration: Create a new communication path from destination to source zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 07/34] COLO: Implement COLO checkpoint protocol zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 08/34] COLO: Add a new RunState RUN_STATE_COLO zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 09/34] COLO: Save PVM state to secondary side when do checkpoint zhanghailiang
2016-08-03 12:25 ` zhanghailiang [this message]
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 11/34] ram/COLO: Record the dirty pages that SVM received zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 12/34] COLO: Load VMState into buffer before restore it zhanghailiang
2016-08-05 17:53   ` Dr. David Alan Gilbert
2016-08-08  8:52     ` Daniel P. Berrange
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 13/34] COLO: Flush PVM's cached RAM into SVM's memory zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 14/34] COLO: Add checkpoint-delay parameter for migrate-set-parameters zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 15/34] COLO: Synchronize PVM's state to SVM periodically zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 16/34] COLO failover: Introduce a new command to trigger a failover zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 17/34] COLO failover: Introduce state to record failover process zhanghailiang
2016-08-09  9:17   ` Dr. David Alan Gilbert
2016-08-10  4:07     ` Hailiang Zhang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 18/34] COLO: Implement failover work for Primary VM zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 19/34] COLO: Implement failover work for Secondary VM zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 20/34] qmp event: Add COLO_EXIT event to notify users while exited from COLO zhanghailiang
2016-08-03 12:25 ` [Qemu-devel] [PATCH COLO-Frame v18 21/34] COLO failover: Shutdown related socket fd when do failover zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 22/34] COLO failover: Don't do failover during loading VM's state zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 23/34] COLO: Process shutdown command for VM in COLO state zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 24/34] COLO: Update the global runstate after going into colo state zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 25/34] savevm: Introduce two helper functions for save/find loadvm_handlers entry zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 26/34] migration/savevm: Add new helpers to process the different stages of loadvm zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 27/34] migration/savevm: Export two helper functions for savevm process zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 28/34] COLO: Separate the process of saving/loading ram and device state zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 29/34] COLO: Split qemu_savevm_state_begin out of checkpoint process zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 30/34] filter-buffer: Accept zero interval zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 31/34] net: Add notifier/callback for netdev init zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 32/34] COLO/filter: Add each netdev a buffer filter zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 33/34] COLO: Control the status of buffer filters for PVM zhanghailiang
2016-08-03 12:26 ` [Qemu-devel] [PATCH COLO-Frame v18 34/34] COLO: Add block replication into colo process zhanghailiang
2016-08-16  0:00 ` [Qemu-devel] [PATCH COLO-Frame v18 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) Changlong Xie
2016-08-16 10:51   ` Hailiang Zhang
     [not found] ` <CAEH94Lju9VSjxo_32yX2iTJngapfDiKQrSXVHa6G6S05yCFWeg@mail.gmail.com>
2016-08-17  5:46   ` Changlong Xie
2016-08-25 21:52 ` Amit Shah
2016-09-01  8:03   ` Hailiang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1470227172-13704-11-git-send-email-zhang.zhanghailiang@huawei.com \
    --to=zhang.zhanghailiang@huawei.com \
    --cc=amit.shah@redhat.com \
    --cc=arei.gonglei@huawei.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=peter.huangpeng@huawei.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=wency@cn.fujitsu.com \
    --cc=xiecl.fnst@cn.fujitsu.com \
    --cc=zhangchen.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.