[Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue
@ 2014-04-04  9:57 arei.gonglei
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 01/10] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
                   ` (11 more replies)
  0 siblings, 12 replies; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, pbonzini

From: ChenLiang <chenliang88@huawei.com>

V5-->V4
* Fix two issues: one is cache_insert don't update the page which
  has been in the cache. Another avoiding the risk that run
  xbzrle_encode_buffer on changing data.


a. Optimization the xbzrle remarkable decrease the cache misses.
    The efficiency of compress increases more than fifty times.
    Before the patch set, the cache almost totally miss when the
    number of cache item less than the dirty page number. Now the
    hot pages in the cache will not be replaced by other pages.

b. Reducing the data copy

c. Fix one corruption issues.

ChenLiang (10):
  XBZRLE: Fix one XBZRLE corruption issues
  migration: Add counts of updating the dirty bitmap
  migration: expose the bitmap_sync_count to the end user
  migration: expose xbzrle cache miss rate
  XBZRLE: optimize XBZRLE to decrease the cache misses
  XBZRLE: rebuild the cache_is_cached function
  xbzrle: don't check the value in the vm ram repeatedly
  xbzrle: check 8 bytes at a time after an concurrency scene
  migration: optimize xbzrle by reducing data copy
  migration: clear the dead code

 arch_init.c                    |  74 +++++++++++++++++-------------
 docs/xbzrle.txt                |   8 ++++
 hmp.c                          |   4 ++
 include/migration/migration.h  |   2 +
 include/migration/page_cache.h |  10 ++--
 migration.c                    |   3 ++
 page_cache.c                   | 101 +++++++++++------------------------------
 qapi-schema.json               |   9 +++-
 qmp-commands.hx                |  15 ++++--
 xbzrle.c                       |  48 ++++++++++++++------
 10 files changed, 144 insertions(+), 130 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 01/10] XBZRLE: Fix one XBZRLE corruption issues
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
@ 2014-04-04  9:57 ` arei.gonglei
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 02/10] migration: Add counts of updating the dirty bitmap arei.gonglei
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

The page may not be inserted into cache after executing save_xbzrle_page.
In case of failure to insert, the original page should be sent rather
than the page in the cache.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 60c975d..2ac68c2 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -340,7 +340,7 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
 
 #define ENCODING_FLAG_XBZRLE 0x1
 
-static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
+static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
                             ram_addr_t current_addr, RAMBlock *block,
                             ram_addr_t offset, int cont, bool last_stage)
 {
@@ -348,19 +348,23 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
     uint8_t *prev_cached_page;
 
     if (!cache_is_cached(XBZRLE.cache, current_addr)) {
+        acct_info.xbzrle_cache_miss++;
         if (!last_stage) {
-            if (cache_insert(XBZRLE.cache, current_addr, current_data) == -1) {
+            if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) {
                 return -1;
+            } else {
+                /* update *current_data when the page has been
+                   inserted into cache */
+                *current_data = get_cached_data(XBZRLE.cache, current_addr);
             }
         }
-        acct_info.xbzrle_cache_miss++;
         return -1;
     }
 
     prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
 
     /* save current buffer into memory */
-    memcpy(XBZRLE.current_buf, current_data, TARGET_PAGE_SIZE);
+    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
 
     /* XBZRLE encoding (if there is no overflow) */
     encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
@@ -373,7 +377,10 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
         DPRINTF("Overflow\n");
         acct_info.xbzrle_overflows++;
         /* update data in the cache */
-        memcpy(prev_cached_page, current_data, TARGET_PAGE_SIZE);
+        if (!last_stage) {
+            memcpy(prev_cached_page, *current_data, TARGET_PAGE_SIZE);
+            *current_data = prev_cached_page;
+        }
         return -1;
     }
 
@@ -598,15 +605,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
                  */
                 xbzrle_cache_zero_page(current_addr);
             } else if (!ram_bulk_stage && migrate_use_xbzrle()) {
-                bytes_sent = save_xbzrle_page(f, p, current_addr, block,
+                bytes_sent = save_xbzrle_page(f, &p, current_addr, block,
                                               offset, cont, last_stage);
                 if (!last_stage) {
-                    /* We must send exactly what's in the xbzrle cache
-                     * even if the page wasn't xbzrle compressed, so that
-                     * it's right next time.
-                     */
-                    p = get_cached_data(XBZRLE.cache, current_addr);
-
                     /* Can't send this cached data async, since the cache page
                      * might get updated before it gets to the wire
                      */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 02/10] migration: Add counts of updating the dirty bitmap
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 01/10] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
@ 2014-04-04  9:57 ` arei.gonglei
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 03/10] migration: expose the bitmap_sync_count to the end arei.gonglei
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

Add counts to log the times of updating the dirty bitmap.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 arch_init.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch_init.c b/arch_init.c
index 2ac68c2..200af0e 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -110,6 +110,8 @@ static bool mig_throttle_on;
 static int dirty_rate_high_cnt;
 static void check_guest_throttling(void);
 
+static uint64_t bitmap_sync_count;
+
 /***********************************************************/
 /* ram save/restore */
 
@@ -487,6 +489,8 @@ static void migration_bitmap_sync(void)
     int64_t end_time;
     int64_t bytes_xfer_now;
 
+    bitmap_sync_count++;
+
     if (!bytes_xfer_prev) {
         bytes_xfer_prev = ram_bytes_transferred();
     }
@@ -734,6 +738,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     migration_dirty_pages = ram_pages;
     mig_throttle_on = false;
     dirty_rate_high_cnt = 0;
+    bitmap_sync_count = 0;
 
     if (migrate_use_xbzrle()) {
         qemu_mutex_lock_iothread();
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 03/10] migration: expose the bitmap_sync_count to the end
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 01/10] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 02/10] migration: Add counts of updating the dirty bitmap arei.gonglei
@ 2014-04-04  9:57 ` arei.gonglei
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 04/10] migration: expose xbzrle cache miss rate arei.gonglei
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

expose the count that logs the times of updating the dirty bitmap to
end user.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 arch_init.c                   |  1 +
 hmp.c                         |  2 ++
 include/migration/migration.h |  1 +
 migration.c                   |  2 ++
 qapi-schema.json              |  4 +++-
 qmp-commands.hx               | 13 +++++++++----
 6 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 200af0e..a7c87de 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -536,6 +536,7 @@ static void migration_bitmap_sync(void)
         s->dirty_bytes_rate = s->dirty_pages_rate * TARGET_PAGE_SIZE;
         start_time = end_time;
         num_dirty_pages_period = 0;
+        s->dirty_sync_count = bitmap_sync_count;
     }
 }
 
diff --git a/hmp.c b/hmp.c
index 2f279c4..77a8d18 100644
--- a/hmp.c
+++ b/hmp.c
@@ -188,6 +188,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
                        info->ram->normal);
         monitor_printf(mon, "normal bytes: %" PRIu64 " kbytes\n",
                        info->ram->normal_bytes >> 10);
+        monitor_printf(mon, "dirty sync count: %" PRIu64 "\n",
+                       info->ram->dirty_sync_count);
         if (info->ram->dirty_pages_rate) {
             monitor_printf(mon, "dirty pages rate: %" PRIu64 " pages\n",
                            info->ram->dirty_pages_rate);
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 3e1e6c7..9e62b99 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -61,6 +61,7 @@ struct MigrationState
     bool enabled_capabilities[MIGRATION_CAPABILITY_MAX];
     int64_t xbzrle_cache_size;
     int64_t setup_time;
+    int64_t dirty_sync_count;
 };
 
 void process_incoming_migration(QEMUFile *f);
diff --git a/migration.c b/migration.c
index bd1fb91..816f5fb 100644
--- a/migration.c
+++ b/migration.c
@@ -215,6 +215,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
         info->ram->normal_bytes = norm_mig_bytes_transferred();
         info->ram->dirty_pages_rate = s->dirty_pages_rate;
         info->ram->mbps = s->mbps;
+        info->ram->dirty_sync_count = s->dirty_sync_count;
 
         if (blk_mig_active()) {
             info->has_disk = true;
@@ -248,6 +249,7 @@ MigrationInfo *qmp_query_migrate(Error **errp)
         info->ram->normal = norm_mig_pages_transferred();
         info->ram->normal_bytes = norm_mig_bytes_transferred();
         info->ram->mbps = s->mbps;
+        info->ram->dirty_sync_count = s->dirty_sync_count;
         break;
     case MIG_STATE_ERROR:
         info->has_status = true;
diff --git a/qapi-schema.json b/qapi-schema.json
index 391356f..8a60a6e 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -651,13 +651,15 @@
 #
 # @mbps: throughput in megabits/sec. (since 1.6)
 #
+# @dirty-sync-count: number of times that dirty ram was synchronized (since 2.1)
+#
 # Since: 0.14.0
 ##
 { 'type': 'MigrationStats',
   'data': {'transferred': 'int', 'remaining': 'int', 'total': 'int' ,
            'duplicate': 'int', 'skipped': 'int', 'normal': 'int',
            'normal-bytes': 'int', 'dirty-pages-rate' : 'int',
-           'mbps' : 'number' } }
+           'mbps' : 'number', 'dirty-sync-count' : 'int' } }
 
 ##
 # @XBZRLECacheStats
diff --git a/qmp-commands.hx b/qmp-commands.hx
index ed3ab92..aadcd04 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2967,6 +2967,7 @@ The main json-object contains the following:
             pages. This is just normal pages times size of one page,
             but this way upper levels don't need to care about page
             size (json-int)
+         - "dirty-sync-count": times that dirty ram was synchronized (json-int)
 - "disk": only present if "status" is "active" and it is a block migration,
   it is a json-object with the following disk information:
          - "transferred": amount transferred in bytes (json-int)
@@ -3004,7 +3005,8 @@ Examples:
           "downtime":12345,
           "duplicate":123,
           "normal":123,
-          "normal-bytes":123456
+          "normal-bytes":123456,
+          "dirty-sync-count":15
         }
      }
    }
@@ -3029,7 +3031,8 @@ Examples:
             "expected-downtime":12345,
             "duplicate":123,
             "normal":123,
-            "normal-bytes":123456
+            "normal-bytes":123456,
+            "dirty-sync-count":15
          }
       }
    }
@@ -3049,7 +3052,8 @@ Examples:
             "expected-downtime":12345,
             "duplicate":123,
             "normal":123,
-            "normal-bytes":123456
+            "normal-bytes":123456,
+            "dirty-sync-count":15
          },
          "disk":{
             "total":20971520,
@@ -3075,7 +3079,8 @@ Examples:
             "expected-downtime":12345,
             "duplicate":10,
             "normal":3333,
-            "normal-bytes":3412992
+            "normal-bytes":3412992,
+            "dirty-sync-count":15
          },
          "xbzrle-cache":{
             "cache-size":67108864,
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 04/10] migration: expose xbzrle cache miss rate
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (2 preceding siblings ...)
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 03/10] migration: expose the bitmap_sync_count to the end arei.gonglei
@ 2014-04-04  9:57 ` arei.gonglei
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 05/10] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

expose xbzrle cache miss rate

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 arch_init.c                   | 18 ++++++++++++++++++
 hmp.c                         |  2 ++
 include/migration/migration.h |  1 +
 migration.c                   |  1 +
 qapi-schema.json              |  5 ++++-
 qmp-commands.hx               |  2 ++
 6 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index a7c87de..15ca4c0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -235,6 +235,7 @@ typedef struct AccountingInfo {
     uint64_t xbzrle_bytes;
     uint64_t xbzrle_pages;
     uint64_t xbzrle_cache_miss;
+    double xbzrle_cache_miss_rate;
     uint64_t xbzrle_overflows;
 } AccountingInfo;
 
@@ -290,6 +291,11 @@ uint64_t xbzrle_mig_pages_cache_miss(void)
     return acct_info.xbzrle_cache_miss;
 }
 
+double xbzrle_mig_cache_miss_rate(void)
+{
+    return acct_info.xbzrle_cache_miss_rate;
+}
+
 uint64_t xbzrle_mig_pages_overflow(void)
 {
     return acct_info.xbzrle_overflows;
@@ -488,6 +494,8 @@ static void migration_bitmap_sync(void)
     static int64_t num_dirty_pages_period;
     int64_t end_time;
     int64_t bytes_xfer_now;
+    static uint64_t xbzrle_cache_miss_prev;
+    static uint64_t iterations_prev;
 
     bitmap_sync_count++;
 
@@ -531,6 +539,16 @@ static void migration_bitmap_sync(void)
         } else {
              mig_throttle_on = false;
         }
+        if (migrate_use_xbzrle()) {
+            if (iterations_prev != 0) {
+                acct_info.xbzrle_cache_miss_rate =
+                   (double)(acct_info.xbzrle_cache_miss -
+                            xbzrle_cache_miss_prev) /
+                   (acct_info.iterations - iterations_prev);
+            }
+            iterations_prev = acct_info.iterations;
+            xbzrle_cache_miss_prev = acct_info.xbzrle_cache_miss;
+        }
         s->dirty_pages_rate = num_dirty_pages_period * 1000
             / (end_time - start_time);
         s->dirty_bytes_rate = s->dirty_pages_rate * TARGET_PAGE_SIZE;
diff --git a/hmp.c b/hmp.c
index 77a8d18..18a850d 100644
--- a/hmp.c
+++ b/hmp.c
@@ -214,6 +214,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
                        info->xbzrle_cache->pages);
         monitor_printf(mon, "xbzrle cache miss: %" PRIu64 "\n",
                        info->xbzrle_cache->cache_miss);
+        monitor_printf(mon, "xbzrle cache miss rate: %0.2f\n",
+                       info->xbzrle_cache->cache_miss_rate);
         monitor_printf(mon, "xbzrle overflow : %" PRIu64 "\n",
                        info->xbzrle_cache->overflow);
     }
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 9e62b99..430e48d 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -126,6 +126,7 @@ uint64_t xbzrle_mig_bytes_transferred(void);
 uint64_t xbzrle_mig_pages_transferred(void);
 uint64_t xbzrle_mig_pages_overflow(void);
 uint64_t xbzrle_mig_pages_cache_miss(void);
+double xbzrle_mig_cache_miss_rate(void);
 
 void ram_handle_compressed(void *host, uint8_t ch, uint64_t size);
 
diff --git a/migration.c b/migration.c
index 816f5fb..44b6ca2 100644
--- a/migration.c
+++ b/migration.c
@@ -174,6 +174,7 @@ static void get_xbzrle_cache_stats(MigrationInfo *info)
         info->xbzrle_cache->bytes = xbzrle_mig_bytes_transferred();
         info->xbzrle_cache->pages = xbzrle_mig_pages_transferred();
         info->xbzrle_cache->cache_miss = xbzrle_mig_pages_cache_miss();
+        info->xbzrle_cache->cache_miss_rate = xbzrle_mig_cache_miss_rate();
         info->xbzrle_cache->overflow = xbzrle_mig_pages_overflow();
     }
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index 8a60a6e..8450275 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -674,13 +674,16 @@
 #
 # @cache-miss: number of cache miss
 #
+# @cache-miss-rate: rate of cache miss (since 2.1)
+#
 # @overflow: number of overflows
 #
 # Since: 1.2
 ##
 { 'type': 'XBZRLECacheStats',
   'data': {'cache-size': 'int', 'bytes': 'int', 'pages': 'int',
-           'cache-miss': 'int', 'overflow': 'int' } }
+           'cache-miss': 'int', 'cache-miss-rate': 'number',
+           'overflow': 'int' } }
 
 ##
 # @MigrationInfo
diff --git a/qmp-commands.hx b/qmp-commands.hx
index aadcd04..f437937 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2979,6 +2979,7 @@ The main json-object contains the following:
          - "bytes": number of bytes transferred for XBZRLE compressed pages
          - "pages": number of XBZRLE compressed pages
          - "cache-miss": number of XBRZRLE page cache misses
+         - "cache-miss-rate": rate of XBRZRLE page cache misses
          - "overflow": number of times XBZRLE overflows.  This means
            that the XBZRLE encoding was bigger than just sent the
            whole page, and then we sent the whole page instead (as as
@@ -3087,6 +3088,7 @@ Examples:
             "bytes":20971520,
             "pages":2444343,
             "cache-miss":2244,
+            "cache-miss-rate":0.123,
             "overflow":34434
          }
       }
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 05/10] XBZRLE: optimize XBZRLE to decrease the cache misses
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (3 preceding siblings ...)
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 04/10] migration: expose xbzrle cache miss rate arei.gonglei
@ 2014-04-04  9:57 ` arei.gonglei
  2014-05-01 14:24   ` Eric Blake
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 06/10] XBZRLE: rebuild the cache_is_cached function arei.gonglei
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

Avoid hot pages being replaced by others to remarkably decrease cache
misses

Sample results with the test program which quote from xbzrle.txt ran in
vm:(migrate bandwidth:1GE and xbzrle cache size 8MB)

the test program:

include <stdlib.h>
include <stdio.h>
int main()
 {
        char *buf = (char *) calloc(4096, 4096);
        while (1) {
            int i;
            for (i = 0; i < 4096 * 4; i++) {
                buf[i * 4096 / 4]++;
            }
            printf(".");
        }
 }

before this patch:
virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
{"return":{"expected-downtime":1020,"xbzrle-cache":{"bytes":1108284,
"cache-size":8388608,"cache-miss-rate":0.987013,"pages":18297,"overflow":8,
"cache-miss":1228737},"status":"active","setup-time":10,"total-time":52398,
"ram":{"total":12466991104,"remaining":1695744,"mbps":935.559472,
"transferred":5780760580,"dirty-sync-counter":271,"duplicate":2878530,
"dirty-pages-rate":29130,"skipped":0,"normal-bytes":5748592640,
"normal":1403465}},"id":"libvirt-706"}

18k pages sent compressed in 52 seconds.
cache-miss-rate is 98.7%, totally miss.

after optimizing:
virsh qemu-monitor-command test_vm '{"execute": "query-migrate"}'
{"return":{"expected-downtime":2054,"xbzrle-cache":{"bytes":5066763,
"cache-size":8388608,"cache-miss-rate":0.485924,"pages":194823,"overflow":0,
"cache-miss":210653},"status":"active","setup-time":11,"total-time":18729,
"ram":{"total":12466991104,"remaining":3895296,"mbps":937.663549,
"transferred":1615042219,"dirty-sync-counter":98,"duplicate":2869840,
"dirty-pages-rate":58781,"skipped":0,"normal-bytes":1588404224,
"normal":387794}},"id":"libvirt-266"}

194k pages sent compressed in 18 seconds.
The value of cache-miss-rate decrease to 48.59%.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 arch_init.c                    |  8 +++++---
 docs/xbzrle.txt                |  8 ++++++++
 include/migration/page_cache.h | 10 +++++++---
 page_cache.c                   | 23 +++++++++++++++++++----
 4 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 15ca4c0..84a4bd3 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -343,7 +343,8 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
 
     /* We don't care if this fails to allocate a new cache page
      * as long as it updated an old one */
-    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE);
+    cache_insert(XBZRLE.cache, current_addr, ZERO_TARGET_PAGE,
+                 bitmap_sync_count);
 }
 
 #define ENCODING_FLAG_XBZRLE 0x1
@@ -355,10 +356,11 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
     int encoded_len = 0, bytes_sent = -1;
     uint8_t *prev_cached_page;
 
-    if (!cache_is_cached(XBZRLE.cache, current_addr)) {
+    if (!cache_is_cached(XBZRLE.cache, current_addr, bitmap_sync_count)) {
         acct_info.xbzrle_cache_miss++;
         if (!last_stage) {
-            if (cache_insert(XBZRLE.cache, current_addr, *current_data) == -1) {
+            if (cache_insert(XBZRLE.cache, current_addr, *current_data,
+                             bitmap_sync_count) == -1) {
                 return -1;
             } else {
                 /* update *current_data when the page has been
diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
index cc3a26a..52c8511 100644
--- a/docs/xbzrle.txt
+++ b/docs/xbzrle.txt
@@ -71,6 +71,14 @@ encoded buffer:
 encoded length 24
 e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69
 
+Cache update strategy
+=====================
+Keeping the hot pages in the cache is effective for decreased cache
+misses. XBZRLE uses a counter as the age of each page. The counter will
+increase after each ram dirty bitmap sync. When a cache conflict is
+detected, XBZRLE will only evict pages in the cache that are older than
+a threshold.
+
 Usage
 ======================
 1. Verify the destination QEMU version is able to decode the new format.
diff --git a/include/migration/page_cache.h b/include/migration/page_cache.h
index 2d5ce2d..10ed532 100644
--- a/include/migration/page_cache.h
+++ b/include/migration/page_cache.h
@@ -43,8 +43,10 @@ void cache_fini(PageCache *cache);
  *
  * @cache pointer to the PageCache struct
  * @addr: page addr
+ * @current_age: current bitmap generation
  */
-bool cache_is_cached(const PageCache *cache, uint64_t addr);
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age);
 
 /**
  * get_cached_data: Get the data cached for an addr
@@ -60,13 +62,15 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr);
  * cache_insert: insert the page into the cache. the page cache
  * will dup the data on insert. the previous value will be overwritten
  *
- * Returns -1 on error
+ * Returns -1 when the page isn't inserted into cache
  *
  * @cache pointer to the PageCache struct
  * @addr: page address
  * @pdata: pointer to the page
+ * @current_age: current bitmap generation
  */
-int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata);
+int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
+                 uint64_t current_age);
 
 /**
  * cache_resize: resize the page cache. In case of size reduction the extra
diff --git a/page_cache.c b/page_cache.c
index b033681..78f7590 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -33,6 +33,9 @@
     do { } while (0)
 #endif
 
+/* the page in cache will not be replaced in two cycles */
+#define CACHED_PAGE_LIFETIME 2
+
 typedef struct CacheItem CacheItem;
 
 struct CacheItem {
@@ -121,7 +124,8 @@ static size_t cache_get_cache_pos(const PageCache *cache,
     return pos;
 }
 
-bool cache_is_cached(const PageCache *cache, uint64_t addr)
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age)
 {
     size_t pos;
 
@@ -130,7 +134,12 @@ bool cache_is_cached(const PageCache *cache, uint64_t addr)
 
     pos = cache_get_cache_pos(cache, addr);
 
-    return (cache->page_cache[pos].it_addr == addr);
+    if (cache->page_cache[pos].it_addr == addr) {
+        /* update the it_age when the cache hit */
+        cache->page_cache[pos].it_age = current_age;
+        return true;
+    }
+    return false;
 }
 
 static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr)
@@ -150,7 +159,8 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr)
     return cache_get_by_addr(cache, addr)->it_data;
 }
 
-int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
+int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
+                 uint64_t current_age)
 {
 
     CacheItem *it = NULL;
@@ -161,6 +171,11 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
     /* actual update of entry */
     it = cache_get_by_addr(cache, addr);
 
+    if (it->it_data && it->it_addr != addr &&
+        it->it_age + CACHED_PAGE_LIFETIME > current_age) {
+        /* the cache page is fresh, don't replace it */
+        return -1;
+    }
     /* allocate page */
     if (!it->it_data) {
         it->it_data = g_try_malloc(cache->page_size);
@@ -173,7 +188,7 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata)
 
     memcpy(it->it_data, pdata, cache->page_size);
 
-    it->it_age = ++cache->max_item_age;
+    it->it_age = current_age;
     it->it_addr = addr;
 
     return 0;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 06/10] XBZRLE: rebuild the cache_is_cached function
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (4 preceding siblings ...)
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 05/10] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
@ 2014-04-04  9:57 ` arei.gonglei
  2014-05-01 14:29   ` Eric Blake
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 07/10] xbzrle: don't check the value in the vm ram repeatedly arei.gonglei
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

Rebuild the cache_is_cached function by cache_get_by_addr. And
drops the asserts because the caller is also asserting the same
thing.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 page_cache.c | 38 ++++++++++++++++----------------------
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/page_cache.c b/page_cache.c
index 78f7590..2626d3d 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -124,24 +124,6 @@ static size_t cache_get_cache_pos(const PageCache *cache,
     return pos;
 }
 
-bool cache_is_cached(const PageCache *cache, uint64_t addr,
-                     uint64_t current_age)
-{
-    size_t pos;
-
-    g_assert(cache);
-    g_assert(cache->page_cache);
-
-    pos = cache_get_cache_pos(cache, addr);
-
-    if (cache->page_cache[pos].it_addr == addr) {
-        /* update the it_age when the cache hit */
-        cache->page_cache[pos].it_age = current_age;
-        return true;
-    }
-    return false;
-}
-
 static CacheItem *cache_get_by_addr(const PageCache *cache, uint64_t addr)
 {
     size_t pos;
@@ -159,14 +141,26 @@ uint8_t *get_cached_data(const PageCache *cache, uint64_t addr)
     return cache_get_by_addr(cache, addr)->it_data;
 }
 
+bool cache_is_cached(const PageCache *cache, uint64_t addr,
+                     uint64_t current_age)
+{
+    CacheItem *it;
+
+    it = cache_get_by_addr(cache, addr);
+
+    if (it->it_addr == addr) {
+        /* update the it_age when the cache hit */
+        it->it_age = current_age;
+        return true;
+    }
+    return false;
+}
+
 int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
                  uint64_t current_age)
 {
 
-    CacheItem *it = NULL;
-
-    g_assert(cache);
-    g_assert(cache->page_cache);
+    CacheItem *it;
 
     /* actual update of entry */
     it = cache_get_by_addr(cache, addr);
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 07/10] xbzrle: don't check the value in the vm ram repeatedly
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (5 preceding siblings ...)
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 06/10] XBZRLE: rebuild the cache_is_cached function arei.gonglei
@ 2014-04-04  9:57 ` arei.gonglei
  2014-04-04 19:50   ` Dr. David Alan Gilbert
  2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 08/10] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

xbzrle_encode_buffer checks the value in the vm ram repeatedly.
It is risk if runs xbzrle_encode_buffer on changing data.
And it is not necessary.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 xbzrle.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/xbzrle.c b/xbzrle.c
index fbcb35d..92cccd7 100644
--- a/xbzrle.c
+++ b/xbzrle.c
@@ -27,9 +27,10 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
                          uint8_t *dst, int dlen)
 {
     uint32_t zrun_len = 0, nzrun_len = 0;
-    int d = 0, i = 0;
+    int d = 0, i = 0, j;
     long res, xor;
     uint8_t *nzrun_start = NULL;
+    uint8_t *xor_ptr = (uint8_t *)(&xor);
 
     g_assert(!(((uintptr_t)old_buf | (uintptr_t)new_buf | slen) %
                sizeof(long)));
@@ -82,6 +83,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
         if (d + 2 > dlen) {
             return -1;
         }
+        i++;
+        nzrun_len++;
         /* not aligned to sizeof(long) */
         res = (slen - i) % sizeof(long);
         while (res && old_buf[i] != new_buf[i]) {
@@ -98,11 +101,16 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
                 xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
                 if ((xor - mask) & ~xor & (mask << 7)) {
                     /* found the end of an nzrun within the current long */
-                    while (old_buf[i] != new_buf[i]) {
-                        nzrun_len++;
-                        i++;
+                    for (j = 0; j < sizeof(long); j++) {
+                        if (0 == xor_ptr[j]) {
+                            break;
+                        }
+                    }
+                    i += j;
+                    nzrun_len += j;
+                    if (j != sizeof(long)) {
+                        break;
                     }
-                    break;
                 } else {
                     i += sizeof(long);
                     nzrun_len += sizeof(long);
@@ -118,6 +126,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
         memcpy(dst + d, nzrun_start, nzrun_len);
         d += nzrun_len;
         nzrun_len = 0;
+        i++;
+        zrun_len++;
     }
 
     return d;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 08/10] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (6 preceding siblings ...)
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 07/10] xbzrle: don't check the value in the vm ram repeatedly arei.gonglei
@ 2014-04-04  9:58 ` arei.gonglei
  2014-04-04 19:52   ` Dr. David Alan Gilbert
  2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 09/10] migration: optimize xbzrle by reducing data copy arei.gonglei
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

The logic of old code is correct. But Checking byte by byte will
consume time after an concurrency scene.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 xbzrle.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/xbzrle.c b/xbzrle.c
index 92cccd7..9d67309 100644
--- a/xbzrle.c
+++ b/xbzrle.c
@@ -51,16 +51,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
 
         /* word at a time for speed */
         if (!res) {
-            while (i < slen &&
-                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
-                i += sizeof(long);
-                zrun_len += sizeof(long);
-            }
-
-            /* go over the rest */
-            while (i < slen && old_buf[i] == new_buf[i]) {
-                zrun_len++;
-                i++;
+            while (i < slen) {
+                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
+                    i += sizeof(long);
+                    zrun_len += sizeof(long);
+                } else {
+                    /* go over the rest */
+                    for (j = 0; j < sizeof(long); j++) {
+                        if (old_buf[i] == new_buf[i]) {
+                            i++;
+                            zrun_len++;
+                        } else {
+                            break;
+                        }
+                    }
+                    if (j != sizeof(long)) {
+                        break;
+                    }
+                }
             }
         }
 
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 09/10] migration: optimize xbzrle by reducing data copy
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (7 preceding siblings ...)
  2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 08/10] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
@ 2014-04-04  9:58 ` arei.gonglei
  2014-04-04 19:59   ` Dr. David Alan Gilbert
  2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 10/10] migration: clear the dead code arei.gonglei
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

Reducing data copy can reduce cpu overhead.

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 arch_init.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 84a4bd3..94b62e2 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -373,11 +373,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
 
     prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
 
-    /* save current buffer into memory */
-    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
-
     /* XBZRLE encoding (if there is no overflow) */
-    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
+    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
                                        TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
                                        TARGET_PAGE_SIZE);
     if (encoded_len == 0) {
@@ -396,7 +393,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
 
     /* we need to update the data in the cache, in order to get the same data */
     if (!last_stage) {
-        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
+        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
+                             TARGET_PAGE_SIZE);
     }
 
     /* Send XBZRLE based compressed page */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH v5 10/10] migration: clear the dead code
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (8 preceding siblings ...)
  2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 09/10] migration: optimize xbzrle by reducing data copy arei.gonglei
@ 2014-04-04  9:58 ` arei.gonglei
  2014-04-04 17:16 ` [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue Dr. David Alan Gilbert
  2014-04-04 17:24 ` [Qemu-devel] For 2.0? " Eric Blake
  11 siblings, 0 replies; 20+ messages in thread
From: arei.gonglei @ 2014-04-04  9:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, Gonglei,
	pbonzini

From: ChenLiang <chenliang88@huawei.com>

clear the dead code

Signed-off-by: ChenLiang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c  | 13 -------------
 page_cache.c | 58 ----------------------------------------------------------
 2 files changed, 71 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 94b62e2..db38c9a 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -164,14 +164,11 @@ static inline bool is_zero_range(uint8_t *p, uint64_t size)
 static struct {
     /* buffer used for XBZRLE encoding */
     uint8_t *encoded_buf;
-    /* buffer for storing page content */
-    uint8_t *current_buf;
     /* Cache for XBZRLE, Protected by lock. */
     PageCache *cache;
     QemuMutex lock;
 } XBZRLE = {
     .encoded_buf = NULL,
-    .current_buf = NULL,
     .cache = NULL,
 };
 /* buffer used for XBZRLE decoding */
@@ -723,10 +720,8 @@ static void migration_end(void)
         cache_fini(XBZRLE.cache);
         g_free(XBZRLE.cache);
         g_free(XBZRLE.encoded_buf);
-        g_free(XBZRLE.current_buf);
         XBZRLE.cache = NULL;
         XBZRLE.encoded_buf = NULL;
-        XBZRLE.current_buf = NULL;
     }
     XBZRLE_cache_unlock();
 }
@@ -779,14 +774,6 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
             return -1;
         }
 
-        XBZRLE.current_buf = g_try_malloc(TARGET_PAGE_SIZE);
-        if (!XBZRLE.current_buf) {
-            DPRINTF("Error allocating current_buf\n");
-            g_free(XBZRLE.encoded_buf);
-            XBZRLE.encoded_buf = NULL;
-            return -1;
-        }
-
         acct_clear();
     }
 
diff --git a/page_cache.c b/page_cache.c
index 2626d3d..978dc1e 100644
--- a/page_cache.c
+++ b/page_cache.c
@@ -48,7 +48,6 @@ struct PageCache {
     CacheItem *page_cache;
     unsigned int page_size;
     int64_t max_num_items;
-    uint64_t max_item_age;
     int64_t num_items;
 };
 
@@ -76,7 +75,6 @@ PageCache *cache_init(int64_t num_pages, unsigned int page_size)
     }
     cache->page_size = page_size;
     cache->num_items = 0;
-    cache->max_item_age = 0;
     cache->max_num_items = num_pages;
 
     DPRINTF("Setting cache buckets to %" PRId64 "\n", cache->max_num_items);
@@ -187,59 +185,3 @@ int cache_insert(PageCache *cache, uint64_t addr, const uint8_t *pdata,
 
     return 0;
 }
-
-int64_t cache_resize(PageCache *cache, int64_t new_num_pages)
-{
-    PageCache *new_cache;
-    int64_t i;
-
-    CacheItem *old_it, *new_it;
-
-    g_assert(cache);
-
-    /* cache was not inited */
-    if (cache->page_cache == NULL) {
-        return -1;
-    }
-
-    /* same size */
-    if (pow2floor(new_num_pages) == cache->max_num_items) {
-        return cache->max_num_items;
-    }
-
-    new_cache = cache_init(new_num_pages, cache->page_size);
-    if (!(new_cache)) {
-        DPRINTF("Error creating new cache\n");
-        return -1;
-    }
-
-    /* move all data from old cache */
-    for (i = 0; i < cache->max_num_items; i++) {
-        old_it = &cache->page_cache[i];
-        if (old_it->it_addr != -1) {
-            /* check for collision, if there is, keep MRU page */
-            new_it = cache_get_by_addr(new_cache, old_it->it_addr);
-            if (new_it->it_data && new_it->it_age >= old_it->it_age) {
-                /* keep the MRU page */
-                g_free(old_it->it_data);
-            } else {
-                if (!new_it->it_data) {
-                    new_cache->num_items++;
-                }
-                g_free(new_it->it_data);
-                new_it->it_data = old_it->it_data;
-                new_it->it_age = old_it->it_age;
-                new_it->it_addr = old_it->it_addr;
-            }
-        }
-    }
-
-    g_free(cache->page_cache);
-    cache->page_cache = new_cache->page_cache;
-    cache->max_num_items = new_cache->max_num_items;
-    cache->num_items = new_cache->num_items;
-
-    g_free(new_cache);
-
-    return cache->max_num_items;
-}
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (9 preceding siblings ...)
  2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 10/10] migration: clear the dead code arei.gonglei
@ 2014-04-04 17:16 ` Dr. David Alan Gilbert
  2014-04-04 17:24 ` [Qemu-devel] For 2.0? " Eric Blake
  11 siblings, 0 replies; 20+ messages in thread
From: Dr. David Alan Gilbert @ 2014-04-04 17:16 UTC (permalink / raw)
  To: arei.gonglei; +Cc: ChenLiang, weidong.huang, quintela, qemu-devel, pbonzini

* arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> V5-->V4
> * Fix two issues: one is cache_insert don't update the page which
>   has been in the cache. Another avoiding the risk that run
>   xbzrle_encode_buffer on changing data.
> 

I've just been running this, and yes it does seem to be stable
now with this set (built on top of 2.0.0-rc1).
Thanks for the fixes!

Dave

> 
> a. Optimization the xbzrle remarkable decrease the cache misses.
>     The efficiency of compress increases more than fifty times.
>     Before the patch set, the cache almost totally miss when the
>     number of cache item less than the dirty page number. Now the
>     hot pages in the cache will not be replaced by other pages.
> 
> b. Reducing the data copy
> 
> c. Fix one corruption issues.
> 
> ChenLiang (10):
>   XBZRLE: Fix one XBZRLE corruption issues
>   migration: Add counts of updating the dirty bitmap
>   migration: expose the bitmap_sync_count to the end user
>   migration: expose xbzrle cache miss rate
>   XBZRLE: optimize XBZRLE to decrease the cache misses
>   XBZRLE: rebuild the cache_is_cached function
>   xbzrle: don't check the value in the vm ram repeatedly
>   xbzrle: check 8 bytes at a time after an concurrency scene
>   migration: optimize xbzrle by reducing data copy
>   migration: clear the dead code
> 
>  arch_init.c                    |  74 +++++++++++++++++-------------
>  docs/xbzrle.txt                |   8 ++++
>  hmp.c                          |   4 ++
>  include/migration/migration.h  |   2 +
>  include/migration/page_cache.h |  10 ++--
>  migration.c                    |   3 ++
>  page_cache.c                   | 101 +++++++++++------------------------------
>  qapi-schema.json               |   9 +++-
>  qmp-commands.hx                |  15 ++++--
>  xbzrle.c                       |  48 ++++++++++++++------
>  10 files changed, 144 insertions(+), 130 deletions(-)
> 
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] For 2.0? Re: [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue
  2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
                   ` (10 preceding siblings ...)
  2014-04-04 17:16 ` [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue Dr. David Alan Gilbert
@ 2014-04-04 17:24 ` Eric Blake
  2014-04-04 17:40   ` Dr. David Alan Gilbert
  11 siblings, 1 reply; 20+ messages in thread
From: Eric Blake @ 2014-04-04 17:24 UTC (permalink / raw)
  To: arei.gonglei, qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, pbonzini

[-- Attachment #1: Type: text/plain, Size: 2234 bytes --]

On 04/04/2014 03:57 AM, arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> V5-->V4
> * Fix two issues: one is cache_insert don't update the page which
>   has been in the cache. Another avoiding the risk that run
>   xbzrle_encode_buffer on changing data.

Is all or part of this series essential to have in 2.0 to avoid having a
regression?

> 
> 
> a. Optimization the xbzrle remarkable decrease the cache misses.
>     The efficiency of compress increases more than fifty times.
>     Before the patch set, the cache almost totally miss when the
>     number of cache item less than the dirty page number. Now the
>     hot pages in the cache will not be replaced by other pages.
> 
> b. Reducing the data copy
> 
> c. Fix one corruption issues.
> 
> ChenLiang (10):
>   XBZRLE: Fix one XBZRLE corruption issues

Based on name, this patch is worth including in the release, if it is
not too late.

>   migration: Add counts of updating the dirty bitmap
>   migration: expose the bitmap_sync_count to the end user
>   migration: expose xbzrle cache miss rate

whereas these names sound like new features, and thus should wait for 2.1.

>   XBZRLE: optimize XBZRLE to decrease the cache misses
>   XBZRLE: rebuild the cache_is_cached function
>   xbzrle: don't check the value in the vm ram repeatedly
>   xbzrle: check 8 bytes at a time after an concurrency scene
>   migration: optimize xbzrle by reducing data copy
>   migration: clear the dead code
> 
>  arch_init.c                    |  74 +++++++++++++++++-------------
>  docs/xbzrle.txt                |   8 ++++
>  hmp.c                          |   4 ++
>  include/migration/migration.h  |   2 +
>  include/migration/page_cache.h |  10 ++--
>  migration.c                    |   3 ++
>  page_cache.c                   | 101 +++++++++++------------------------------
>  qapi-schema.json               |   9 +++-
>  qmp-commands.hx                |  15 ++++--
>  xbzrle.c                       |  48 ++++++++++++++------
>  10 files changed, 144 insertions(+), 130 deletions(-)
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] For 2.0? Re: [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue
  2014-04-04 17:24 ` [Qemu-devel] For 2.0? " Eric Blake
@ 2014-04-04 17:40   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 20+ messages in thread
From: Dr. David Alan Gilbert @ 2014-04-04 17:40 UTC (permalink / raw)
  To: Eric Blake
  Cc: ChenLiang, weidong.huang, quintela, qemu-devel, arei.gonglei, pbonzini

* Eric Blake (eblake@redhat.com) wrote:
> On 04/04/2014 03:57 AM, arei.gonglei@huawei.com wrote:
> > From: ChenLiang <chenliang88@huawei.com>
> > 
> > V5-->V4
> > * Fix two issues: one is cache_insert don't update the page which
> >   has been in the cache. Another avoiding the risk that run
> >   xbzrle_encode_buffer on changing data.
> 
> Is all or part of this series essential to have in 2.0 to avoid having a
> regression?

The first one (Fix one XBZRLE corruption issue) is probably worth it;
however my understanding is that it can only cause this corruption 
in the case where the source qemu runs out of RAM, and a few months
ago that used to be the case where the source qemu would abort at that
point, so it wasn't pleasent then.

Everything is else is an improvement.

Dave

> > 
> > 
> > a. Optimization the xbzrle remarkable decrease the cache misses.
> >     The efficiency of compress increases more than fifty times.
> >     Before the patch set, the cache almost totally miss when the
> >     number of cache item less than the dirty page number. Now the
> >     hot pages in the cache will not be replaced by other pages.
> > 
> > b. Reducing the data copy
> > 
> > c. Fix one corruption issues.
> > 
> > ChenLiang (10):
> >   XBZRLE: Fix one XBZRLE corruption issues
> 
> Based on name, this patch is worth including in the release, if it is
> not too late.
> 
> >   migration: Add counts of updating the dirty bitmap
> >   migration: expose the bitmap_sync_count to the end user
> >   migration: expose xbzrle cache miss rate
> 
> whereas these names sound like new features, and thus should wait for 2.1.
> 
> >   XBZRLE: optimize XBZRLE to decrease the cache misses
> >   XBZRLE: rebuild the cache_is_cached function
> >   xbzrle: don't check the value in the vm ram repeatedly
> >   xbzrle: check 8 bytes at a time after an concurrency scene
> >   migration: optimize xbzrle by reducing data copy
> >   migration: clear the dead code
> > 
> >  arch_init.c                    |  74 +++++++++++++++++-------------
> >  docs/xbzrle.txt                |   8 ++++
> >  hmp.c                          |   4 ++
> >  include/migration/migration.h  |   2 +
> >  include/migration/page_cache.h |  10 ++--
> >  migration.c                    |   3 ++
> >  page_cache.c                   | 101 +++++++++++------------------------------
> >  qapi-schema.json               |   9 +++-
> >  qmp-commands.hx                |  15 ++++--
> >  xbzrle.c                       |  48 ++++++++++++++------
> >  10 files changed, 144 insertions(+), 130 deletions(-)
> > 
> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/10] xbzrle: don't check the value in the vm ram repeatedly
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 07/10] xbzrle: don't check the value in the vm ram repeatedly arei.gonglei
@ 2014-04-04 19:50   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 20+ messages in thread
From: Dr. David Alan Gilbert @ 2014-04-04 19:50 UTC (permalink / raw)
  To: arei.gonglei
  Cc: ChenLiang, weidong.huang, quintela, qemu-devel, dgilbert, pbonzini

* arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> xbzrle_encode_buffer checks the value in the vm ram repeatedly.
> It is risk if runs xbzrle_encode_buffer on changing data.
> And it is not necessary.
> 
> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  xbzrle.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/xbzrle.c b/xbzrle.c
> index fbcb35d..92cccd7 100644
> --- a/xbzrle.c
> +++ b/xbzrle.c
> @@ -27,9 +27,10 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>                           uint8_t *dst, int dlen)
>  {
>      uint32_t zrun_len = 0, nzrun_len = 0;
> -    int d = 0, i = 0;
> +    int d = 0, i = 0, j;
>      long res, xor;
>      uint8_t *nzrun_start = NULL;
> +    uint8_t *xor_ptr = (uint8_t *)(&xor);
>  
>      g_assert(!(((uintptr_t)old_buf | (uintptr_t)new_buf | slen) %
>                 sizeof(long)));
> @@ -82,6 +83,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>          if (d + 2 > dlen) {
>              return -1;
>          }
> +        i++;
> +        nzrun_len++;

Yes, I think that's safe - I was checking for if an overflow was possible, but
my reading is that before this 'i++' i can be a maximum of slen-1, so here
it's a maximum of slen and the next loop won't happen in that case.

>          /* not aligned to sizeof(long) */
>          res = (slen - i) % sizeof(long);
>          while (res && old_buf[i] != new_buf[i]) {
> @@ -98,11 +101,16 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>                  xor = *(long *)(old_buf + i) ^ *(long *)(new_buf + i);
>                  if ((xor - mask) & ~xor & (mask << 7)) {
>                      /* found the end of an nzrun within the current long */
> -                    while (old_buf[i] != new_buf[i]) {
> -                        nzrun_len++;
> -                        i++;
> +                    for (j = 0; j < sizeof(long); j++) {
> +                        if (0 == xor_ptr[j]) {
> +                            break;
> +                        }
> +                    }
> +                    i += j;
> +                    nzrun_len += j;

> +                    if (j != sizeof(long)) {
> +                        break;
>                      }
> -                    break;
>                  } else {
>                      i += sizeof(long);
>                      nzrun_len += sizeof(long);
> @@ -118,6 +126,8 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>          memcpy(dst + d, nzrun_start, nzrun_len);
>          d += nzrun_len;
>          nzrun_len = 0;
> +        i++;
> +        zrun_len++;

I think that's also safe, because if i was now 'slen' the mainloop would exit, that would
mean the last zero run wasn't encoded, but there seems to already be a check that causes
the last zero run not to be encoded.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

>      }
>  
>      return d;
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/10] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 08/10] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
@ 2014-04-04 19:52   ` Dr. David Alan Gilbert
  2014-04-05  2:39     ` 陈梁
  0 siblings, 1 reply; 20+ messages in thread
From: Dr. David Alan Gilbert @ 2014-04-04 19:52 UTC (permalink / raw)
  To: arei.gonglei; +Cc: ChenLiang, weidong.huang, quintela, qemu-devel, pbonzini

* arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> The logic of old code is correct. But Checking byte by byte will
> consume time after an concurrency scene.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  xbzrle.c | 28 ++++++++++++++++++----------
>  1 file changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/xbzrle.c b/xbzrle.c
> index 92cccd7..9d67309 100644
> --- a/xbzrle.c
> +++ b/xbzrle.c
> @@ -51,16 +51,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>  
>          /* word at a time for speed */
>          if (!res) {
> -            while (i < slen &&
> -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> -                i += sizeof(long);
> -                zrun_len += sizeof(long);
> -            }
> -
> -            /* go over the rest */
> -            while (i < slen && old_buf[i] == new_buf[i]) {
> -                zrun_len++;
> -                i++;
> +            while (i < slen) {
> +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
> +                    i += sizeof(long);
> +                    zrun_len += sizeof(long);
> +                } else {
> +                    /* go over the rest */
> +                    for (j = 0; j < sizeof(long); j++) {
> +                        if (old_buf[i] == new_buf[i]) {
> +                            i++;
> +                            zrun_len++;
> +                        } else {
> +                            break;
> +                        }
> +                    }
> +                    if (j != sizeof(long)) {
> +                        break;

Is it not possible to make this code the same as the other loop, you could xor in the same
way just change the comparison?
(What do other people think - I was thinking that would just be better since it would
be symmetric?)

Dave
> +                    }
> +                }
>              }
>          }
>  
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/10] migration: optimize xbzrle by reducing data copy
  2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 09/10] migration: optimize xbzrle by reducing data copy arei.gonglei
@ 2014-04-04 19:59   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 20+ messages in thread
From: Dr. David Alan Gilbert @ 2014-04-04 19:59 UTC (permalink / raw)
  To: arei.gonglei; +Cc: ChenLiang, weidong.huang, quintela, qemu-devel, pbonzini

* arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> Reducing data copy can reduce cpu overhead.
> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>

Yes, with the previous patches that's now correct.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  arch_init.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index 84a4bd3..94b62e2 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -373,11 +373,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>  
>      prev_cached_page = get_cached_data(XBZRLE.cache, current_addr);
>  
> -    /* save current buffer into memory */
> -    memcpy(XBZRLE.current_buf, *current_data, TARGET_PAGE_SIZE);
> -
>      /* XBZRLE encoding (if there is no overflow) */
> -    encoded_len = xbzrle_encode_buffer(prev_cached_page, XBZRLE.current_buf,
> +    encoded_len = xbzrle_encode_buffer(prev_cached_page, *current_data,
>                                         TARGET_PAGE_SIZE, XBZRLE.encoded_buf,
>                                         TARGET_PAGE_SIZE);
>      if (encoded_len == 0) {
> @@ -396,7 +393,8 @@ static int save_xbzrle_page(QEMUFile *f, uint8_t **current_data,
>  
>      /* we need to update the data in the cache, in order to get the same data */
>      if (!last_stage) {
> -        memcpy(prev_cached_page, XBZRLE.current_buf, TARGET_PAGE_SIZE);
> +        xbzrle_decode_buffer(XBZRLE.encoded_buf, encoded_len, prev_cached_page,
> +                             TARGET_PAGE_SIZE);
>      }
>  
>      /* Send XBZRLE based compressed page */
> -- 
> 1.7.12.4
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/10] xbzrle: check 8 bytes at a time after an concurrency scene
  2014-04-04 19:52   ` Dr. David Alan Gilbert
@ 2014-04-05  2:39     ` 陈梁
  0 siblings, 0 replies; 20+ messages in thread
From: 陈梁 @ 2014-04-05  2:39 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: ChenLiang, weidong.huang, arei.gonglei, quintela, qemu-devel,
	陈梁,
	pbonzini


> * arei.gonglei@huawei.com (arei.gonglei@huawei.com) wrote:
>> From: ChenLiang <chenliang88@huawei.com>
>> 
>> The logic of old code is correct. But Checking byte by byte will
>> consume time after an concurrency scene.
>> 
>> Signed-off-by: ChenLiang <chenliang88@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>> xbzrle.c | 28 ++++++++++++++++++----------
>> 1 file changed, 18 insertions(+), 10 deletions(-)
>> 
>> diff --git a/xbzrle.c b/xbzrle.c
>> index 92cccd7..9d67309 100644
>> --- a/xbzrle.c
>> +++ b/xbzrle.c
>> @@ -51,16 +51,24 @@ int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>> 
>>         /* word at a time for speed */
>>         if (!res) {
>> -            while (i < slen &&
>> -                   (*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>> -                i += sizeof(long);
>> -                zrun_len += sizeof(long);
>> -            }
>> -
>> -            /* go over the rest */
>> -            while (i < slen && old_buf[i] == new_buf[i]) {
>> -                zrun_len++;
>> -                i++;
>> +            while (i < slen) {
>> +                if ((*(long *)(old_buf + i)) == (*(long *)(new_buf + i))) {
>> +                    i += sizeof(long);
>> +                    zrun_len += sizeof(long);
>> +                } else {
>> +                    /* go over the rest */
>> +                    for (j = 0; j < sizeof(long); j++) {
>> +                        if (old_buf[i] == new_buf[i]) {
>> +                            i++;
>> +                            zrun_len++;
>> +                        } else {
>> +                            break;
>> +                        }
>> +                    }
>> +                    if (j != sizeof(long)) {
>> +                        break;
> 
> Is it not possible to make this code the same as the other loop, you could xor in the same
> way just change the comparison?
> (What do other people think - I was thinking that would just be better since it would
> be symmetric?)
> 
> Dave
Hi,
yeah, It will be symmetric. They are equivalence like this:
(*(long *)(old_buf + i)) == (*(long *)(new_buf + i))

!(xor = (*(long *)(old_buf + i)) ^ (*(long *)(new_buf + i)))

But the second is not efficient, IMHO. Because xbzrle assumes that most of bytes in the dirty
page is not modified. The result of xor will be not useful mostly. ps: It's just my idea.

Best regards
Chen Liang
>> +                    }
>> +                }
>>             }
>>         }
>> 
>> -- 
>> 1.7.12.4
>> 
>> 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/10] XBZRLE: optimize XBZRLE to decrease the cache misses
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 05/10] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
@ 2014-05-01 14:24   ` Eric Blake
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2014-05-01 14:24 UTC (permalink / raw)
  To: arei.gonglei, qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, pbonzini

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

On 04/04/2014 03:57 AM, arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> Avoid hot pages being replaced by others to remarkably decrease cache
> misses
> 
> Sample results with the test program which quote from xbzrle.txt ran in
> vm:(migrate bandwidth:1GE and xbzrle cache size 8MB)
> 

> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  arch_init.c                    |  8 +++++---
>  docs/xbzrle.txt                |  8 ++++++++
>  include/migration/page_cache.h | 10 +++++++---
>  page_cache.c                   | 23 +++++++++++++++++++----
>  4 files changed, 39 insertions(+), 10 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/10] XBZRLE: rebuild the cache_is_cached function
  2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 06/10] XBZRLE: rebuild the cache_is_cached function arei.gonglei
@ 2014-05-01 14:29   ` Eric Blake
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Blake @ 2014-05-01 14:29 UTC (permalink / raw)
  To: arei.gonglei, qemu-devel
  Cc: ChenLiang, weidong.huang, quintela, dgilbert, owasserm, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1001 bytes --]

On 04/04/2014 03:57 AM, arei.gonglei@huawei.com wrote:
> From: ChenLiang <chenliang88@huawei.com>
> 
> Rebuild the cache_is_cached function by cache_get_by_addr. And
> drops the asserts because the caller is also asserting the same
> thing.

Technically, the caller (parent - arch_init.c:save_xbzrle_page) is not
asserting the same thing; rather, it is the callee (child -
cache_get_by_addr() used first in the two methods where you dropped the
asserts) that is making the same assertion.

> 
> Signed-off-by: ChenLiang <chenliang88@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  page_cache.c | 38 ++++++++++++++++----------------------
>  1 file changed, 16 insertions(+), 22 deletions(-)

The commit message can be trivially amended; so
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-05-01 14:29 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-04  9:57 [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue arei.gonglei
2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 01/10] XBZRLE: Fix one XBZRLE corruption issues arei.gonglei
2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 02/10] migration: Add counts of updating the dirty bitmap arei.gonglei
2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 03/10] migration: expose the bitmap_sync_count to the end arei.gonglei
2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 04/10] migration: expose xbzrle cache miss rate arei.gonglei
2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 05/10] XBZRLE: optimize XBZRLE to decrease the cache misses arei.gonglei
2014-05-01 14:24   ` Eric Blake
2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 06/10] XBZRLE: rebuild the cache_is_cached function arei.gonglei
2014-05-01 14:29   ` Eric Blake
2014-04-04  9:57 ` [Qemu-devel] [PATCH v5 07/10] xbzrle: don't check the value in the vm ram repeatedly arei.gonglei
2014-04-04 19:50   ` Dr. David Alan Gilbert
2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 08/10] xbzrle: check 8 bytes at a time after an concurrency scene arei.gonglei
2014-04-04 19:52   ` Dr. David Alan Gilbert
2014-04-05  2:39     ` 陈梁
2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 09/10] migration: optimize xbzrle by reducing data copy arei.gonglei
2014-04-04 19:59   ` Dr. David Alan Gilbert
2014-04-04  9:58 ` [Qemu-devel] [PATCH v5 10/10] migration: clear the dead code arei.gonglei
2014-04-04 17:16 ` [Qemu-devel] [PATCH v5 00/10] migration: Optimizate the xbzrle and fix one corruption issue Dr. David Alan Gilbert
2014-04-04 17:24 ` [Qemu-devel] For 2.0? " Eric Blake
2014-04-04 17:40   ` Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.