All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration
@ 2017-07-10 16:30 Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 01/16] migration: add has_postcopy savevm handler Vladimir Sementsov-Ogievskiy
                   ` (15 more replies)
  0 siblings, 16 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Hi all!

There is a new version of dirty bitmap postcopy migration series.

v7

clone: tag postcopy-v7 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v7

- rebased on dirty-bitmap byte-based interfaces
    (based on git://repo.or.cz/qemu/ericb.git branch nbd-byte-dirty-v4)
- migration of persistent bitmaps should fail for shared storage migration for now,
  as persistent dirty bitmaps are stored/load on inactivate/invalidate-cache.
  also, even for non-shared storage migration there would be useless saving of dirty
  bitmaps on source. This all will be optimized later.

01: staff from include/migration/vmstate.h moved to include/migration/register.h (rebase)
03: some structural changes due to rebase - drop r-b
04: staff from include/migration/vmstate.h moved to include/migration/register.h (rebase)
    staff from include/sysemu/sysemu.h moved to migration/savevm.h (rebase)
05: fix patch header: block -> block/dirty-bitmap
    add locking, drop r-b
06: staff from include/migration/migration.h moved to migration/migration.h (rebase)
07: add locking, drop r-b
09: staff from include/migration/qemu-file.h moved to migration/qemu-file.h (rebase)
10: staff from include/migration/vmstate.h moved to include/migration/register.h (rebase)
11: new patch
12: a lot of changes/fixes (mostly by Fam's comments) + rebase
    header-definition movement
    remove include <assert.h>
    add some includes
    fix/refactor bitmap flags send
    byte-based interface for dirty bitmaps (rebase)
    froze bitmaps on source
    init_dirty_bitmap_migration can return error, if some of bitmaps are already
      frozen
    bdrv_ref drives with bitmaps
    fprintf -> error_report
    check version_id in _load function

v6:

clone: tag postcopy-v6 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v6

rebase on master.

03 - tiny contextual change

12 - little change, but it should be reviewed. Call of init_dirty_bitmap_incoming_migration()
    (which only initialize mutex) moved from start of process_incoming_migration_co (it was
    immediately after "mis = migration_incoming_state_new(f)") to migration_incoming_get_current()
    to stay with initialization code.
    I remove r-b's, but hope that this will not be a problem. The only change in this patch - is moved
    call of init_dirty_bitmap_incoming_migration.
    I do so because of recent

commit b4b076daf324894dd288cbdb67ff1e3c7434df7b
Author: Juan Quintela <quintela@redhat.com>
Date:   Mon Jan 23 22:32:06 2017 +0100

    migration: create Migration Incoming State at init time

15 - add Max's r-b

v5:

clone: tag postcopy-v5 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v5

- move 'signed-off' over 'reviewed-by' in patches.

03,04 - add comments. Hope they will be ok for you, so add Juan's r-b.
    If not ok - let me know and I'll resend.

06,08,12 - add Max's r-b
07,09,10,11,12 - add Juan's r-b

14 - used last version of this patch from qcow2-bitmap series with 
     Max's r-b. It has contextual changes due to different base.

15 - fix 041 iotest, add default node-name only if path is specified and
     node-name is not specified
16 - handle whitespaces
       s/"exec: cat " + fifo/"exec: cat '" + fifo + "'"/
     fix indentation
     add Max's r-b
17 - fix typos, wrong size in comment, s/md5/sha256/
     add Max's r-b

v4:

clone: tag postcopy-v4 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v4

reroll, to fix "failed automatic build test"

rebase on master!

07: since 2.8 -> since 2.9
14: fix build of test-hbitmap
    since 2.8 -> since 2.9 and small rewording of comment (this change was not done for
    same patch in may parallels series about qcow2 bitmap extension)

v3:

rebased on Max's block branch: https://github.com/XanClic/qemu/commits/block
clone: tag postcopy-v3 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=postcopy-v3

01  - r-b by Juan
02  - r-b by Juan and David
04  - fix indent
07  - s/since 2.6/since 2.8/
10  - change variable name s/buf/str/
12  - improve copyright message and move it up
      fix memory loss (thanks to Juan)
      switch from DPRINTF to trace events
14* - switch to sha256 and qcrypto_hash_*
      separate qmp command x-debug-block-dirty-bitmap-sha256
16  - use path_suffix for multi-vm test
      fix indent
      fix copyright
      use x-debug-block-dirty-bitmap-sha256 instead of md5
17  - use x-debug-block-dirty-bitmap-sha256 instead of md5
      remove not existing 170 test from qemu-iotests/group

*Note: patch 14 is also used in my second series 'qcow2 persistent dirty bitmaps'


v2:
some bugs fixed, iotests a bit changed and merged into one test.
based on block-next (https://github.com/XanClic/qemu/commits/block-next)
clone: tag postcopy-v2 from https://src.openvz.org/scm/~vsementsov/qemu.git
online: https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=refs%2Ftags%2Fpostcopy-v2

v1:

These series are derived from my 'Dirty bitmaps migration' series. The
core idea is switch to postcopy migration and drop usage of meta
bitmaps.

These patches provide dirty bitmap postcopy migration feature. Only
named dirty bitmaps are to be migrated. Migration may be enabled using
migration capabilities.

The overall method (thanks to John Snow):

1. migrate bitmaps meta data in .save_live_setup
   - create/find related bitmaps on target
   - disable them
   - create successors (anonimous children) only for enabled migrated
     bitmaps
2. do nothing in precopy stage
3. just before target vm start: enable successors, created in (1)
4. migrate bitmap data
5. reclaime bitmaps (merge successors to their parents)
6. enable bitmaps (only bitmaps, which was enabled in source)


Some patches are unchnaged from (v7) of 'Dirty bitmaps migration'
(DBMv7). I've left Reviewed-by's for them, if you don't like it, say me
and I'll drop them in the following version.

So, relatively to last DBMv7: 

01-04: new patches, splitting common postcopy migration out of ram
       postcopy migration
   05: equal to DBMv7.05
   06: new
   07: equal to DBMv7.06
   08: new
   09: equal to DBMv7.07
   10: new
   11: derived from DBMv7.08, see below
12-15: equal to DBMv7.09-12
   16: derived from DBMv7.13
       - switch from fifo to socket, as postcopy don't work with fifo
         for now
       - change parameters: size, granularity, regions
       - add time.sleep, to wait for postcopy migration phase (bad
         temporary solution.
       - drop Reviewed-by
   17: new

   11: the core patch of the series, it is derived from
       [DBMv7.08: migration: add migration_block-dirty-bitmap.c]
       There are a lot of changes related to switching from precopy to
       postcopy, but functions related to migration stream itself
       (structs, send/load sequences) are mostly unchnaged.

       So, changes, to switch from precopy to postcopy:
       - removed all staff related to meta bitmaps and dirty phase!!!
       - add dirty_bitmap_mig_enable_successors, and call it before
         target vm start in loadvm_postcopy_handle_run
       - add enabled_bitmaps list of bitmaps for
         dirty_bitmap_mig_enable_successors

       - enabled flag is send with start bitmap chunk instead of
         completion chunk
       - sectors_per_chunk is calculated directly from CHUNK_SIZE, not
         using meta bitmap granularity

       - dirty_bitmap_save_iterate: remove dirty_phase, move bulk_phase
         to postcopy stage
       - dirty_bitmap_save_pending: remove dirty phase related pending,
         switch pending to non-postcopyable
       - dirty_bitmap_load_start: get enabled flag and prepare
         successors for enabled bitmaps, also add them to
         enabled_bitmaps list
       - dirty_bitmap_load_complete: for enabled bitmaps: merge them
         with successors and enable

       - savevm handlers:
         * remove separate savevm_dirty_bitmap_live_iterate_handlers state
           (it was bad idea, any way), and move its save_live_iterate to
           savevm_dirty_bitmap_handlers
         * add is_active_iterate savevm handler, which allows iterations
           only in postcopy stage (after stopping source vm)
         * add has_postcopy savevm handler. (ofcourse, just returning true)
         * use save_live_complete_postcopy instead of
           save_live_complete_precopy

       Other changes:
       - some debug output changed
       - remove MIN_LIVE_SIZE, is_live_iterative and related staff (it
         was needed to omit iterations if bitmap data is small, possibly
         this should be reimplemented)

Vladimir Sementsov-Ogievskiy (16):
  migration: add has_postcopy savevm handler
  migration: fix ram_save_pending
  migration: split common postcopy out of ram postcopy
  migration: introduce postcopy-only pending
  block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()
  qapi: add dirty-bitmaps migration capability
  block/dirty-bitmap: add bdrv_dirty_bitmap_release_successor
  migration: include migrate_dirty_bitmaps in migrate_postcopy
  migration/qemu-file: add qemu_put_counted_string()
  migration: add is_active_iterate handler
  block/dirty-bitmap: add bdrv_dirty_bitmap_set_frozen
  migration: add postcopy migration of dirty bitmaps
  iotests: add add_incoming_migration to VM class
  iotests: add default node-name
  iotests: add dirty bitmap migration test
  iotests: add dirty bitmap postcopy test

 block/dirty-bitmap.c           |  43 ++-
 include/block/dirty-bitmap.h   |   4 +
 include/migration/misc.h       |   3 +
 include/migration/register.h   |  19 +-
 migration/Makefile.objs        |   1 +
 migration/block-dirty-bitmap.c | 700 +++++++++++++++++++++++++++++++++++++++++
 migration/block.c              |   7 +-
 migration/migration.c          |  66 ++--
 migration/migration.h          |   6 +
 migration/qemu-file.c          |  13 +
 migration/qemu-file.h          |   2 +
 migration/ram.c                |  19 +-
 migration/savevm.c             |  74 ++++-
 migration/savevm.h             |   5 +-
 migration/trace-events         |  16 +-
 qapi-schema.json               |   4 +-
 tests/qemu-iotests/169         | 140 +++++++++
 tests/qemu-iotests/169.out     |   5 +
 tests/qemu-iotests/group       |   1 +
 tests/qemu-iotests/iotests.py  |  15 +-
 vl.c                           |   1 +
 21 files changed, 1092 insertions(+), 52 deletions(-)
 create mode 100644 migration/block-dirty-bitmap.c
 create mode 100755 tests/qemu-iotests/169
 create mode 100644 tests/qemu-iotests/169.out

-- 
2.11.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 01/16] migration: add has_postcopy savevm handler
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 02/16] migration: fix ram_save_pending Vladimir Sementsov-Ogievskiy
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Now postcopy-able states are recognized by not NULL
save_live_complete_postcopy handler. But when we have several different
postcopy-able states, it is not convenient. Ram postcopy may be
disabled, while some other postcopy enabled, in this case Ram state
should behave as it is not postcopy-able.

This patch add separate has_postcopy handler to specify behaviour of
savevm state.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 include/migration/register.h | 1 +
 migration/ram.c              | 6 ++++++
 migration/savevm.c           | 6 ++++--
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/migration/register.h b/include/migration/register.h
index d9498d95eb..b999917c9d 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -24,6 +24,7 @@ typedef struct SaveVMHandlers {
 
     /* This runs both outside and inside the iothread lock.  */
     bool (*is_active)(void *opaque);
+    bool (*has_postcopy)(void *opaque);
 
     /* This runs outside the iothread lock in the migration case, and
      * within the lock in the savevm case.  The callback had better only
diff --git a/migration/ram.c b/migration/ram.c
index 0baa1e0d56..64d006bc8b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2622,10 +2622,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
     return ret;
 }
 
+static bool ram_has_postcopy(void *opaque)
+{
+    return migrate_postcopy_ram();
+}
+
 static SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
     .save_live_complete_postcopy = ram_save_complete,
+    .has_postcopy = ram_has_postcopy,
     .save_live_complete_precopy = ram_save_complete,
     .save_live_pending = ram_save_pending,
     .load_state = ram_load,
diff --git a/migration/savevm.c b/migration/savevm.c
index be3f885119..37da83509f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1008,7 +1008,8 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
          * call that's already run, it might get confused if we call
          * iterate afterwards.
          */
-        if (postcopy && !se->ops->save_live_complete_postcopy) {
+        if (postcopy &&
+            !(se->ops->has_postcopy && se->ops->has_postcopy(se->opaque))) {
             continue;
         }
         if (qemu_file_rate_limit(f)) {
@@ -1097,7 +1098,8 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
         if (!se->ops ||
-            (in_postcopy && se->ops->save_live_complete_postcopy) ||
+            (in_postcopy && se->ops->has_postcopy &&
+             se->ops->has_postcopy(se->opaque)) ||
             (in_postcopy && !iterable_only) ||
             !se->ops->save_live_complete_precopy) {
             continue;
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 02/16] migration: fix ram_save_pending
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 01/16] migration: add has_postcopy savevm handler Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy Vladimir Sementsov-Ogievskiy
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Fill postcopy-able pending only if ram postcopy is enabled.
It is necessary because of there will be other postcopy-able states and
when ram postcopy is disabled, it should not spoil common postcopy
related pending.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 64d006bc8b..b43589e3d2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2068,8 +2068,12 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
         remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;
     }
 
-    /* We can do postcopy, and all the data is postcopiable */
-    *postcopiable_pending += remaining_size;
+    if (migrate_postcopy_ram()) {
+        /* We can do postcopy, and all the data is postcopiable */
+        *postcopiable_pending += remaining_size;
+    } else {
+        *non_postcopiable_pending += remaining_size;
+    }
 }
 
 static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host)
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 01/16] migration: add has_postcopy savevm handler Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 02/16] migration: fix ram_save_pending Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-11 13:06   ` Dr. David Alan Gilbert
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 04/16] migration: introduce postcopy-only pending Vladimir Sementsov-Ogievskiy
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Split common postcopy staff from ram postcopy staff.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 migration/migration.c | 39 ++++++++++++++++++++++++++-------------
 migration/migration.h |  2 ++
 migration/savevm.c    | 48 +++++++++++++++++++++++++++++++++++++++---------
 3 files changed, 67 insertions(+), 22 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 51ccd1a4c5..d3a2fd405a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1221,6 +1221,11 @@ bool migrate_postcopy_ram(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_RAM];
 }
 
+bool migrate_postcopy(void)
+{
+    return migrate_postcopy_ram();
+}
+
 bool migrate_auto_converge(void)
 {
     MigrationState *s;
@@ -1577,9 +1582,11 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
      * need to tell the destination to throw any pages it's already received
      * that are dirty
      */
-    if (ram_postcopy_send_discard_bitmap(ms)) {
-        error_report("postcopy send discard bitmap failed");
-        goto fail;
+    if (migrate_postcopy_ram()) {
+        if (ram_postcopy_send_discard_bitmap(ms)) {
+            error_report("postcopy send discard bitmap failed");
+            goto fail;
+        }
     }
 
     /*
@@ -1588,8 +1595,10 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
      * wrap their state up here
      */
     qemu_file_set_rate_limit(ms->to_dst_file, INT64_MAX);
-    /* Ping just for debugging, helps line traces up */
-    qemu_savevm_send_ping(ms->to_dst_file, 2);
+    if (migrate_postcopy_ram()) {
+        /* Ping just for debugging, helps line traces up */
+        qemu_savevm_send_ping(ms->to_dst_file, 2);
+    }
 
     /*
      * While loading the device state we may trigger page transfer
@@ -1614,7 +1623,9 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
     qemu_savevm_send_postcopy_listen(fb);
 
     qemu_savevm_state_complete_precopy(fb, false, false);
-    qemu_savevm_send_ping(fb, 3);
+    if (migrate_postcopy_ram()) {
+        qemu_savevm_send_ping(fb, 3);
+    }
 
     qemu_savevm_send_postcopy_run(fb);
 
@@ -1649,11 +1660,13 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
 
     qemu_mutex_unlock_iothread();
 
-    /*
-     * Although this ping is just for debug, it could potentially be
-     * used for getting a better measurement of downtime at the source.
-     */
-    qemu_savevm_send_ping(ms->to_dst_file, 4);
+    if (migrate_postcopy_ram()) {
+        /*
+         * Although this ping is just for debug, it could potentially be
+         * used for getting a better measurement of downtime at the source.
+         */
+        qemu_savevm_send_ping(ms->to_dst_file, 4);
+    }
 
     if (migrate_release_ram()) {
         ram_postcopy_migrated_memory_release(ms);
@@ -1831,7 +1844,7 @@ static void *migration_thread(void *opaque)
         qemu_savevm_send_ping(s->to_dst_file, 1);
     }
 
-    if (migrate_postcopy_ram()) {
+    if (migrate_postcopy()) {
         /*
          * Tell the destination that we *might* want to do postcopy later;
          * if the other end can't do postcopy it should fail now, nice and
@@ -1864,7 +1877,7 @@ static void *migration_thread(void *opaque)
             if (pending_size && pending_size >= threshold_size) {
                 /* Still a significant amount to transfer */
 
-                if (migrate_postcopy_ram() &&
+                if (migrate_postcopy() &&
                     s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
                     pend_nonpost <= threshold_size &&
                     atomic_read(&s->start_postcopy)) {
diff --git a/migration/migration.h b/migration/migration.h
index 148c9facbc..1d974bacce 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -165,6 +165,8 @@ bool migration_is_blocked(Error **errp);
 bool migration_in_postcopy(void);
 MigrationState *migrate_get_current(void);
 
+bool migrate_postcopy(void);
+
 bool migrate_release_ram(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
diff --git a/migration/savevm.c b/migration/savevm.c
index 37da83509f..cf79e1d3ac 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -89,7 +89,7 @@ static struct mig_cmd_args {
     [MIG_CMD_INVALID]          = { .len = -1, .name = "INVALID" },
     [MIG_CMD_OPEN_RETURN_PATH] = { .len =  0, .name = "OPEN_RETURN_PATH" },
     [MIG_CMD_PING]             = { .len = sizeof(uint32_t), .name = "PING" },
-    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = 16, .name = "POSTCOPY_ADVISE" },
+    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = -1, .name = "POSTCOPY_ADVISE" },
     [MIG_CMD_POSTCOPY_LISTEN]  = { .len =  0, .name = "POSTCOPY_LISTEN" },
     [MIG_CMD_POSTCOPY_RUN]     = { .len =  0, .name = "POSTCOPY_RUN" },
     [MIG_CMD_POSTCOPY_RAM_DISCARD] = {
@@ -98,6 +98,23 @@ static struct mig_cmd_args {
     [MIG_CMD_MAX]              = { .len = -1, .name = "MAX" },
 };
 
+/* Note for MIG_CMD_POSTCOPY_ADVISE:
+ * The format of arguments is depending on postcopy mode:
+ * - postcopy RAM only
+ *   uint64_t host page size
+ *   uint64_t taget page size
+ *
+ * - postcopy RAM and postcopy dirty bitmaps
+ *   format is the same as for postcopy RAM only
+ *
+ * - postcopy dirty bitmaps only
+ *   Nothing. Command length field is 0.
+ *
+ * Be careful: adding a new postcopy entity with some other parameters should
+ * not break format self-description ability. Good way is to introduce some
+ * generic extendable format with an exception for two old entities.
+ */
+
 static int announce_self_create(uint8_t *buf,
                                 uint8_t *mac_addr)
 {
@@ -861,12 +878,17 @@ int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
 /* Send prior to any postcopy transfer */
 void qemu_savevm_send_postcopy_advise(QEMUFile *f)
 {
-    uint64_t tmp[2];
-    tmp[0] = cpu_to_be64(ram_pagesize_summary());
-    tmp[1] = cpu_to_be64(qemu_target_page_size());
+    if (migrate_postcopy_ram()) {
+        uint64_t tmp[2];
+        tmp[0] = cpu_to_be64(ram_pagesize_summary());
+        tmp[1] = cpu_to_be64(qemu_target_page_size());
 
-    trace_qemu_savevm_send_postcopy_advise();
-    qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 16, (uint8_t *)tmp);
+        trace_qemu_savevm_send_postcopy_advise();
+        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE,
+                                 16, (uint8_t *)tmp);
+    } else {
+        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 0, NULL);
+    }
 }
 
 /* Sent prior to starting the destination running in postcopy, discard pages
@@ -1352,6 +1374,10 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
         return -1;
     }
 
+    if (!migrate_postcopy_ram()) {
+        return 0;
+    }
+
     if (!postcopy_ram_supported_by_host()) {
         postcopy_state_set(POSTCOPY_INCOMING_NONE);
         return -1;
@@ -1562,7 +1588,9 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
          * A rare case, we entered listen without having to do any discards,
          * so do the setup that's normally done at the time of the 1st discard.
          */
-        postcopy_ram_prepare_discard(mis);
+        if (migrate_postcopy_ram()) {
+            postcopy_ram_prepare_discard(mis);
+        }
     }
 
     /*
@@ -1570,8 +1598,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
      * However, at this point the CPU shouldn't be running, and the IO
      * shouldn't be doing anything yet so don't actually expect requests
      */
-    if (postcopy_ram_enable_notify(mis)) {
-        return -1;
+    if (migrate_postcopy_ram()) {
+        if (postcopy_ram_enable_notify(mis)) {
+            return -1;
+        }
     }
 
     if (mis->have_listen_thread) {
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 04/16] migration: introduce postcopy-only pending
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 05/16] block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor() Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

There would be savevm states (dirty-bitmap) which can migrate only in
postcopy stage. The corresponding pending is introduced here.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 include/migration/register.h | 17 +++++++++++++++--
 migration/block.c            |  7 ++++---
 migration/migration.c        | 15 ++++++++-------
 migration/ram.c              |  9 +++++----
 migration/savevm.c           | 13 ++++++++-----
 migration/savevm.h           |  5 +++--
 migration/trace-events       |  2 +-
 7 files changed, 44 insertions(+), 24 deletions(-)

diff --git a/include/migration/register.h b/include/migration/register.h
index b999917c9d..ebf68992dd 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -37,8 +37,21 @@ typedef struct SaveVMHandlers {
     int (*save_live_setup)(QEMUFile *f, void *opaque);
     void (*save_live_pending)(QEMUFile *f, void *opaque,
                               uint64_t threshold_size,
-                              uint64_t *non_postcopiable_pending,
-                              uint64_t *postcopiable_pending);
+                              uint64_t *res_precopy_only,
+                              uint64_t *res_compatible,
+                              uint64_t *res_postcopy_only);
+    /* Note for save_live_pending:
+     * - res_precopy_only is for data which must be migrated in precopy phase
+     *     or in stopped state, in other words - before target vm start
+     * - res_compatible is for data which may be migrated in any phase
+     * - res_postcopy_only is for data which must be migrated in postcopy phase
+     *     or in stopped state, in other words - after source vm stop
+     *
+     * Sum of res_postcopy_only, res_compatible and res_postcopy_only is the
+     * whole amount of pending data.
+     */
+
+
     LoadStateHandler *load_state;
 } SaveVMHandlers;
 
diff --git a/migration/block.c b/migration/block.c
index 2844149d34..0897c90fd2 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -857,8 +857,9 @@ static int block_save_complete(QEMUFile *f, void *opaque)
 }
 
 static void block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
-                               uint64_t *non_postcopiable_pending,
-                               uint64_t *postcopiable_pending)
+                               uint64_t *res_precopy_only,
+                               uint64_t *res_compatible,
+                               uint64_t *res_postcopy_only)
 {
     /* Estimate pending number of bytes to send */
     uint64_t pending;
@@ -879,7 +880,7 @@ static void block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
 
     DPRINTF("Enter save live pending  %" PRIu64 "\n", pending);
     /* We don't do postcopy */
-    *non_postcopiable_pending += pending;
+    *res_precopy_only += pending;
 }
 
 static int block_load(QEMUFile *f, void *opaque, int version_id)
diff --git a/migration/migration.c b/migration/migration.c
index d3a2fd405a..b13a9f496a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1867,20 +1867,21 @@ static void *migration_thread(void *opaque)
         uint64_t pending_size;
 
         if (!qemu_file_rate_limit(s->to_dst_file)) {
-            uint64_t pend_post, pend_nonpost;
+            uint64_t pend_pre, pend_compat, pend_post;
 
-            qemu_savevm_state_pending(s->to_dst_file, threshold_size,
-                                      &pend_nonpost, &pend_post);
-            pending_size = pend_nonpost + pend_post;
+            qemu_savevm_state_pending(s->to_dst_file, threshold_size, &pend_pre,
+                                      &pend_compat, &pend_post);
+            pending_size = pend_pre + pend_compat + pend_post;
             trace_migrate_pending(pending_size, threshold_size,
-                                  pend_post, pend_nonpost);
+                                  pend_pre, pend_compat, pend_post);
             if (pending_size && pending_size >= threshold_size) {
                 /* Still a significant amount to transfer */
 
                 if (migrate_postcopy() &&
                     s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
-                    pend_nonpost <= threshold_size &&
-                    atomic_read(&s->start_postcopy)) {
+                    pend_pre <= threshold_size &&
+                    (atomic_read(&s->start_postcopy) ||
+                     (pend_pre + pend_compat <= threshold_size))) {
 
                     if (!postcopy_start(s, &old_vm_running)) {
                         current_active_state = MIGRATION_STATUS_POSTCOPY_ACTIVE;
diff --git a/migration/ram.c b/migration/ram.c
index b43589e3d2..0f564d1ff3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2049,8 +2049,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 }
 
 static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
-                             uint64_t *non_postcopiable_pending,
-                             uint64_t *postcopiable_pending)
+                             uint64_t *res_precopy_only,
+                             uint64_t *res_compatible,
+                             uint64_t *res_postcopy_only)
 {
     RAMState **temp = opaque;
     RAMState *rs = *temp;
@@ -2070,9 +2071,9 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size,
 
     if (migrate_postcopy_ram()) {
         /* We can do postcopy, and all the data is postcopiable */
-        *postcopiable_pending += remaining_size;
+        *res_compatible += remaining_size;
     } else {
-        *non_postcopiable_pending += remaining_size;
+        *res_precopy_only += remaining_size;
     }
 }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index cf79e1d3ac..0ab036489a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1210,13 +1210,15 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
  * for units that can't do postcopy.
  */
 void qemu_savevm_state_pending(QEMUFile *f, uint64_t threshold_size,
-                               uint64_t *res_non_postcopiable,
-                               uint64_t *res_postcopiable)
+                               uint64_t *res_precopy_only,
+                               uint64_t *res_compatible,
+                               uint64_t *res_postcopy_only)
 {
     SaveStateEntry *se;
 
-    *res_non_postcopiable = 0;
-    *res_postcopiable = 0;
+    *res_precopy_only = 0;
+    *res_compatible = 0;
+    *res_postcopy_only = 0;
 
 
     QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
@@ -1229,7 +1231,8 @@ void qemu_savevm_state_pending(QEMUFile *f, uint64_t threshold_size,
             }
         }
         se->ops->save_live_pending(f, se->opaque, threshold_size,
-                                   res_non_postcopiable, res_postcopiable);
+                                   res_precopy_only, res_compatible,
+                                   res_postcopy_only);
     }
 }
 
diff --git a/migration/savevm.h b/migration/savevm.h
index 5a2ed1161d..775f7eec0f 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -38,8 +38,9 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f);
 int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
                                        bool inactivate_disks);
 void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
-                               uint64_t *res_non_postcopiable,
-                               uint64_t *res_postcopiable);
+                               uint64_t *res_precopy_only,
+                               uint64_t *res_compatible,
+                               uint64_t *res_postcopy_only);
 void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
 void qemu_savevm_send_open_return_path(QEMUFile *f);
 int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len);
diff --git a/migration/trace-events b/migration/trace-events
index 38345be9c3..34cc2adafc 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -83,7 +83,7 @@ migrate_fd_cleanup(void) ""
 migrate_fd_error(const char *error_desc) "error=%s"
 migrate_fd_cancel(void) ""
 migrate_handle_rp_req_pages(const char *rbname, size_t start, size_t len) "in %s at %zx len %zx"
-migrate_pending(uint64_t size, uint64_t max, uint64_t post, uint64_t nonpost) "pending size %" PRIu64 " max %" PRIu64 " (post=%" PRIu64 " nonpost=%" PRIu64 ")"
+migrate_pending(uint64_t size, uint64_t max, uint64_t pre, uint64_t compat, uint64_t post) "pending size %" PRIu64 " max %" PRIu64 " (pre = %" PRIu64 " compat=%" PRIu64 " post=%" PRIu64 ")"
 migrate_send_rp_message(int msg_type, uint16_t len) "%d: len %d"
 migration_completion_file_err(void) ""
 migration_completion_postcopy_end(void) ""
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 05/16] block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 04/16] migration: introduce postcopy-only pending Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 06/16] qapi: add dirty-bitmaps migration capability Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Enabling bitmap successor is necessary to enable successors of bitmaps
being migrated before target vm start.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/dirty-bitmap.c         | 8 ++++++++
 include/block/dirty-bitmap.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 0b349f0b5a..7b598c36bf 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -237,6 +237,14 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
     return 0;
 }
 
+/* Called with BQL taken. */
+void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap)
+{
+    qemu_mutex_lock(bitmap->mutex);
+    bdrv_enable_dirty_bitmap(bitmap->successor);
+    qemu_mutex_unlock(bitmap->mutex);
+}
+
 /**
  * For a bitmap with a successor, yield our name to the successor,
  * delete the old bitmap, and return a handle to the new bitmap.
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 0341a605d7..d00a7a4b7c 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -20,6 +20,7 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
 BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
                                            BdrvDirtyBitmap *bitmap,
                                            Error **errp);
+void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
                                         const char *name);
 void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap);
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 06/16] qapi: add dirty-bitmaps migration capability
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 05/16] block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor() Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 21:03   ` Eric Blake
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 07/16] block/dirty-bitmap: add bdrv_dirty_bitmap_release_successor Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 9 +++++++++
 migration/migration.h | 1 +
 qapi-schema.json      | 4 +++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index b13a9f496a..dac67eab69 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1280,6 +1280,15 @@ int migrate_decompress_threads(void)
     return s->parameters.decompress_threads;
 }
 
+bool migrate_dirty_bitmaps(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS];
+}
+
 bool migrate_use_events(void)
 {
     MigrationState *s;
diff --git a/migration/migration.h b/migration/migration.h
index 1d974bacce..0c6ba56423 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -170,6 +170,7 @@ bool migrate_postcopy(void);
 bool migrate_release_ram(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
+bool migrate_dirty_bitmaps(void);
 
 bool migrate_auto_converge(void);
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 485767f1ab..f1d32547ec 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -903,12 +903,14 @@
 # @return-path: If enabled, migration will use the return path even
 #               for precopy. (since 2.10)
 #
+# @dirty-bitmaps: If enabled, QEMU will migrate named dirty bitmaps. (since 2.9)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
            'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
-           'block', 'return-path' ] }
+           'block', 'return-path', 'dirty-bitmaps' ] }
 
 ##
 # @MigrationCapabilityStatus:
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 07/16] block/dirty-bitmap: add bdrv_dirty_bitmap_release_successor
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 06/16] qapi: add dirty-bitmaps migration capability Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-08-14  9:37   ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 08/16] migration: include migrate_dirty_bitmaps in migrate_postcopy Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

To just release successor and unfreeze bitmap without any additional
work.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
---
 block/dirty-bitmap.c         | 13 +++++++++++++
 include/block/dirty-bitmap.h |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 7b598c36bf..b5678d2d25 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -303,6 +303,19 @@ BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
     return parent;
 }
 
+/* Called with BQL taken. */
+void bdrv_dirty_bitmap_release_successor(BlockDriverState *bs,
+                                         BdrvDirtyBitmap *parent)
+{
+    qemu_mutex_lock(parent->mutex);
+
+    if (parent->successor) {
+        bdrv_release_dirty_bitmap(bs, parent->successor);
+        parent->successor = NULL;
+    }
+
+    qemu_mutex_unlock(parent->mutex);
+}
 /**
  * Truncates _all_ bitmaps attached to a BDS.
  * Called with BQL taken.
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index d00a7a4b7c..fa2f670050 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -20,6 +20,8 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
 BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
                                            BdrvDirtyBitmap *bitmap,
                                            Error **errp);
+void bdrv_dirty_bitmap_release_successor(BlockDriverState *bs,
+                                         BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
                                         const char *name);
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 08/16] migration: include migrate_dirty_bitmaps in migrate_postcopy
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 07/16] block/dirty-bitmap: add bdrv_dirty_bitmap_release_successor Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 09/16] migration/qemu-file: add qemu_put_counted_string() Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Enable postcopy if dirty bitmap migration is endabled.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index dac67eab69..48adc812bc 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1223,7 +1223,7 @@ bool migrate_postcopy_ram(void)
 
 bool migrate_postcopy(void)
 {
-    return migrate_postcopy_ram();
+    return migrate_postcopy_ram() || migrate_dirty_bitmaps();
 }
 
 bool migrate_auto_converge(void)
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 09/16] migration/qemu-file: add qemu_put_counted_string()
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 08/16] migration: include migrate_dirty_bitmaps in migrate_postcopy Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 10/16] migration: add is_active_iterate handler Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Add function opposite to qemu_get_counted_string.
qemu_put_counted_string puts one-byte length of the string (string
should not be longer than 255 characters), and then it puts the string,
without last zero byte.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 migration/qemu-file.c | 13 +++++++++++++
 migration/qemu-file.h |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 2ab2bf362d..e85f501f86 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -734,6 +734,19 @@ size_t qemu_get_counted_string(QEMUFile *f, char buf[256])
 }
 
 /*
+ * Put a string with one preceding byte containing its length. The length of
+ * the string should be less than 256.
+ */
+void qemu_put_counted_string(QEMUFile *f, const char *str)
+{
+    size_t len = strlen(str);
+
+    assert(len < 256);
+    qemu_put_byte(f, len);
+    qemu_put_buffer(f, (const uint8_t *)str, len);
+}
+
+/*
  * Set the blocking state of the QEMUFile.
  * Note: On some transports the OS only keeps a single blocking state for
  *       both directions, and thus changing the blocking on the main
diff --git a/migration/qemu-file.h b/migration/qemu-file.h
index aae4e5ed36..f4f356ab12 100644
--- a/migration/qemu-file.h
+++ b/migration/qemu-file.h
@@ -174,4 +174,6 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset,
                              ram_addr_t offset, size_t size,
                              uint64_t *bytes_sent);
 
+void qemu_put_counted_string(QEMUFile *f, const char *name);
+
 #endif
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 10/16] migration: add is_active_iterate handler
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 09/16] migration/qemu-file: add qemu_put_counted_string() Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 11/16] block/dirty-bitmap: add bdrv_dirty_bitmap_set_frozen Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Only-postcopy savevm states (dirty-bitmap) don't need live iteration, so
to disable them and stop transporting empty sections there is a new
savevm handler.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 include/migration/register.h | 1 +
 migration/savevm.c           | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/include/migration/register.h b/include/migration/register.h
index ebf68992dd..3655341544 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -25,6 +25,7 @@ typedef struct SaveVMHandlers {
     /* This runs both outside and inside the iothread lock.  */
     bool (*is_active)(void *opaque);
     bool (*has_postcopy)(void *opaque);
+    bool (*is_active_iterate)(void *opaque);
 
     /* This runs outside the iothread lock in the migration case, and
      * within the lock in the savevm case.  The callback had better only
diff --git a/migration/savevm.c b/migration/savevm.c
index 0ab036489a..f03ee02f67 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1024,6 +1024,11 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy)
                 continue;
             }
         }
+        if (se->ops && se->ops->is_active_iterate) {
+            if (!se->ops->is_active_iterate(se->opaque)) {
+                continue;
+            }
+        }
         /*
          * In the postcopy phase, any device that doesn't know how to
          * do postcopy should have saved it's state in the _complete
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 11/16] block/dirty-bitmap: add bdrv_dirty_bitmap_set_frozen
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 10/16] migration: add is_active_iterate handler Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 12/16] migration: add postcopy migration of dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Make it possible to set bitmap 'frozen' without a successor.
This is needed to protect the bitmap during outgoing bitmap postcopy
migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/dirty-bitmap.c         | 22 ++++++++++++++++++++--
 include/block/dirty-bitmap.h |  1 +
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index b5678d2d25..d79e29786f 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -40,6 +40,8 @@ struct BdrvDirtyBitmap {
     QemuMutex *mutex;
     HBitmap *bitmap;            /* Dirty bitmap implementation */
     HBitmap *meta;              /* Meta dirty bitmap */
+    bool frozen;                /* Bitmap is frozen, it can't be modified
+                                   through QMP */
     BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
     char *name;                 /* Optional non-empty unique ID */
     int64_t size;               /* Size of the bitmap, in bytes */
@@ -183,13 +185,22 @@ const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap)
 /* Called with BQL taken.  */
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
 {
-    return bitmap->successor;
+    return bitmap->frozen;
+}
+
+/* Called with BQL taken.  */
+void bdrv_dirty_bitmap_set_frozen(BdrvDirtyBitmap *bitmap, bool frozen)
+{
+    qemu_mutex_lock(bitmap->mutex);
+    assert(bitmap->successor == NULL);
+    bitmap->frozen = frozen;
+    qemu_mutex_unlock(bitmap->mutex);
 }
 
 /* Called with BQL taken.  */
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
 {
-    return !(bitmap->disabled || bitmap->successor);
+    return !(bitmap->disabled || (bitmap->successor != NULL));
 }
 
 /* Called with BQL taken.  */
@@ -234,6 +245,7 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
 
     /* Install the successor and freeze the parent */
     bitmap->successor = child;
+    bitmap->frozen = true;
     return 0;
 }
 
@@ -266,6 +278,8 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
     name = bitmap->name;
     bitmap->name = NULL;
     successor->name = name;
+    assert(bitmap->frozen);
+    bitmap->frozen = false;
     bitmap->successor = NULL;
     successor->persistent = bitmap->persistent;
     bitmap->persistent = false;
@@ -298,6 +312,8 @@ BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
         return NULL;
     }
     bdrv_release_dirty_bitmap(bs, successor);
+    assert(parent->frozen);
+    parent->frozen = false;
     parent->successor = NULL;
 
     return parent;
@@ -311,6 +327,8 @@ void bdrv_dirty_bitmap_release_successor(BlockDriverState *bs,
 
     if (parent->successor) {
         bdrv_release_dirty_bitmap(bs, parent->successor);
+        assert(parent->frozen);
+        parent->frozen = false;
         parent->successor = NULL;
     }
 
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index fa2f670050..cd831d40de 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -39,6 +39,7 @@ uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
 uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
+void bdrv_dirty_bitmap_set_frozen(BdrvDirtyBitmap *bitmap, bool frozen);
 const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
 int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
 DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 12/16] migration: add postcopy migration of dirty bitmaps
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 11/16] block/dirty-bitmap: add bdrv_dirty_bitmap_set_frozen Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-09-18 14:07   ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 13/16] iotests: add add_incoming_migration to VM class Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Postcopy migration of dirty bitmaps. Only named dirty bitmaps,
associated with root nodes and non-root named nodes are migrated.

If destination qemu is already containing a dirty bitmap with the same name
as a migrated bitmap (for the same node), then, if their granularities are
the same the migration will be done, otherwise the error will be generated.

If destination qemu doesn't contain such bitmap it will be created.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/migration/misc.h       |   3 +
 migration/Makefile.objs        |   1 +
 migration/block-dirty-bitmap.c | 700 +++++++++++++++++++++++++++++++++++++++++
 migration/migration.c          |   3 +
 migration/migration.h          |   3 +
 migration/savevm.c             |   2 +
 migration/trace-events         |  14 +
 vl.c                           |   1 +
 8 files changed, 727 insertions(+)
 create mode 100644 migration/block-dirty-bitmap.c

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 22551216bb..c53689eb0f 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -56,4 +56,7 @@ bool migration_in_postcopy_after_devices(MigrationState *);
 void migration_only_migratable_set(void);
 void migration_global_dump(Monitor *mon);
 
+/* migration/block-dirty-bitmap.c */
+void dirty_bitmap_mig_init(void);
+
 #endif
diff --git a/migration/Makefile.objs b/migration/Makefile.objs
index 1c7770da46..525cc8293e 100644
--- a/migration/Makefile.objs
+++ b/migration/Makefile.objs
@@ -6,6 +6,7 @@ common-obj-y += qemu-file.o global_state.o
 common-obj-y += qemu-file-channel.o
 common-obj-y += xbzrle.o postcopy-ram.o
 common-obj-y += qjson.o
+common-obj-y += block-dirty-bitmap.o
 
 common-obj-$(CONFIG_RDMA) += rdma.o
 
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
new file mode 100644
index 0000000000..e4cd17bcc8
--- /dev/null
+++ b/migration/block-dirty-bitmap.c
@@ -0,0 +1,700 @@
+/*
+ * Block dirty bitmap postcopy migration
+ *
+ * Copyright IBM, Corp. 2009
+ * Copyright (C) 2016 Parallels IP Holdings GmbH. All rights reserved.
+ *
+ * Authors:
+ *  Liran Schour   <lirans@il.ibm.com>
+ *  Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ * This file is derived from migration/block.c, so it's author and IBM copyright
+ * are here, although content is quite different.
+ *
+ * Contributions after 2012-01-13 are licensed under the terms of the
+ * GNU GPL, version 2 or (at your option) any later version.
+ *
+ *                                ***
+ *
+ * Here postcopy migration of dirty bitmaps is realized. Only named dirty
+ * bitmaps, associated with root nodes and non-root named nodes are migrated.
+ *
+ * If destination qemu is already containing a dirty bitmap with the same name
+ * as a migrated bitmap (for the same node), then, if their granularities are
+ * the same the migration will be done, otherwise the error will be generated.
+ *
+ * If destination qemu doesn't contain such bitmap it will be created.
+ *
+ * format of migration:
+ *
+ * # Header (shared for different chunk types)
+ * 1, 2 or 4 bytes: flags (see qemu_{put,put}_flags)
+ * [ 1 byte: node name size ] \  flags & DEVICE_NAME
+ * [ n bytes: node name     ] /
+ * [ 1 byte: bitmap name size ] \  flags & BITMAP_NAME
+ * [ n bytes: bitmap name     ] /
+ *
+ * # Start of bitmap migration (flags & START)
+ * header
+ * be64: granularity
+ * 1 byte: bitmap enabled flag
+ *
+ * # Complete of bitmap migration (flags & COMPLETE)
+ * header
+ *
+ * # Data chunk of bitmap migration
+ * header
+ * be64: start sector
+ * be32: number of sectors
+ * [ be64: buffer size  ] \ ! (flags & ZEROES)
+ * [ n bytes: buffer    ] /
+ *
+ * The last chunk in stream should contain flags & EOS. The chunk may skip
+ * device and/or bitmap names, assuming them to be the same with the previous
+ * chunk.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block.h"
+#include "block/block_int.h"
+#include "sysemu/block-backend.h"
+#include "qemu/main-loop.h"
+#include "qemu/error-report.h"
+#include "migration/misc.h"
+#include "migration/migration.h"
+#include "migration/qemu-file.h"
+#include "migration/vmstate.h"
+#include "migration/register.h"
+#include "qemu/hbitmap.h"
+#include "sysemu/sysemu.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+#define CHUNK_SIZE     (1 << 10)
+
+/* Flags occupy one, two or four bytes (Big Endian). The size is determined as
+ * follows:
+ * in first (most significant) byte bit 8 is clear  -->  one byte
+ * in first byte bit 8 is set    -->  two or four bytes, depending on second
+ *                                    byte:
+ *    | in second byte bit 8 is clear  -->  two bytes
+ *    | in second byte bit 8 is set    -->  four bytes
+ */
+#define DIRTY_BITMAP_MIG_FLAG_EOS           0x01
+#define DIRTY_BITMAP_MIG_FLAG_ZEROES        0x02
+#define DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME   0x04
+#define DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME   0x08
+#define DIRTY_BITMAP_MIG_FLAG_START         0x10
+#define DIRTY_BITMAP_MIG_FLAG_COMPLETE      0x20
+#define DIRTY_BITMAP_MIG_FLAG_BITS          0x40
+
+#define DIRTY_BITMAP_MIG_EXTRA_FLAGS        0x80
+
+typedef struct DirtyBitmapMigBitmapState {
+    /* Written during setup phase. */
+    BlockDriverState *bs;
+    const char *node_name;
+    BdrvDirtyBitmap *bitmap;
+    uint64_t total_sectors;
+    uint64_t sectors_per_chunk;
+    QSIMPLEQ_ENTRY(DirtyBitmapMigBitmapState) entry;
+
+    /* For bulk phase. */
+    bool bulk_completed;
+    uint64_t cur_sector;
+} DirtyBitmapMigBitmapState;
+
+typedef struct DirtyBitmapMigState {
+    QSIMPLEQ_HEAD(dbms_list, DirtyBitmapMigBitmapState) dbms_list;
+
+    bool bulk_completed;
+
+    /* for send_bitmap_bits() */
+    BlockDriverState *prev_bs;
+    BdrvDirtyBitmap *prev_bitmap;
+} DirtyBitmapMigState;
+
+typedef struct DirtyBitmapLoadState {
+    uint32_t flags;
+    char node_name[256];
+    char bitmap_name[256];
+    BlockDriverState *bs;
+    BdrvDirtyBitmap *bitmap;
+} DirtyBitmapLoadState;
+
+static DirtyBitmapMigState dirty_bitmap_mig_state;
+
+typedef struct DirtyBitmapLoadBitmapState {
+    BlockDriverState *bs;
+    BdrvDirtyBitmap *bitmap;
+    bool migrated;
+} DirtyBitmapLoadBitmapState;
+static GSList *enabled_bitmaps;
+QemuMutex finish_lock;
+
+void init_dirty_bitmap_incoming_migration(void)
+{
+    qemu_mutex_init(&finish_lock);
+}
+
+static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
+{
+    uint8_t flags = qemu_get_byte(f);
+    if (flags & DIRTY_BITMAP_MIG_EXTRA_FLAGS) {
+        flags = flags << 8 | qemu_get_byte(f);
+        if (flags & DIRTY_BITMAP_MIG_EXTRA_FLAGS) {
+            flags = flags << 16 | qemu_get_be16(f);
+        }
+    }
+
+    return flags;
+}
+
+static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t flags)
+{
+    /* The code currently do not send flags more than one byte */
+    assert(!(flags & (0xffffff00 | DIRTY_BITMAP_MIG_EXTRA_FLAGS)));
+
+    qemu_put_byte(f, flags);
+}
+
+static void send_bitmap_header(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+                               uint32_t additional_flags)
+{
+    BlockDriverState *bs = dbms->bs;
+    BdrvDirtyBitmap *bitmap = dbms->bitmap;
+    uint32_t flags = additional_flags;
+    trace_send_bitmap_header_enter();
+
+    if (bs != dirty_bitmap_mig_state.prev_bs) {
+        dirty_bitmap_mig_state.prev_bs = bs;
+        flags |= DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME;
+    }
+
+    if (bitmap != dirty_bitmap_mig_state.prev_bitmap) {
+        dirty_bitmap_mig_state.prev_bitmap = bitmap;
+        flags |= DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME;
+    }
+
+    qemu_put_bitmap_flags(f, flags);
+
+    if (flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) {
+        qemu_put_counted_string(f, dbms->node_name);
+    }
+
+    if (flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) {
+        qemu_put_counted_string(f, bdrv_dirty_bitmap_name(bitmap));
+    }
+}
+
+static void send_bitmap_start(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+{
+    send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
+    qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
+    qemu_put_byte(f, bdrv_dirty_bitmap_enabled(dbms->bitmap));
+}
+
+static void send_bitmap_complete(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+{
+    send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
+}
+
+static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+                             uint64_t start_sector, uint32_t nr_sectors)
+{
+    /* align for buffer_is_zero() */
+    uint64_t align = 4 * sizeof(long);
+    uint64_t unaligned_size =
+        bdrv_dirty_bitmap_serialization_size(
+            dbms->bitmap, start_sector << BDRV_SECTOR_BITS,
+            (uint64_t)nr_sectors << BDRV_SECTOR_BITS);
+    uint64_t buf_size = (unaligned_size + align - 1) & ~(align - 1);
+    uint8_t *buf = g_malloc0(buf_size);
+    uint32_t flags = DIRTY_BITMAP_MIG_FLAG_BITS;
+
+    bdrv_dirty_bitmap_serialize_part(
+        dbms->bitmap, buf, start_sector << BDRV_SECTOR_BITS,
+        (uint64_t)nr_sectors << BDRV_SECTOR_BITS);
+
+    if (buffer_is_zero(buf, buf_size)) {
+        g_free(buf);
+        buf = NULL;
+        flags |= DIRTY_BITMAP_MIG_FLAG_ZEROES;
+    }
+
+    trace_send_bitmap_bits(flags, start_sector, nr_sectors, buf_size);
+
+    send_bitmap_header(f, dbms, flags);
+
+    qemu_put_be64(f, start_sector);
+    qemu_put_be32(f, nr_sectors);
+
+    /* if a block is zero we need to flush here since the network
+     * bandwidth is now a lot higher than the storage device bandwidth.
+     * thus if we queue zero blocks we slow down the migration. */
+    if (flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
+        qemu_fflush(f);
+    } else {
+        qemu_put_be64(f, buf_size);
+        qemu_put_buffer(f, buf, buf_size);
+    }
+
+    g_free(buf);
+}
+
+
+/* Called with iothread lock taken.  */
+
+static int init_dirty_bitmap_migration(void)
+{
+    BlockDriverState *bs;
+    BdrvDirtyBitmap *bitmap;
+    DirtyBitmapMigBitmapState *dbms;
+    BdrvNextIterator it;
+
+    dirty_bitmap_mig_state.bulk_completed = false;
+    dirty_bitmap_mig_state.prev_bs = NULL;
+    dirty_bitmap_mig_state.prev_bitmap = NULL;
+
+    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+        if (!bdrv_get_device_or_node_name(bs)) {
+            /* not named non-root node */
+            continue;
+        }
+
+        for (bitmap = bdrv_dirty_bitmap_next(bs, NULL); bitmap;
+             bitmap = bdrv_dirty_bitmap_next(bs, bitmap)) {
+            if (!bdrv_dirty_bitmap_name(bitmap)) {
+                continue;
+            }
+
+            if (bdrv_dirty_bitmap_frozen(bitmap)) {
+                error_report("Can't migrate frozen dirty bitmap: '%s",
+                             bdrv_dirty_bitmap_name(bitmap));
+                return -1;
+            }
+        }
+    }
+
+    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
+        if (!bdrv_get_device_or_node_name(bs)) {
+            /* not named non-root node */
+            continue;
+        }
+
+        for (bitmap = bdrv_dirty_bitmap_next(bs, NULL); bitmap;
+             bitmap = bdrv_dirty_bitmap_next(bs, bitmap)) {
+            if (!bdrv_dirty_bitmap_name(bitmap)) {
+                continue;
+            }
+
+            bdrv_ref(bs);
+            bdrv_dirty_bitmap_set_frozen(bitmap, true);
+
+            dbms = g_new0(DirtyBitmapMigBitmapState, 1);
+            dbms->bs = bs;
+            dbms->node_name = bdrv_get_node_name(bs);
+            if (!dbms->node_name || dbms->node_name[0] == '\0') {
+                dbms->node_name = bdrv_get_device_name(bs);
+            }
+            dbms->bitmap = bitmap;
+            dbms->total_sectors = bdrv_nb_sectors(bs);
+            dbms->sectors_per_chunk = CHUNK_SIZE * 8 *
+                bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
+
+            QSIMPLEQ_INSERT_TAIL(&dirty_bitmap_mig_state.dbms_list,
+                                 dbms, entry);
+        }
+    }
+
+    return 0;
+}
+
+/* Called with no lock taken.  */
+static void bulk_phase_send_chunk(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+{
+    uint32_t nr_sectors = MIN(dbms->total_sectors - dbms->cur_sector,
+                             dbms->sectors_per_chunk);
+
+    send_bitmap_bits(f, dbms, dbms->cur_sector, nr_sectors);
+
+    dbms->cur_sector += nr_sectors;
+    if (dbms->cur_sector >= dbms->total_sectors) {
+        dbms->bulk_completed = true;
+    }
+}
+
+/* Called with no lock taken.  */
+static void bulk_phase(QEMUFile *f, bool limit)
+{
+    DirtyBitmapMigBitmapState *dbms;
+
+    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
+        while (!dbms->bulk_completed) {
+            bulk_phase_send_chunk(f, dbms);
+            if (limit && qemu_file_rate_limit(f)) {
+                return;
+            }
+        }
+    }
+
+    dirty_bitmap_mig_state.bulk_completed = true;
+}
+
+/* Called with iothread lock taken.  */
+static void dirty_bitmap_mig_cleanup(void)
+{
+    DirtyBitmapMigBitmapState *dbms;
+
+    while ((dbms = QSIMPLEQ_FIRST(&dirty_bitmap_mig_state.dbms_list)) != NULL) {
+        QSIMPLEQ_REMOVE_HEAD(&dirty_bitmap_mig_state.dbms_list, entry);
+        bdrv_dirty_bitmap_set_frozen(dbms->bitmap, false);
+        bdrv_unref(dbms->bs);
+        g_free(dbms);
+    }
+}
+
+/* for SaveVMHandlers */
+static void dirty_bitmap_migration_cleanup(void *opaque)
+{
+    dirty_bitmap_mig_cleanup();
+}
+
+static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
+{
+    trace_dirty_bitmap_save_iterate(migration_in_postcopy());
+
+    if (migration_in_postcopy() && !dirty_bitmap_mig_state.bulk_completed) {
+        bulk_phase(f, true);
+    }
+
+    qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
+
+    return dirty_bitmap_mig_state.bulk_completed;
+}
+
+/* Called with iothread lock taken.  */
+
+static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque)
+{
+    DirtyBitmapMigBitmapState *dbms;
+    trace_dirty_bitmap_save_complete_enter();
+
+    if (!dirty_bitmap_mig_state.bulk_completed) {
+        bulk_phase(f, false);
+    }
+
+    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
+        send_bitmap_complete(f, dbms);
+    }
+
+    qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
+
+    trace_dirty_bitmap_save_complete_finish();
+
+    dirty_bitmap_mig_cleanup();
+    return 0;
+}
+
+static void dirty_bitmap_save_pending(QEMUFile *f, void *opaque,
+                                      uint64_t max_size,
+                                      uint64_t *res_precopy_only,
+                                      uint64_t *res_compatible,
+                                      uint64_t *res_postcopy_only)
+{
+    DirtyBitmapMigBitmapState *dbms;
+    uint64_t pending = 0;
+
+    qemu_mutex_lock_iothread();
+
+    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
+        uint64_t gran = bdrv_dirty_bitmap_granularity(dbms->bitmap);
+        uint64_t sectors = dbms->bulk_completed ? 0 :
+                           dbms->total_sectors - dbms->cur_sector;
+
+        pending += (sectors * BDRV_SECTOR_SIZE + gran - 1) / gran;
+    }
+
+    qemu_mutex_unlock_iothread();
+
+    trace_dirty_bitmap_save_pending(pending, max_size);
+
+    *res_postcopy_only += pending;
+}
+
+/* First occurrence of this bitmap. It should be created if doesn't exist */
+static int dirty_bitmap_load_start(QEMUFile *f, DirtyBitmapLoadState *s)
+{
+    Error *local_err = NULL;
+    uint32_t granularity = qemu_get_be32(f);
+    bool enabled = qemu_get_byte(f);
+
+    if (!s->bitmap) {
+        s->bitmap = bdrv_create_dirty_bitmap(s->bs, granularity,
+                                             s->bitmap_name, &local_err);
+        if (!s->bitmap) {
+            error_report_err(local_err);
+            return -EINVAL;
+        }
+    } else {
+        uint32_t dest_granularity =
+            bdrv_dirty_bitmap_granularity(s->bitmap);
+        if (dest_granularity != granularity) {
+            error_report("Error: "
+                         "Migrated bitmap granularity (%" PRIu32 ") "
+                         "doesn't match the destination bitmap '%s' "
+                         "granularity (%" PRIu32 ")",
+                         granularity,
+                         bdrv_dirty_bitmap_name(s->bitmap),
+                         dest_granularity);
+            return -EINVAL;
+        }
+    }
+
+    bdrv_disable_dirty_bitmap(s->bitmap);
+    if (enabled) {
+        DirtyBitmapLoadBitmapState *b;
+
+        bdrv_dirty_bitmap_create_successor(s->bs, s->bitmap, &local_err);
+        if (local_err) {
+            error_report_err(local_err);
+            return -EINVAL;
+        }
+
+        b = g_new(DirtyBitmapLoadBitmapState, 1);
+        b->bs = s->bs;
+        b->bitmap = s->bitmap;
+        b->migrated = false;
+        enabled_bitmaps = g_slist_prepend(enabled_bitmaps, b);
+    }
+
+    return 0;
+}
+
+void dirty_bitmap_mig_before_vm_start(void)
+{
+    GSList *item;
+
+    qemu_mutex_lock(&finish_lock);
+
+    for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
+        DirtyBitmapLoadBitmapState *b = item->data;
+
+        if (b->migrated) {
+            bdrv_enable_dirty_bitmap(b->bitmap);
+        } else {
+            bdrv_dirty_bitmap_enable_successor(b->bitmap);
+        }
+
+        g_free(b);
+    }
+
+    g_slist_free(enabled_bitmaps);
+    enabled_bitmaps = NULL;
+
+    qemu_mutex_unlock(&finish_lock);
+}
+
+static void dirty_bitmap_load_complete(QEMUFile *f, DirtyBitmapLoadState *s)
+{
+    GSList *item;
+    trace_dirty_bitmap_load_complete();
+    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
+
+    qemu_mutex_lock(&finish_lock);
+
+    for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
+        DirtyBitmapLoadBitmapState *b = item->data;
+
+        if (b->bitmap == s->bitmap) {
+            b->migrated = true;
+        }
+    }
+
+    if (bdrv_dirty_bitmap_frozen(s->bitmap)) {
+        if (enabled_bitmaps == NULL) {
+            /* in postcopy */
+            AioContext *aio_context = bdrv_get_aio_context(s->bs);
+            aio_context_acquire(aio_context);
+
+            bdrv_reclaim_dirty_bitmap(s->bs, s->bitmap, &error_abort);
+            bdrv_enable_dirty_bitmap(s->bitmap);
+
+            aio_context_release(aio_context);
+        } else {
+            /* target not started, successor is empty */
+            bdrv_dirty_bitmap_release_successor(s->bs, s->bitmap);
+        }
+    }
+
+    qemu_mutex_unlock(&finish_lock);
+}
+
+static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
+{
+    uint64_t first_byte = qemu_get_be64(f) << BDRV_SECTOR_BITS;
+    uint64_t nr_bytes = qemu_get_be32(f) << BDRV_SECTOR_BITS;
+    trace_dirty_bitmap_load_bits_enter(first_byte >> BDRV_SECTOR_BITS,
+                                       nr_bytes >> BDRV_SECTOR_BITS);
+
+    if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
+        trace_dirty_bitmap_load_bits_zeroes();
+        bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
+                                             false);
+    } else {
+        uint8_t *buf;
+        uint64_t buf_size = qemu_get_be64(f);
+        uint64_t needed_size =
+            bdrv_dirty_bitmap_serialization_size(s->bitmap,
+                                                 first_byte, nr_bytes);
+
+        if (needed_size > buf_size) {
+            error_report("Error: Migrated bitmap granularity doesn't "
+                         "match the destination bitmap '%s' granularity",
+                         bdrv_dirty_bitmap_name(s->bitmap));
+            return -EINVAL;
+        }
+
+        buf = g_malloc(buf_size);
+        qemu_get_buffer(f, buf, buf_size);
+        bdrv_dirty_bitmap_deserialize_part(s->bitmap, buf, first_byte, nr_bytes,
+                                           false);
+        g_free(buf);
+    }
+
+    return 0;
+}
+
+static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
+{
+    Error *local_err = NULL;
+    s->flags = qemu_get_bitmap_flags(f);
+    trace_dirty_bitmap_load_header(s->flags);
+
+    if (s->flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) {
+        if (!qemu_get_counted_string(f, s->node_name)) {
+            error_report("Unable to read node name string");
+            return -EINVAL;
+        }
+        s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
+        if (!s->bs) {
+            error_report_err(local_err);
+            return -EINVAL;
+        }
+    } else if (!s->bs) {
+        error_report("Error: block device name is not set");
+        return -EINVAL;
+    }
+
+    if (s->flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) {
+        if (!qemu_get_counted_string(f, s->bitmap_name)) {
+            error_report("Unable to read node name string");
+            return -EINVAL;
+        }
+        s->bitmap = bdrv_find_dirty_bitmap(s->bs, s->bitmap_name);
+
+        /* bitmap may be NULL here, it wouldn't be an error if it is the
+         * first occurrence of the bitmap */
+        if (!s->bitmap && !(s->flags & DIRTY_BITMAP_MIG_FLAG_START)) {
+            error_report("Error: unknown dirty bitmap "
+                         "'%s' for block device '%s'",
+                         s->bitmap_name, s->node_name);
+            return -EINVAL;
+        }
+    } else if (!s->bitmap) {
+        error_report("Error: block device name is not set");
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
+{
+    static DirtyBitmapLoadState s;
+    int ret = 0;
+
+    trace_dirty_bitmap_load_enter();
+
+    if (version_id != 1) {
+        return -EINVAL;
+    }
+
+    do {
+        dirty_bitmap_load_header(f, &s);
+
+        if (s.flags & DIRTY_BITMAP_MIG_FLAG_START) {
+            ret = dirty_bitmap_load_start(f, &s);
+        } else if (s.flags & DIRTY_BITMAP_MIG_FLAG_COMPLETE) {
+            dirty_bitmap_load_complete(f, &s);
+        } else if (s.flags & DIRTY_BITMAP_MIG_FLAG_BITS) {
+            ret = dirty_bitmap_load_bits(f, &s);
+        }
+
+        if (!ret) {
+            ret = qemu_file_get_error(f);
+        }
+
+        if (ret) {
+            return ret;
+        }
+    } while (!(s.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
+
+    trace_dirty_bitmap_load_success();
+    return 0;
+}
+
+static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
+{
+    DirtyBitmapMigBitmapState *dbms = NULL;
+    if (init_dirty_bitmap_migration() < 0) {
+        return -1;
+    }
+
+    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
+        send_bitmap_start(f, dbms);
+    }
+    qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
+
+    return 0;
+}
+
+static bool dirty_bitmap_is_active(void *opaque)
+{
+    return migrate_dirty_bitmaps();
+}
+
+static bool dirty_bitmap_is_active_iterate(void *opaque)
+{
+    return dirty_bitmap_is_active(opaque) && !runstate_is_running();
+}
+
+static bool dirty_bitmap_has_postcopy(void *opaque)
+{
+    return true;
+}
+
+static SaveVMHandlers savevm_dirty_bitmap_handlers = {
+    .save_live_setup = dirty_bitmap_save_setup,
+    .save_live_complete_postcopy = dirty_bitmap_save_complete,
+    .save_live_complete_precopy = dirty_bitmap_save_complete,
+    .has_postcopy = dirty_bitmap_has_postcopy,
+    .save_live_pending = dirty_bitmap_save_pending,
+    .save_live_iterate = dirty_bitmap_save_iterate,
+    .is_active_iterate = dirty_bitmap_is_active_iterate,
+    .load_state = dirty_bitmap_load,
+    .cleanup = dirty_bitmap_migration_cleanup,
+    .is_active = dirty_bitmap_is_active,
+};
+
+void dirty_bitmap_mig_init(void)
+{
+    QSIMPLEQ_INIT(&dirty_bitmap_mig_state.dbms_list);
+
+    register_savevm_live(NULL, "dirty-bitmap", 0, 1,
+                         &savevm_dirty_bitmap_handlers,
+                         &dirty_bitmap_mig_state);
+}
diff --git a/migration/migration.c b/migration/migration.c
index 48adc812bc..5c49e68f8e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -143,6 +143,9 @@ MigrationIncomingState *migration_incoming_get_current(void)
         memset(&mis_current, 0, sizeof(MigrationIncomingState));
         qemu_mutex_init(&mis_current.rp_mutex);
         qemu_event_init(&mis_current.main_thread_load_event, false);
+
+        init_dirty_bitmap_incoming_migration();
+
         once = true;
     }
     return &mis_current;
diff --git a/migration/migration.h b/migration/migration.h
index 0c6ba56423..8d9f4cd41e 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -196,4 +196,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
 void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname,
                               ram_addr_t start, size_t len);
 
+void dirty_bitmap_mig_before_vm_start(void);
+void init_dirty_bitmap_incoming_migration(void);
+
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index f03ee02f67..c67c1d9110 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1660,6 +1660,8 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
 
     trace_loadvm_postcopy_handle_run_vmstart();
 
+    dirty_bitmap_mig_before_vm_start();
+
     if (autostart) {
         /* Hold onto your hats, starting the CPU */
         vm_start();
diff --git a/migration/trace-events b/migration/trace-events
index 34cc2adafc..672551bd5a 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -224,3 +224,17 @@ colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
 colo_send_message(const char *msg) "Send '%s' message"
 colo_receive_message(const char *msg) "Receive '%s' message"
 colo_failover_set_state(const char *new_state) "new state %s"
+
+# migration/block-dirty-bitmap.c
+send_bitmap_header_enter(void) ""
+send_bitmap_bits(uint32_t flags, uint64_t start_sector, uint32_t nr_sectors, uint64_t data_size) "\n   flags:        %x\n   start_sector: %" PRIu64 "\n   nr_sectors:   %" PRIu32 "\n   data_size:    %" PRIu64 "\n"
+dirty_bitmap_save_iterate(int in_postcopy) "in postcopy: %d"
+dirty_bitmap_save_complete_enter(void) ""
+dirty_bitmap_save_complete_finish(void) ""
+dirty_bitmap_save_pending(uint64_t pending, uint64_t max_size) "pending %" PRIu64 " max: %" PRIu64
+dirty_bitmap_load_complete(void) ""
+dirty_bitmap_load_bits_enter(uint64_t first_sector, uint32_t nr_sectors) "chunk: %" PRIu64 " %" PRIu32
+dirty_bitmap_load_bits_zeroes(void) ""
+dirty_bitmap_load_header(uint32_t flags) "flags %x"
+dirty_bitmap_load_enter(void) ""
+dirty_bitmap_load_success(void) ""
diff --git a/vl.c b/vl.c
index 36ff3f4345..369d8cecd0 100644
--- a/vl.c
+++ b/vl.c
@@ -4523,6 +4523,7 @@ int main(int argc, char **argv, char **envp)
 
     blk_mig_init();
     ram_mig_init();
+    dirty_bitmap_mig_init();
 
     /* If the currently selected machine wishes to override the units-per-bus
      * property of its default HBA interface type, do so now. */
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 13/16] iotests: add add_incoming_migration to VM class
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 12/16] migration: add postcopy migration of dirty bitmaps Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 14/16] iotests: add default node-name Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 tests/qemu-iotests/iotests.py | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index abcf3c10e2..f64277d0b4 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -185,6 +185,12 @@ class VM(qtest.QEMUQtestMachine):
             self._args.append(','.join(opts))
         return self
 
+    def add_incoming_migration(self, desc):
+        '''Add an incoming migration to the VM'''
+        self._args.append('-incoming')
+        self._args.append(desc)
+        return self
+
     def pause_drive(self, drive, event=None):
         '''Pause drive r/w operations'''
         if not event:
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 14/16] iotests: add default node-name
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (12 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 13/16] iotests: add add_incoming_migration to VM class Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 15/16] iotests: add dirty bitmap migration test Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 16/16] iotests: add dirty bitmap postcopy test Vladimir Sementsov-Ogievskiy
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

When testing migration, auto-generated by qemu node-names differs in
source and destination qemu and migration fails. After this patch,
auto-generated by iotest nodenames will be the same.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/iotests.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index f64277d0b4..3dcf539730 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -168,6 +168,8 @@ class VM(qtest.QEMUQtestMachine):
             options.append('file=%s' % path)
             options.append('format=%s' % format)
             options.append('cache=%s' % cachemode)
+            if 'node-name' not in opts:
+                options.append('node-name=drivenode%d' % self._num_drives)
 
         if opts:
             options.append(opts)
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 15/16] iotests: add dirty bitmap migration test
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (13 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 14/16] iotests: add default node-name Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 16/16] iotests: add dirty bitmap postcopy test Vladimir Sementsov-Ogievskiy
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

The test starts two vms (vm_a, vm_b), create dirty bitmap in
the first one, do several writes to corresponding device and
then migrate vm_a to vm_b with dirty bitmaps.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/169     | 86 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/169.out |  5 +++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 92 insertions(+)
 create mode 100755 tests/qemu-iotests/169
 create mode 100644 tests/qemu-iotests/169.out

diff --git a/tests/qemu-iotests/169 b/tests/qemu-iotests/169
new file mode 100755
index 0000000000..ec6cd46a4d
--- /dev/null
+++ b/tests/qemu-iotests/169
@@ -0,0 +1,86 @@
+#!/usr/bin/env python
+#
+# Tests for dirty bitmaps migration.
+#
+# Copyright (C) Vladimir Sementsov-Ogievskiy 2015-2016
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+import iotests
+import time
+from iotests import qemu_img
+
+disk_a = os.path.join(iotests.test_dir, 'disk_a')
+disk_b = os.path.join(iotests.test_dir, 'disk_b')
+fifo = os.path.join(iotests.test_dir, 'mig_fifo')
+
+class TestDirtyBitmapMigration(iotests.QMPTestCase):
+
+    def setUp(self):
+        size = 0x400000000 # 1G
+        os.mkfifo(fifo)
+        qemu_img('create', '-f', iotests.imgfmt, disk_a, str(size))
+        qemu_img('create', '-f', iotests.imgfmt, disk_b, str(size))
+        self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
+        self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+        self.vm_b.add_incoming_migration("exec: cat '" + fifo + "'")
+        self.vm_a.launch()
+        self.vm_b.launch()
+
+    def tearDown(self):
+        self.vm_a.shutdown()
+        self.vm_b.shutdown()
+        os.remove(disk_a)
+        os.remove(disk_b)
+        os.remove(fifo)
+
+    def test_migration(self):
+        granularity = 512
+        regions = [
+            { 'start': 0,           'count': 0x100000 },
+            { 'start': 0x100000000, 'count': 0x200000  },
+            { 'start': 0x399900000, 'count': 0x100000  }
+            ]
+
+        result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
+                               name='bitmap', granularity=granularity)
+        self.assert_qmp(result, 'return', {});
+
+        for r in regions:
+            self.vm_a.hmp_qemu_io('drive0',
+                                  'write %d %d' % (r['start'], r['count']))
+
+        result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+                               node='drive0', name='bitmap')
+        sha256 = result['return']['sha256']
+
+        result = self.vm_a.qmp('migrate-set-capabilities',
+                               capabilities=[{'capability': 'dirty-bitmaps',
+                                              'state': True}])
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
+        self.assertNotEqual(self.vm_a.event_wait("STOP"), None)
+        self.assertNotEqual(self.vm_b.event_wait("RESUME"), None)
+        time.sleep(2)
+
+        result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
+                               node='drive0', name='bitmap')
+        self.assert_qmp(result, 'return/sha256', sha256);
+
+
+if __name__ == '__main__':
+    iotests.main()
diff --git a/tests/qemu-iotests/169.out b/tests/qemu-iotests/169.out
new file mode 100644
index 0000000000..ae1213e6f8
--- /dev/null
+++ b/tests/qemu-iotests/169.out
@@ -0,0 +1,5 @@
+.
+----------------------------------------------------------------------
+Ran 1 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 40bef995fe..a0b7972879 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -165,6 +165,7 @@
 160 rw auto quick
 162 auto quick
 165 rw auto quick
+169 rw auto quick
 170 rw auto quick
 171 rw auto quick
 172 auto
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Qemu-devel] [PATCH v7 16/16] iotests: add dirty bitmap postcopy test
  2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
                   ` (14 preceding siblings ...)
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 15/16] iotests: add dirty bitmap migration test Vladimir Sementsov-Ogievskiy
@ 2017-07-10 16:30 ` Vladimir Sementsov-Ogievskiy
  15 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-10 16:30 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, vsementsov,
	lirans

Test
- start two vms (vm_a, vm_b)

- in a
    - do writes from set A
    - do writes from set B
    - fix bitmap sha256
    - clear bitmap
    - do writes from set A
    - start migration
- than, in b
    - wait vm start (postcopy should start)
    - do writes from set B
    - check bitmap sha256

The test should verify postcopy migration and then merging with delta
(changes in target, during postcopy process).

Reduce supported cache modes to only 'none', because with cache on time
from source.STOP to target.RESUME is unpredictable and we can fail with
timout while waiting for target.RESUME.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/169        | 74 +++++++++++++++++++++++++++++++++++++------
 tests/qemu-iotests/169.out    |  4 +--
 tests/qemu-iotests/iotests.py |  7 +++-
 3 files changed, 72 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/169 b/tests/qemu-iotests/169
index ec6cd46a4d..333df58e0d 100755
--- a/tests/qemu-iotests/169
+++ b/tests/qemu-iotests/169
@@ -29,8 +29,14 @@ fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 class TestDirtyBitmapMigration(iotests.QMPTestCase):
 
-    def setUp(self):
-        size = 0x400000000 # 1G
+    def tearDown(self):
+        self.vm_a.shutdown()
+        self.vm_b.shutdown()
+        os.remove(disk_a)
+        os.remove(disk_b)
+        os.remove(fifo)
+
+    def init(self, size):
         os.mkfifo(fifo)
         qemu_img('create', '-f', iotests.imgfmt, disk_a, str(size))
         qemu_img('create', '-f', iotests.imgfmt, disk_b, str(size))
@@ -40,14 +46,8 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
         self.vm_a.launch()
         self.vm_b.launch()
 
-    def tearDown(self):
-        self.vm_a.shutdown()
-        self.vm_b.shutdown()
-        os.remove(disk_a)
-        os.remove(disk_b)
-        os.remove(fifo)
-
     def test_migration(self):
+        self.init(0x400000000) # 1G
         granularity = 512
         regions = [
             { 'start': 0,           'count': 0x100000 },
@@ -81,6 +81,60 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
                                node='drive0', name='bitmap')
         self.assert_qmp(result, 'return/sha256', sha256);
 
+    def test_postcopy(self):
+        self.init(0x4000000000) # 256G
+        write_size = 0x40000000
+        granularity = 512
+        chunk = 4096
+
+        result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
+                               name='bitmap', granularity=granularity)
+        self.assert_qmp(result, 'return', {});
+
+        s = 0
+        while s < write_size:
+            self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+            s += 0x10000
+        s = 0x8000
+        while s < write_size:
+            self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+            s += 0x10000
+
+        result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+                               node='drive0', name='bitmap')
+        sha256 = result['return']['sha256']
+
+        result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
+                               name='bitmap')
+        self.assert_qmp(result, 'return', {});
+        s = 0
+        while s < write_size:
+            self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+            s += 0x10000
+
+        result = self.vm_a.qmp('migrate-set-capabilities',
+                               capabilities=[{'capability': 'dirty-bitmaps',
+                                              'state': True}])
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
+        self.assertNotEqual(self.vm_a.event_wait("STOP"), None)
+        self.assertNotEqual(self.vm_b.event_wait("RESUME"), None)
+
+        s = 0x8000
+        while s < write_size:
+            self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+            s += 0x10000
+
+        result = self.vm_b.qmp('query-block');
+        while len(result['return'][0]['dirty-bitmaps']) > 1:
+            time.sleep(2)
+            result = self.vm_b.qmp('query-block');
+
+        result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
+                               node='drive0', name='bitmap')
+
+        self.assert_qmp(result, 'return/sha256', sha256);
 
 if __name__ == '__main__':
-    iotests.main()
+    iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'])
diff --git a/tests/qemu-iotests/169.out b/tests/qemu-iotests/169.out
index ae1213e6f8..fbc63e62f8 100644
--- a/tests/qemu-iotests/169.out
+++ b/tests/qemu-iotests/169.out
@@ -1,5 +1,5 @@
-.
+..
 ----------------------------------------------------------------------
-Ran 1 tests
+Ran 2 tests
 
 OK
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 3dcf539730..3a0b40f8e9 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -371,6 +371,10 @@ def verify_platform(supported_oses=['linux']):
     if True not in [sys.platform.startswith(x) for x in supported_oses]:
         notrun('not suitable for this OS: %s' % sys.platform)
 
+def verify_cache_mode(supported_cache_modes=[]):
+    if supported_cache_modes and (cachemode not in supported_cache_modes):
+        notrun('not suitable for this cache mode: %s' % cachemode)
+
 def supports_quorum():
     return 'quorum' in qemu_img_pipe('--help')
 
@@ -379,7 +383,7 @@ def verify_quorum():
     if not supports_quorum():
         notrun('quorum support missing')
 
-def main(supported_fmts=[], supported_oses=['linux']):
+def main(supported_fmts=[], supported_oses=['linux'], supported_cache_modes=[]):
     '''Run tests'''
 
     global debug
@@ -396,6 +400,7 @@ def main(supported_fmts=[], supported_oses=['linux']):
     verbosity = 1
     verify_image_format(supported_fmts)
     verify_platform(supported_oses)
+    verify_cache_mode(supported_cache_modes)
 
     # We need to filter out the time taken from the output so that qemu-iotest
     # can reliably diff the results against master output.
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v7 06/16] qapi: add dirty-bitmaps migration capability
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 06/16] qapi: add dirty-bitmaps migration capability Vladimir Sementsov-Ogievskiy
@ 2017-07-10 21:03   ` Eric Blake
  0 siblings, 0 replies; 34+ messages in thread
From: Eric Blake @ 2017-07-10 21:03 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: pbonzini, armbru, famz, stefanha, amit.shah, quintela, mreitz,
	kwolf, peter.maydell, dgilbert, den, jsnow, lirans

[-- Attachment #1: Type: text/plain, Size: 1257 bytes --]

On 07/10/2017 11:30 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: John Snow <jsnow@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/migration.c | 9 +++++++++
>  migration/migration.h | 1 +
>  qapi-schema.json      | 4 +++-
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 

> +++ b/qapi-schema.json
> @@ -903,12 +903,14 @@
>  # @return-path: If enabled, migration will use the return path even
>  #               for precopy. (since 2.10)
>  #
> +# @dirty-bitmaps: If enabled, QEMU will migrate named dirty bitmaps. (since 2.9)

We've missed 2.9; this should be 2.10.

> +#
>  # Since: 1.2
>  ##
>  { 'enum': 'MigrationCapability',
>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
>             'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
> -           'block', 'return-path' ] }
> +           'block', 'return-path', 'dirty-bitmaps' ] }
>  
>  ##
>  # @MigrationCapabilityStatus:
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy Vladimir Sementsov-Ogievskiy
@ 2017-07-11 13:06   ` Dr. David Alan Gilbert
  2017-07-11 13:38     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 34+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-11 13:06 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: qemu-block, qemu-devel, pbonzini, armbru, eblake, famz, stefanha,
	quintela, mreitz, kwolf, peter.maydell, den, jsnow, lirans

* Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> Split common postcopy staff from ram postcopy staff.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

I'm OK with this; I'm not sure I'd have bothered making the ping's
dependent on it being RAM.

(These first few are pretty much a separable series).

Note a few things below I'd prefer if you reroll:

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c | 39 ++++++++++++++++++++++++++-------------
>  migration/migration.h |  2 ++
>  migration/savevm.c    | 48 +++++++++++++++++++++++++++++++++++++++---------
>  3 files changed, 67 insertions(+), 22 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 51ccd1a4c5..d3a2fd405a 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1221,6 +1221,11 @@ bool migrate_postcopy_ram(void)
>      return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_RAM];
>  }
>  
> +bool migrate_postcopy(void)
> +{
> +    return migrate_postcopy_ram();
> +}
> +
>  bool migrate_auto_converge(void)
>  {
>      MigrationState *s;
> @@ -1577,9 +1582,11 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>       * need to tell the destination to throw any pages it's already received
>       * that are dirty
>       */
> -    if (ram_postcopy_send_discard_bitmap(ms)) {
> -        error_report("postcopy send discard bitmap failed");
> -        goto fail;
> +    if (migrate_postcopy_ram()) {
> +        if (ram_postcopy_send_discard_bitmap(ms)) {

I'd rather:
       if (migrate_postcopy_ram() &&
           ram_postcopy_send_discard_bitmap(ms)) {

avoiding one extra layer of {}'s

Dave

> +            error_report("postcopy send discard bitmap failed");
> +            goto fail;
> +        }
>      }
>  
>      /*
> @@ -1588,8 +1595,10 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>       * wrap their state up here
>       */
>      qemu_file_set_rate_limit(ms->to_dst_file, INT64_MAX);
> -    /* Ping just for debugging, helps line traces up */
> -    qemu_savevm_send_ping(ms->to_dst_file, 2);
> +    if (migrate_postcopy_ram()) {
> +        /* Ping just for debugging, helps line traces up */
> +        qemu_savevm_send_ping(ms->to_dst_file, 2);
> +    }
>  
>      /*
>       * While loading the device state we may trigger page transfer
> @@ -1614,7 +1623,9 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>      qemu_savevm_send_postcopy_listen(fb);
>  
>      qemu_savevm_state_complete_precopy(fb, false, false);
> -    qemu_savevm_send_ping(fb, 3);
> +    if (migrate_postcopy_ram()) {
> +        qemu_savevm_send_ping(fb, 3);
> +    }
>  
>      qemu_savevm_send_postcopy_run(fb);
>  
> @@ -1649,11 +1660,13 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>  
>      qemu_mutex_unlock_iothread();
>  
> -    /*
> -     * Although this ping is just for debug, it could potentially be
> -     * used for getting a better measurement of downtime at the source.
> -     */
> -    qemu_savevm_send_ping(ms->to_dst_file, 4);
> +    if (migrate_postcopy_ram()) {
> +        /*
> +         * Although this ping is just for debug, it could potentially be
> +         * used for getting a better measurement of downtime at the source.
> +         */
> +        qemu_savevm_send_ping(ms->to_dst_file, 4);
> +    }
>  
>      if (migrate_release_ram()) {
>          ram_postcopy_migrated_memory_release(ms);
> @@ -1831,7 +1844,7 @@ static void *migration_thread(void *opaque)
>          qemu_savevm_send_ping(s->to_dst_file, 1);
>      }
>  
> -    if (migrate_postcopy_ram()) {
> +    if (migrate_postcopy()) {
>          /*
>           * Tell the destination that we *might* want to do postcopy later;
>           * if the other end can't do postcopy it should fail now, nice and
> @@ -1864,7 +1877,7 @@ static void *migration_thread(void *opaque)
>              if (pending_size && pending_size >= threshold_size) {
>                  /* Still a significant amount to transfer */
>  
> -                if (migrate_postcopy_ram() &&
> +                if (migrate_postcopy() &&
>                      s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
>                      pend_nonpost <= threshold_size &&
>                      atomic_read(&s->start_postcopy)) {
> diff --git a/migration/migration.h b/migration/migration.h
> index 148c9facbc..1d974bacce 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -165,6 +165,8 @@ bool migration_is_blocked(Error **errp);
>  bool migration_in_postcopy(void);
>  MigrationState *migrate_get_current(void);
>  
> +bool migrate_postcopy(void);
> +
>  bool migrate_release_ram(void);
>  bool migrate_postcopy_ram(void);
>  bool migrate_zero_blocks(void);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 37da83509f..cf79e1d3ac 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -89,7 +89,7 @@ static struct mig_cmd_args {
>      [MIG_CMD_INVALID]          = { .len = -1, .name = "INVALID" },
>      [MIG_CMD_OPEN_RETURN_PATH] = { .len =  0, .name = "OPEN_RETURN_PATH" },
>      [MIG_CMD_PING]             = { .len = sizeof(uint32_t), .name = "PING" },
> -    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = 16, .name = "POSTCOPY_ADVISE" },
> +    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = -1, .name = "POSTCOPY_ADVISE" },
>      [MIG_CMD_POSTCOPY_LISTEN]  = { .len =  0, .name = "POSTCOPY_LISTEN" },
>      [MIG_CMD_POSTCOPY_RUN]     = { .len =  0, .name = "POSTCOPY_RUN" },
>      [MIG_CMD_POSTCOPY_RAM_DISCARD] = {
> @@ -98,6 +98,23 @@ static struct mig_cmd_args {
>      [MIG_CMD_MAX]              = { .len = -1, .name = "MAX" },
>  };
>  
> +/* Note for MIG_CMD_POSTCOPY_ADVISE:
> + * The format of arguments is depending on postcopy mode:
> + * - postcopy RAM only
> + *   uint64_t host page size
> + *   uint64_t taget page size
> + *
> + * - postcopy RAM and postcopy dirty bitmaps
> + *   format is the same as for postcopy RAM only
> + *
> + * - postcopy dirty bitmaps only
> + *   Nothing. Command length field is 0.
> + *
> + * Be careful: adding a new postcopy entity with some other parameters should
> + * not break format self-description ability. Good way is to introduce some
> + * generic extendable format with an exception for two old entities.
> + */
> +
>  static int announce_self_create(uint8_t *buf,
>                                  uint8_t *mac_addr)
>  {
> @@ -861,12 +878,17 @@ int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
>  /* Send prior to any postcopy transfer */
>  void qemu_savevm_send_postcopy_advise(QEMUFile *f)
>  {
> -    uint64_t tmp[2];
> -    tmp[0] = cpu_to_be64(ram_pagesize_summary());
> -    tmp[1] = cpu_to_be64(qemu_target_page_size());
> +    if (migrate_postcopy_ram()) {
> +        uint64_t tmp[2];
> +        tmp[0] = cpu_to_be64(ram_pagesize_summary());
> +        tmp[1] = cpu_to_be64(qemu_target_page_size());
>  
> -    trace_qemu_savevm_send_postcopy_advise();
> -    qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 16, (uint8_t *)tmp);
> +        trace_qemu_savevm_send_postcopy_advise();
> +        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE,
> +                                 16, (uint8_t *)tmp);
> +    } else {
> +        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 0, NULL);
> +    }
>  }
>  
>  /* Sent prior to starting the destination running in postcopy, discard pages
> @@ -1352,6 +1374,10 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
>          return -1;
>      }
>  
> +    if (!migrate_postcopy_ram()) {
> +        return 0;
> +    }
> +
>      if (!postcopy_ram_supported_by_host()) {
>          postcopy_state_set(POSTCOPY_INCOMING_NONE);
>          return -1;
> @@ -1562,7 +1588,9 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>           * A rare case, we entered listen without having to do any discards,
>           * so do the setup that's normally done at the time of the 1st discard.
>           */
> -        postcopy_ram_prepare_discard(mis);
> +        if (migrate_postcopy_ram()) {
> +            postcopy_ram_prepare_discard(mis);
> +        }
>      }
>  
>      /*
> @@ -1570,8 +1598,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>       * However, at this point the CPU shouldn't be running, and the IO
>       * shouldn't be doing anything yet so don't actually expect requests
>       */
> -    if (postcopy_ram_enable_notify(mis)) {
> -        return -1;
> +    if (migrate_postcopy_ram()) {
> +        if (postcopy_ram_enable_notify(mis)) {
> +            return -1;
> +        }
>      }
>  
>      if (mis->have_listen_thread) {
> -- 
> 2.11.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-07-11 13:06   ` Dr. David Alan Gilbert
@ 2017-07-11 13:38     ` Vladimir Sementsov-Ogievskiy
  2017-08-21 23:34       ` John Snow
  0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-07-11 13:38 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-block, qemu-devel, pbonzini, armbru, eblake, famz, stefanha,
	quintela, mreitz, kwolf, peter.maydell, den, jsnow, lirans

11.07.2017 16:06, Dr. David Alan Gilbert wrote:
> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>> Split common postcopy staff from ram postcopy staff.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> I'm OK with this; I'm not sure I'd have bothered making the ping's
> dependent on it being RAM.
>
> (These first few are pretty much a separable series).

It would be grate if you (or Juan?) can take them separately.

>
> Note a few things below I'd prefer if you reroll:
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
>> ---
>>   migration/migration.c | 39 ++++++++++++++++++++++++++-------------
>>   migration/migration.h |  2 ++
>>   migration/savevm.c    | 48 +++++++++++++++++++++++++++++++++++++++---------
>>   3 files changed, 67 insertions(+), 22 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 51ccd1a4c5..d3a2fd405a 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1221,6 +1221,11 @@ bool migrate_postcopy_ram(void)
>>       return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_RAM];
>>   }
>>   
>> +bool migrate_postcopy(void)
>> +{
>> +    return migrate_postcopy_ram();
>> +}
>> +
>>   bool migrate_auto_converge(void)
>>   {
>>       MigrationState *s;
>> @@ -1577,9 +1582,11 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>>        * need to tell the destination to throw any pages it's already received
>>        * that are dirty
>>        */
>> -    if (ram_postcopy_send_discard_bitmap(ms)) {
>> -        error_report("postcopy send discard bitmap failed");
>> -        goto fail;
>> +    if (migrate_postcopy_ram()) {
>> +        if (ram_postcopy_send_discard_bitmap(ms)) {
> I'd rather:
>         if (migrate_postcopy_ram() &&
>             ram_postcopy_send_discard_bitmap(ms)) {
>
> avoiding one extra layer of {}'s
>
> Dave
>
>> +            error_report("postcopy send discard bitmap failed");
>> +            goto fail;
>> +        }
>>       }
>>   
>>       /*
>> @@ -1588,8 +1595,10 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>>        * wrap their state up here
>>        */
>>       qemu_file_set_rate_limit(ms->to_dst_file, INT64_MAX);
>> -    /* Ping just for debugging, helps line traces up */
>> -    qemu_savevm_send_ping(ms->to_dst_file, 2);
>> +    if (migrate_postcopy_ram()) {
>> +        /* Ping just for debugging, helps line traces up */
>> +        qemu_savevm_send_ping(ms->to_dst_file, 2);
>> +    }
>>   
>>       /*
>>        * While loading the device state we may trigger page transfer
>> @@ -1614,7 +1623,9 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>>       qemu_savevm_send_postcopy_listen(fb);
>>   
>>       qemu_savevm_state_complete_precopy(fb, false, false);
>> -    qemu_savevm_send_ping(fb, 3);
>> +    if (migrate_postcopy_ram()) {
>> +        qemu_savevm_send_ping(fb, 3);
>> +    }
>>   
>>       qemu_savevm_send_postcopy_run(fb);
>>   
>> @@ -1649,11 +1660,13 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>>   
>>       qemu_mutex_unlock_iothread();
>>   
>> -    /*
>> -     * Although this ping is just for debug, it could potentially be
>> -     * used for getting a better measurement of downtime at the source.
>> -     */
>> -    qemu_savevm_send_ping(ms->to_dst_file, 4);
>> +    if (migrate_postcopy_ram()) {
>> +        /*
>> +         * Although this ping is just for debug, it could potentially be
>> +         * used for getting a better measurement of downtime at the source.
>> +         */
>> +        qemu_savevm_send_ping(ms->to_dst_file, 4);
>> +    }
>>   
>>       if (migrate_release_ram()) {
>>           ram_postcopy_migrated_memory_release(ms);
>> @@ -1831,7 +1844,7 @@ static void *migration_thread(void *opaque)
>>           qemu_savevm_send_ping(s->to_dst_file, 1);
>>       }
>>   
>> -    if (migrate_postcopy_ram()) {
>> +    if (migrate_postcopy()) {
>>           /*
>>            * Tell the destination that we *might* want to do postcopy later;
>>            * if the other end can't do postcopy it should fail now, nice and
>> @@ -1864,7 +1877,7 @@ static void *migration_thread(void *opaque)
>>               if (pending_size && pending_size >= threshold_size) {
>>                   /* Still a significant amount to transfer */
>>   
>> -                if (migrate_postcopy_ram() &&
>> +                if (migrate_postcopy() &&
>>                       s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
>>                       pend_nonpost <= threshold_size &&
>>                       atomic_read(&s->start_postcopy)) {
>> diff --git a/migration/migration.h b/migration/migration.h
>> index 148c9facbc..1d974bacce 100644
>> --- a/migration/migration.h
>> +++ b/migration/migration.h
>> @@ -165,6 +165,8 @@ bool migration_is_blocked(Error **errp);
>>   bool migration_in_postcopy(void);
>>   MigrationState *migrate_get_current(void);
>>   
>> +bool migrate_postcopy(void);
>> +
>>   bool migrate_release_ram(void);
>>   bool migrate_postcopy_ram(void);
>>   bool migrate_zero_blocks(void);
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 37da83509f..cf79e1d3ac 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -89,7 +89,7 @@ static struct mig_cmd_args {
>>       [MIG_CMD_INVALID]          = { .len = -1, .name = "INVALID" },
>>       [MIG_CMD_OPEN_RETURN_PATH] = { .len =  0, .name = "OPEN_RETURN_PATH" },
>>       [MIG_CMD_PING]             = { .len = sizeof(uint32_t), .name = "PING" },
>> -    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = 16, .name = "POSTCOPY_ADVISE" },
>> +    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = -1, .name = "POSTCOPY_ADVISE" },
>>       [MIG_CMD_POSTCOPY_LISTEN]  = { .len =  0, .name = "POSTCOPY_LISTEN" },
>>       [MIG_CMD_POSTCOPY_RUN]     = { .len =  0, .name = "POSTCOPY_RUN" },
>>       [MIG_CMD_POSTCOPY_RAM_DISCARD] = {
>> @@ -98,6 +98,23 @@ static struct mig_cmd_args {
>>       [MIG_CMD_MAX]              = { .len = -1, .name = "MAX" },
>>   };
>>   
>> +/* Note for MIG_CMD_POSTCOPY_ADVISE:
>> + * The format of arguments is depending on postcopy mode:
>> + * - postcopy RAM only
>> + *   uint64_t host page size
>> + *   uint64_t taget page size
>> + *
>> + * - postcopy RAM and postcopy dirty bitmaps
>> + *   format is the same as for postcopy RAM only
>> + *
>> + * - postcopy dirty bitmaps only
>> + *   Nothing. Command length field is 0.
>> + *
>> + * Be careful: adding a new postcopy entity with some other parameters should
>> + * not break format self-description ability. Good way is to introduce some
>> + * generic extendable format with an exception for two old entities.
>> + */
>> +
>>   static int announce_self_create(uint8_t *buf,
>>                                   uint8_t *mac_addr)
>>   {
>> @@ -861,12 +878,17 @@ int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
>>   /* Send prior to any postcopy transfer */
>>   void qemu_savevm_send_postcopy_advise(QEMUFile *f)
>>   {
>> -    uint64_t tmp[2];
>> -    tmp[0] = cpu_to_be64(ram_pagesize_summary());
>> -    tmp[1] = cpu_to_be64(qemu_target_page_size());
>> +    if (migrate_postcopy_ram()) {
>> +        uint64_t tmp[2];
>> +        tmp[0] = cpu_to_be64(ram_pagesize_summary());
>> +        tmp[1] = cpu_to_be64(qemu_target_page_size());
>>   
>> -    trace_qemu_savevm_send_postcopy_advise();
>> -    qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 16, (uint8_t *)tmp);
>> +        trace_qemu_savevm_send_postcopy_advise();
>> +        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE,
>> +                                 16, (uint8_t *)tmp);
>> +    } else {
>> +        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 0, NULL);
>> +    }
>>   }
>>   
>>   /* Sent prior to starting the destination running in postcopy, discard pages
>> @@ -1352,6 +1374,10 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
>>           return -1;
>>       }
>>   
>> +    if (!migrate_postcopy_ram()) {
>> +        return 0;
>> +    }
>> +
>>       if (!postcopy_ram_supported_by_host()) {
>>           postcopy_state_set(POSTCOPY_INCOMING_NONE);
>>           return -1;
>> @@ -1562,7 +1588,9 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>>            * A rare case, we entered listen without having to do any discards,
>>            * so do the setup that's normally done at the time of the 1st discard.
>>            */
>> -        postcopy_ram_prepare_discard(mis);
>> +        if (migrate_postcopy_ram()) {
>> +            postcopy_ram_prepare_discard(mis);
>> +        }
>>       }
>>   
>>       /*
>> @@ -1570,8 +1598,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>>        * However, at this point the CPU shouldn't be running, and the IO
>>        * shouldn't be doing anything yet so don't actually expect requests
>>        */
>> -    if (postcopy_ram_enable_notify(mis)) {
>> -        return -1;
>> +    if (migrate_postcopy_ram()) {
>> +        if (postcopy_ram_enable_notify(mis)) {
>> +            return -1;
>> +        }
>>       }
>>   
>>       if (mis->have_listen_thread) {
>> -- 
>> 2.11.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v7 07/16] block/dirty-bitmap: add bdrv_dirty_bitmap_release_successor
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 07/16] block/dirty-bitmap: add bdrv_dirty_bitmap_release_successor Vladimir Sementsov-Ogievskiy
@ 2017-08-14  9:37   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-08-14  9:37 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, lirans

10.07.2017 19:30, Vladimir Sementsov-Ogievskiy wrote:
> To just release successor and unfreeze bitmap without any additional
> work.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> ---
>   block/dirty-bitmap.c         | 13 +++++++++++++
>   include/block/dirty-bitmap.h |  2 ++
>   2 files changed, 15 insertions(+)
>
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index 7b598c36bf..b5678d2d25 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -303,6 +303,19 @@ BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
>       return parent;
>   }
>   
> +/* Called with BQL taken. */
> +void bdrv_dirty_bitmap_release_successor(BlockDriverState *bs,
> +                                         BdrvDirtyBitmap *parent)
> +{
> +    qemu_mutex_lock(parent->mutex);
> +
> +    if (parent->successor) {
> +        bdrv_release_dirty_bitmap(bs, parent->successor);

this will try to lock same mutex and deadlock.

> +        parent->successor = NULL;
> +    }
> +
> +    qemu_mutex_unlock(parent->mutex);
> +}
>   /**
>    * Truncates _all_ bitmaps attached to a BDS.
>    * Called with BQL taken.
> diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
> index d00a7a4b7c..fa2f670050 100644
> --- a/include/block/dirty-bitmap.h
> +++ b/include/block/dirty-bitmap.h
> @@ -20,6 +20,8 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
>   BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
>                                              BdrvDirtyBitmap *bitmap,
>                                              Error **errp);
> +void bdrv_dirty_bitmap_release_successor(BlockDriverState *bs,
> +                                         BdrvDirtyBitmap *bitmap);
>   void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap);
>   BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
>                                           const char *name);


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-07-11 13:38     ` Vladimir Sementsov-Ogievskiy
@ 2017-08-21 23:34       ` John Snow
  2017-09-18 14:07         ` [Qemu-devel] ping " Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 34+ messages in thread
From: John Snow @ 2017-08-21 23:34 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Dr. David Alan Gilbert
  Cc: kwolf, peter.maydell, famz, lirans, qemu-block, quintela,
	qemu-devel, armbru, stefanha, den, pbonzini, mreitz



On 07/11/2017 09:38 AM, Vladimir Sementsov-Ogievskiy wrote:
> 11.07.2017 16:06, Dr. David Alan Gilbert wrote:
>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>> Split common postcopy staff from ram postcopy staff.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> I'm OK with this; I'm not sure I'd have bothered making the ping's
>> dependent on it being RAM.
>>
>> (These first few are pretty much a separable series).
> 
> It would be grate if you (or Juan?) can take them separately.
> 

Yes please. I don't think this ever happened, did it? Can we split off
1-3 and re-roll?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] [PATCH v7 12/16] migration: add postcopy migration of dirty bitmaps
  2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 12/16] migration: add postcopy migration of dirty bitmaps Vladimir Sementsov-Ogievskiy
@ 2017-09-18 14:07   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-09-18 14:07 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbonzini, armbru, eblake, famz, stefanha, amit.shah, quintela,
	mreitz, kwolf, peter.maydell, dgilbert, den, jsnow, lirans

10.07.2017 19:30, Vladimir Sementsov-Ogievskiy wrote:
> Postcopy migration of dirty bitmaps. Only named dirty bitmaps,
> associated with root nodes and non-root named nodes are migrated.
>
> If destination qemu is already containing a dirty bitmap with the same name
> as a migrated bitmap (for the same node), then, if their granularities are
> the same the migration will be done, otherwise the error will be generated.
>
> If destination qemu doesn't contain such bitmap it will be created.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   include/migration/misc.h       |   3 +
>   migration/Makefile.objs        |   1 +
>   migration/block-dirty-bitmap.c | 700 +++++++++++++++++++++++++++++++++++++++++
>   migration/migration.c          |   3 +
>   migration/migration.h          |   3 +
>   migration/savevm.c             |   2 +
>   migration/trace-events         |  14 +
>   vl.c                           |   1 +
>   8 files changed, 727 insertions(+)
>   create mode 100644 migration/block-dirty-bitmap.c
>
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 22551216bb..c53689eb0f 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -56,4 +56,7 @@ bool migration_in_postcopy_after_devices(MigrationState *);
>   void migration_only_migratable_set(void);
>   void migration_global_dump(Monitor *mon);
>   
> +/* migration/block-dirty-bitmap.c */
> +void dirty_bitmap_mig_init(void);
> +
>   #endif
> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
> index 1c7770da46..525cc8293e 100644
> --- a/migration/Makefile.objs
> +++ b/migration/Makefile.objs
> @@ -6,6 +6,7 @@ common-obj-y += qemu-file.o global_state.o
>   common-obj-y += qemu-file-channel.o
>   common-obj-y += xbzrle.o postcopy-ram.o
>   common-obj-y += qjson.o
> +common-obj-y += block-dirty-bitmap.o
>   
>   common-obj-$(CONFIG_RDMA) += rdma.o
>   
> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
> new file mode 100644
> index 0000000000..e4cd17bcc8
> --- /dev/null
> +++ b/migration/block-dirty-bitmap.c
> @@ -0,0 +1,700 @@
> +/*
> + * Block dirty bitmap postcopy migration
> + *
> + * Copyright IBM, Corp. 2009
> + * Copyright (C) 2016 Parallels IP Holdings GmbH. All rights reserved.
> + *
> + * Authors:
> + *  Liran Schour   <lirans@il.ibm.com>
> + *  Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + * This file is derived from migration/block.c, so it's author and IBM copyright
> + * are here, although content is quite different.
> + *
> + * Contributions after 2012-01-13 are licensed under the terms of the
> + * GNU GPL, version 2 or (at your option) any later version.
> + *
> + *                                ***
> + *
> + * Here postcopy migration of dirty bitmaps is realized. Only named dirty
> + * bitmaps, associated with root nodes and non-root named nodes are migrated.
> + *
> + * If destination qemu is already containing a dirty bitmap with the same name
> + * as a migrated bitmap (for the same node), then, if their granularities are
> + * the same the migration will be done, otherwise the error will be generated.
> + *
> + * If destination qemu doesn't contain such bitmap it will be created.
> + *
> + * format of migration:
> + *
> + * # Header (shared for different chunk types)
> + * 1, 2 or 4 bytes: flags (see qemu_{put,put}_flags)
> + * [ 1 byte: node name size ] \  flags & DEVICE_NAME
> + * [ n bytes: node name     ] /
> + * [ 1 byte: bitmap name size ] \  flags & BITMAP_NAME
> + * [ n bytes: bitmap name     ] /
> + *
> + * # Start of bitmap migration (flags & START)
> + * header
> + * be64: granularity
> + * 1 byte: bitmap enabled flag

note: this will be changed to 1 byte of flags, to transfer also 
persistance of bitmaps.

> + *
> + * # Complete of bitmap migration (flags & COMPLETE)
> + * header
> + *
> + * # Data chunk of bitmap migration
> + * header
> + * be64: start sector
> + * be32: number of sectors
> + * [ be64: buffer size  ] \ ! (flags & ZEROES)
> + * [ n bytes: buffer    ] /
> + *
> + * The last chunk in stream should contain flags & EOS. The chunk may skip
> + * device and/or bitmap names, assuming them to be the same with the previous
> + * chunk.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "block/block.h"
> +#include "block/block_int.h"
> +#include "sysemu/block-backend.h"
> +#include "qemu/main-loop.h"
> +#include "qemu/error-report.h"
> +#include "migration/misc.h"
> +#include "migration/migration.h"
> +#include "migration/qemu-file.h"
> +#include "migration/vmstate.h"
> +#include "migration/register.h"
> +#include "qemu/hbitmap.h"
> +#include "sysemu/sysemu.h"
> +#include "qemu/cutils.h"
> +#include "qapi/error.h"
> +#include "trace.h"
> +
> +#define CHUNK_SIZE     (1 << 10)
> +
> +/* Flags occupy one, two or four bytes (Big Endian). The size is determined as
> + * follows:
> + * in first (most significant) byte bit 8 is clear  -->  one byte
> + * in first byte bit 8 is set    -->  two or four bytes, depending on second
> + *                                    byte:
> + *    | in second byte bit 8 is clear  -->  two bytes
> + *    | in second byte bit 8 is set    -->  four bytes
> + */
> +#define DIRTY_BITMAP_MIG_FLAG_EOS           0x01
> +#define DIRTY_BITMAP_MIG_FLAG_ZEROES        0x02
> +#define DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME   0x04
> +#define DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME   0x08
> +#define DIRTY_BITMAP_MIG_FLAG_START         0x10
> +#define DIRTY_BITMAP_MIG_FLAG_COMPLETE      0x20
> +#define DIRTY_BITMAP_MIG_FLAG_BITS          0x40
> +
> +#define DIRTY_BITMAP_MIG_EXTRA_FLAGS        0x80
> +
> +typedef struct DirtyBitmapMigBitmapState {
> +    /* Written during setup phase. */
> +    BlockDriverState *bs;
> +    const char *node_name;
> +    BdrvDirtyBitmap *bitmap;
> +    uint64_t total_sectors;
> +    uint64_t sectors_per_chunk;
> +    QSIMPLEQ_ENTRY(DirtyBitmapMigBitmapState) entry;
> +
> +    /* For bulk phase. */
> +    bool bulk_completed;
> +    uint64_t cur_sector;
> +} DirtyBitmapMigBitmapState;
> +
> +typedef struct DirtyBitmapMigState {
> +    QSIMPLEQ_HEAD(dbms_list, DirtyBitmapMigBitmapState) dbms_list;
> +
> +    bool bulk_completed;
> +
> +    /* for send_bitmap_bits() */
> +    BlockDriverState *prev_bs;
> +    BdrvDirtyBitmap *prev_bitmap;
> +} DirtyBitmapMigState;
> +
> +typedef struct DirtyBitmapLoadState {
> +    uint32_t flags;
> +    char node_name[256];
> +    char bitmap_name[256];
> +    BlockDriverState *bs;
> +    BdrvDirtyBitmap *bitmap;
> +} DirtyBitmapLoadState;
> +
> +static DirtyBitmapMigState dirty_bitmap_mig_state;
> +
> +typedef struct DirtyBitmapLoadBitmapState {
> +    BlockDriverState *bs;
> +    BdrvDirtyBitmap *bitmap;
> +    bool migrated;
> +} DirtyBitmapLoadBitmapState;
> +static GSList *enabled_bitmaps;
> +QemuMutex finish_lock;
> +
> +void init_dirty_bitmap_incoming_migration(void)
> +{
> +    qemu_mutex_init(&finish_lock);
> +}
> +
> +static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
> +{
> +    uint8_t flags = qemu_get_byte(f);
> +    if (flags & DIRTY_BITMAP_MIG_EXTRA_FLAGS) {
> +        flags = flags << 8 | qemu_get_byte(f);
> +        if (flags & DIRTY_BITMAP_MIG_EXTRA_FLAGS) {
> +            flags = flags << 16 | qemu_get_be16(f);
> +        }
> +    }
> +
> +    return flags;
> +}
> +
> +static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t flags)
> +{
> +    /* The code currently do not send flags more than one byte */
> +    assert(!(flags & (0xffffff00 | DIRTY_BITMAP_MIG_EXTRA_FLAGS)));
> +
> +    qemu_put_byte(f, flags);
> +}
> +
> +static void send_bitmap_header(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
> +                               uint32_t additional_flags)
> +{
> +    BlockDriverState *bs = dbms->bs;
> +    BdrvDirtyBitmap *bitmap = dbms->bitmap;
> +    uint32_t flags = additional_flags;
> +    trace_send_bitmap_header_enter();
> +
> +    if (bs != dirty_bitmap_mig_state.prev_bs) {
> +        dirty_bitmap_mig_state.prev_bs = bs;
> +        flags |= DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME;
> +    }
> +
> +    if (bitmap != dirty_bitmap_mig_state.prev_bitmap) {
> +        dirty_bitmap_mig_state.prev_bitmap = bitmap;
> +        flags |= DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME;
> +    }
> +
> +    qemu_put_bitmap_flags(f, flags);
> +
> +    if (flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) {
> +        qemu_put_counted_string(f, dbms->node_name);
> +    }
> +
> +    if (flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) {
> +        qemu_put_counted_string(f, bdrv_dirty_bitmap_name(bitmap));
> +    }
> +}
> +
> +static void send_bitmap_start(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
> +{
> +    send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
> +    qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
> +    qemu_put_byte(f, bdrv_dirty_bitmap_enabled(dbms->bitmap));
> +}
> +
> +static void send_bitmap_complete(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
> +{
> +    send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
> +}
> +
> +static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
> +                             uint64_t start_sector, uint32_t nr_sectors)
> +{
> +    /* align for buffer_is_zero() */
> +    uint64_t align = 4 * sizeof(long);
> +    uint64_t unaligned_size =
> +        bdrv_dirty_bitmap_serialization_size(
> +            dbms->bitmap, start_sector << BDRV_SECTOR_BITS,
> +            (uint64_t)nr_sectors << BDRV_SECTOR_BITS);
> +    uint64_t buf_size = (unaligned_size + align - 1) & ~(align - 1);
> +    uint8_t *buf = g_malloc0(buf_size);
> +    uint32_t flags = DIRTY_BITMAP_MIG_FLAG_BITS;
> +
> +    bdrv_dirty_bitmap_serialize_part(
> +        dbms->bitmap, buf, start_sector << BDRV_SECTOR_BITS,
> +        (uint64_t)nr_sectors << BDRV_SECTOR_BITS);
> +
> +    if (buffer_is_zero(buf, buf_size)) {
> +        g_free(buf);
> +        buf = NULL;
> +        flags |= DIRTY_BITMAP_MIG_FLAG_ZEROES;
> +    }
> +
> +    trace_send_bitmap_bits(flags, start_sector, nr_sectors, buf_size);
> +
> +    send_bitmap_header(f, dbms, flags);
> +
> +    qemu_put_be64(f, start_sector);
> +    qemu_put_be32(f, nr_sectors);
> +
> +    /* if a block is zero we need to flush here since the network
> +     * bandwidth is now a lot higher than the storage device bandwidth.
> +     * thus if we queue zero blocks we slow down the migration. */
> +    if (flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
> +        qemu_fflush(f);
> +    } else {
> +        qemu_put_be64(f, buf_size);
> +        qemu_put_buffer(f, buf, buf_size);
> +    }
> +
> +    g_free(buf);
> +}
> +
> +
> +/* Called with iothread lock taken.  */
> +
> +static int init_dirty_bitmap_migration(void)
> +{
> +    BlockDriverState *bs;
> +    BdrvDirtyBitmap *bitmap;
> +    DirtyBitmapMigBitmapState *dbms;
> +    BdrvNextIterator it;
> +
> +    dirty_bitmap_mig_state.bulk_completed = false;
> +    dirty_bitmap_mig_state.prev_bs = NULL;
> +    dirty_bitmap_mig_state.prev_bitmap = NULL;
> +
> +    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
> +        if (!bdrv_get_device_or_node_name(bs)) {
> +            /* not named non-root node */
> +            continue;
> +        }
> +
> +        for (bitmap = bdrv_dirty_bitmap_next(bs, NULL); bitmap;
> +             bitmap = bdrv_dirty_bitmap_next(bs, bitmap)) {
> +            if (!bdrv_dirty_bitmap_name(bitmap)) {
> +                continue;
> +            }
> +
> +            if (bdrv_dirty_bitmap_frozen(bitmap)) {
> +                error_report("Can't migrate frozen dirty bitmap: '%s",
> +                             bdrv_dirty_bitmap_name(bitmap));
> +                return -1;
> +            }
> +        }
> +    }
> +
> +    for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
> +        if (!bdrv_get_device_or_node_name(bs)) {
> +            /* not named non-root node */
> +            continue;
> +        }
> +
> +        for (bitmap = bdrv_dirty_bitmap_next(bs, NULL); bitmap;
> +             bitmap = bdrv_dirty_bitmap_next(bs, bitmap)) {
> +            if (!bdrv_dirty_bitmap_name(bitmap)) {
> +                continue;
> +            }
> +
> +            bdrv_ref(bs);
> +            bdrv_dirty_bitmap_set_frozen(bitmap, true);
> +
> +            dbms = g_new0(DirtyBitmapMigBitmapState, 1);
> +            dbms->bs = bs;
> +            dbms->node_name = bdrv_get_node_name(bs);
> +            if (!dbms->node_name || dbms->node_name[0] == '\0') {
> +                dbms->node_name = bdrv_get_device_name(bs);
> +            }
> +            dbms->bitmap = bitmap;
> +            dbms->total_sectors = bdrv_nb_sectors(bs);
> +            dbms->sectors_per_chunk = CHUNK_SIZE * 8 *
> +                bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
> +
> +            QSIMPLEQ_INSERT_TAIL(&dirty_bitmap_mig_state.dbms_list,
> +                                 dbms, entry);
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +/* Called with no lock taken.  */
> +static void bulk_phase_send_chunk(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
> +{
> +    uint32_t nr_sectors = MIN(dbms->total_sectors - dbms->cur_sector,
> +                             dbms->sectors_per_chunk);
> +
> +    send_bitmap_bits(f, dbms, dbms->cur_sector, nr_sectors);
> +
> +    dbms->cur_sector += nr_sectors;
> +    if (dbms->cur_sector >= dbms->total_sectors) {
> +        dbms->bulk_completed = true;
> +    }
> +}
> +
> +/* Called with no lock taken.  */
> +static void bulk_phase(QEMUFile *f, bool limit)
> +{
> +    DirtyBitmapMigBitmapState *dbms;
> +
> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
> +        while (!dbms->bulk_completed) {
> +            bulk_phase_send_chunk(f, dbms);
> +            if (limit && qemu_file_rate_limit(f)) {
> +                return;
> +            }
> +        }
> +    }
> +
> +    dirty_bitmap_mig_state.bulk_completed = true;
> +}
> +
> +/* Called with iothread lock taken.  */
> +static void dirty_bitmap_mig_cleanup(void)
> +{
> +    DirtyBitmapMigBitmapState *dbms;
> +
> +    while ((dbms = QSIMPLEQ_FIRST(&dirty_bitmap_mig_state.dbms_list)) != NULL) {
> +        QSIMPLEQ_REMOVE_HEAD(&dirty_bitmap_mig_state.dbms_list, entry);
> +        bdrv_dirty_bitmap_set_frozen(dbms->bitmap, false);
> +        bdrv_unref(dbms->bs);
> +        g_free(dbms);
> +    }
> +}
> +
> +/* for SaveVMHandlers */
> +static void dirty_bitmap_migration_cleanup(void *opaque)
> +{
> +    dirty_bitmap_mig_cleanup();
> +}
> +
> +static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
> +{
> +    trace_dirty_bitmap_save_iterate(migration_in_postcopy());
> +
> +    if (migration_in_postcopy() && !dirty_bitmap_mig_state.bulk_completed) {
> +        bulk_phase(f, true);
> +    }
> +
> +    qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
> +
> +    return dirty_bitmap_mig_state.bulk_completed;
> +}
> +
> +/* Called with iothread lock taken.  */
> +
> +static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque)
> +{
> +    DirtyBitmapMigBitmapState *dbms;
> +    trace_dirty_bitmap_save_complete_enter();
> +
> +    if (!dirty_bitmap_mig_state.bulk_completed) {
> +        bulk_phase(f, false);
> +    }
> +
> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
> +        send_bitmap_complete(f, dbms);
> +    }
> +
> +    qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
> +
> +    trace_dirty_bitmap_save_complete_finish();
> +
> +    dirty_bitmap_mig_cleanup();
> +    return 0;
> +}
> +
> +static void dirty_bitmap_save_pending(QEMUFile *f, void *opaque,
> +                                      uint64_t max_size,
> +                                      uint64_t *res_precopy_only,
> +                                      uint64_t *res_compatible,
> +                                      uint64_t *res_postcopy_only)
> +{
> +    DirtyBitmapMigBitmapState *dbms;
> +    uint64_t pending = 0;
> +
> +    qemu_mutex_lock_iothread();
> +
> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
> +        uint64_t gran = bdrv_dirty_bitmap_granularity(dbms->bitmap);
> +        uint64_t sectors = dbms->bulk_completed ? 0 :
> +                           dbms->total_sectors - dbms->cur_sector;
> +
> +        pending += (sectors * BDRV_SECTOR_SIZE + gran - 1) / gran;
> +    }
> +
> +    qemu_mutex_unlock_iothread();
> +
> +    trace_dirty_bitmap_save_pending(pending, max_size);
> +
> +    *res_postcopy_only += pending;
> +}
> +
> +/* First occurrence of this bitmap. It should be created if doesn't exist */
> +static int dirty_bitmap_load_start(QEMUFile *f, DirtyBitmapLoadState *s)
> +{
> +    Error *local_err = NULL;
> +    uint32_t granularity = qemu_get_be32(f);
> +    bool enabled = qemu_get_byte(f);
> +
> +    if (!s->bitmap) {
> +        s->bitmap = bdrv_create_dirty_bitmap(s->bs, granularity,
> +                                             s->bitmap_name, &local_err);
> +        if (!s->bitmap) {
> +            error_report_err(local_err);
> +            return -EINVAL;
> +        }
> +    } else {
> +        uint32_t dest_granularity =
> +            bdrv_dirty_bitmap_granularity(s->bitmap);
> +        if (dest_granularity != granularity) {
> +            error_report("Error: "
> +                         "Migrated bitmap granularity (%" PRIu32 ") "
> +                         "doesn't match the destination bitmap '%s' "
> +                         "granularity (%" PRIu32 ")",
> +                         granularity,
> +                         bdrv_dirty_bitmap_name(s->bitmap),
> +                         dest_granularity);
> +            return -EINVAL;
> +        }
> +    }
> +
> +    bdrv_disable_dirty_bitmap(s->bitmap);
> +    if (enabled) {
> +        DirtyBitmapLoadBitmapState *b;
> +
> +        bdrv_dirty_bitmap_create_successor(s->bs, s->bitmap, &local_err);
> +        if (local_err) {
> +            error_report_err(local_err);
> +            return -EINVAL;
> +        }
> +
> +        b = g_new(DirtyBitmapLoadBitmapState, 1);
> +        b->bs = s->bs;
> +        b->bitmap = s->bitmap;
> +        b->migrated = false;
> +        enabled_bitmaps = g_slist_prepend(enabled_bitmaps, b);
> +    }
> +
> +    return 0;
> +}
> +
> +void dirty_bitmap_mig_before_vm_start(void)
> +{
> +    GSList *item;
> +
> +    qemu_mutex_lock(&finish_lock);
> +
> +    for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
> +        DirtyBitmapLoadBitmapState *b = item->data;
> +
> +        if (b->migrated) {
> +            bdrv_enable_dirty_bitmap(b->bitmap);
> +        } else {
> +            bdrv_dirty_bitmap_enable_successor(b->bitmap);
> +        }
> +
> +        g_free(b);
> +    }
> +
> +    g_slist_free(enabled_bitmaps);
> +    enabled_bitmaps = NULL;
> +
> +    qemu_mutex_unlock(&finish_lock);
> +}
> +
> +static void dirty_bitmap_load_complete(QEMUFile *f, DirtyBitmapLoadState *s)
> +{
> +    GSList *item;
> +    trace_dirty_bitmap_load_complete();
> +    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
> +
> +    qemu_mutex_lock(&finish_lock);
> +
> +    for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
> +        DirtyBitmapLoadBitmapState *b = item->data;
> +
> +        if (b->bitmap == s->bitmap) {
> +            b->migrated = true;
> +        }
> +    }
> +
> +    if (bdrv_dirty_bitmap_frozen(s->bitmap)) {
> +        if (enabled_bitmaps == NULL) {
> +            /* in postcopy */
> +            AioContext *aio_context = bdrv_get_aio_context(s->bs);
> +            aio_context_acquire(aio_context);
> +
> +            bdrv_reclaim_dirty_bitmap(s->bs, s->bitmap, &error_abort);
> +            bdrv_enable_dirty_bitmap(s->bitmap);
> +
> +            aio_context_release(aio_context);
> +        } else {
> +            /* target not started, successor is empty */
> +            bdrv_dirty_bitmap_release_successor(s->bs, s->bitmap);
> +        }
> +    }
> +
> +    qemu_mutex_unlock(&finish_lock);
> +}
> +
> +static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
> +{
> +    uint64_t first_byte = qemu_get_be64(f) << BDRV_SECTOR_BITS;
> +    uint64_t nr_bytes = qemu_get_be32(f) << BDRV_SECTOR_BITS;
> +    trace_dirty_bitmap_load_bits_enter(first_byte >> BDRV_SECTOR_BITS,
> +                                       nr_bytes >> BDRV_SECTOR_BITS);
> +
> +    if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
> +        trace_dirty_bitmap_load_bits_zeroes();
> +        bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
> +                                             false);
> +    } else {
> +        uint8_t *buf;
> +        uint64_t buf_size = qemu_get_be64(f);
> +        uint64_t needed_size =
> +            bdrv_dirty_bitmap_serialization_size(s->bitmap,
> +                                                 first_byte, nr_bytes);
> +
> +        if (needed_size > buf_size) {
> +            error_report("Error: Migrated bitmap granularity doesn't "
> +                         "match the destination bitmap '%s' granularity",
> +                         bdrv_dirty_bitmap_name(s->bitmap));
> +            return -EINVAL;
> +        }
> +
> +        buf = g_malloc(buf_size);
> +        qemu_get_buffer(f, buf, buf_size);
> +        bdrv_dirty_bitmap_deserialize_part(s->bitmap, buf, first_byte, nr_bytes,
> +                                           false);
> +        g_free(buf);
> +    }
> +
> +    return 0;
> +}
> +
> +static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
> +{
> +    Error *local_err = NULL;
> +    s->flags = qemu_get_bitmap_flags(f);
> +    trace_dirty_bitmap_load_header(s->flags);
> +
> +    if (s->flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) {
> +        if (!qemu_get_counted_string(f, s->node_name)) {
> +            error_report("Unable to read node name string");
> +            return -EINVAL;
> +        }
> +        s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
> +        if (!s->bs) {
> +            error_report_err(local_err);
> +            return -EINVAL;
> +        }
> +    } else if (!s->bs) {
> +        error_report("Error: block device name is not set");
> +        return -EINVAL;
> +    }
> +
> +    if (s->flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) {
> +        if (!qemu_get_counted_string(f, s->bitmap_name)) {
> +            error_report("Unable to read node name string");
> +            return -EINVAL;
> +        }
> +        s->bitmap = bdrv_find_dirty_bitmap(s->bs, s->bitmap_name);
> +
> +        /* bitmap may be NULL here, it wouldn't be an error if it is the
> +         * first occurrence of the bitmap */
> +        if (!s->bitmap && !(s->flags & DIRTY_BITMAP_MIG_FLAG_START)) {
> +            error_report("Error: unknown dirty bitmap "
> +                         "'%s' for block device '%s'",
> +                         s->bitmap_name, s->node_name);
> +            return -EINVAL;
> +        }
> +    } else if (!s->bitmap) {
> +        error_report("Error: block device name is not set");
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
> +{
> +    static DirtyBitmapLoadState s;
> +    int ret = 0;
> +
> +    trace_dirty_bitmap_load_enter();
> +
> +    if (version_id != 1) {
> +        return -EINVAL;
> +    }
> +
> +    do {
> +        dirty_bitmap_load_header(f, &s);
> +
> +        if (s.flags & DIRTY_BITMAP_MIG_FLAG_START) {
> +            ret = dirty_bitmap_load_start(f, &s);
> +        } else if (s.flags & DIRTY_BITMAP_MIG_FLAG_COMPLETE) {
> +            dirty_bitmap_load_complete(f, &s);
> +        } else if (s.flags & DIRTY_BITMAP_MIG_FLAG_BITS) {
> +            ret = dirty_bitmap_load_bits(f, &s);
> +        }
> +
> +        if (!ret) {
> +            ret = qemu_file_get_error(f);
> +        }
> +
> +        if (ret) {
> +            return ret;
> +        }
> +    } while (!(s.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
> +
> +    trace_dirty_bitmap_load_success();
> +    return 0;
> +}
> +
> +static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
> +{
> +    DirtyBitmapMigBitmapState *dbms = NULL;
> +    if (init_dirty_bitmap_migration() < 0) {
> +        return -1;
> +    }
> +
> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
> +        send_bitmap_start(f, dbms);
> +    }
> +    qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
> +
> +    return 0;
> +}
> +
> +static bool dirty_bitmap_is_active(void *opaque)
> +{
> +    return migrate_dirty_bitmaps();
> +}
> +
> +static bool dirty_bitmap_is_active_iterate(void *opaque)
> +{
> +    return dirty_bitmap_is_active(opaque) && !runstate_is_running();
> +}
> +
> +static bool dirty_bitmap_has_postcopy(void *opaque)
> +{
> +    return true;
> +}
> +
> +static SaveVMHandlers savevm_dirty_bitmap_handlers = {
> +    .save_live_setup = dirty_bitmap_save_setup,
> +    .save_live_complete_postcopy = dirty_bitmap_save_complete,
> +    .save_live_complete_precopy = dirty_bitmap_save_complete,
> +    .has_postcopy = dirty_bitmap_has_postcopy,
> +    .save_live_pending = dirty_bitmap_save_pending,
> +    .save_live_iterate = dirty_bitmap_save_iterate,
> +    .is_active_iterate = dirty_bitmap_is_active_iterate,
> +    .load_state = dirty_bitmap_load,
> +    .cleanup = dirty_bitmap_migration_cleanup,
> +    .is_active = dirty_bitmap_is_active,
> +};
> +
> +void dirty_bitmap_mig_init(void)
> +{
> +    QSIMPLEQ_INIT(&dirty_bitmap_mig_state.dbms_list);
> +
> +    register_savevm_live(NULL, "dirty-bitmap", 0, 1,
> +                         &savevm_dirty_bitmap_handlers,
> +                         &dirty_bitmap_mig_state);
> +}
> diff --git a/migration/migration.c b/migration/migration.c
> index 48adc812bc..5c49e68f8e 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -143,6 +143,9 @@ MigrationIncomingState *migration_incoming_get_current(void)
>           memset(&mis_current, 0, sizeof(MigrationIncomingState));
>           qemu_mutex_init(&mis_current.rp_mutex);
>           qemu_event_init(&mis_current.main_thread_load_event, false);
> +
> +        init_dirty_bitmap_incoming_migration();
> +
>           once = true;
>       }
>       return &mis_current;
> diff --git a/migration/migration.h b/migration/migration.h
> index 0c6ba56423..8d9f4cd41e 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -196,4 +196,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
>   void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname,
>                                 ram_addr_t start, size_t len);
>   
> +void dirty_bitmap_mig_before_vm_start(void);
> +void init_dirty_bitmap_incoming_migration(void);
> +
>   #endif
> diff --git a/migration/savevm.c b/migration/savevm.c
> index f03ee02f67..c67c1d9110 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1660,6 +1660,8 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
>   
>       trace_loadvm_postcopy_handle_run_vmstart();
>   
> +    dirty_bitmap_mig_before_vm_start();
> +
>       if (autostart) {
>           /* Hold onto your hats, starting the CPU */
>           vm_start();
> diff --git a/migration/trace-events b/migration/trace-events
> index 34cc2adafc..672551bd5a 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -224,3 +224,17 @@ colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
>   colo_send_message(const char *msg) "Send '%s' message"
>   colo_receive_message(const char *msg) "Receive '%s' message"
>   colo_failover_set_state(const char *new_state) "new state %s"
> +
> +# migration/block-dirty-bitmap.c
> +send_bitmap_header_enter(void) ""
> +send_bitmap_bits(uint32_t flags, uint64_t start_sector, uint32_t nr_sectors, uint64_t data_size) "\n   flags:        %x\n   start_sector: %" PRIu64 "\n   nr_sectors:   %" PRIu32 "\n   data_size:    %" PRIu64 "\n"
> +dirty_bitmap_save_iterate(int in_postcopy) "in postcopy: %d"
> +dirty_bitmap_save_complete_enter(void) ""
> +dirty_bitmap_save_complete_finish(void) ""
> +dirty_bitmap_save_pending(uint64_t pending, uint64_t max_size) "pending %" PRIu64 " max: %" PRIu64
> +dirty_bitmap_load_complete(void) ""
> +dirty_bitmap_load_bits_enter(uint64_t first_sector, uint32_t nr_sectors) "chunk: %" PRIu64 " %" PRIu32
> +dirty_bitmap_load_bits_zeroes(void) ""
> +dirty_bitmap_load_header(uint32_t flags) "flags %x"
> +dirty_bitmap_load_enter(void) ""
> +dirty_bitmap_load_success(void) ""
> diff --git a/vl.c b/vl.c
> index 36ff3f4345..369d8cecd0 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4523,6 +4523,7 @@ int main(int argc, char **argv, char **envp)
>   
>       blk_mig_init();
>       ram_mig_init();
> +    dirty_bitmap_mig_init();
>   
>       /* If the currently selected machine wishes to override the units-per-bus
>        * property of its default HBA interface type, do so now. */


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-08-21 23:34       ` John Snow
@ 2017-09-18 14:07         ` Vladimir Sementsov-Ogievskiy
  2017-09-19 19:06           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-09-18 14:07 UTC (permalink / raw)
  To: John Snow, Dr. David Alan Gilbert
  Cc: kwolf, peter.maydell, famz, lirans, qemu-block, quintela,
	qemu-devel, armbru, stefanha, den, pbonzini, mreitz

ping for 1-3
Can we merge them?

22.08.2017 02:34, John Snow wrote:
>
> On 07/11/2017 09:38 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 11.07.2017 16:06, Dr. David Alan Gilbert wrote:
>>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>>> Split common postcopy staff from ram postcopy staff.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> I'm OK with this; I'm not sure I'd have bothered making the ping's
>>> dependent on it being RAM.
>>>
>>> (These first few are pretty much a separable series).
>> It would be grate if you (or Juan?) can take them separately.
>>
> Yes please. I don't think this ever happened, did it? Can we split off
> 1-3 and re-roll?



-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-18 14:07         ` [Qemu-devel] ping " Vladimir Sementsov-Ogievskiy
@ 2017-09-19 19:06           ` Dr. David Alan Gilbert
  2017-09-20 11:45             ` Juan Quintela
  0 siblings, 1 reply; 34+ messages in thread
From: Dr. David Alan Gilbert @ 2017-09-19 19:06 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: John Snow, kwolf, peter.maydell, famz, lirans, qemu-block,
	quintela, qemu-devel, armbru, stefanha, den, pbonzini, mreitz

* Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> ping for 1-3
> Can we merge them?

I see all of them have R-b's; so lets try and put them in the next
migration merge.

Quintela: Sound good?

Dave

> 22.08.2017 02:34, John Snow wrote:
> > 
> > On 07/11/2017 09:38 AM, Vladimir Sementsov-Ogievskiy wrote:
> > > 11.07.2017 16:06, Dr. David Alan Gilbert wrote:
> > > > * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> > > > > Split common postcopy staff from ram postcopy staff.
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > > I'm OK with this; I'm not sure I'd have bothered making the ping's
> > > > dependent on it being RAM.
> > > > 
> > > > (These first few are pretty much a separable series).
> > > It would be grate if you (or Juan?) can take them separately.
> > > 
> > Yes please. I don't think this ever happened, did it? Can we split off
> > 1-3 and re-roll?
> 
> 
> 
> -- 
> Best regards,
> Vladimir
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-19 19:06           ` Dr. David Alan Gilbert
@ 2017-09-20 11:45             ` Juan Quintela
  2017-09-25 13:23               ` Kevin Wolf
  0 siblings, 1 reply; 34+ messages in thread
From: Juan Quintela @ 2017-09-20 11:45 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Vladimir Sementsov-Ogievskiy, kwolf, peter.maydell, famz,
	stefanha, qemu-block, qemu-devel, armbru, lirans, pbonzini, den,
	mreitz, John Snow

"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>> ping for 1-3
>> Can we merge them?
>
> I see all of them have R-b's; so lets try and put them in the next
> migration merge.
>
> Quintela: Sound good?

Yeap.

Later, Juan.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-20 11:45             ` Juan Quintela
@ 2017-09-25 13:23               ` Kevin Wolf
  2017-09-25 14:31                 ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2017-09-25 13:23 UTC (permalink / raw)
  To: Juan Quintela
  Cc: Dr. David Alan Gilbert, Vladimir Sementsov-Ogievskiy,
	peter.maydell, famz, stefanha, qemu-block, qemu-devel, armbru,
	lirans, pbonzini, den, mreitz, John Snow

Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> >> ping for 1-3
> >> Can we merge them?
> >
> > I see all of them have R-b's; so lets try and put them in the next
> > migration merge.
> >
> > Quintela: Sound good?
> 
> Yeap.

This patch broke qemu-iotests 181 ('Test postcopy live migration with
shared storage'):

--- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
+++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
@@ -21,18 +21,16 @@
 === Do some I/O on the destination ===
 
 QEMU X.Y.Z monitor - type 'help' for more information
-(qemu) qemu-io disk "read -P 0x55 0 64k"
+(qemu) QEMU_PROG: Expected vmdescription section, but got 0
+QEMU_PROG: Failed to get "write" lock
+Is another process using the image?
+qemu-io disk "read -P 0x55 0 64k"
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 (qemu) 
 (qemu) qemu-io disk "write -P 0x66 1M 64k"
-wrote 65536/65536 bytes at offset 1048576
-64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-
-=== Shut down and check image ===
-
-(qemu) quit
-(qemu) 
-(qemu) quit
-No errors were found on the image.
-*** done
+QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
+./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
+echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
+fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
+Timeout waiting for ops/sec on handle 1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-25 13:23               ` Kevin Wolf
@ 2017-09-25 14:31                 ` Vladimir Sementsov-Ogievskiy
  2017-09-25 14:58                   ` Kevin Wolf
  0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-09-25 14:31 UTC (permalink / raw)
  To: Kevin Wolf, Juan Quintela
  Cc: Dr. David Alan Gilbert, peter.maydell, famz, stefanha,
	qemu-block, qemu-devel, armbru, lirans, pbonzini, den, mreitz,
	John Snow

25.09.2017 16:23, Kevin Wolf wrote:
> Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>>> ping for 1-3
>>>> Can we merge them?
>>> I see all of them have R-b's; so lets try and put them in the next
>>> migration merge.
>>>
>>> Quintela: Sound good?
>> Yeap.
> This patch broke qemu-iotests 181 ('Test postcopy live migration with
> shared storage'):
>
> --- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
> +++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
> @@ -21,18 +21,16 @@
>   === Do some I/O on the destination ===
>   
>   QEMU X.Y.Z monitor - type 'help' for more information
> -(qemu) qemu-io disk "read -P 0x55 0 64k"
> +(qemu) QEMU_PROG: Expected vmdescription section, but got 0
> +QEMU_PROG: Failed to get "write" lock
> +Is another process using the image?
> +qemu-io disk "read -P 0x55 0 64k"
>   read 65536/65536 bytes at offset 0
>   64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>   (qemu)
>   (qemu) qemu-io disk "write -P 0x66 1M 64k"
> -wrote 65536/65536 bytes at offset 1048576
> -64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -
> -=== Shut down and check image ===
> -
> -(qemu) quit
> -(qemu)
> -(qemu) quit
> -No errors were found on the image.
> -*** done
> +QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
> +./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
> +echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
> +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
> +Timeout waiting for ops/sec on handle 1

Not sure about locking (don't see this error on my old kernel without 
OFD locking), but it looks like that
181 test should be fixed to set postcopy-ram capability on target too 
(which was considered as correct way on list)

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-25 14:31                 ` Vladimir Sementsov-Ogievskiy
@ 2017-09-25 14:58                   ` Kevin Wolf
  2017-09-25 15:07                     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2017-09-25 14:58 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Juan Quintela, Dr. David Alan Gilbert, peter.maydell, famz,
	stefanha, qemu-block, qemu-devel, armbru, lirans, pbonzini, den,
	mreitz, John Snow

Am 25.09.2017 um 16:31 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 25.09.2017 16:23, Kevin Wolf wrote:
> > Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> > > > > ping for 1-3
> > > > > Can we merge them?
> > > > I see all of them have R-b's; so lets try and put them in the next
> > > > migration merge.
> > > > 
> > > > Quintela: Sound good?
> > > Yeap.
> > This patch broke qemu-iotests 181 ('Test postcopy live migration with
> > shared storage'):
> > 
> > --- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
> > +++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
> > @@ -21,18 +21,16 @@
> >   === Do some I/O on the destination ===
> >   QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) qemu-io disk "read -P 0x55 0 64k"
> > +(qemu) QEMU_PROG: Expected vmdescription section, but got 0
> > +QEMU_PROG: Failed to get "write" lock
> > +Is another process using the image?
> > +qemu-io disk "read -P 0x55 0 64k"
> >   read 65536/65536 bytes at offset 0
> >   64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >   (qemu)
> >   (qemu) qemu-io disk "write -P 0x66 1M 64k"
> > -wrote 65536/65536 bytes at offset 1048576
> > -64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > -
> > -=== Shut down and check image ===
> > -
> > -(qemu) quit
> > -(qemu)
> > -(qemu) quit
> > -No errors were found on the image.
> > -*** done
> > +QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
> > +./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
> > +echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
> > +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
> > +Timeout waiting for ops/sec on handle 1
> 
> Not sure about locking (don't see this error on my old kernel without OFD
> locking), but it looks like that
> 181 test should be fixed to set postcopy-ram capability on target too (which
> was considered as correct way on list)

Whatever you think the preferred way to set up postcopy migration is: If
something worked before this patch and doesn't after it, that's a
regression and breaks backwards compatibility.

If we were talking about a graceful failure, maybe we could discuss
whether carefully and deliberately breaking compatibility could be
justified in this specific case. But the breakage is neither mentioned
in the commit message nor is it graceful, so I can only call it a bug.

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-25 14:58                   ` Kevin Wolf
@ 2017-09-25 15:07                     ` Vladimir Sementsov-Ogievskiy
  2017-09-25 15:27                       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 34+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-09-25 15:07 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Juan Quintela, Dr. David Alan Gilbert, peter.maydell, famz,
	stefanha, qemu-block, qemu-devel, armbru, lirans, pbonzini, den,
	mreitz, John Snow

25.09.2017 17:58, Kevin Wolf wrote:
> Am 25.09.2017 um 16:31 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 25.09.2017 16:23, Kevin Wolf wrote:
>>> Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
>>>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>>>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>>>>> ping for 1-3
>>>>>> Can we merge them?
>>>>> I see all of them have R-b's; so lets try and put them in the next
>>>>> migration merge.
>>>>>
>>>>> Quintela: Sound good?
>>>> Yeap.
>>> This patch broke qemu-iotests 181 ('Test postcopy live migration with
>>> shared storage'):
>>>
>>> --- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
>>> +++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
>>> @@ -21,18 +21,16 @@
>>>    === Do some I/O on the destination ===
>>>    QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) qemu-io disk "read -P 0x55 0 64k"
>>> +(qemu) QEMU_PROG: Expected vmdescription section, but got 0
>>> +QEMU_PROG: Failed to get "write" lock
>>> +Is another process using the image?
>>> +qemu-io disk "read -P 0x55 0 64k"
>>>    read 65536/65536 bytes at offset 0
>>>    64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>    (qemu)
>>>    (qemu) qemu-io disk "write -P 0x66 1M 64k"
>>> -wrote 65536/65536 bytes at offset 1048576
>>> -64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>> -
>>> -=== Shut down and check image ===
>>> -
>>> -(qemu) quit
>>> -(qemu)
>>> -(qemu) quit
>>> -No errors were found on the image.
>>> -*** done
>>> +QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
>>> +./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
>>> +echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
>>> +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
>>> +Timeout waiting for ops/sec on handle 1
>> Not sure about locking (don't see this error on my old kernel without OFD
>> locking), but it looks like that
>> 181 test should be fixed to set postcopy-ram capability on target too (which
>> was considered as correct way on list)
> Whatever you think the preferred way to set up postcopy migration is: If
> something worked before this patch and doesn't after it, that's a
> regression and breaks backwards compatibility.
>
> If we were talking about a graceful failure, maybe we could discuss
> whether carefully and deliberately breaking compatibility could be
> justified in this specific case. But the breakage is neither mentioned
> in the commit message nor is it graceful, so I can only call it a bug.
>
> Kevin

It's of course my fault, I don't mean "it's wrong test, so it's not my 
problem") And I've already sent a patch.

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-25 15:07                     ` Vladimir Sementsov-Ogievskiy
@ 2017-09-25 15:27                       ` Dr. David Alan Gilbert
  2017-09-26 10:12                         ` Kevin Wolf
  0 siblings, 1 reply; 34+ messages in thread
From: Dr. David Alan Gilbert @ 2017-09-25 15:27 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Kevin Wolf, Juan Quintela, peter.maydell, famz, stefanha,
	qemu-block, qemu-devel, armbru, lirans, pbonzini, den, mreitz,
	John Snow

* Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> 25.09.2017 17:58, Kevin Wolf wrote:
> > Am 25.09.2017 um 16:31 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 25.09.2017 16:23, Kevin Wolf wrote:
> > > > Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
> > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > > > * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> > > > > > > ping for 1-3
> > > > > > > Can we merge them?
> > > > > > I see all of them have R-b's; so lets try and put them in the next
> > > > > > migration merge.
> > > > > > 
> > > > > > Quintela: Sound good?
> > > > > Yeap.
> > > > This patch broke qemu-iotests 181 ('Test postcopy live migration with
> > > > shared storage'):
> > > > 
> > > > --- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
> > > > +++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
> > > > @@ -21,18 +21,16 @@
> > > >    === Do some I/O on the destination ===
> > > >    QEMU X.Y.Z monitor - type 'help' for more information
> > > > -(qemu) qemu-io disk "read -P 0x55 0 64k"
> > > > +(qemu) QEMU_PROG: Expected vmdescription section, but got 0
> > > > +QEMU_PROG: Failed to get "write" lock
> > > > +Is another process using the image?
> > > > +qemu-io disk "read -P 0x55 0 64k"
> > > >    read 65536/65536 bytes at offset 0
> > > >    64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > > >    (qemu)
> > > >    (qemu) qemu-io disk "write -P 0x66 1M 64k"
> > > > -wrote 65536/65536 bytes at offset 1048576
> > > > -64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > > > -
> > > > -=== Shut down and check image ===
> > > > -
> > > > -(qemu) quit
> > > > -(qemu)
> > > > -(qemu) quit
> > > > -No errors were found on the image.
> > > > -*** done
> > > > +QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
> > > > +./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
> > > > +echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
> > > > +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
> > > > +Timeout waiting for ops/sec on handle 1
> > > Not sure about locking (don't see this error on my old kernel without OFD
> > > locking), but it looks like that
> > > 181 test should be fixed to set postcopy-ram capability on target too (which
> > > was considered as correct way on list)
> > Whatever you think the preferred way to set up postcopy migration is: If
> > something worked before this patch and doesn't after it, that's a
> > regression and breaks backwards compatibility.
> > 
> > If we were talking about a graceful failure, maybe we could discuss
> > whether carefully and deliberately breaking compatibility could be
> > justified in this specific case. But the breakage is neither mentioned
> > in the commit message nor is it graceful, so I can only call it a bug.
> > 
> > Kevin
> 
> It's of course my fault, I don't mean "it's wrong test, so it's not my
> problem") And I've already sent a patch.

Why does this fail so badly, asserts etc - I was hoping for something
a bit more obvious from the migration code.

postcopy did originally work without the destination having the flag on
but setting the flag on the destination was always good practice because
it detected whether the host support was there early on.

Dave

> -- 
> Best regards,
> Vladimir
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-25 15:27                       ` Dr. David Alan Gilbert
@ 2017-09-26 10:12                         ` Kevin Wolf
  2017-09-26 10:21                           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 34+ messages in thread
From: Kevin Wolf @ 2017-09-26 10:12 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Vladimir Sementsov-Ogievskiy, Juan Quintela, peter.maydell, famz,
	stefanha, qemu-block, qemu-devel, armbru, lirans, pbonzini, den,
	mreitz, John Snow

Am 25.09.2017 um 17:27 hat Dr. David Alan Gilbert geschrieben:
> > > Whatever you think the preferred way to set up postcopy migration is: If
> > > something worked before this patch and doesn't after it, that's a
> > > regression and breaks backwards compatibility.
> > > 
> > > If we were talking about a graceful failure, maybe we could discuss
> > > whether carefully and deliberately breaking compatibility could be
> > > justified in this specific case. But the breakage is neither mentioned
> > > in the commit message nor is it graceful, so I can only call it a bug.
> > > 
> > > Kevin
> > 
> > It's of course my fault, I don't mean "it's wrong test, so it's not my
> > problem") And I've already sent a patch.
> 
> Why does this fail so badly, asserts etc - I was hoping for something
> a bit more obvious from the migration code.
> 
> postcopy did originally work without the destination having the flag on
> but setting the flag on the destination was always good practice because
> it detected whether the host support was there early on.

So what does this mean for 2.11? Do you think it is acceptable breaking
cases where the flag isn't set on the destination?

If so, just changing the test case is enough. But if not, I'd rather
keep the test case as it is and fix only the migration code.

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-26 10:12                         ` Kevin Wolf
@ 2017-09-26 10:21                           ` Dr. David Alan Gilbert
  2017-09-26 12:32                             ` Kevin Wolf
  0 siblings, 1 reply; 34+ messages in thread
From: Dr. David Alan Gilbert @ 2017-09-26 10:21 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Vladimir Sementsov-Ogievskiy, Juan Quintela, peter.maydell, famz,
	stefanha, qemu-block, qemu-devel, armbru, lirans, pbonzini, den,
	mreitz, John Snow

* Kevin Wolf (kwolf@redhat.com) wrote:
> Am 25.09.2017 um 17:27 hat Dr. David Alan Gilbert geschrieben:
> > > > Whatever you think the preferred way to set up postcopy migration is: If
> > > > something worked before this patch and doesn't after it, that's a
> > > > regression and breaks backwards compatibility.
> > > > 
> > > > If we were talking about a graceful failure, maybe we could discuss
> > > > whether carefully and deliberately breaking compatibility could be
> > > > justified in this specific case. But the breakage is neither mentioned
> > > > in the commit message nor is it graceful, so I can only call it a bug.
> > > > 
> > > > Kevin
> > > 
> > > It's of course my fault, I don't mean "it's wrong test, so it's not my
> > > problem") And I've already sent a patch.
> > 
> > Why does this fail so badly, asserts etc - I was hoping for something
> > a bit more obvious from the migration code.
> > 
> > postcopy did originally work without the destination having the flag on
> > but setting the flag on the destination was always good practice because
> > it detected whether the host support was there early on.
> 
> So what does this mean for 2.11? Do you think it is acceptable breaking
> cases where the flag isn't set on the destination?

I think so, because we've always recommended setting it on the
destination for the early detection.

> If so, just changing the test case is enough. But if not, I'd rather
> keep the test case as it is and fix only the migration code.

I'd take the test case fix, but I also want to dig why it fails so
badly; it would be nice just to have a clean failure telling you
that postcopy was expected.

Dave

> 
> Kevin
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
  2017-09-26 10:21                           ` Dr. David Alan Gilbert
@ 2017-09-26 12:32                             ` Kevin Wolf
  0 siblings, 0 replies; 34+ messages in thread
From: Kevin Wolf @ 2017-09-26 12:32 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Vladimir Sementsov-Ogievskiy, Juan Quintela, peter.maydell, famz,
	stefanha, qemu-block, qemu-devel, armbru, lirans, pbonzini, den,
	mreitz, John Snow

Am 26.09.2017 um 12:21 hat Dr. David Alan Gilbert geschrieben:
> * Kevin Wolf (kwolf@redhat.com) wrote:
> > Am 25.09.2017 um 17:27 hat Dr. David Alan Gilbert geschrieben:
> > > > > Whatever you think the preferred way to set up postcopy migration is: If
> > > > > something worked before this patch and doesn't after it, that's a
> > > > > regression and breaks backwards compatibility.
> > > > > 
> > > > > If we were talking about a graceful failure, maybe we could discuss
> > > > > whether carefully and deliberately breaking compatibility could be
> > > > > justified in this specific case. But the breakage is neither mentioned
> > > > > in the commit message nor is it graceful, so I can only call it a bug.
> > > > > 
> > > > > Kevin
> > > > 
> > > > It's of course my fault, I don't mean "it's wrong test, so it's not my
> > > > problem") And I've already sent a patch.
> > > 
> > > Why does this fail so badly, asserts etc - I was hoping for something
> > > a bit more obvious from the migration code.
> > > 
> > > postcopy did originally work without the destination having the flag on
> > > but setting the flag on the destination was always good practice because
> > > it detected whether the host support was there early on.
> > 
> > So what does this mean for 2.11? Do you think it is acceptable breaking
> > cases where the flag isn't set on the destination?
> 
> I think so, because we've always recommended setting it on the
> destination for the early detection.

Okay, I'll include the test case patch in my pull request today then.

> > If so, just changing the test case is enough. But if not, I'd rather
> > keep the test case as it is and fix only the migration code.
> 
> I'd take the test case fix, but I also want to dig why it fails so
> badly; it would be nice just to have a clean failure telling you
> that postcopy was expected.

Yes, that would be nice.

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2017-09-26 12:32 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-10 16:30 [Qemu-devel] [PATCH v7 00/16] Dirty bitmaps postcopy migration Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 01/16] migration: add has_postcopy savevm handler Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 02/16] migration: fix ram_save_pending Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy Vladimir Sementsov-Ogievskiy
2017-07-11 13:06   ` Dr. David Alan Gilbert
2017-07-11 13:38     ` Vladimir Sementsov-Ogievskiy
2017-08-21 23:34       ` John Snow
2017-09-18 14:07         ` [Qemu-devel] ping " Vladimir Sementsov-Ogievskiy
2017-09-19 19:06           ` Dr. David Alan Gilbert
2017-09-20 11:45             ` Juan Quintela
2017-09-25 13:23               ` Kevin Wolf
2017-09-25 14:31                 ` Vladimir Sementsov-Ogievskiy
2017-09-25 14:58                   ` Kevin Wolf
2017-09-25 15:07                     ` Vladimir Sementsov-Ogievskiy
2017-09-25 15:27                       ` Dr. David Alan Gilbert
2017-09-26 10:12                         ` Kevin Wolf
2017-09-26 10:21                           ` Dr. David Alan Gilbert
2017-09-26 12:32                             ` Kevin Wolf
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 04/16] migration: introduce postcopy-only pending Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 05/16] block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor() Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 06/16] qapi: add dirty-bitmaps migration capability Vladimir Sementsov-Ogievskiy
2017-07-10 21:03   ` Eric Blake
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 07/16] block/dirty-bitmap: add bdrv_dirty_bitmap_release_successor Vladimir Sementsov-Ogievskiy
2017-08-14  9:37   ` Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 08/16] migration: include migrate_dirty_bitmaps in migrate_postcopy Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 09/16] migration/qemu-file: add qemu_put_counted_string() Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 10/16] migration: add is_active_iterate handler Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 11/16] block/dirty-bitmap: add bdrv_dirty_bitmap_set_frozen Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 12/16] migration: add postcopy migration of dirty bitmaps Vladimir Sementsov-Ogievskiy
2017-09-18 14:07   ` Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 13/16] iotests: add add_incoming_migration to VM class Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 14/16] iotests: add default node-name Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 15/16] iotests: add dirty bitmap migration test Vladimir Sementsov-Ogievskiy
2017-07-10 16:30 ` [Qemu-devel] [PATCH v7 16/16] iotests: add dirty bitmap postcopy test Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.