All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints
@ 2016-05-20  7:36 Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 01/10] unblock backup operations in backing file Changlong Xie
                   ` (11 more replies)
  0 siblings, 12 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

Block replication is a very important feature which is used for
continuous checkpoints(for example: COLO).

You can get the detailed information about block replication from here:
http://wiki.qemu.org/Features/BlockReplication

Usage:
Please refer to docs/block-replication.txt

You can get the patch here:
https://github.com/Pating/qemu/tree/changlox/block-replication-v19

You can get the patch with framework here:
https://github.com/Pating/qemu/tree/changlox/colo_framework_v18

TODO:
1. Continuous block replication. It will be started after basic functions
   are accepted.

Changs Log:
V19:
1. Rebase to v2.6.0
2. Address comments from stefan
p3: a new patch that export interfaces for extra serialization
p8: 
1. call replication_stop() before freeing s->top_id
2. check top_bs
3. reopen file readonly in error return paths
4. enable extra serialization between read and COW
p9: try to hanlde SIGABRT
V18:
p6: add local_err in all replication callbacks to prevent "errp == NULL"
p7: add missing qemu_iovec_destroy(xxx)
V17:
1. Rebase to the lastest codes 
p2: refactor backup_do_checkpoint addressed comments from Jeff Cody
p4: fix bugs in "drive_add buddy xxx" hmp commands
p6: add "since: 2.7"
p7: fix bug in replication_close(), add missing "qapi/error.h", add test-replication 
p8: add "since: 2.7"
V16:
1. Rebase to the newest codes
2. Address comments from Stefan & hailiang
p3: we don't need this patch now
p4: add "top-id" parameters for secondary
p6: fix NULL pointer in replication callbacks, remove unnecessary typedefs, 
add doc comments that explain the semantics of Replication
p7: Refactor AioContext for thread-safe, remove unnecessary get_top_bs()
*Note*: I'm working on replication testcase now, will send out in V17
V15:
1. Rebase to the newest codes
2. Fix typos and coding style addresed Eric's comments
3. Address Stefan's comments
   1) Make backup_do_checkpoint public, drop the changes on BlockJobDriver
   2) Update the message and description for [PATCH 4/9]
   3) Make replication_(start/stop/do_checkpoint)_all as global interfaces
   4) Introduce AioContext lock to protect start/stop/do_checkpoint callbacks
   5) Use BdrvChild instead of holding on to BlockDriverState * pointers
4. Clear BDRV_O_INACTIVE for hidden disk's open_flags since commit 09e0c771  
5. Introduce replication_get_error_all to check replication status
6. Remove useless discard interface
V14:
1. Implement auto complete active commit
2. Implement active commit block job for replication.c
3. Address the comments from Stefan, add replication-specific API and data
   structure, also remove old block layer APIs
V13:
1. Rebase to the newest codes
2. Remove redundant marcos and semicolon in replication.c 
3. Fix typos in block-replication.txt
V12:
1. Rebase to the newest codes
2. Use backing reference to replcace 'allow-write-backing-file'
V11:
1. Reopen the backing file when starting blcok replication if it is not
   opened in R/W mode
2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
   when opening backing file
3. Block the top BDS so there is only one block job for the top BDS and
   its backing chain.
V10:
1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
   reference.
2. Address the comments from Eric Blake
V9:
1. Update the error messages
2. Rebase to the newest qemu
3. Split child add/delete support. These patches are sent in another patchset.
V8:
1. Address Alberto Garcia's comments
V7:
1. Implement adding/removing quorum child. Remove the option non-connect.
2. Simplify the backing refrence option according to Stefan Hajnoczi's suggestion
V6:
1. Rebase to the newest qemu.
V5:
1. Address the comments from Gong Lei
2. Speed the failover up. The secondary vm can take over very quickly even
   if there are too many I/O requests.
V4:
1. Introduce a new driver replication to avoid touch nbd and qcow2.
V3:
1: use error_setg() instead of error_set()
2. Add a new block job API
3. Active disk, hidden disk and nbd target uses the same AioContext
4. Add a testcase to test new hbitmap API
V2:
1. Redesign the secondary qemu(use image-fleecing)
2. Use Error objects to return error message
3. Address the comments from Max Reitz and Eric Blake

Changlong Xie (3):
  Backup: export interfaces for extra serialization
  Introduce new APIs to do replication operation
  tests: add unit test case for replication

Wen Congyang (7):
  unblock backup operations in backing file
  Backup: clear all bitmap when doing block checkpoint
  Link backup into block core
  docs: block replication's description
  auto complete active commit
  Implement new driver for block replication
  support replication driver in blockdev-add

 Makefile.objs                |   1 +
 block.c                      |  17 ++
 block/Makefile.objs          |   3 +-
 block/backup.c               |  59 +++-
 block/mirror.c               |  13 +-
 block/replication.c          | 666 +++++++++++++++++++++++++++++++++++++++++++
 blockdev.c                   |   2 +-
 docs/block-replication.txt   | 239 ++++++++++++++++
 include/block/block_backup.h |  17 ++
 include/block/block_int.h    |   3 +-
 qapi/block-core.json         |  33 ++-
 qemu-img.c                   |   2 +-
 replication.c                | 105 +++++++
 replication.h                | 176 ++++++++++++
 tests/.gitignore             |   1 +
 tests/Makefile               |   4 +
 tests/test-replication.c     | 523 +++++++++++++++++++++++++++++++++
 17 files changed, 1847 insertions(+), 17 deletions(-)
 create mode 100644 block/replication.c
 create mode 100644 docs/block-replication.txt
 create mode 100644 include/block/block_backup.h
 create mode 100644 replication.c
 create mode 100644 replication.h
 create mode 100644 tests/test-replication.c

-- 
1.9.3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 01/10] unblock backup operations in backing file
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 02/10] Backup: clear all bitmap when doing block checkpoint Changlong Xie
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

From: Wen Congyang <wency@cn.fujitsu.com>

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 block.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/block.c b/block.c
index 1205ef8..8c4c2c2 100644
--- a/block.c
+++ b/block.c
@@ -1271,6 +1271,23 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
     /* Otherwise we won't be able to commit due to check in bdrv_commit */
     bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_COMMIT_TARGET,
                     bs->backing_blocker);
+    /*
+     * We do backup in 3 ways:
+     * 1. drive backup
+     *    The target bs is new opened, and the source is top BDS
+     * 2. blockdev backup
+     *    Both the source and the target are top BDSes.
+     * 3. internal backup(used for block replication)
+     *    Both the source and the target are backing file
+     *
+     * In case 1 and 2, neither the source nor the target is the backing file.
+     * In case 3, we will block the top BDS, so there is only one block job
+     * for the top BDS and its backing chain.
+     */
+    bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_SOURCE,
+                    bs->backing_blocker);
+    bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_TARGET,
+                    bs->backing_blocker);
 out:
     bdrv_refresh_limits(bs, NULL);
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 02/10] Backup: clear all bitmap when doing block checkpoint
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 01/10] unblock backup operations in backing file Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 03/10] Backup: export interfaces for extra serialization Changlong Xie
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

From: Wen Congyang <wency@cn.fujitsu.com>

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 block/backup.c               | 18 ++++++++++++++++++
 include/block/block_backup.h |  3 +++
 2 files changed, 21 insertions(+)
 create mode 100644 include/block/block_backup.h

diff --git a/block/backup.c b/block/backup.c
index fec45e8..93bfd4c 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -17,6 +17,7 @@
 #include "block/block.h"
 #include "block/block_int.h"
 #include "block/blockjob.h"
+#include "block/block_backup.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/ratelimit.h"
@@ -250,6 +251,23 @@ static void backup_abort(BlockJob *job)
     }
 }
 
+void backup_do_checkpoint(BlockJob *job, Error **errp)
+{
+    BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
+    int64_t len;
+
+    assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
+
+    if (backup_job->sync_mode != MIRROR_SYNC_MODE_NONE) {
+        error_setg(errp, "The backup job only supports block checkpoint in"
+                   " sync=none mode");
+        return;
+    }
+
+    len = DIV_ROUND_UP(backup_job->common.len, backup_job->cluster_size);
+    bitmap_zero(backup_job->done_bitmap, len);
+}
+
 static const BlockJobDriver backup_job_driver = {
     .instance_size  = sizeof(BackupBlockJob),
     .job_type       = BLOCK_JOB_TYPE_BACKUP,
diff --git a/include/block/block_backup.h b/include/block/block_backup.h
new file mode 100644
index 0000000..3753bcb
--- /dev/null
+++ b/include/block/block_backup.h
@@ -0,0 +1,3 @@
+#include "block/block_int.h"
+
+void backup_do_checkpoint(BlockJob *job, Error **errp);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 03/10] Backup: export interfaces for extra serialization
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 01/10] unblock backup operations in backing file Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 02/10] Backup: clear all bitmap when doing block checkpoint Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 04/10] Link backup into block core Changlong Xie
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

Normal backup(sync='none') workflow:
step 1. NBD peformance I/O write from client to server
   qcow2_co_writev
    bdrv_co_writev
     ...
       bdrv_aligned_pwritev
        notifier_with_return_list_notify -> backup_do_cow
         bdrv_driver_pwritev // write new contents

step 2. drive-backup sync=none
   backup_do_cow
   {
    wait_for_overlapping_requests
    cow_request_begin
    for(; start < end; start++) {
            bdrv_co_readv_no_serialising //read old contents from Secondary disk
            bdrv_co_writev // write old contents to hidden-disk
    }
    cow_request_end
   }

step 3. Then roll back to "step 1" to write new contents to Secondary disk.

And for replication, we must make sure that we only read the old contents from
Secondary disk in order to keep contents consistent.

1) Replication workflow of Secondary
                                                         virtio-blk
                                                              ^
------->  1 NBD                                               |
   ||     server                                       3 replication
   ||        ^                                                ^
   ||        |           backing                 backing      |
   ||  Secondary disk 6<-------- hidden-disk 5 <-------- active-disk 4
   ||        |                         ^
   ||        '-------------------------'
   ||           drive-backup sync=none 2

Hence, we need these interfaces to implement coarse-grained serialization between
COW of Secondary disk and the read operation of replication.

Example codes about how to use them:

*#include "block/block_backup.h"

static coroutine_fn int xxx_co_readv()
{
        CowRequest req;
        BlockJob *job = secondary_disk->bs->job;

        if (job) {
              backup_wait_for_overlapping_requests(job, start, end);
              backup_cow_request_begin(&req, job, start, end);
              ret = bdrv_co_readv();
              backup_cow_request_end(&req);
              goto out;
        }
        ret = bdrv_co_readv();
out:
        return ret;
}

Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 block/backup.c               | 41 ++++++++++++++++++++++++++++++++++-------
 include/block/block_backup.h | 14 ++++++++++++++
 2 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 93bfd4c..57bcfa3 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -28,13 +28,6 @@
 #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
 #define SLICE_TIME 100000000ULL /* ns */
 
-typedef struct CowRequest {
-    int64_t start;
-    int64_t end;
-    QLIST_ENTRY(CowRequest) list;
-    CoQueue wait_queue; /* coroutines blocked on this request */
-} CowRequest;
-
 typedef struct BackupBlockJob {
     BlockJob common;
     BlockDriverState *target;
@@ -268,6 +261,40 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
     bitmap_zero(backup_job->done_bitmap, len);
 }
 
+void backup_wait_for_overlapping_requests(BlockJob *job, int64_t sector_num,
+                                          int nb_sectors)
+{
+    BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
+    int64_t sectors_per_cluster = cluster_size_sectors(backup_job);
+    int64_t start, end;
+
+    assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
+
+    start = sector_num / sectors_per_cluster;
+    end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
+    wait_for_overlapping_requests(backup_job, start, end);
+}
+
+void backup_cow_request_begin(CowRequest *req, BlockJob *job,
+                              int64_t sector_num,
+                              int nb_sectors)
+{
+    BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
+    int64_t sectors_per_cluster = cluster_size_sectors(backup_job);
+    int64_t start, end;
+
+    assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
+
+    start = sector_num / sectors_per_cluster;
+    end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
+    cow_request_begin(req, backup_job, start, end);
+}
+
+void backup_cow_request_end(CowRequest *req)
+{
+    cow_request_end(req);
+}
+
 static const BlockJobDriver backup_job_driver = {
     .instance_size  = sizeof(BackupBlockJob),
     .job_type       = BLOCK_JOB_TYPE_BACKUP,
diff --git a/include/block/block_backup.h b/include/block/block_backup.h
index 3753bcb..e0e7ce6 100644
--- a/include/block/block_backup.h
+++ b/include/block/block_backup.h
@@ -1,3 +1,17 @@
 #include "block/block_int.h"
 
+typedef struct CowRequest {
+    int64_t start;
+    int64_t end;
+    QLIST_ENTRY(CowRequest) list;
+    CoQueue wait_queue; /* coroutines blocked on this request */
+} CowRequest;
+
+void backup_wait_for_overlapping_requests(BlockJob *job, int64_t sector_num,
+                                          int nb_sectors);
+void backup_cow_request_begin(CowRequest *req, BlockJob *job,
+                              int64_t sector_num,
+                              int nb_sectors);
+void backup_cow_request_end(CowRequest *req);
+
 void backup_do_checkpoint(BlockJob *job, Error **errp);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 04/10] Link backup into block core
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (2 preceding siblings ...)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 03/10] Backup: export interfaces for extra serialization Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 05/10] docs: block replication's description Changlong Xie
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

From: Wen Congyang <wency@cn.fujitsu.com>

Some programs that add a dependency on it will use
the block layer directly.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
---
 block/Makefile.objs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 44a5416..fbfe647 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -22,12 +22,12 @@ block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
 block-obj-$(CONFIG_LIBSSH2) += ssh.o
 block-obj-y += accounting.o dirty-bitmap.o
 block-obj-y += write-threshold.o
+block-obj-y += backup.o
 
 block-obj-y += crypto.o
 
 common-obj-y += stream.o
 common-obj-y += commit.o
-common-obj-y += backup.o
 
 iscsi.o-cflags     := $(LIBISCSI_CFLAGS)
 iscsi.o-libs       := $(LIBISCSI_LIBS)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 05/10] docs: block replication's description
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (3 preceding siblings ...)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 04/10] Link backup into block core Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 06/10] auto complete active commit Changlong Xie
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

From: Wen Congyang <wency@cn.fujitsu.com>

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 docs/block-replication.txt | 239 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 239 insertions(+)
 create mode 100644 docs/block-replication.txt

diff --git a/docs/block-replication.txt b/docs/block-replication.txt
new file mode 100644
index 0000000..c5fc18b
--- /dev/null
+++ b/docs/block-replication.txt
@@ -0,0 +1,239 @@
+Block replication
+----------------------------------------
+Copyright Fujitsu, Corp. 2016
+Copyright (c) 2016 Intel Corporation
+Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+Block replication is used for continuous checkpoints. It is designed
+for COLO (COarse-grain LOck-stepping) where the Secondary VM is running.
+It can also be applied for FT/HA (Fault-tolerance/High Assurance) scenario,
+where the Secondary VM is not running.
+
+This document gives an overview of block replication's design.
+
+== Background ==
+High availability solutions such as micro checkpoint and COLO will do
+consecutive checkpoints. The VM state of the Primary and Secondary VM is
+identical right after a VM checkpoint, but becomes different as the VM
+executes till the next checkpoint. To support disk contents checkpoint,
+the modified disk contents in the Secondary VM must be buffered, and are
+only dropped at next checkpoint time. To reduce the network transportation
+effort during a vmstate checkpoint, the disk modification operations of
+the Primary disk are asynchronously forwarded to the Secondary node.
+
+== Workflow ==
+The following is the image of block replication workflow:
+
+        +----------------------+            +------------------------+
+        |Primary Write Requests|            |Secondary Write Requests|
+        +----------------------+            +------------------------+
+                  |                                       |
+                  |                                      (4)
+                  |                                       V
+                  |                              /-------------\
+                  |      Copy and Forward        |             |
+                  |---------(1)----------+       | Disk Buffer |
+                  |                      |       |             |
+                  |                     (3)      \-------------/
+                  |                 speculative      ^
+                  |                write through    (2)
+                  |                      |           |
+                  V                      V           |
+           +--------------+           +----------------+
+           | Primary Disk |           | Secondary Disk |
+           +--------------+           +----------------+
+
+    1) Primary write requests will be copied and forwarded to Secondary
+       QEMU.
+    2) Before Primary write requests are written to Secondary disk, the
+       original sector content will be read from Secondary disk and
+       buffered in the Disk buffer, but it will not overwrite the existing
+       sector content (it could be from either "Secondary Write Requests" or
+       previous COW of "Primary Write Requests") in the Disk buffer.
+    3) Primary write requests will be written to Secondary disk.
+    4) Secondary write requests will be buffered in the Disk buffer and it
+       will overwrite the existing sector content in the buffer.
+
+== Architecture ==
+We are going to implement block replication from many basic
+blocks that are already in QEMU.
+
+         virtio-blk       ||
+             ^            ||                            .----------
+             |            ||                            | Secondary
+        1 Quorum          ||                            '----------
+         /      \         ||
+        /        \        ||
+   Primary    2 filter
+     disk         ^                                                             virtio-blk
+                  |                                                                  ^
+                3 NBD  ------->  3 NBD                                               |
+                client    ||     server                                          2 filter
+                          ||        ^                                                ^
+--------.                 ||        |                                                |
+Primary |                 ||  Secondary disk <--------- hidden-disk 5 <--------- active-disk 4
+--------'                 ||        |          backing        ^       backing
+                          ||        |                         |
+                          ||        |                         |
+                          ||        '-------------------------'
+                          ||           drive-backup sync=none 6
+
+1) The disk on the primary is represented by a block device with two
+children, providing replication between a primary disk and the host that
+runs the secondary VM. The read pattern (fifo) for quorum can be extended
+to make the primary always read from the local disk instead of going through
+NBD.
+
+2) The new block filter (the name is replication) will control the block
+replication.
+
+3) The secondary disk receives writes from the primary VM through QEMU's
+embedded NBD server (speculative write-through).
+
+4) The disk on the secondary is represented by a custom block device
+(called active-disk). It should start as an empty disk, and the format
+should support bdrv_make_empty() and backing file.
+
+5) The hidden-disk is created automatically. It buffers the original content
+that is modified by the primary VM. It should also start as an empty disk,
+and the driver supports bdrv_make_empty() and backing file.
+
+6) The drive-backup job (sync=none) is run to allow hidden-disk to buffer
+any state that would otherwise be lost by the speculative write-through
+of the NBD server into the secondary disk. So before block replication,
+the primary disk and secondary disk should contain the same data.
+
+== Failure Handling ==
+There are 7 internal errors when block replication is running:
+1. I/O error on primary disk
+2. Forwarding primary write requests failed
+3. Backup failed
+4. I/O error on secondary disk
+5. I/O error on active disk
+6. Making active disk or hidden disk empty failed
+7. Doing failover failed
+In case 1 and 5, we just report the error to the disk layer. In case 2, 3,
+4 and 6, we just report block replication's error to FT/HA manager (which
+decides when to do a new checkpoint, when to do failover).
+In case 7, if active commit failed, we use replication failover failed state
+in Secondary's write operation (what decides which target to write).
+
+== New block driver interface ==
+We add three block driver interfaces to control block replication:
+a. replication_start_all()
+   Start block replication, called in migration/checkpoint thread.
+   We must call block_replication_start_all() in secondary QEMU before
+   calling block_replication_start_all() in primary QEMU. The caller
+   must hold the I/O mutex lock if it is in migration/checkpoint
+   thread.
+b. replication_do_checkpoint_all()
+   This interface is called after all VM state is transferred to
+   Secondary QEMU. The Disk buffer will be dropped in this interface.
+   The caller must hold the I/O mutex lock if it is in migration/checkpoint
+   thread.
+c. replication_get_error_all()
+   This interface is called to check if error happened in replication.
+   The caller must hold the I/O mutex lock if it is in migration/checkpoint
+   thread.
+d. replication_stop_all()
+   It is called on failover. We will flush the Disk buffer into
+   Secondary Disk and stop block replication. The vm should be stopped
+   before calling it if you use this API to shutdown the guest, or other
+   things except failover. The caller must hold the I/O mutex lock if it is
+   in migration/checkpoint thread.
+
+== Usage ==
+Primary:
+  -drive if=xxx,driver=quorum,read-pattern=fifo,id=colo1,vote-threshold=1,\
+         children.0.file.filename=1.raw,\
+         children.0.driver=raw
+
+  Run qmp command in primary qemu:
+    { 'execute': 'human-monitor-command',
+      'arguments': {
+          'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=xxxx,file.port=xxxx,file.export=colo1,node-name=nbd_client1'
+      }
+    }
+    { 'execute': 'x-blockdev-change',
+      'arguments': {
+          'parent': 'colo1',
+          'node': 'nbd_client1'
+      }
+    }
+  Note:
+  1. There should be only one NBD Client for each primary disk.
+  2. host is the secondary physical machine's hostname or IP
+  3. Each disk must have its own export name.
+  4. It is all a single argument to -drive and you should ignore the
+     leading whitespace.
+  5. The qmp command line must be run after running qmp command line in
+     secondary qemu.
+  6. After failover we need remove children.1 (replication driver).
+
+Secondary:
+  -drive if=none,driver=raw,file.filename=1.raw,id=colo1 \
+  -drive if=xxx,id=topxxx,driver=replication,mode=secondary,top-id=topxxx\
+         file.file.filename=active_disk.qcow2,\
+         file.driver=qcow2,\
+         file.backing.file.filename=hidden_disk.qcow2,\
+         file.backing.driver=qcow2,\
+         file.backing.backing=colo1
+
+  Then run qmp command in secondary qemu:
+    { 'execute': 'nbd-server-start',
+      'arguments': {
+          'addr': {
+              'type': 'inet',
+              'data': {
+                  'host': 'xxx',
+                  'port': 'xxx'
+              }
+          }
+      }
+    }
+    { 'execute': 'nbd-server-add',
+      'arguments': {
+          'device': 'colo1',
+          'writable': true
+      }
+    }
+
+  Note:
+  1. The export name in secondary QEMU command line is the secondary
+     disk's id.
+  2. The export name for the same disk must be the same
+  3. The qmp command nbd-server-start and nbd-server-add must be run
+     before running the qmp command migrate on primary QEMU
+  4. Active disk, hidden disk and nbd target's length should be the
+     same.
+  5. It is better to put active disk and hidden disk in ramdisk.
+  6. It is all a single argument to -drive, and you should ignore
+     the leading whitespace.
+
+After Failover:
+Primary:
+  The secondary host is down, so we should run the following qmp command
+  to remove the nbd child from the quorum:
+  { 'execute': 'x-blockdev-change',
+    'arguments': {
+        'parent': 'colo1',
+        'child': 'children.1'
+    }
+  }
+  { 'execute': 'human-monitor-command',
+    'arguments': {
+        'command-line': 'drive_del xxxx'
+    }
+  }
+  Note: there is no qmp command to remove the blockdev now
+
+Secondary:
+  The primary host is down, so we should do the following thing:
+  { 'execute': 'nbd-server-stop' }
+
+TODO:
+1. Continuous block replication
+2. Shared disk
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 06/10] auto complete active commit
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (4 preceding siblings ...)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 05/10] docs: block replication's description Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 07/10] Introduce new APIs to do replication operation Changlong Xie
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

From: Wen Congyang <wency@cn.fujitsu.com>

Auto complete mirror job in background to prevent from
blocking synchronously

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 block/mirror.c            | 13 +++++++++----
 blockdev.c                |  2 +-
 include/block/block_int.h |  3 ++-
 qemu-img.c                |  2 +-
 4 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index b9986d8..385b189 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -801,7 +801,8 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
                              BlockCompletionFunc *cb,
                              void *opaque, Error **errp,
                              const BlockJobDriver *driver,
-                             bool is_none_mode, BlockDriverState *base)
+                             bool is_none_mode, BlockDriverState *base,
+                             bool auto_complete)
 {
     MirrorBlockJob *s;
     BlockDriverState *replaced_bs;
@@ -850,6 +851,9 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
     s->granularity = granularity;
     s->buf_size = ROUND_UP(buf_size, granularity);
     s->unmap = unmap;
+    if (auto_complete) {
+        s->should_complete = true;
+    }
 
     s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
     if (!s->dirty_bitmap) {
@@ -886,14 +890,15 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
     mirror_start_job(bs, target, replaces,
                      speed, granularity, buf_size,
                      on_source_error, on_target_error, unmap, cb, opaque, errp,
-                     &mirror_job_driver, is_none_mode, base);
+                     &mirror_job_driver, is_none_mode, base, false);
 }
 
 void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
                          int64_t speed,
                          BlockdevOnError on_error,
                          BlockCompletionFunc *cb,
-                         void *opaque, Error **errp)
+                         void *opaque, Error **errp,
+                         bool auto_complete)
 {
     int64_t length, base_length;
     int orig_base_flags;
@@ -934,7 +939,7 @@ void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
     bdrv_ref(base);
     mirror_start_job(bs, base, NULL, speed, 0, 0,
                      on_error, on_error, false, cb, opaque, &local_err,
-                     &commit_active_job_driver, false, base);
+                     &commit_active_job_driver, false, base, auto_complete);
     if (local_err) {
         error_propagate(errp, local_err);
         goto error_restore_flags;
diff --git a/blockdev.c b/blockdev.c
index 40e4e6f..90de201 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3163,7 +3163,7 @@ void qmp_block_commit(const char *device,
             goto out;
         }
         commit_active_start(bs, base_bs, speed, on_error, block_job_cb,
-                            bs, &local_err);
+                            bs, &local_err, false);
     } else {
         commit_start(bs, base_bs, top_bs, speed, on_error, block_job_cb, bs,
                      has_backing_file ? backing_file : NULL, &local_err);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index b6f4755..4fc60f7 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -653,13 +653,14 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
  * @cb: Completion function for the job.
  * @opaque: Opaque pointer value passed to @cb.
  * @errp: Error object.
+ * @auto_complete: Auto complete the job.
  *
  */
 void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
                          int64_t speed,
                          BlockdevOnError on_error,
                          BlockCompletionFunc *cb,
-                         void *opaque, Error **errp);
+                         void *opaque, Error **errp, bool auto_complete);
 /*
  * mirror_start:
  * @bs: Block device to operate on.
diff --git a/qemu-img.c b/qemu-img.c
index 4792366..1e04771 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -911,7 +911,7 @@ static int img_commit(int argc, char **argv)
     };
 
     commit_active_start(bs, base_bs, 0, BLOCKDEV_ON_ERROR_REPORT,
-                        common_block_job_cb, &cbi, &local_err);
+                        common_block_job_cb, &cbi, &local_err, false);
     if (local_err) {
         goto done;
     }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 07/10] Introduce new APIs to do replication operation
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (5 preceding siblings ...)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 06/10] auto complete active commit Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication Changlong Xie
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 Makefile.objs        |   1 +
 qapi/block-core.json |  13 ++++
 replication.c        | 105 ++++++++++++++++++++++++++++++
 replication.h        | 176 +++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 295 insertions(+)
 create mode 100644 replication.c
 create mode 100644 replication.h

diff --git a/Makefile.objs b/Makefile.objs
index 8f705f6..30d403f 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -15,6 +15,7 @@ block-obj-$(CONFIG_POSIX) += aio-posix.o
 block-obj-$(CONFIG_WIN32) += aio-win32.o
 block-obj-y += block/
 block-obj-y += qemu-io-cmds.o
+block-obj-y += replication.o
 
 block-obj-m = block/
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 98a20d2..e56cdf4 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2032,6 +2032,19 @@
             '*read-pattern': 'QuorumReadPattern' } }
 
 ##
+# @ReplicationMode
+#
+# An enumeration of replication modes.
+#
+# @primary: Primary mode, the vm's state will be sent to secondary QEMU.
+#
+# @secondary: Secondary mode, receive the vm's state from primary QEMU.
+#
+# Since: 2.7
+##
+{ 'enum' : 'ReplicationMode', 'data' : [ 'primary', 'secondary' ] }
+
+##
 # @BlockdevOptions
 #
 # Options for creating a block device.  Many options are available for all
diff --git a/replication.c b/replication.c
new file mode 100644
index 0000000..03f4a2b
--- /dev/null
+++ b/replication.c
@@ -0,0 +1,105 @@
+/*
+ * Replication filter
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 Intel Corporation
+ * Copyright (c) 2016 FUJITSU LIMITED
+ *
+ * Author:
+ *   Changlong Xie <xiecl.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "replication.h"
+
+static QLIST_HEAD(, ReplicationState) replication_states;
+
+ReplicationState *replication_new(void *opaque, ReplicationOps *ops)
+{
+    ReplicationState *rs;
+
+    assert(ops != NULL);
+    rs = g_new0(ReplicationState, 1);
+    rs->opaque = opaque;
+    rs->ops = ops;
+    QLIST_INSERT_HEAD(&replication_states, rs, node);
+
+    return rs;
+}
+
+void replication_remove(ReplicationState *rs)
+{
+    if (rs) {
+        QLIST_REMOVE(rs, node);
+        g_free(rs);
+    }
+}
+
+/*
+ * The caller of the function MUST make sure vm stopped
+ */
+void replication_start_all(ReplicationMode mode, Error **errp)
+{
+    ReplicationState *rs, *next;
+    Error *local_err = NULL;
+
+    QLIST_FOREACH_SAFE(rs, &replication_states, node, next) {
+        if (rs->ops && rs->ops->start) {
+            rs->ops->start(rs, mode, &local_err);
+        }
+        if (local_err) {
+           error_propagate(errp, local_err);
+           return;
+        }
+    }
+}
+
+void replication_do_checkpoint_all(Error **errp)
+{
+    ReplicationState *rs, *next;
+    Error *local_err = NULL;
+
+    QLIST_FOREACH_SAFE(rs, &replication_states, node, next) {
+        if (rs->ops && rs->ops->checkpoint) {
+            rs->ops->checkpoint(rs, &local_err);
+        }
+        if (local_err) {
+           error_propagate(errp, local_err);
+           return;
+        }
+    }
+}
+
+void replication_get_error_all(Error **errp)
+{
+    ReplicationState *rs, *next;
+    Error *local_err = NULL;
+
+    QLIST_FOREACH_SAFE(rs, &replication_states, node, next) {
+        if (rs->ops && rs->ops->get_error) {
+            rs->ops->get_error(rs, &local_err);
+        }
+        if (local_err) {
+           error_propagate(errp, local_err);
+           return;
+        }
+    }
+}
+
+void replication_stop_all(bool failover, Error **errp)
+{
+    ReplicationState *rs, *next;
+    Error *local_err = NULL;
+
+    QLIST_FOREACH_SAFE(rs, &replication_states, node, next) {
+        if (rs->ops && rs->ops->stop) {
+            rs->ops->stop(rs, failover, &local_err);
+        }
+        if (local_err) {
+           error_propagate(errp, local_err);
+           return;
+        }
+    }
+}
diff --git a/replication.h b/replication.h
new file mode 100644
index 0000000..d9db696
--- /dev/null
+++ b/replication.h
@@ -0,0 +1,176 @@
+/*
+ * Replication filter
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 Intel Corporation
+ * Copyright (c) 2016 FUJITSU LIMITED
+ *
+ * Author:
+ *   Changlong Xie <xiecl.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef REPLICATION_H
+#define REPLICATION_H
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "sysemu/sysemu.h"
+
+typedef struct ReplicationOps ReplicationOps;
+typedef struct ReplicationState ReplicationState;
+
+/**
+ * SECTION:replication.h
+ * @title:Base Replication System
+ * @short_description: interfaces for handling replication
+ *
+ * The Replication Model provides a framework for handling Replication
+ *
+ * <example>
+ *   <title>How to use replication interfaces</title>
+ *   <programlisting>
+ * #include "replication.h"
+ *
+ * typedef struct BDRVReplicationState {
+ *     ReplicationState *rs;
+ * } BDRVReplicationState;
+ *
+ * static void replication_start(ReplicationState *rs, ReplicationMode mode,
+ *                               Error **errp);
+ * static void replication_do_checkpoint(ReplicationState *rs, Error **errp);
+ * static void replication_get_error(ReplicationState *rs, Error **errp);
+ * static void replication_stop(ReplicationState *rs, bool failover,
+ *                              Error **errp);
+ *
+ * static ReplicationOps replication_ops = {
+ *     .start = replication_start,
+ *     .checkpoint = replication_do_checkpoint,
+ *     .get_error = replication_get_error,
+ *     .stop = replication_stop,
+ * }
+ *
+ * static int replication_open(BlockDriverState *bs, QDict *options,
+ *                             int flags, Error **errp)
+ * {
+ *     BDRVReplicationState *s = bs->opaque;
+ *     s->rs = replication_new(bs, &replication_ops);
+ *     return 0;
+ * }
+ *
+ * static void replication_close(BlockDriverState *bs)
+ * {
+ *     BDRVReplicationState *s = bs->opaque;
+ *     replication_remove(s->rs);
+ * }
+ *
+ * BlockDriver bdrv_replication = {
+ *     .format_name                = "replication",
+ *     .protocol_name              = "replication",
+ *     .instance_size              = sizeof(BDRVReplicationState),
+ *
+ *     .bdrv_open                  = replication_open,
+ *     .bdrv_close                 = replication_close,
+ * };
+ *
+ * static void bdrv_replication_init(void)
+ * {
+ *     bdrv_register(&bdrv_replication);
+ * }
+ *
+ * block_init(bdrv_replication_init);
+ *   </programlisting>
+ * </example>
+ *
+ * We create an example about how to use replication interface in above.
+ * Then in migration, we can use replication_(start/stop/do_checkpoint/
+ * get_error)_all to handle all replication operations.
+ */
+
+/**
+ * ReplicationState:
+ * @opaque: opaque pointer value passed to this ReplicationState
+ * @ops: replication operation of this ReplicationState
+ * @node: node that we will insert into @replication_states QLIST
+ */
+struct ReplicationState {
+    void *opaque;
+    ReplicationOps *ops;
+    QLIST_ENTRY(ReplicationState) node;
+};
+
+/**
+ * ReplicationOps:
+ * @start: callback to start replication
+ * @stop: callback to stop replication
+ * @checkpoint: callback to do checkpoint
+ * @get_error: callback to check if error occurred during replication
+ */
+struct ReplicationOps {
+    void (*start)(ReplicationState *rs, ReplicationMode mode, Error **errp);
+    void (*stop)(ReplicationState *rs, bool failover, Error **errp);
+    void (*checkpoint)(ReplicationState *rs, Error **errp);
+    void (*get_error)(ReplicationState *rs, Error **errp);
+};
+
+/**
+ * replication_new:
+ * @opaque: opaque pointer value passed to ReplicationState
+ * @ops: replication operation of the new relevant ReplicationState
+ *
+ * Called to create a new ReplicationState instance, and then insert it
+ * into @replication_states QLIST
+ *
+ * Returns: the new ReplicationState instance
+ */
+ReplicationState *replication_new(void *opaque, ReplicationOps *ops);
+
+/**
+ * replication_remove:
+ * @rs: the ReplicationState instance to remove
+ *
+ * Called to remove a ReplicationState instance, and then delete it from
+ * @replication_states QLIST
+ */
+void replication_remove(ReplicationState *rs);
+
+/**
+ * replication_start_all:
+ * @mode: replication mode that could be "primary" or "secondary"
+ * @errp: returns an error if this function fails
+ *
+ * Start replication, called in migration/checkpoint thread
+ *
+ * Note: the caller of the function MUST make sure vm stopped
+ */
+void replication_start_all(ReplicationMode mode, Error **errp);
+
+/**
+ * replication_do_checkpoint_all:
+ * @errp: returns an error if this function fails
+ *
+ * This interface is called after all VM state is transferred to Secondary QEMU
+ */
+void replication_do_checkpoint_all(Error **errp);
+
+/**
+ * replication_get_error_all:
+ * @errp: returns an error if this function fails
+ *
+ * This interface is called to check if error occurred during replication
+ */
+void replication_get_error_all(Error **errp);
+
+/**
+ * replication_stop_all:
+ * @failover: boolean value that indicates if we need do failover or not
+ * @errp: returns an error if this function fails
+ *
+ * It is called on failover. The vm should be stopped before calling it, if you
+ * use this API to shutdown the guest, or other things except failover
+ */
+void replication_stop_all(bool failover, Error **errp);
+
+#endif /* REPLICATION_H */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (6 preceding siblings ...)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 07/10] Introduce new APIs to do replication operation Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-30 18:14   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
                     ` (2 more replies)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication Changlong Xie
                   ` (3 subsequent siblings)
  11 siblings, 3 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

From: Wen Congyang <wency@cn.fujitsu.com>

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 block/Makefile.objs |   1 +
 block/replication.c | 666 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 667 insertions(+)
 create mode 100644 block/replication.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index fbfe647..5e28b45 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -23,6 +23,7 @@ block-obj-$(CONFIG_LIBSSH2) += ssh.o
 block-obj-y += accounting.o dirty-bitmap.o
 block-obj-y += write-threshold.o
 block-obj-y += backup.o
+block-obj-y += replication.o
 
 block-obj-y += crypto.o
 
diff --git a/block/replication.c b/block/replication.c
new file mode 100644
index 0000000..5228c42
--- /dev/null
+++ b/block/replication.c
@@ -0,0 +1,666 @@
+/*
+ * Replication Block filter
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 Intel Corporation
+ * Copyright (c) 2016 FUJITSU LIMITED
+ *
+ * Author:
+ *   Wen Congyang <wency@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "block/nbd.h"
+#include "block/blockjob.h"
+#include "block/block_int.h"
+#include "block/block_backup.h"
+#include "sysemu/block-backend.h"
+#include "qapi/error.h"
+#include "replication.h"
+
+typedef struct BDRVReplicationState {
+    ReplicationMode mode;
+    int replication_state;
+    BdrvChild *active_disk;
+    BdrvChild *hidden_disk;
+    BdrvChild *secondary_disk;
+    char *top_id;
+    ReplicationState *rs;
+    Error *blocker;
+    int orig_hidden_flags;
+    int orig_secondary_flags;
+    int error;
+} BDRVReplicationState;
+
+enum {
+    BLOCK_REPLICATION_NONE,             /* block replication is not started */
+    BLOCK_REPLICATION_RUNNING,          /* block replication is running */
+    BLOCK_REPLICATION_FAILOVER,         /* failover is running in background */
+    BLOCK_REPLICATION_FAILOVER_FAILED,  /* failover failed */
+    BLOCK_REPLICATION_DONE,             /* block replication is done */
+};
+
+static void replication_start(ReplicationState *rs, ReplicationMode mode,
+                              Error **errp);
+static void replication_do_checkpoint(ReplicationState *rs, Error **errp);
+static void replication_get_error(ReplicationState *rs, Error **errp);
+static void replication_stop(ReplicationState *rs, bool failover,
+                             Error **errp);
+
+#define REPLICATION_MODE        "mode"
+#define REPLICATION_TOP_ID      "top-id"
+static QemuOptsList replication_runtime_opts = {
+    .name = "replication",
+    .head = QTAILQ_HEAD_INITIALIZER(replication_runtime_opts.head),
+    .desc = {
+        {
+            .name = REPLICATION_MODE,
+            .type = QEMU_OPT_STRING,
+        },
+        {
+            .name = REPLICATION_TOP_ID,
+            .type = QEMU_OPT_STRING,
+        },
+        { /* end of list */ }
+    },
+};
+
+static ReplicationOps replication_ops = {
+    .start = replication_start,
+    .checkpoint = replication_do_checkpoint,
+    .get_error = replication_get_error,
+    .stop = replication_stop,
+};
+
+static int replication_open(BlockDriverState *bs, QDict *options,
+                            int flags, Error **errp)
+{
+    int ret;
+    BDRVReplicationState *s = bs->opaque;
+    Error *local_err = NULL;
+    QemuOpts *opts = NULL;
+    const char *mode;
+    const char *top_id;
+
+    ret = -EINVAL;
+    opts = qemu_opts_create(&replication_runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        goto fail;
+    }
+
+    mode = qemu_opt_get(opts, REPLICATION_MODE);
+    if (!mode) {
+        error_setg(&local_err, "Missing the option mode");
+        goto fail;
+    }
+
+    if (!strcmp(mode, "primary")) {
+        s->mode = REPLICATION_MODE_PRIMARY;
+    } else if (!strcmp(mode, "secondary")) {
+        s->mode = REPLICATION_MODE_SECONDARY;
+        top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
+        s->top_id = g_strdup(top_id);
+        if (!s->top_id) {
+            error_setg(&local_err, "Missing the option top-id");
+            goto fail;
+        }
+    } else {
+        error_setg(&local_err,
+                   "The option mode's value should be primary or secondary");
+        goto fail;
+    }
+
+    s->rs = replication_new(bs, &replication_ops);
+
+    ret = 0;
+
+fail:
+    qemu_opts_del(opts);
+    error_propagate(errp, local_err);
+
+    return ret;
+}
+
+static void replication_close(BlockDriverState *bs)
+{
+    BDRVReplicationState *s = bs->opaque;
+
+    if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
+        replication_stop(s->rs, false, NULL);
+    }
+
+    if (s->mode == REPLICATION_MODE_SECONDARY) {
+        g_free(s->top_id);
+    }
+
+    replication_remove(s->rs);
+}
+
+static int64_t replication_getlength(BlockDriverState *bs)
+{
+    return bdrv_getlength(bs->file->bs);
+}
+
+static int replication_get_io_status(BDRVReplicationState *s)
+{
+    switch (s->replication_state) {
+    case BLOCK_REPLICATION_NONE:
+        return -EIO;
+    case BLOCK_REPLICATION_RUNNING:
+        return 0;
+    case BLOCK_REPLICATION_FAILOVER:
+        return s->mode == REPLICATION_MODE_PRIMARY ? -EIO : 0;
+    case BLOCK_REPLICATION_FAILOVER_FAILED:
+        return s->mode == REPLICATION_MODE_PRIMARY ? -EIO : 1;
+    case BLOCK_REPLICATION_DONE:
+        /*
+         * active commit job completes, and active disk and secondary_disk
+         * is swapped, so we can operate bs->file directly
+         */
+        return s->mode == REPLICATION_MODE_PRIMARY ? -EIO : 0;
+    default:
+        abort();
+    }
+}
+
+static int replication_return_value(BDRVReplicationState *s, int ret)
+{
+    if (s->mode == REPLICATION_MODE_SECONDARY) {
+        return ret;
+    }
+
+    if (ret < 0) {
+        s->error = ret;
+        ret = 0;
+    }
+
+    return ret;
+}
+
+static coroutine_fn int replication_co_readv(BlockDriverState *bs,
+                                             int64_t sector_num,
+                                             int remaining_sectors,
+                                             QEMUIOVector *qiov)
+{
+    BDRVReplicationState *s = bs->opaque;
+    BdrvChild *child = s->secondary_disk;
+    BlockJob *job = NULL;
+    CowRequest req;
+    int ret;
+
+    if (s->mode == REPLICATION_MODE_PRIMARY) {
+        /* We only use it to forward primary write requests */
+        return -EIO;
+    }
+
+    ret = replication_get_io_status(s);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (child && child->bs) {
+        job = child->bs->job;
+    }
+
+    if (job) {
+        backup_wait_for_overlapping_requests(child->bs->job, sector_num,
+                                             remaining_sectors);
+        backup_cow_request_begin(&req, child->bs->job, sector_num,
+                                 remaining_sectors);
+        ret = bdrv_co_readv(bs->file->bs, sector_num, remaining_sectors,
+                            qiov);
+        backup_cow_request_end(&req);
+        goto out;
+    }
+
+    ret = bdrv_co_readv(bs->file->bs, sector_num, remaining_sectors, qiov);
+out:
+    return replication_return_value(s, ret);
+}
+
+static coroutine_fn int replication_co_writev(BlockDriverState *bs,
+                                              int64_t sector_num,
+                                              int remaining_sectors,
+                                              QEMUIOVector *qiov)
+{
+    BDRVReplicationState *s = bs->opaque;
+    QEMUIOVector hd_qiov;
+    uint64_t bytes_done = 0;
+    BdrvChild *top = bs->file;
+    BdrvChild *base = s->secondary_disk;
+    BlockDriverState *target;
+    int ret, n;
+
+    ret = replication_get_io_status(s);
+    if (ret < 0) {
+        goto out;
+    }
+
+    if (ret == 0) {
+        ret = bdrv_co_writev(top->bs, sector_num,
+                             remaining_sectors, qiov);
+        return replication_return_value(s, ret);
+    }
+
+    /*
+     * Failover failed, only write to active disk if the sectors
+     * have already been allocated in active disk/hidden disk.
+     */
+    qemu_iovec_init(&hd_qiov, qiov->niov);
+    while (remaining_sectors > 0) {
+        ret = bdrv_is_allocated_above(top->bs, base->bs, sector_num,
+                                      remaining_sectors, &n);
+        if (ret < 0) {
+            goto out1;
+        }
+
+        qemu_iovec_reset(&hd_qiov);
+        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, n * BDRV_SECTOR_SIZE);
+
+        target = ret ? (top->bs) : (base->bs);
+        ret = bdrv_co_writev(target, sector_num, n, &hd_qiov);
+        if (ret < 0) {
+            goto out1;
+        }
+
+        remaining_sectors -= n;
+        sector_num += n;
+        bytes_done += n * BDRV_SECTOR_SIZE;
+    }
+
+out1:
+    qemu_iovec_destroy(&hd_qiov);
+out:
+    return ret;
+}
+
+static bool replication_recurse_is_first_non_filter(BlockDriverState *bs,
+                                                    BlockDriverState *candidate)
+{
+    return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate);
+}
+
+static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
+{
+    Error *local_err = NULL;
+    int ret;
+
+    if (!s->secondary_disk->bs->job) {
+        error_setg(errp, "Backup job was cancelled unexpectedly");
+        return;
+    }
+
+    backup_do_checkpoint(s->secondary_disk->bs->job, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    ret = s->active_disk->bs->drv->bdrv_make_empty(s->active_disk->bs);
+    if (ret < 0) {
+        error_setg(errp, "Cannot make active disk empty");
+        return;
+    }
+
+    ret = s->hidden_disk->bs->drv->bdrv_make_empty(s->hidden_disk->bs);
+    if (ret < 0) {
+        error_setg(errp, "Cannot make hidden disk empty");
+        return;
+    }
+}
+
+static void reopen_backing_file(BDRVReplicationState *s, bool writable,
+                                Error **errp)
+{
+    BlockReopenQueue *reopen_queue = NULL;
+    int orig_hidden_flags, orig_secondary_flags;
+    int new_hidden_flags, new_secondary_flags;
+    Error *local_err = NULL;
+
+    if (writable) {
+        orig_hidden_flags = s->orig_hidden_flags =
+                                bdrv_get_flags(s->hidden_disk->bs);
+        new_hidden_flags = (orig_hidden_flags | BDRV_O_RDWR) &
+                                                    ~BDRV_O_INACTIVE;
+        orig_secondary_flags = s->orig_secondary_flags =
+                                bdrv_get_flags(s->secondary_disk->bs);
+        new_secondary_flags = (orig_secondary_flags | BDRV_O_RDWR) &
+                                                     ~BDRV_O_INACTIVE;
+    } else {
+        orig_hidden_flags = (s->orig_hidden_flags | BDRV_O_RDWR) &
+                                                    ~BDRV_O_INACTIVE;
+        new_hidden_flags = s->orig_hidden_flags;
+        orig_secondary_flags = (s->orig_secondary_flags | BDRV_O_RDWR) &
+                                                    ~BDRV_O_INACTIVE;
+        new_secondary_flags = s->orig_secondary_flags;
+    }
+
+    if (orig_hidden_flags != new_hidden_flags) {
+        reopen_queue = bdrv_reopen_queue(reopen_queue, s->hidden_disk->bs, NULL,
+                                         new_hidden_flags);
+    }
+
+    if (!(orig_secondary_flags & BDRV_O_RDWR)) {
+        reopen_queue = bdrv_reopen_queue(reopen_queue, s->secondary_disk->bs,
+                                         NULL, new_secondary_flags);
+    }
+
+    if (reopen_queue) {
+        bdrv_reopen_multiple(reopen_queue, &local_err);
+        error_propagate(errp, local_err);
+    }
+}
+
+static void backup_job_cleanup(BDRVReplicationState *s)
+{
+    BlockDriverState *top_bs;
+
+    top_bs = bdrv_lookup_bs(s->top_id, s->top_id, NULL);
+    if (!top_bs) {
+        return;
+    }
+    bdrv_op_unblock_all(top_bs, s->blocker);
+    error_free(s->blocker);
+    reopen_backing_file(s, false, NULL);
+}
+
+static void backup_job_completed(void *opaque, int ret)
+{
+    BDRVReplicationState *s = opaque;
+
+    if (s->replication_state != BLOCK_REPLICATION_FAILOVER) {
+        /* The backup job is cancelled unexpectedly */
+        s->error = -EIO;
+    }
+
+    backup_job_cleanup(s);
+}
+
+static bool check_top_bs(BlockDriverState *top_bs, BlockDriverState *bs)
+{
+    BdrvChild *child;
+
+    /* The bs itself is the top_bs */
+    if (top_bs == bs) {
+        return true;
+    }
+
+    /* Iterate over top_bs's children */
+    QLIST_FOREACH(child, &top_bs->children, next) {
+        if (child->bs == bs || check_top_bs(child->bs, bs)) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
+static void replication_start(ReplicationState *rs, ReplicationMode mode,
+                              Error **errp)
+{
+    BlockDriverState *bs = rs->opaque;
+    BDRVReplicationState *s;
+    BlockDriverState *top_bs;
+    int64_t active_length, hidden_length, disk_length;
+    AioContext *aio_context;
+    Error *local_err = NULL;
+
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+    s = bs->opaque;
+
+    if (s->replication_state != BLOCK_REPLICATION_NONE) {
+        error_setg(errp, "Block replication is running or done");
+        aio_context_release(aio_context);
+        return;
+    }
+
+    if (s->mode != mode) {
+        error_setg(errp, "The parameter mode's value is invalid, needs %d,"
+                   " but got %d", s->mode, mode);
+        aio_context_release(aio_context);
+        return;
+    }
+
+    switch (s->mode) {
+    case REPLICATION_MODE_PRIMARY:
+        break;
+    case REPLICATION_MODE_SECONDARY:
+        s->active_disk = bs->file;
+        if (!s->active_disk || !s->active_disk->bs ||
+                                    !s->active_disk->bs->backing) {
+            error_setg(errp, "Active disk doesn't have backing file");
+            aio_context_release(aio_context);
+            return;
+        }
+
+        s->hidden_disk = s->active_disk->bs->backing;
+        if (!s->hidden_disk->bs || !s->hidden_disk->bs->backing) {
+            error_setg(errp, "Hidden disk doesn't have backing file");
+            aio_context_release(aio_context);
+            return;
+        }
+
+        s->secondary_disk = s->hidden_disk->bs->backing;
+        if (!s->secondary_disk->bs || !bdrv_has_blk(s->secondary_disk->bs)) {
+            error_setg(errp, "The secondary disk doesn't have block backend");
+            aio_context_release(aio_context);
+            return;
+        }
+
+        /* verify the length */
+        active_length = bdrv_getlength(s->active_disk->bs);
+        hidden_length = bdrv_getlength(s->hidden_disk->bs);
+        disk_length = bdrv_getlength(s->secondary_disk->bs);
+        if (active_length < 0 || hidden_length < 0 || disk_length < 0 ||
+            active_length != hidden_length || hidden_length != disk_length) {
+            error_setg(errp, "active disk, hidden disk, secondary disk's length"
+                       " are not the same");
+            aio_context_release(aio_context);
+            return;
+        }
+
+        if (!s->active_disk->bs->drv->bdrv_make_empty ||
+            !s->hidden_disk->bs->drv->bdrv_make_empty) {
+            error_setg(errp,
+                       "active disk or hidden disk doesn't support make_empty");
+            aio_context_release(aio_context);
+            return;
+        }
+
+        /* reopen the backing file in r/w mode */
+        reopen_backing_file(s, true, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            aio_context_release(aio_context);
+            return;
+        }
+
+        /* start backup job now */
+        error_setg(&s->blocker,
+                   "block device is in use by internal backup job");
+
+        top_bs = bdrv_lookup_bs(s->top_id, s->top_id, errp);
+        if (!top_bs || !check_top_bs(top_bs, bs)) {
+            reopen_backing_file(s, false, NULL);
+            aio_context_release(aio_context);
+            return;
+        }
+        bdrv_op_block_all(top_bs, s->blocker);
+        bdrv_op_unblock(top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker);
+
+        /*
+         * Must protect backup target if backup job was stopped/cancelled
+         * unexpectedly
+         */
+        bdrv_ref(s->hidden_disk->bs);
+
+        backup_start(s->secondary_disk->bs, s->hidden_disk->bs, 0,
+                     MIRROR_SYNC_MODE_NONE, NULL, BLOCKDEV_ON_ERROR_REPORT,
+                     BLOCKDEV_ON_ERROR_REPORT, backup_job_completed,
+                     s, NULL, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            backup_job_cleanup(s);
+            bdrv_unref(s->hidden_disk->bs);
+            aio_context_release(aio_context);
+            return;
+        }
+        break;
+    default:
+        aio_context_release(aio_context);
+        abort();
+    }
+
+    s->replication_state = BLOCK_REPLICATION_RUNNING;
+
+    if (s->mode == REPLICATION_MODE_SECONDARY) {
+        secondary_do_checkpoint(s, errp);
+    }
+
+    s->error = 0;
+    aio_context_release(aio_context);
+}
+
+static void replication_do_checkpoint(ReplicationState *rs, Error **errp)
+{
+    BlockDriverState *bs = rs->opaque;
+    BDRVReplicationState *s;
+    AioContext *aio_context;
+
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+    s = bs->opaque;
+
+    if (s->mode == REPLICATION_MODE_SECONDARY) {
+        secondary_do_checkpoint(s, errp);
+    }
+    aio_context_release(aio_context);
+}
+
+static void replication_get_error(ReplicationState *rs, Error **errp)
+{
+    BlockDriverState *bs = rs->opaque;
+    BDRVReplicationState *s;
+    AioContext *aio_context;
+
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+    s = bs->opaque;
+
+    if (s->replication_state != BLOCK_REPLICATION_RUNNING) {
+        error_setg(errp, "Block replication is not running");
+        aio_context_release(aio_context);
+        return;
+    }
+
+    if (s->error) {
+        error_setg(errp, "I/O error occurred");
+        aio_context_release(aio_context);
+        return;
+    }
+    aio_context_release(aio_context);
+}
+
+static void replication_done(void *opaque, int ret)
+{
+    BlockDriverState *bs = opaque;
+    BDRVReplicationState *s = bs->opaque;
+
+    if (ret == 0) {
+        s->replication_state = BLOCK_REPLICATION_DONE;
+
+        /* refresh top bs's filename */
+        bdrv_refresh_filename(bs);
+        s->active_disk = NULL;
+        s->secondary_disk = NULL;
+        s->hidden_disk = NULL;
+        s->error = 0;
+    } else {
+        s->replication_state = BLOCK_REPLICATION_FAILOVER_FAILED;
+        s->error = -EIO;
+    }
+}
+
+static void replication_stop(ReplicationState *rs, bool failover, Error **errp)
+{
+    BlockDriverState *bs = rs->opaque;
+    BDRVReplicationState *s;
+    AioContext *aio_context;
+
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+    s = bs->opaque;
+
+    if (s->replication_state != BLOCK_REPLICATION_RUNNING) {
+        error_setg(errp, "Block replication is not running");
+        aio_context_release(aio_context);
+        return;
+    }
+
+    switch (s->mode) {
+    case REPLICATION_MODE_PRIMARY:
+        s->replication_state = BLOCK_REPLICATION_DONE;
+        s->error = 0;
+        break;
+    case REPLICATION_MODE_SECONDARY:
+        if (!failover) {
+            /*
+             * This BDS will be closed, and the job should be completed
+             * before the BDS is closed, because we will access hidden
+             * disk, secondary disk in backup_job_completed().
+             */
+            if (s->secondary_disk->bs->job) {
+                block_job_cancel_sync(s->secondary_disk->bs->job);
+            }
+            secondary_do_checkpoint(s, errp);
+            s->replication_state = BLOCK_REPLICATION_DONE;
+            aio_context_release(aio_context);
+            return;
+        }
+
+        s->replication_state = BLOCK_REPLICATION_FAILOVER;
+        if (s->secondary_disk->bs->job) {
+            block_job_cancel(s->secondary_disk->bs->job);
+        }
+
+        commit_active_start(s->active_disk->bs, s->secondary_disk->bs, 0,
+                            BLOCKDEV_ON_ERROR_REPORT, replication_done,
+                            bs, errp, true);
+        break;
+    default:
+        aio_context_release(aio_context);
+        abort();
+    }
+    aio_context_release(aio_context);
+}
+
+BlockDriver bdrv_replication = {
+    .format_name                = "replication",
+    .protocol_name              = "replication",
+    .instance_size              = sizeof(BDRVReplicationState),
+
+    .bdrv_open                  = replication_open,
+    .bdrv_close                 = replication_close,
+
+    .bdrv_getlength             = replication_getlength,
+    .bdrv_co_readv              = replication_co_readv,
+    .bdrv_co_writev             = replication_co_writev,
+
+    .is_filter                  = true,
+    .bdrv_recurse_is_first_non_filter = replication_recurse_is_first_non_filter,
+
+    .has_variable_length        = true,
+};
+
+static void bdrv_replication_init(void)
+{
+    bdrv_register(&bdrv_replication);
+}
+
+block_init(bdrv_replication_init);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (7 preceding siblings ...)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-27  1:46   ` Changlong Xie
  2016-05-30 17:34   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 10/10] support replication driver in blockdev-add Changlong Xie
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 tests/.gitignore         |   1 +
 tests/Makefile           |   4 +
 tests/test-replication.c | 523 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 528 insertions(+)
 create mode 100644 tests/test-replication.c

diff --git a/tests/.gitignore b/tests/.gitignore
index a06a8ba..d22ab06 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -58,6 +58,7 @@ test-qmp-introspect.[ch]
 test-qmp-marshal.c
 test-qmp-output-visitor
 test-rcu-list
+test-replication
 test-rfifolock
 test-string-input-visitor
 test-string-output-visitor
diff --git a/tests/Makefile b/tests/Makefile
index 9dddde6..9e6d31d 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -103,6 +103,7 @@ check-unit-y += tests/test-crypto-xts$(EXESUF)
 check-unit-y += tests/test-crypto-block$(EXESUF)
 gcov-files-test-logging-y = tests/test-logging.c
 check-unit-y += tests/test-logging$(EXESUF)
+check-unit-y += tests/test-replication$(EXESUF)
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 
@@ -448,6 +449,9 @@ tests/test-base64$(EXESUF): tests/test-base64.o \
 
 tests/test-logging$(EXESUF): tests/test-logging.o $(test-util-obj-y)
 
+tests/test-replication$(EXESUF): tests/test-replication.o $(test-util-obj-y) \
+	$(test-block-obj-y)
+
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/tests/qapi-schema/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
 	$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
diff --git a/tests/test-replication.c b/tests/test-replication.c
new file mode 100644
index 0000000..e998d46
--- /dev/null
+++ b/tests/test-replication.c
@@ -0,0 +1,523 @@
+/*
+ * Block replication tests
+ *
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Author: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "replication.h"
+#include "block/block_int.h"
+#include "sysemu/block-backend.h"
+
+#define IMG_SIZE (64 * 1024 * 1024)
+
+/* primary */
+#define P_LOCAL_DISK "/tmp/p_local_disk.XXXXXX"
+#define P_COMMAND "driver=replication,mode=primary,node-name=xxx,"\
+                  "file.driver=qcow2,file.file.filename="P_LOCAL_DISK
+
+/* secondary */
+#define S_LOCAL_DISK "/tmp/s_local_disk.XXXXXX"
+#define S_ACTIVE_DISK "/tmp/s_active_disk.XXXXXX"
+#define S_HIDDEN_DISK "/tmp/s_hidden_disk.XXXXXX"
+#define S_ID "secondary-id"
+#define S_LOCAL_DISK_ID "secondary-local-disk-id"
+#define S_COMMAND1 "file.filename="S_LOCAL_DISK",driver=qcow2"
+#define S_COMMAND2 "driver=replication,mode=secondary,top-id="S_ID","\
+                   "file.driver=qcow2,file.file.filename="S_ACTIVE_DISK","\
+                   "file.backing.driver=qcow2,file.backing.file.filename="\
+                   ""S_HIDDEN_DISK",file.backing.backing="S_LOCAL_DISK_ID
+
+/* FIXME: steal from blockdev.c */
+QemuOptsList qemu_drive_opts = {
+    .name = "drive",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_drive_opts.head),
+    .desc = {
+        { /* end of list */ }
+    },
+};
+
+static void io_read(BlockDriverState *bs, long pattern, int64_t pattern_offset,
+                    int64_t pattern_count, int64_t offset, int64_t count,
+                    bool expect_failed)
+{
+    char *buf;
+    void *cmp_buf;
+    int ret;
+
+    /* 1. alloc pattern buffer */
+    if (pattern) {
+        cmp_buf = g_malloc(pattern_count);
+        memset(cmp_buf, pattern, pattern_count);
+    }
+
+    /* 2. alloc read buffer */
+    buf = qemu_blockalign(bs, count);
+    memset(buf, 0xab, count);
+
+    /* 3. do read */
+    ret = bdrv_read(bs, offset >> 9, (uint8_t *)buf, count >> 9);
+
+    /* 4. assert and compare buf */
+    if (expect_failed) {
+        g_assert(ret < 0);
+    } else {
+        g_assert(ret >= 0);
+        if (pattern) {
+            g_assert(memcmp(buf + pattern_offset, cmp_buf, pattern_count) <= 0);
+            g_free(cmp_buf);
+        }
+    }
+    g_free(buf);
+}
+
+static void io_write(BlockDriverState *bs, long pattern, int64_t pattern_count,
+                     int64_t offset, int64_t count, bool expect_failed)
+{
+    void *pattern_buf;
+    int ret;
+
+    /* 1. alloc pattern buffer */
+    if (pattern) {
+        pattern_buf = g_malloc(pattern_count);
+        memset(pattern_buf, pattern, pattern_count);
+    }
+
+    /* 2. do write */
+    if (pattern) {
+        ret = bdrv_write(bs, offset >> 9, (uint8_t *)pattern_buf, count >> 9);
+    } else {
+        ret = bdrv_write_zeroes(bs, offset >> 9, count >> 9, 0);
+    }
+
+    /* 3. assert */
+    if (expect_failed) {
+        g_assert(ret < 0);
+    } else {
+        g_assert(ret >= 0);
+        g_free(pattern_buf);
+    }
+}
+
+static void prepare_imgs(void)
+{
+    Error *local_err = NULL;
+
+    /* Primary */
+    bdrv_img_create(P_LOCAL_DISK, "qcow2", NULL, NULL, NULL, IMG_SIZE,
+                    BDRV_O_RDWR, &local_err, true);
+    g_assert(!local_err);
+
+    /* Secondary */
+    bdrv_img_create(S_LOCAL_DISK, "qcow2", NULL, NULL, NULL, IMG_SIZE,
+                    BDRV_O_RDWR, &local_err, true);
+    g_assert(!local_err);
+    bdrv_img_create(S_ACTIVE_DISK, "qcow2", NULL, NULL, NULL, IMG_SIZE,
+                    BDRV_O_RDWR, &local_err, true);
+    g_assert(!local_err);
+    bdrv_img_create(S_HIDDEN_DISK, "qcow2", NULL, NULL, NULL, IMG_SIZE,
+                    BDRV_O_RDWR, &local_err, true);
+    g_assert(!local_err);
+}
+
+static void cleanup_imgs(void)
+{
+    /* Primary */
+    unlink(P_LOCAL_DISK);
+
+    /* Secondary */
+    unlink(S_LOCAL_DISK);
+    unlink(S_ACTIVE_DISK);
+    unlink(S_HIDDEN_DISK);
+}
+
+static BlockDriverState *start_primary(void)
+{
+    BlockDriverState *bs;
+    QemuOpts *opts;
+    QDict *qdict;
+    Error *local_err = NULL;
+    int ret;
+
+    /* init Primary BS without BB */
+    bs = bdrv_new();
+    g_assert(bs);
+
+    opts = qemu_opts_parse_noisily(&qemu_drive_opts, P_COMMAND, false);
+    qdict = qemu_opts_to_qdict(opts, NULL);
+
+    qdict_set_default_str(qdict, BDRV_OPT_CACHE_DIRECT, "off");
+    qdict_set_default_str(qdict, BDRV_OPT_CACHE_NO_FLUSH, "off");
+
+    ret = bdrv_open(&bs, NULL, NULL, qdict, BDRV_O_RDWR, &local_err);
+    g_assert(ret >= 0);
+    g_assert(!local_err);
+
+    qemu_opts_del(opts);
+
+    return bs;
+}
+
+static void teardown_primary(BlockDriverState *bs)
+{
+    /* only destroy BS, since we didn't initialize BB in Primary */
+    bdrv_unref(bs);
+}
+
+static void test_primary_read(void)
+{
+    BlockDriverState *bs;
+
+    bs = start_primary();
+    /* read from 0 to IMG_SIZE */
+    io_read(bs, 0, 0, IMG_SIZE, 0, IMG_SIZE, true);
+
+    teardown_primary(bs);
+}
+
+static void test_primary_write(void)
+{
+    BlockDriverState *bs;
+
+    bs = start_primary();
+    /* write from 0 to IMG_SIZE */
+    io_write(bs, 0, IMG_SIZE, 0, IMG_SIZE, true);
+
+    teardown_primary(bs);
+}
+
+static void test_primary_start(void)
+{
+    BlockDriverState *bs;
+    Error *local_err = NULL;
+
+    bs = start_primary();
+
+    replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
+    g_assert(!local_err);
+    /* read from 0 to IMG_SIZE */
+    io_read(bs, 0, 0, IMG_SIZE, 0, IMG_SIZE, true);
+
+    /* write 0x22 from 0 to IMG_SIZE */
+    io_write(bs, 0x22, IMG_SIZE, 0, IMG_SIZE, false);
+
+    teardown_primary(bs);
+}
+
+static void test_primary_stop(void)
+{
+    BlockDriverState *bs;
+    Error *local_err = NULL;
+    bool failover = true;
+
+    bs = start_primary();
+
+    replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
+    g_assert(!local_err);
+
+    replication_stop_all(failover, &local_err);
+    g_assert(!local_err);
+
+    teardown_primary(bs);
+}
+
+static void test_primary_do_checkpoint(void)
+{
+    BlockDriverState *bs;
+    Error *local_err = NULL;
+
+    bs = start_primary();
+
+    replication_do_checkpoint_all(&local_err);
+    g_assert(!local_err);
+
+    teardown_primary(bs);
+}
+
+static void test_primary_get_error(void)
+{
+    BlockDriverState *bs;
+    Error *local_err = NULL;
+
+    bs = start_primary();
+
+    replication_start_all(REPLICATION_MODE_PRIMARY, &local_err);
+    g_assert(!local_err);
+
+    replication_get_error_all(&local_err);
+    g_assert(!local_err);
+
+    teardown_primary(bs);
+}
+
+static BlockDriverState *start_secondary(void)
+{
+    QemuOpts *opts;
+    QDict *qdict;
+    BlockBackend *blk;
+    BlockDriverState *bs;
+    Error *local_err = NULL;
+
+    /* 1. add S_LOCAL_DISK and forge S_LOCAL_DISK_ID */
+    opts = qemu_opts_parse_noisily(&qemu_drive_opts, S_COMMAND1, false);
+    qdict = qemu_opts_to_qdict(opts, NULL);
+
+    qdict_set_default_str(qdict, BDRV_OPT_CACHE_DIRECT, "off");
+    qdict_set_default_str(qdict, BDRV_OPT_CACHE_NO_FLUSH, "off");
+
+    blk = blk_new_open(NULL, NULL, qdict, BDRV_O_RDWR, &local_err);
+    assert(blk);
+    monitor_add_blk(blk, S_LOCAL_DISK_ID, &local_err);
+    g_assert(!local_err);
+
+    /* 2. format S_LOCAL_DISK with pattern "0x11" */
+    bs = blk_bs(blk);
+    io_write(bs, 0x11, IMG_SIZE, 0, IMG_SIZE, false);
+
+    qemu_opts_del(opts);
+
+    /* 3. add S_(ACTIVE/HIDDEN)_DISK and forge S_ID */
+    opts = qemu_opts_parse_noisily(&qemu_drive_opts, S_COMMAND2, false);
+    qdict = qemu_opts_to_qdict(opts, NULL);
+
+    qdict_set_default_str(qdict, BDRV_OPT_CACHE_DIRECT, "off");
+    qdict_set_default_str(qdict, BDRV_OPT_CACHE_NO_FLUSH, "off");
+
+    blk = blk_new_open(NULL, NULL, qdict, BDRV_O_RDWR, &local_err);
+    assert(blk);
+    monitor_add_blk(blk, S_ID, &local_err);
+    g_assert(!local_err);
+
+    qemu_opts_del(opts);
+
+    /* return top bs */
+    return blk_bs(blk);
+}
+
+static void teardown_secondary(void)
+{
+    /* only need to destroy two BBs */
+    BlockBackend *blk;
+
+    /* 1. remove S_LOCAL_DISK_ID */
+    blk = blk_by_name(S_LOCAL_DISK_ID);
+    assert(blk);
+
+    monitor_remove_blk(blk);
+    blk_unref(blk);
+
+    /* 2. remove S_ID */
+    blk = blk_by_name(S_ID);
+    assert(blk);
+
+    monitor_remove_blk(blk);
+    blk_unref(blk);
+}
+
+static void test_secondary_read(void)
+{
+    BlockDriverState *top_bs;
+
+    top_bs = start_secondary();
+    /* read from 0 to IMG_SIZE */
+    io_read(top_bs, 0, 0, IMG_SIZE, 0, IMG_SIZE, true);
+
+    teardown_secondary();
+}
+
+static void test_secondary_write(void)
+{
+    BlockDriverState *bs;
+
+    bs = start_secondary();
+    /* write from 0 to IMG_SIZE */
+    io_write(bs, 0, IMG_SIZE, 0, IMG_SIZE, true);
+
+    teardown_secondary();
+}
+
+static void test_secondary_start(void)
+{
+    BlockBackend *blk;
+    BlockDriverState *top_bs, *local_bs;
+    Error *local_err = NULL;
+    bool failover = true;
+
+    top_bs = start_secondary();
+    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
+    g_assert(!local_err);
+
+    /* 1. read from S_LOCAL_DISK (0, IMG_SIZE) */
+    io_read(top_bs, 0x11, 0, IMG_SIZE, 0, IMG_SIZE, false);
+
+    /* 2. write 0x22 to S_LOCAL_DISK (IMG_SIZE / 2, IMG_SIZE) */
+    blk = blk_by_name(S_LOCAL_DISK_ID);
+    local_bs = blk_bs(blk);
+
+    io_write(local_bs, 0x22, IMG_SIZE / 2, IMG_SIZE / 2, IMG_SIZE / 2, false);
+
+    /* 2.1 replication will backup S_LOCAL_DISK to S_HIDDEN_DISK */
+    io_read(top_bs, 0x11, IMG_SIZE / 2, IMG_SIZE / 2, 0, IMG_SIZE, false);
+
+    /* 3. write 0x33 to S_ACTIVE_DISK (0, IMG_SIZE / 2) */
+    io_write(top_bs, 0x33, IMG_SIZE / 2, 0, IMG_SIZE / 2, false);
+
+    /* 3.1 read from S_ACTIVE_DISK (0, IMG_SIZE/2) */
+    io_read(top_bs, 0x33, 0, IMG_SIZE / 2, 0, IMG_SIZE / 2, false);
+
+    /* unblock top_bs */
+    replication_stop_all(failover, &local_err);
+    g_assert(!local_err);
+
+    teardown_secondary();
+}
+
+static void test_secondary_stop(void)
+{
+    BlockBackend *blk;
+    BlockDriverState *top_bs, *local_bs;
+    Error *local_err = NULL;
+    bool failover = true;
+
+    top_bs = start_secondary();
+    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
+    g_assert(!local_err);
+
+    /* 1. write 0x22 to S_LOCAL_DISK (IMG_SIZE / 2, IMG_SIZE) */
+    blk = blk_by_name(S_LOCAL_DISK_ID);
+    local_bs = blk_bs(blk);
+
+    io_write(local_bs, 0x22, IMG_SIZE / 2, IMG_SIZE / 2, IMG_SIZE / 2, false);
+
+    /* 2. replication will backup S_LOCAL_DISK to S_HIDDEN_DISK */
+    io_read(top_bs, 0x11, IMG_SIZE / 2, IMG_SIZE / 2, 0, IMG_SIZE, false);
+
+    /* 3. write 0x33 to S_ACTIVE_DISK (0, IMG_SIZE / 2) */
+    io_write(top_bs, 0x33, IMG_SIZE / 2, 0, IMG_SIZE / 2, false);
+
+    /* 4. do active commit */
+    replication_stop_all(failover, &local_err);
+    g_assert(!local_err);
+
+    /* 5. read from S_LOCAL_DISK (0, IMG_SIZE / 2) */
+    io_read(top_bs, 0x33, 0, IMG_SIZE / 2, 0, IMG_SIZE / 2, false);
+
+    /* 6. read from S_LOCAL_DISK (IMG_SIZE / 2, IMG_SIZE) */
+    io_read(top_bs, 0x22, IMG_SIZE / 2, IMG_SIZE / 2, 0, IMG_SIZE, false);
+
+    teardown_secondary();
+}
+
+static void test_secondary_do_checkpoint(void)
+{
+    BlockBackend *blk;
+    BlockDriverState *top_bs, *local_bs;
+    Error *local_err = NULL;
+    bool failover = true;
+
+    top_bs = start_secondary();
+    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
+    g_assert(!local_err);
+
+    /* 1. write 0x22 to S_LOCAL_DISK (IMG_SIZE / 2, IMG_SIZE) */
+    blk = blk_by_name(S_LOCAL_DISK_ID);
+    local_bs = blk_bs(blk);
+
+    io_write(local_bs, 0x22, IMG_SIZE / 2, IMG_SIZE / 2, IMG_SIZE / 2, false);
+
+    /* 2. replication will backup S_LOCAL_DISK to S_HIDDEN_DISK */
+    io_read(top_bs, 0x11, IMG_SIZE / 2, IMG_SIZE / 2, 0, IMG_SIZE, false);
+
+    replication_do_checkpoint_all(&local_err);
+    g_assert(!local_err);
+
+    /* 3. after checkpoint, read pattern 0x22 from S_LOCAL_DISK */
+    io_read(top_bs, 0x22, IMG_SIZE / 2, IMG_SIZE / 2, 0, IMG_SIZE, false);
+
+    /* unblock top_bs */
+    replication_stop_all(failover, &local_err);
+    g_assert(!local_err);
+
+    teardown_secondary();
+}
+
+static void test_secondary_get_error(void)
+{
+    Error *local_err = NULL;
+    bool failover = true;
+
+    start_secondary();
+    replication_start_all(REPLICATION_MODE_SECONDARY, &local_err);
+    g_assert(!local_err);
+
+    replication_get_error_all(&local_err);
+    g_assert(!local_err);
+
+    /* unblock top_bs */
+    replication_stop_all(failover, &local_err);
+    g_assert(!local_err);
+
+    teardown_secondary();
+}
+
+static void sigabrt_handler(int signo)
+{
+    cleanup_imgs();
+}
+
+static void setup_sigabrt_handler(void)
+{
+    struct sigaction sigact;
+
+    sigact = (struct sigaction){
+        .sa_handler = sigabrt_handler,
+        .sa_flags = SA_RESETHAND,
+    };
+    sigemptyset(&sigact.sa_mask);
+    sigaction(SIGABRT, &sigact, NULL);
+}
+
+int main(int argc, char **argv)
+{
+    int ret;
+    qemu_init_main_loop(&error_fatal);
+    bdrv_init();
+
+    do {} while (g_main_context_iteration(NULL, false));
+    g_test_init(&argc, &argv, NULL);
+    setup_sigabrt_handler();
+
+    prepare_imgs();
+
+    /* Primary */
+    g_test_add_func("/replication/primary/read",    test_primary_read);
+    g_test_add_func("/replication/primary/write",   test_primary_write);
+    g_test_add_func("/replication/primary/start",   test_primary_start);
+    g_test_add_func("/replication/primary/stop",    test_primary_stop);
+    g_test_add_func("/replication/primary/do_checkpoint",
+                    test_primary_do_checkpoint);
+    g_test_add_func("/replication/primary/get_error",
+                    test_primary_get_error);
+
+    /* Secondary */
+    g_test_add_func("/replication/secondary/read",  test_secondary_read);
+    g_test_add_func("/replication/secondary/write", test_secondary_write);
+    g_test_add_func("/replication/secondary/start", test_secondary_start);
+    g_test_add_func("/replication/secondary/stop",  test_secondary_stop);
+    g_test_add_func("/replication/secondary/do_checkpoint",
+                    test_secondary_do_checkpoint);
+    g_test_add_func("/replication/secondary/get_error",
+                    test_secondary_get_error);
+
+    ret = g_test_run();
+
+    cleanup_imgs();
+
+    return ret;
+}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v19 10/10] support replication driver in blockdev-add
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (8 preceding siblings ...)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication Changlong Xie
@ 2016-05-20  7:36 ` Changlong Xie
  2016-05-27  1:59 ` [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
  2016-05-30 18:20 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  11 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-20  7:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang,
	Changlong Xie

From: Wen Congyang <wency@cn.fujitsu.com>

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 qapi/block-core.json | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index e56cdf4..b9f9839 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -248,6 +248,7 @@
 #       2.3: 'host_floppy' deprecated
 #       2.5: 'host_floppy' dropped
 #       2.6: 'luks' added
+#       2.7: 'replication' added
 #
 # @backing_file: #optional the name of the backing file (for copy-on-write)
 #
@@ -1632,6 +1633,7 @@
 # Drivers that are supported in block device operations.
 #
 # @host_device, @host_cdrom: Since 2.1
+# @replication: Since 2.7
 #
 # Since: 2.0
 ##
@@ -1639,8 +1641,8 @@
   'data': [ 'archipelago', 'blkdebug', 'blkverify', 'bochs', 'cloop',
             'dmg', 'file', 'ftp', 'ftps', 'host_cdrom', 'host_device',
             'http', 'https', 'luks', 'null-aio', 'null-co', 'parallels',
-            'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'tftp', 'vdi', 'vhdx',
-            'vmdk', 'vpc', 'vvfat' ] }
+            'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'replication', 'tftp',
+            'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
 
 ##
 # @BlockdevOptionsFile
@@ -2045,6 +2047,19 @@
 { 'enum' : 'ReplicationMode', 'data' : [ 'primary', 'secondary' ] }
 
 ##
+# @BlockdevOptionsReplication
+#
+# Driver specific block device options for replication
+#
+# @mode: the replication mode
+#
+# Since: 2.7
+##
+{ 'struct': 'BlockdevOptionsReplication',
+  'base': 'BlockdevOptionsGenericFormat',
+  'data': { 'mode': 'ReplicationMode'  } }
+
+##
 # @BlockdevOptions
 #
 # Options for creating a block device.  Many options are available for all
@@ -2125,6 +2140,7 @@
       'quorum':     'BlockdevOptionsQuorum',
       'raw':        'BlockdevOptionsGenericFormat',
 # TODO rbd: Wait for structured options
+      'replication':'BlockdevOptionsReplication',
 # TODO sheepdog: Wait for structured options
 # TODO ssh: Should take InetSocketAddress for 'host'?
       'tftp':       'BlockdevOptionsFile',
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication Changlong Xie
@ 2016-05-27  1:46   ` Changlong Xie
  2016-05-30 17:34   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  1 sibling, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-27  1:46 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang

On 05/20/2016 03:36 PM, Changlong Xie wrote:
> +static void io_write(BlockDriverState *bs, long pattern, int64_t pattern_count,
> +                     int64_t offset, int64_t count, bool expect_failed)
> +{
> +    void *pattern_buf;

Should initialize as NULL to avoid below warnning:

tests/test-replication.c:104:15: error: ‘pattern_buf’ may be used 
uninitialized in this function [-Werror=maybe-uninitialized]
          g_free(pattern_buf);

Will fix in next version.

> +    int ret;
> +
> +    /* 1. alloc pattern buffer */
> +    if (pattern) {
> +        pattern_buf = g_malloc(pattern_count);
> +        memset(pattern_buf, pattern, pattern_count);
> +    }
> +
> +    /* 2. do write */
> +    if (pattern) {
> +        ret = bdrv_write(bs, offset >> 9, (uint8_t *)pattern_buf, count >> 9);
> +    } else {
> +        ret = bdrv_write_zeroes(bs, offset >> 9, count >> 9, 0);
> +    }
> +
> +    /* 3. assert */
> +    if (expect_failed) {
> +        g_assert(ret < 0);
> +    } else {
> +        g_assert(ret >= 0);
> +        g_free(pattern_buf);
> +    }
> +}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (9 preceding siblings ...)
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 10/10] support replication driver in blockdev-add Changlong Xie
@ 2016-05-27  1:59 ` Changlong Xie
  2016-05-27  7:23   ` Fam Zheng
  2016-05-30 18:20 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  11 siblings, 1 reply; 22+ messages in thread
From: Changlong Xie @ 2016-05-27  1:59 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang

Ping here : )

Hi fam, do you have time to help reviewing this patchset? Consider of we 
are in the same time zone what will speed up code reviewing process,
any feedback will be appreciated.

Thanks
	-Xie

On 05/20/2016 03:36 PM, Changlong Xie wrote:
> Block replication is a very important feature which is used for
> continuous checkpoints(for example: COLO).
>
> You can get the detailed information about block replication from here:
> http://wiki.qemu.org/Features/BlockReplication
>
> Usage:
> Please refer to docs/block-replication.txt
>
> You can get the patch here:
> https://github.com/Pating/qemu/tree/changlox/block-replication-v19
>
> You can get the patch with framework here:
> https://github.com/Pating/qemu/tree/changlox/colo_framework_v18
>
> TODO:
> 1. Continuous block replication. It will be started after basic functions

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints
  2016-05-27  1:59 ` [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
@ 2016-05-27  7:23   ` Fam Zheng
  0 siblings, 0 replies; 22+ messages in thread
From: Fam Zheng @ 2016-05-27  7:23 UTC (permalink / raw)
  To: Changlong Xie; +Cc: qemu devel

On Fri, 05/27 09:59, Changlong Xie wrote:
> Hi fam, do you have time to help reviewing this patchset? Consider of we are
> in the same time zone what will speed up code reviewing process,
> any feedback will be appreciated.

Today I don't, but I will take a look at this series on next Monday.

Fam

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v19 09/10] tests: add unit test case for replication
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication Changlong Xie
  2016-05-27  1:46   ` Changlong Xie
@ 2016-05-30 17:34   ` Stefan Hajnoczi
  2016-05-31 10:21     ` Changlong Xie
  1 sibling, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2016-05-30 17:34 UTC (permalink / raw)
  To: Changlong Xie
  Cc: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf,
	Jeff Cody, Wen Congyang, zhanghailiang, qemu block,
	Jiang Yunhong, Dong Eddie, Dr. David Alan Gilbert,
	Markus Armbruster, Gonglei, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 2388 bytes --]

On Fri, May 20, 2016 at 03:36:19PM +0800, Changlong Xie wrote:
> +/* primary */
> +#define P_LOCAL_DISK "/tmp/p_local_disk.XXXXXX"
> +#define P_COMMAND "driver=replication,mode=primary,node-name=xxx,"\
> +                  "file.driver=qcow2,file.file.filename="P_LOCAL_DISK
> +
> +/* secondary */
> +#define S_LOCAL_DISK "/tmp/s_local_disk.XXXXXX"
> +#define S_ACTIVE_DISK "/tmp/s_active_disk.XXXXXX"
> +#define S_HIDDEN_DISK "/tmp/s_hidden_disk.XXXXXX"

Please use unique filenames so that multiple instances of the test can
run in parallel on a single machine.  mkstemp(3) can be used to do this.

> +static void io_read(BlockDriverState *bs, long pattern, int64_t pattern_offset,
> +                    int64_t pattern_count, int64_t offset, int64_t count,
> +                    bool expect_failed)
> +{
> +    char *buf;
> +    void *cmp_buf;
> +    int ret;
> +
> +    /* 1. alloc pattern buffer */
> +    if (pattern) {
> +        cmp_buf = g_malloc(pattern_count);
> +        memset(cmp_buf, pattern, pattern_count);
> +    }
> +
> +    /* 2. alloc read buffer */
> +    buf = qemu_blockalign(bs, count);
> +    memset(buf, 0xab, count);
> +
> +    /* 3. do read */
> +    ret = bdrv_read(bs, offset >> 9, (uint8_t *)buf, count >> 9);
> +
> +    /* 4. assert and compare buf */
> +    if (expect_failed) {
> +        g_assert(ret < 0);
> +    } else {
> +        g_assert(ret >= 0);
> +        if (pattern) {
> +            g_assert(memcmp(buf + pattern_offset, cmp_buf, pattern_count) <= 0);
> +            g_free(cmp_buf);

if pattern && expect_failed then cmp_buf is leaked.  Probably best to
initialize cmp_buf = NULL and have an unconditional g_free(cmp_buf) at
the end of the function to avoid leaks.

> +        }
> +    }
> +    g_free(buf);

qemu_blockalign() memory is freed with qemu_vfree(), not g_free().

> +static void test_primary_do_checkpoint(void)
> +{
> +    BlockDriverState *bs;
> +    Error *local_err = NULL;
> +
> +    bs = start_primary();
> +
> +    replication_do_checkpoint_all(&local_err);
> +    g_assert(!local_err);
> +
> +    teardown_primary(bs);
> +}

Shouldn't replication_start_all() be called before
replication_do_checkpoint_all()?

> +int main(int argc, char **argv)
> +{
> +    int ret;
> +    qemu_init_main_loop(&error_fatal);
> +    bdrv_init();
> +
> +    do {} while (g_main_context_iteration(NULL, false));

Why is this necessary?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v19 08/10] Implement new driver for block replication
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication Changlong Xie
@ 2016-05-30 18:14   ` Stefan Hajnoczi
  2016-05-31  1:20     ` Changlong Xie
  2016-06-07  4:59   ` [Qemu-devel] " Changlong Xie
  2016-06-07  5:36   ` Changlong Xie
  2 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2016-05-30 18:14 UTC (permalink / raw)
  To: Changlong Xie
  Cc: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf,
	Jeff Cody, Wen Congyang, zhanghailiang, qemu block,
	Jiang Yunhong, Dong Eddie, Dr. David Alan Gilbert,
	Markus Armbruster, Gonglei, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 539 bytes --]

On Fri, May 20, 2016 at 03:36:18PM +0800, Changlong Xie wrote:
> +        /* start backup job now */
> +        error_setg(&s->blocker,
> +                   "block device is in use by internal backup job");
> +
> +        top_bs = bdrv_lookup_bs(s->top_id, s->top_id, errp);
> +        if (!top_bs || !check_top_bs(top_bs, bs)) {
> +            reopen_backing_file(s, false, NULL);
> +            aio_context_release(aio_context);
> +            return;
> +        }

Missing error_setg() with an error message when check_top_bs() fails.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v19 00/10] Block replication for continuous checkpoints
  2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
                   ` (10 preceding siblings ...)
  2016-05-27  1:59 ` [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
@ 2016-05-30 18:20 ` Stefan Hajnoczi
  2016-05-31 10:25   ` Changlong Xie
  11 siblings, 1 reply; 22+ messages in thread
From: Stefan Hajnoczi @ 2016-05-30 18:20 UTC (permalink / raw)
  To: Changlong Xie
  Cc: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf,
	Jeff Cody, Wen Congyang, zhanghailiang, qemu block,
	Jiang Yunhong, Dong Eddie, Dr. David Alan Gilbert,
	Markus Armbruster, Gonglei, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 6772 bytes --]

On Fri, May 20, 2016 at 03:36:10PM +0800, Changlong Xie wrote:
> Block replication is a very important feature which is used for
> continuous checkpoints(for example: COLO).
> 
> You can get the detailed information about block replication from here:
> http://wiki.qemu.org/Features/BlockReplication
> 
> Usage:
> Please refer to docs/block-replication.txt
> 
> You can get the patch here:
> https://github.com/Pating/qemu/tree/changlox/block-replication-v19
> 
> You can get the patch with framework here:
> https://github.com/Pating/qemu/tree/changlox/colo_framework_v18
> 
> TODO:
> 1. Continuous block replication. It will be started after basic functions
>    are accepted.
> 
> Changs Log:
> V19:
> 1. Rebase to v2.6.0
> 2. Address comments from stefan
> p3: a new patch that export interfaces for extra serialization
> p8: 
> 1. call replication_stop() before freeing s->top_id
> 2. check top_bs
> 3. reopen file readonly in error return paths
> 4. enable extra serialization between read and COW
> p9: try to hanlde SIGABRT
> V18:
> p6: add local_err in all replication callbacks to prevent "errp == NULL"
> p7: add missing qemu_iovec_destroy(xxx)
> V17:
> 1. Rebase to the lastest codes 
> p2: refactor backup_do_checkpoint addressed comments from Jeff Cody
> p4: fix bugs in "drive_add buddy xxx" hmp commands
> p6: add "since: 2.7"
> p7: fix bug in replication_close(), add missing "qapi/error.h", add test-replication 
> p8: add "since: 2.7"
> V16:
> 1. Rebase to the newest codes
> 2. Address comments from Stefan & hailiang
> p3: we don't need this patch now
> p4: add "top-id" parameters for secondary
> p6: fix NULL pointer in replication callbacks, remove unnecessary typedefs, 
> add doc comments that explain the semantics of Replication
> p7: Refactor AioContext for thread-safe, remove unnecessary get_top_bs()
> *Note*: I'm working on replication testcase now, will send out in V17
> V15:
> 1. Rebase to the newest codes
> 2. Fix typos and coding style addresed Eric's comments
> 3. Address Stefan's comments
>    1) Make backup_do_checkpoint public, drop the changes on BlockJobDriver
>    2) Update the message and description for [PATCH 4/9]
>    3) Make replication_(start/stop/do_checkpoint)_all as global interfaces
>    4) Introduce AioContext lock to protect start/stop/do_checkpoint callbacks
>    5) Use BdrvChild instead of holding on to BlockDriverState * pointers
> 4. Clear BDRV_O_INACTIVE for hidden disk's open_flags since commit 09e0c771  
> 5. Introduce replication_get_error_all to check replication status
> 6. Remove useless discard interface
> V14:
> 1. Implement auto complete active commit
> 2. Implement active commit block job for replication.c
> 3. Address the comments from Stefan, add replication-specific API and data
>    structure, also remove old block layer APIs
> V13:
> 1. Rebase to the newest codes
> 2. Remove redundant marcos and semicolon in replication.c 
> 3. Fix typos in block-replication.txt
> V12:
> 1. Rebase to the newest codes
> 2. Use backing reference to replcace 'allow-write-backing-file'
> V11:
> 1. Reopen the backing file when starting blcok replication if it is not
>    opened in R/W mode
> 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
>    when opening backing file
> 3. Block the top BDS so there is only one block job for the top BDS and
>    its backing chain.
> V10:
> 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
>    reference.
> 2. Address the comments from Eric Blake
> V9:
> 1. Update the error messages
> 2. Rebase to the newest qemu
> 3. Split child add/delete support. These patches are sent in another patchset.
> V8:
> 1. Address Alberto Garcia's comments
> V7:
> 1. Implement adding/removing quorum child. Remove the option non-connect.
> 2. Simplify the backing refrence option according to Stefan Hajnoczi's suggestion
> V6:
> 1. Rebase to the newest qemu.
> V5:
> 1. Address the comments from Gong Lei
> 2. Speed the failover up. The secondary vm can take over very quickly even
>    if there are too many I/O requests.
> V4:
> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
> V3:
> 1: use error_setg() instead of error_set()
> 2. Add a new block job API
> 3. Active disk, hidden disk and nbd target uses the same AioContext
> 4. Add a testcase to test new hbitmap API
> V2:
> 1. Redesign the secondary qemu(use image-fleecing)
> 2. Use Error objects to return error message
> 3. Address the comments from Max Reitz and Eric Blake
> 
> Changlong Xie (3):
>   Backup: export interfaces for extra serialization
>   Introduce new APIs to do replication operation
>   tests: add unit test case for replication
> 
> Wen Congyang (7):
>   unblock backup operations in backing file
>   Backup: clear all bitmap when doing block checkpoint
>   Link backup into block core
>   docs: block replication's description
>   auto complete active commit
>   Implement new driver for block replication
>   support replication driver in blockdev-add
> 
>  Makefile.objs                |   1 +
>  block.c                      |  17 ++
>  block/Makefile.objs          |   3 +-
>  block/backup.c               |  59 +++-
>  block/mirror.c               |  13 +-
>  block/replication.c          | 666 +++++++++++++++++++++++++++++++++++++++++++
>  blockdev.c                   |   2 +-
>  docs/block-replication.txt   | 239 ++++++++++++++++
>  include/block/block_backup.h |  17 ++
>  include/block/block_int.h    |   3 +-
>  qapi/block-core.json         |  33 ++-
>  qemu-img.c                   |   2 +-
>  replication.c                | 105 +++++++
>  replication.h                | 176 ++++++++++++
>  tests/.gitignore             |   1 +
>  tests/Makefile               |   4 +
>  tests/test-replication.c     | 523 +++++++++++++++++++++++++++++++++
>  17 files changed, 1847 insertions(+), 17 deletions(-)
>  create mode 100644 block/replication.c
>  create mode 100644 docs/block-replication.txt
>  create mode 100644 include/block/block_backup.h
>  create mode 100644 replication.c
>  create mode 100644 replication.h
>  create mode 100644 tests/test-replication.c

I have reviewed many revisions of this series.  The main mechanism in
this series makes sense to me.

I'm still concerned that checkpointing (vm_stop(), not in this series
but COLO in general) depends on bdrv_drain(), which can block forever if
I/O is hung.  That doesn't seem like a reasonable limitation for a high
availability feature since it may lead to the VM becoming unavailable.

I'd like Jeff and/or Kevin to review this series and merge it once they
are happy.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v19 08/10] Implement new driver for block replication
  2016-05-30 18:14   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2016-05-31  1:20     ` Changlong Xie
  0 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-31  1:20 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf,
	Jeff Cody, Wen Congyang, zhanghailiang, qemu block,
	Jiang Yunhong, Dong Eddie, Dr. David Alan Gilbert,
	Markus Armbruster, Gonglei, Paolo Bonzini

On 05/31/2016 02:14 AM, Stefan Hajnoczi wrote:
> On Fri, May 20, 2016 at 03:36:18PM +0800, Changlong Xie wrote:
>> +        /* start backup job now */
>> +        error_setg(&s->blocker,
>> +                   "block device is in use by internal backup job");
>> +
>> +        top_bs = bdrv_lookup_bs(s->top_id, s->top_id, errp);
>> +        if (!top_bs || !check_top_bs(top_bs, bs)) {
>> +            reopen_backing_file(s, false, NULL);
>> +            aio_context_release(aio_context);
>> +            return;
>> +        }
>
> Missing error_setg() with an error message when check_top_bs() fails.
>

Will add.

Thanks
	-Xie

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v19 09/10] tests: add unit test case for replication
  2016-05-30 17:34   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2016-05-31 10:21     ` Changlong Xie
  0 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-31 10:21 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf,
	Jeff Cody, Wen Congyang, zhanghailiang, qemu block,
	Jiang Yunhong, Dong Eddie, Dr. David Alan Gilbert,
	Markus Armbruster, Gonglei, Paolo Bonzini

On 05/31/2016 01:34 AM, Stefan Hajnoczi wrote:
> On Fri, May 20, 2016 at 03:36:19PM +0800, Changlong Xie wrote:
>> +/* primary */
>> +#define P_LOCAL_DISK "/tmp/p_local_disk.XXXXXX"
>> +#define P_COMMAND "driver=replication,mode=primary,node-name=xxx,"\
>> +                  "file.driver=qcow2,file.file.filename="P_LOCAL_DISK
>> +
>> +/* secondary */
>> +#define S_LOCAL_DISK "/tmp/s_local_disk.XXXXXX"
>> +#define S_ACTIVE_DISK "/tmp/s_active_disk.XXXXXX"
>> +#define S_HIDDEN_DISK "/tmp/s_hidden_disk.XXXXXX"
>
> Please use unique filenames so that multiple instances of the test can
> run in parallel on a single machine.  mkstemp(3) can be used to do this.
>

will fix in next version.

>> +static void io_read(BlockDriverState *bs, long pattern, int64_t pattern_offset,
>> +                    int64_t pattern_count, int64_t offset, int64_t count,
>> +                    bool expect_failed)
>> +{
>> +    char *buf;
>> +    void *cmp_buf;
>> +    int ret;
>> +
>> +    /* 1. alloc pattern buffer */
>> +    if (pattern) {
>> +        cmp_buf = g_malloc(pattern_count);
>> +        memset(cmp_buf, pattern, pattern_count);
>> +    }
>> +
>> +    /* 2. alloc read buffer */
>> +    buf = qemu_blockalign(bs, count);
>> +    memset(buf, 0xab, count);
>> +
>> +    /* 3. do read */
>> +    ret = bdrv_read(bs, offset >> 9, (uint8_t *)buf, count >> 9);
>> +
>> +    /* 4. assert and compare buf */
>> +    if (expect_failed) {
>> +        g_assert(ret < 0);
>> +    } else {
>> +        g_assert(ret >= 0);
>> +        if (pattern) {
>> +            g_assert(memcmp(buf + pattern_offset, cmp_buf, pattern_count) <= 0);
>> +            g_free(cmp_buf);
>
> if pattern && expect_failed then cmp_buf is leaked.  Probably best to
> initialize cmp_buf = NULL and have an unconditional g_free(cmp_buf) at
> the end of the function to avoid leaks.
>

Yes, you are right.

>> +        }
>> +    }
>> +    g_free(buf);
>
> qemu_blockalign() memory is freed with qemu_vfree(), not g_free().
>

will fix

>> +static void test_primary_do_checkpoint(void)
>> +{
>> +    BlockDriverState *bs;
>> +    Error *local_err = NULL;
>> +
>> +    bs = start_primary();
>> +
>> +    replication_do_checkpoint_all(&local_err);
>> +    g_assert(!local_err);
>> +
>> +    teardown_primary(bs);
>> +}
>
> Shouldn't replication_start_all() be called before
> replication_do_checkpoint_all()?
>

It seems i missed it.

>> +int main(int argc, char **argv)
>> +{
>> +    int ret;
>> +    qemu_init_main_loop(&error_fatal);
>> +    bdrv_init();
>> +
>> +    do {} while (g_main_context_iteration(NULL, false));
>
> Why is this necessary?

Will remove it.

>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v19 00/10] Block replication for continuous checkpoints
  2016-05-30 18:20 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2016-05-31 10:25   ` Changlong Xie
  0 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-05-31 10:25 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf,
	Jeff Cody, Wen Congyang, zhanghailiang, qemu block,
	Jiang Yunhong, Dong Eddie, Dr. David Alan Gilbert,
	Markus Armbruster, Gonglei, Paolo Bonzini

On 05/31/2016 02:20 AM, Stefan Hajnoczi wrote:
> On Fri, May 20, 2016 at 03:36:10PM +0800, Changlong Xie wrote:
>> Block replication is a very important feature which is used for
>> continuous checkpoints(for example: COLO).
>>
>> You can get the detailed information about block replication from here:
>> http://wiki.qemu.org/Features/BlockReplication
>>
>> Usage:
>> Please refer to docs/block-replication.txt
>>
>> You can get the patch here:
>> https://github.com/Pating/qemu/tree/changlox/block-replication-v19
>>
>> You can get the patch with framework here:
>> https://github.com/Pating/qemu/tree/changlox/colo_framework_v18
>>
>> TODO:
>> 1. Continuous block replication. It will be started after basic functions
>>     are accepted.
>>
>> Changs Log:
>> V19:
>> 1. Rebase to v2.6.0
>> 2. Address comments from stefan
>> p3: a new patch that export interfaces for extra serialization
>> p8:
>> 1. call replication_stop() before freeing s->top_id
>> 2. check top_bs
>> 3. reopen file readonly in error return paths
>> 4. enable extra serialization between read and COW
>> p9: try to hanlde SIGABRT
>> V18:
>> p6: add local_err in all replication callbacks to prevent "errp == NULL"
>> p7: add missing qemu_iovec_destroy(xxx)
>> V17:
>> 1. Rebase to the lastest codes
>> p2: refactor backup_do_checkpoint addressed comments from Jeff Cody
>> p4: fix bugs in "drive_add buddy xxx" hmp commands
>> p6: add "since: 2.7"
>> p7: fix bug in replication_close(), add missing "qapi/error.h", add test-replication
>> p8: add "since: 2.7"
>> V16:
>> 1. Rebase to the newest codes
>> 2. Address comments from Stefan & hailiang
>> p3: we don't need this patch now
>> p4: add "top-id" parameters for secondary
>> p6: fix NULL pointer in replication callbacks, remove unnecessary typedefs,
>> add doc comments that explain the semantics of Replication
>> p7: Refactor AioContext for thread-safe, remove unnecessary get_top_bs()
>> *Note*: I'm working on replication testcase now, will send out in V17
>> V15:
>> 1. Rebase to the newest codes
>> 2. Fix typos and coding style addresed Eric's comments
>> 3. Address Stefan's comments
>>     1) Make backup_do_checkpoint public, drop the changes on BlockJobDriver
>>     2) Update the message and description for [PATCH 4/9]
>>     3) Make replication_(start/stop/do_checkpoint)_all as global interfaces
>>     4) Introduce AioContext lock to protect start/stop/do_checkpoint callbacks
>>     5) Use BdrvChild instead of holding on to BlockDriverState * pointers
>> 4. Clear BDRV_O_INACTIVE for hidden disk's open_flags since commit 09e0c771
>> 5. Introduce replication_get_error_all to check replication status
>> 6. Remove useless discard interface
>> V14:
>> 1. Implement auto complete active commit
>> 2. Implement active commit block job for replication.c
>> 3. Address the comments from Stefan, add replication-specific API and data
>>     structure, also remove old block layer APIs
>> V13:
>> 1. Rebase to the newest codes
>> 2. Remove redundant marcos and semicolon in replication.c
>> 3. Fix typos in block-replication.txt
>> V12:
>> 1. Rebase to the newest codes
>> 2. Use backing reference to replcace 'allow-write-backing-file'
>> V11:
>> 1. Reopen the backing file when starting blcok replication if it is not
>>     opened in R/W mode
>> 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
>>     when opening backing file
>> 3. Block the top BDS so there is only one block job for the top BDS and
>>     its backing chain.
>> V10:
>> 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
>>     reference.
>> 2. Address the comments from Eric Blake
>> V9:
>> 1. Update the error messages
>> 2. Rebase to the newest qemu
>> 3. Split child add/delete support. These patches are sent in another patchset.
>> V8:
>> 1. Address Alberto Garcia's comments
>> V7:
>> 1. Implement adding/removing quorum child. Remove the option non-connect.
>> 2. Simplify the backing refrence option according to Stefan Hajnoczi's suggestion
>> V6:
>> 1. Rebase to the newest qemu.
>> V5:
>> 1. Address the comments from Gong Lei
>> 2. Speed the failover up. The secondary vm can take over very quickly even
>>     if there are too many I/O requests.
>> V4:
>> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
>> V3:
>> 1: use error_setg() instead of error_set()
>> 2. Add a new block job API
>> 3. Active disk, hidden disk and nbd target uses the same AioContext
>> 4. Add a testcase to test new hbitmap API
>> V2:
>> 1. Redesign the secondary qemu(use image-fleecing)
>> 2. Use Error objects to return error message
>> 3. Address the comments from Max Reitz and Eric Blake
>>
>> Changlong Xie (3):
>>    Backup: export interfaces for extra serialization
>>    Introduce new APIs to do replication operation
>>    tests: add unit test case for replication
>>
>> Wen Congyang (7):
>>    unblock backup operations in backing file
>>    Backup: clear all bitmap when doing block checkpoint
>>    Link backup into block core
>>    docs: block replication's description
>>    auto complete active commit
>>    Implement new driver for block replication
>>    support replication driver in blockdev-add
>>
>>   Makefile.objs                |   1 +
>>   block.c                      |  17 ++
>>   block/Makefile.objs          |   3 +-
>>   block/backup.c               |  59 +++-
>>   block/mirror.c               |  13 +-
>>   block/replication.c          | 666 +++++++++++++++++++++++++++++++++++++++++++
>>   blockdev.c                   |   2 +-
>>   docs/block-replication.txt   | 239 ++++++++++++++++
>>   include/block/block_backup.h |  17 ++
>>   include/block/block_int.h    |   3 +-
>>   qapi/block-core.json         |  33 ++-
>>   qemu-img.c                   |   2 +-
>>   replication.c                | 105 +++++++
>>   replication.h                | 176 ++++++++++++
>>   tests/.gitignore             |   1 +
>>   tests/Makefile               |   4 +
>>   tests/test-replication.c     | 523 +++++++++++++++++++++++++++++++++
>>   17 files changed, 1847 insertions(+), 17 deletions(-)
>>   create mode 100644 block/replication.c
>>   create mode 100644 docs/block-replication.txt
>>   create mode 100644 include/block/block_backup.h
>>   create mode 100644 replication.c
>>   create mode 100644 replication.h
>>   create mode 100644 tests/test-replication.c
>
> I have reviewed many revisions of this series.  The main mechanism in
> this series makes sense to me.
>

Thanks.

> I'm still concerned that checkpointing (vm_stop(), not in this series
> but COLO in general) depends on bdrv_drain(), which can block forever if
> I/O is hung.  That doesn't seem like a reasonable limitation for a high
> availability feature since it may lead to the VM becoming unavailable.

IIRC, this issue is mentioned in my older patchset.

>
> I'd like Jeff and/or Kevin to review this series and merge it once they
> are happy.
>

@jeff, kevin and all block maintainters, would you like to review this 
patchset?

Thanks
	-Xie
> Stefan
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication Changlong Xie
  2016-05-30 18:14   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2016-06-07  4:59   ` Changlong Xie
  2016-06-07  5:36   ` Changlong Xie
  2 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-06-07  4:59 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang

On 05/20/2016 03:36 PM, Changlong Xie wrote:
> +
> +        /*
> +         * Must protect backup target if backup job was stopped/cancelled
> +         * unexpectedly
> +         */
> +        bdrv_ref(s->hidden_disk->bs);
> +
> +        backup_start(s->secondary_disk->bs, s->hidden_disk->bs, 0,
> +                     MIRROR_SYNC_MODE_NONE, NULL, BLOCKDEV_ON_ERROR_REPORT,
> +                     BLOCKDEV_ON_ERROR_REPORT, backup_job_completed,
> +                     s, NULL, &local_err);
> +        if (local_err) {
> +            error_propagate(errp, local_err);
> +            backup_job_cleanup(s);
> +            bdrv_unref(s->hidden_disk->bs);
> +            aio_context_release(aio_context);
> +            return;
> +        }
> +        break;
> +    default:
> +        aio_context_release(aio_context);
> +        abort();
> +    }

commit 5c438bc6 introduce BB for I/O, so we don't need protect backup 
target by ourself now.

-        /*
-         * Must protect backup target if backup job was stopped/cancelled
-         * unexpectedly
-         */
-        bdrv_ref(s->hidden_disk->bs);
-
          backup_start(s->secondary_disk->bs, s->hidden_disk->bs, 0,
                       MIRROR_SYNC_MODE_NONE, NULL, 
BLOCKDEV_ON_ERROR_REPORT,
                       BLOCKDEV_ON_ERROR_REPORT, backup_job_completed,
@@ -508,7 +502,6 @@ static void replication_start(ReplicationState *rs, 
ReplicationMode mode,
          if (local_err) {
              error_propagate(errp, local_err);
              backup_job_cleanup(s);
-            bdrv_unref(s->hidden_disk->bs);
              aio_context_release(aio_context);
              return;
          }

will update in next version.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication
  2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication Changlong Xie
  2016-05-30 18:14   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2016-06-07  4:59   ` [Qemu-devel] " Changlong Xie
@ 2016-06-07  5:36   ` Changlong Xie
  2 siblings, 0 replies; 22+ messages in thread
From: Changlong Xie @ 2016-06-07  5:36 UTC (permalink / raw)
  To: qemu devel, Stefan Hajnoczi, Fam Zheng, Max Reitz, Kevin Wolf, Jeff Cody
  Cc: qemu block, Paolo Bonzini, John Snow, Eric Blake,
	Markus Armbruster, Dr. David Alan Gilbert, Dong Eddie,
	Jiang Yunhong, zhanghailiang, Gonglei, Wen Congyang

On 05/20/2016 03:36 PM, Changlong Xie wrote:
> +        if (!failover) {
> +            /*
> +             * This BDS will be closed, and the job should be completed
> +             * before the BDS is closed, because we will access hidden
> +             * disk, secondary disk in backup_job_completed().
> +             */
> +            if (s->secondary_disk->bs->job) {
> +                block_job_cancel_sync(s->secondary_disk->bs->job);
> +            }
> +            secondary_do_checkpoint(s, errp);
> +            s->replication_state = BLOCK_REPLICATION_DONE;
> +            aio_context_release(aio_context);
> +            return;
> +        }
> +
> +        s->replication_state = BLOCK_REPLICATION_FAILOVER;
> +        if (s->secondary_disk->bs->job) {
> +            block_job_cancel(s->secondary_disk->bs->job);
> +        }

Since commit b6d2e599 "block: Convert block job core to BlockBackend", 
blockjob uses BB instead of bdrv_ref(), this introduces unexpected 
Segmentation fault with COLO.

In the below backtrace, you can see that. During failover, s->target was 
changed to an illegal value "0x1e1e1e1e1e1e1e1e" in bakup_complete.
Then the active commit job what also has a pointer that refer to 
s->target will use this illegal pointer. To avoid this, we should use 
"bloc_job_cancel_sync" to ensure backup job complete synchronously.

% MALLOC_PERTURB_=$(($RANDOM % 255 + 1))
% export MALLOC_PERTURB_
% gdb --args ./tests/test-replication
(gdb) b backup_complete
(gdb) r
(gdb) n
(gdb) n
(gdb) watch s->target
(gdb) c
Old value = (BlockBackend *) 0x555555f1d990
New value = (BlockBackend *) 0x1e1e1e1e1e1e1e1e
0x00007ffff5a811eb in memset () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff5a811eb in memset () from /lib64/libc.so.6
#1  0x00007ffff5a7500e in _int_free () from /lib64/libc.so.6
#2  0x00007ffff705bf7f in g_free () from /lib64/libglib-2.0.so.0
#3  0x000055555557e924 in block_job_unref (job=0x555555f1d630) at 
blockjob.c:124
#4  0x000055555557e9da in block_job_completed_single 
(job=0x555555f1d630) at blockjob.c:143
#5  0x000055555557ecc8 in block_job_completed (job=0x555555f1d630, 
ret=0) at blockjob.c:215
#6  0x00005555555e6d49 in backup_complete (job=0x555555f1d630, 
opaque=0x555555f1dd50) at block/backup.c:325
#7  0x000055555557f5f4 in block_job_defer_to_main_loop_bh 
(opaque=0x5555596e1dc0) at blockjob.c:500
#8  0x00005555555747d7 in aio_bh_call (bh=0x5555596e1c30) at async.c:66
#9  0x0000555555574899 in aio_bh_poll (ctx=0x555555ce2d60) at async.c:94
#10 0x0000555555581d4d in aio_dispatch (ctx=0x555555ce2d60) at 
aio-posix.c:308
#11 0x00005555555823cb in aio_poll (ctx=0x555555ce2d60, blocking=false) 
at aio-posix.c:479
#12 0x00005555555d639b in bdrv_drain_poll (bs=0x555555dec210) at 
block/io.c:190
#13 0x00005555555d6566 in bdrv_drained_begin (bs=0x555555dec210) at 
block/io.c:240
#14 0x0000555555577261 in bdrv_child_cb_drained_begin 
(child=0x5555596e1ab0) at block.c:665
#15 0x00005555555d5e81 in bdrv_parent_drained_begin (bs=0x555555f0e130) 
at block/io.c:54
#16 0x00005555555d652b in bdrv_drained_begin (bs=0x555555f0e130) at 
block/io.c:232
#17 0x0000555555577261 in bdrv_child_cb_drained_begin 
(child=0x5555596e1810) at block.c:665
#18 0x00005555555d5e81 in bdrv_parent_drained_begin (bs=0x5555596d7030) 
at block/io.c:54
#19 0x00005555555d652b in bdrv_drained_begin (bs=0x5555596d7030) at 
block/io.c:232
#20 0x0000555555577261 in bdrv_child_cb_drained_begin 
(child=0x555555df8830) at block.c:665
#21 0x00005555555d5e81 in bdrv_parent_drained_begin (bs=0x555555df0bb0) 
at block/io.c:54
#22 0x00005555555d66ee in bdrv_drain_all () at block/io.c:301
#23 0x0000555555579f9f in bdrv_reopen_multiple (bs_queue=0x555555f1dd30, 
errp=0x7fffffffd768) at block.c:1953
#24 0x000055555557a169 in bdrv_reopen (bs=0x555555df0bb0, 
bdrv_flags=24578, errp=0x7fffffffd8e0) at block.c:1994
#25 0x00005555555d54a7 in commit_active_start (bs=0x555555f0e130, 
base=0x555555df0bb0, speed=0, on_error=BLOCKDEV_ON_ERROR_REPORT, 
cb=0x5555555e8b97 <replication_done>,
     opaque=0x555555dec210, errp=0x7fffffffd8e0, auto_complete=true) at 
block/mirror.c:901
#26 0x00005555555e8d98 in replication_stop (rs=0x55555746f7b0, 
failover=true, errp=0x7fffffffd8e0) at block/replication.c:623
#27 0x00005555555873d1 in replication_stop_all (failover=true, 
errp=0x7fffffffd928) at replication.c:98
#28 0x000055555557406d in test_secondary_start () at 
tests/test-replication.c:403
#29 0x00007ffff707a5e1 in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#30 0x00007ffff707a7a6 in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#31 0x00007ffff707a7a6 in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#32 0x00007ffff707ab1b in g_test_run_suite () from /lib64/libglib-2.0.so.0
#33 0x00005555555746c8 in main (argc=1, argv=0x7fffffffdda8) at 
tests/test-replication.c:545
(gdb) d breakpoints
(gdb) c
Program received signal SIGSEGV, Segmentation fault.
0x00005555555c7d6c in blk_bs (blk=0x1e1e1e1e1e1e1e1e) at 
block/block-backend.c:389
389         return blk->root ? blk->root->bs : NULL;
(gdb) bt
#0  0x00005555555c7d6c in blk_bs (blk=0x1e1e1e1e1e1e1e1e) at 
block/block-backend.c:389
#1  0x00005555555c79e3 in bdrv_next (it=0x7fffffffd6a0) at 
block/block-backend.c:279
#2  0x00005555555d674d in bdrv_drain_all () at block/io.c:294
#3  0x0000555555579f9f in bdrv_reopen_multiple (bs_queue=0x555555f1dd30, 
errp=0x7fffffffd768) at block.c:1953
#4  0x000055555557a169 in bdrv_reopen (bs=0x555555df0bb0, 
bdrv_flags=24578, errp=0x7fffffffd8e0) at block.c:1994
#5  0x00005555555d54a7 in commit_active_start (bs=0x555555f0e130, 
base=0x555555df0bb0, speed=0, on_error=BLOCKDEV_ON_ERROR_REPORT, 
cb=0x5555555e8b97 <replication_done>,
     opaque=0x555555dec210, errp=0x7fffffffd8e0, auto_complete=true) at 
block/mirror.c:901
#6  0x00005555555e8d98 in replication_stop (rs=0x55555746f7b0, 
failover=true, errp=0x7fffffffd8e0) at block/replication.c:623
#7  0x00005555555873d1 in replication_stop_all (failover=true, 
errp=0x7fffffffd928) at replication.c:98
#8  0x000055555557406d in test_secondary_start () at 
tests/test-replication.c:403
#9  0x00007ffff707a5e1 in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#10 0x00007ffff707a7a6 in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#11 0x00007ffff707a7a6 in g_test_run_suite_internal () from 
/lib64/libglib-2.0.so.0
#12 0x00007ffff707ab1b in g_test_run_suite () from /lib64/libglib-2.0.so.0
#13 0x00005555555746c8 in main (argc=1, argv=0x7fffffffdda8) at 
tests/test-replication.c:545

> +
> +        commit_active_start(s->active_disk->bs, s->secondary_disk->bs, 0,
> +                            BLOCKDEV_ON_ERROR_REPORT, replication_done,
> +                            bs, errp, true);
> +        break;

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-06-07  5:33 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 01/10] unblock backup operations in backing file Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 02/10] Backup: clear all bitmap when doing block checkpoint Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 03/10] Backup: export interfaces for extra serialization Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 04/10] Link backup into block core Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 05/10] docs: block replication's description Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 06/10] auto complete active commit Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 07/10] Introduce new APIs to do replication operation Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication Changlong Xie
2016-05-30 18:14   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-05-31  1:20     ` Changlong Xie
2016-06-07  4:59   ` [Qemu-devel] " Changlong Xie
2016-06-07  5:36   ` Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication Changlong Xie
2016-05-27  1:46   ` Changlong Xie
2016-05-30 17:34   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-05-31 10:21     ` Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 10/10] support replication driver in blockdev-add Changlong Xie
2016-05-27  1:59 ` [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
2016-05-27  7:23   ` Fam Zheng
2016-05-30 18:20 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-05-31 10:25   ` Changlong Xie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.