From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46642) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yv9Vm-0004g1-Q5 for qemu-devel@nongnu.org; Wed, 20 May 2015 15:18:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yv9Vl-0003V7-If for qemu-devel@nongnu.org; Wed, 20 May 2015 15:18:38 -0400 Date: Wed, 20 May 2015 20:18:28 +0100 From: "Dr. David Alan Gilbert" Message-ID: <20150520191827.GH2148@work-vm> References: <1431076567-30371-1-git-send-email-wency@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1431076567-30371-1-git-send-email-wency@cn.fujitsu.com> Subject: Re: [Qemu-devel] [PATCH COLO v4 00/15] Block replication for continuous checkpoints List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wen Congyang Cc: Kevin Wolf , Fam Zheng , Lai Jiangshan , qemu block , Jiang Yunhong , Dong Eddie , qemu devel , Max Reitz , Stefan Hajnoczi , Paolo Bonzini , Yang Hongyang , zhang.zhanghailiang@huawei.com * Wen Congyang (wency@cn.fujitsu.com) wrote: > Block replication is a very important feature which is used for > continuous checkpoints(for example: COLO). > > Usage: > Please refer to docs/block-replication.txt > > You can get the patch here: > https://github.com/wencongyang/qemu-colo/commits/block-replication-v4 > > You can get the patch with the other COLO patches here: > https://github.com/wencongyang/qemu-colo/tree/colo_huawei_v4.7 Hi, A couple of questions: 1) I still trip the handle_aiocb_rw assertion occasionally; I see Kevin was asking for some detail on http://lists.nongnu.org/archive/html/qemu-devel/2015-01/msg04507.html is that still the right fix? 2) The only stats I see on the block replication are the info block-jobs completed 'n' of 'm' - is 'n' there just the total number of blocks written? Are there any more stats about? > TODO: > 1. Test failover when the guest does too many I/O operations. If it > takes too much time, we need to optimize it. Limiting the size of the state needed for commit might help that; you can always trigger a new checkpoint if the size gets too big; it would also make it more realistic since it will probably fail at the moment if you fill up the RAM you're using to hold the hidden/active disks with a big write. Dave > 2. Continuous block replication. It will be started after basic functions > are accepted. > > Changs Log: > V4: > 1. Introduce a new driver replication to avoid touch nbd and qcow2. > V3: > 1: use error_setg() instead of error_set() > 2. Add a new block job API > 3. Active disk, hidden disk and nbd target uses the same AioContext > 4. Add a testcase to test new hbitmap API > V2: > 1. Redesign the secondary qemu(use image-fleecing) > 2. Use Error objects to return error message > 3. Address the comments from Max Reitz and Eric Blake > > Wen Congyang (15): > docs: block replication's description > allow writing to the backing file > Allow creating backup jobs when opening BDS > block: Parse "backing_reference" option to reference existing BDS > Backup: clear all bitmap when doing block checkpoint > Don't allow a disk use backing reference target > Add new block driver interface to connect/disconnect the remote target > NBD client: implement block driver interfaces to connect/disconnect > NBD server > Introduce a new -drive option to control whether to connect to remote > target > NBD client: connect to nbd server later > Add new block driver interfaces to control block replication > skip nbd_target when starting block replication > quorum: implement block driver interfaces for block replication > quorum: allow ignoring child errors > Implement new driver for block replication > > block.c | 270 +++++++++++++++++++++++- > block/Makefile.objs | 3 +- > block/backup.c | 13 ++ > block/nbd.c | 67 ++++-- > block/quorum.c | 142 ++++++++++++- > block/replication.c | 512 +++++++++++++++++++++++++++++++++++++++++++++ > blockdev.c | 8 + > blockjob.c | 10 + > docs/block-replication.txt | 179 ++++++++++++++++ > include/block/block.h | 10 + > include/block/block_int.h | 18 ++ > include/block/blockjob.h | 12 ++ > qapi/block.json | 16 ++ > qemu-options.hx | 4 + > tests/qemu-iotests/051 | 13 ++ > tests/qemu-iotests/051.out | 13 ++ > 16 files changed, 1260 insertions(+), 30 deletions(-) > create mode 100644 block/replication.c > create mode 100644 docs/block-replication.txt > > -- > 2.1.0 > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK