From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59142) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQnwL-0004IZ-LS for qemu-devel@nongnu.org; Tue, 02 Feb 2016 22:17:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aQnwI-00047b-DK for qemu-devel@nongnu.org; Tue, 02 Feb 2016 22:17:09 -0500 Received: from [59.151.112.132] (port=30555 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQnwH-0003ym-CT for qemu-devel@nongnu.org; Tue, 02 Feb 2016 22:17:06 -0500 Message-ID: <56B17169.90601@cn.fujitsu.com> Date: Wed, 3 Feb 2016 11:18:01 +0800 From: Changlong Xie MIME-Version: 1.0 References: <1452676712-24239-1-git-send-email-xiecl.fnst@cn.fujitsu.com> <1452676712-24239-6-git-send-email-xiecl.fnst@cn.fujitsu.com> <56B0C945.60100@redhat.com> In-Reply-To: <56B0C945.60100@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v14 5/8] docs: block replication's description List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , qemu devel , Fam Zheng , Max Reitz , Paolo Bonzini , Kevin Wolf , Stefan Hajnoczi Cc: Gonglei , zhanghailiang , fnstml-hwcolo@cn.fujitsu.com On 02/02/2016 11:20 PM, Eric Blake wrote: > On 01/13/2016 02:18 AM, Changlong Xie wrote: >> From: Wen Congyang >> >> Signed-off-by: Wen Congyang >> Signed-off-by: zhanghailiang >> Signed-off-by: Gonglei >> Signed-off-by: Changlong Xie >> --- >> docs/block-replication.txt | 229 +++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 229 insertions(+) >> create mode 100644 docs/block-replication.txt >> >> diff --git a/docs/block-replication.txt b/docs/block-replication.txt >> new file mode 100644 >> index 0000000..d1a231e >> --- /dev/null >> +++ b/docs/block-replication.txt >> @@ -0,0 +1,229 @@ >> +Block replication >> +---------------------------------------- >> +Copyright Fujitsu, Corp. 2015 >> +Copyright (c) 2015 Intel Corporation >> +Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD. > > Do you want to claim 2016 for any of this? Will update in next version. > >> + >> +This work is licensed under the terms of the GNU GPL, version 2 or later. >> +See the COPYING file in the top-level directory. >> + >> +Block replication is used for continuous checkpoints. It is designed >> +for COLO (COarse-grain LOck-stepping) where the Secondary VM is running. >> +It can also be applied for FT/HA (Fault-tolerance/High Assurance) scenario, >> +where the Secondary VM is not running. >> + >> +This document gives an overview of block replication's design. >> + >> +== Background == >> +High availability solutions such as micro checkpoint and COLO will do >> +consecutive checkpoints. The VM state of Primary VM and Secondary VM is > > s/of Primary/of the Primary/ OK > >> +identical right after a VM checkpoint, but becomes different as the VM >> +executes till the next checkpoint. To support disk contents checkpoint, >> +the modified disk contents in the Secondary VM must be buffered, and are >> +only dropped at next checkpoint time. To reduce the network transportation >> +effort at the time of checkpoint, the disk modification operations of > > s/at the time of/during a vmstate/ > s/operations of/operations of the/ OK > >> +Primary disk are asynchronously forwarded to the Secondary node. >> + >> +== Workflow == > >> +== Architecture == > >> + >> +6) The drive-backup job(sync=none) is run to allow hidden-disk to buffer > > Space before ( in English description. OK > >> +any state that would otherwise be lost by the speculative write-through >> +of the NBD server into the secondary disk. So before block replication, >> +the primary disk and secondary disk should contain the same data. >> + >> +== Failure Handling == > >> +== Usage == >> +Primary: >> + -drive if=xxx,driver=quorum,read-pattern=fifo,id=colo1,vote-threshold=1,\ >> + children.0.file.filename=1.raw,\ >> + children.0.driver=raw >> + >> + Run qmp command in primary qemu: >> + { 'execute': 'human-monitor-command', >> + 'arguments': { >> + 'command-line': 'drive_add buddy driver=replication,mode=primary,file.driver=nbd,file.host=xxxx,file.port=xxxx,file.export=colo1,node-name=nbd_client1,if=none' > > Eww. We shouldn't ever have to pack a command line as . single QMP > string that needs reparsing. Instead, you should pass the information > as a nested QMP dictionary, something like: > > 'arguments': { > 'remote-command': { 'command': 'drive_add', 'name': 'buddy', > 'driver': 'replication', 'mode': 'primary', > 'file': { 'driver': 'nbd', 'host': 'xxxx', Hi Eric What is 'remote-command' here? I just tried below commands, but got some errors. {'execute': 'human-monitor-command', 'arguments': { 'command-line': { 'command': 'drive_add', 'name': 'buddy', 'driver': 'replication', 'mode': 'primary', 'if': 'none', 'node-name': 'primary_nbd_node', 'file': { 'driver': 'nbd', 'host': '192.168.3.2', 'port': '8889', 'export': 'colo-disk', } } } } {"error": {"class": "GenericError", "desc": "Invalid JSON syntax"}} 'blockdev-add' doesn't support 'nbd'. So we use 'drive_add' here, and it's a hmp command. If i'm right, there seems just one way to execute hmp commands in QMP: ================================================================ EQMP { .name = "human-monitor-command", .args_type = "command-line:s,cpu-index:i?", .mhandler.cmd_new = qmp_marshal_human_monitor_command, }, SQMP human-monitor-command --------------------- Execute a Human Monitor command. Arguments: - command-line: the command name and its arguments, just like the Human Monitor's shell (json-string) - cpu-index: select the CPU number to be used by commands which access CPU data, like 'info registers'. The Monitor selects CPU 0 if this argument is not provided (json-int, optional) Example: -> { "execute": "human-monitor-command", "arguments": { "command-line": "info kvm" } } <- { "return": "kvm support: enabled\r\n" } ================================================================== Thanks -Xie > ... } } } > >> + } >> + } >> + { 'execute': 'x-blockdev-change', >> + 'arguments': { >> + 'parent': 'colo1', >> + 'node': 'nbd_client1' >> + } >> + } >> + Note: >> + 1. There should be only one NBD Client for each primary disk. >> + 2. host is the secondary physical machine's hostname or IP >> + 3. Each disk must have its own export name. >> + 4. It is all a single argument to -drive and you should ignore the >> + leading whitespace. >> + 5. The qmp command line must be run after running qmp command line in >> + secondary qemu. >> + >> +Secondary: >> + -drive if=none,driver=raw,file.filename=1.raw,id=colo1 \ >> + -drive if=xxx,driver=replication,mode=secondary,\ >> + file.file.filename=active_disk.qcow2,\ >> + file.driver=qcow2,\ >> + file.backing.file.filename=hidden_disk.qcow2,\ >> + file.backing.driver=qcow2,\ >> + file.backing.backing=colo1 >> + >> + Then run qmp command in secondary qemu: >> + { 'execute': 'nbd-server-start', >> + 'arguments': { >> + 'addr': { >> + 'type': 'inet', >> + 'data': { >> + 'host': 'xxx', >> + 'port': 'xxx' >> + } >> + } >> + } >> + } >> + { 'execute': 'nbd-server-add', >> + 'arguments': { >> + 'device': 'colo1', >> + 'writable': true >> + } >> + } >> + >> + Note: >> + 1. The export name in secondary QEMU command line is the secondary >> + disk's id. >> + 2. The export name for the same disk must be the same >> + 3. The qmp command nbd-server-start and nbd-server-add must be run >> + before running the qmp command migrate on primary QEMU >> + 4. Active disk, hidden disk and nbd target's length should be the >> + same. >> + 5. It is better to put active disk and hidden disk in ramdisk. >> + 6. It is all a single argument to -drive, and you should ignore >> + the leading whitespace. >> + >> +After Failover: >> +Primary: >> + The secondary host is down, so we should run the following qmp command >> + to remove the nbd child from the quorum: >> + { 'execute': 'x-blockdev-change', >> + 'arguments': { >> + 'parent': 'colo1', >> + 'child': 'children.1' >> + } >> + } >> + Note: there is no qmp command to remove the blockdev now >> + >> +Secondary: >> + The primary host is down, so we should do the following thing: >> + { 'execute': 'nbd-server-stop' } >> + >> +TODO: >> +1. Continuous block replication >> +2. Shared disk >> >