All of lore.kernel.org
 help / color / mirror / Atom feed
From: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
To: zhanghailiang <zhang.zhanghailiang@huawei.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: stefanha@redhat.com, kwolf@redhat.com, mreitz@redhat.com,
	pbonzini@redhat.com, wency@cn.fujitsu.com,
	Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PATCH RFC 1/7] docs/block-replication: Add description for shared-disk case
Date: Tue, 25 Oct 2016 17:03:26 +0800	[thread overview]
Message-ID: <580F1FDE.8050401@cn.fujitsu.com> (raw)
In-Reply-To: <1476971860-20860-2-git-send-email-zhang.zhanghailiang@huawei.com>

On 10/20/2016 09:57 PM, zhanghailiang wrote:
> Introuduce the scenario of shared-disk block replication
> and how to use it.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   docs/block-replication.txt | 131 +++++++++++++++++++++++++++++++++++++++++++--
>   1 file changed, 127 insertions(+), 4 deletions(-)
>
> diff --git a/docs/block-replication.txt b/docs/block-replication.txt
> index 6bde673..97fcfc1 100644
> --- a/docs/block-replication.txt
> +++ b/docs/block-replication.txt
> @@ -24,7 +24,7 @@ only dropped at next checkpoint time. To reduce the network transportation
>   effort during a vmstate checkpoint, the disk modification operations of
>   the Primary disk are asynchronously forwarded to the Secondary node.
>
> -== Workflow ==
> +== Non-shared disk workflow ==
>   The following is the image of block replication workflow:
>
>           +----------------------+            +------------------------+
> @@ -57,7 +57,7 @@ The following is the image of block replication workflow:
>       4) Secondary write requests will be buffered in the Disk buffer and it
>          will overwrite the existing sector content in the buffer.
>
> -== Architecture ==
> +== None-shared disk architecture ==

s/None-shared/Non-shared/g

>   We are going to implement block replication from many basic
>   blocks that are already in QEMU.
>
> @@ -106,6 +106,74 @@ any state that would otherwise be lost by the speculative write-through
>   of the NBD server into the secondary disk. So before block replication,
>   the primary disk and secondary disk should contain the same data.
>
> +== Shared Disk Mode Workflow ==
> +The following is the image of block replication workflow:
> +
> +        +----------------------+            +------------------------+
> +        |Primary Write Requests|            |Secondary Write Requests|
> +        +----------------------+            +------------------------+
> +                  |                                       |
> +                  |                                      (4)
> +                  |                                       V
> +                  |                              /-------------\
> +                  | (2)Forward and write through |             |
> +                  | +--------------------------> | Disk Buffer |
> +                  | |                            |             |
> +                  | |                            \-------------/
> +                  | |(1)read                           |
> +                  | |                                  |
> +       (3)write   | |                                  | backing file
> +                  V |                                  |
> +                 +-----------------------------+       |
> +                 | Shared Disk                 | <-----+
> +                 +-----------------------------+
> +
> +    1) Primary writes will read original data and forward it to Secondary
> +       QEMU.
> +    2) Before Primary write requests are written to Shared disk, the
> +       original sector content will be read from Shared disk and
> +       forwarded and buffered in the Disk buffer on the secondary site,
> +       but it will not overwrite the existing

extra spaces at the end of line

> +       sector content(it could be from either "Secondary Write Requests" or

Need a space before "(" for better style.

> +       previous COW of "Primary Write Requests") in the Disk buffer.
> +    3) Primary write requests will be written to Shared disk.
> +    4) Secondary write requests will be buffered in the Disk buffer and it
> +       will overwrite the existing sector content in the buffer.
> +
> +== Shared Disk Mode Architecture ==
> +We are going to implement block replication from many basic
> +blocks that are already in QEMU.
> +         virtio-blk                     ||                               .----------
> +             /                          ||                               | Secondary
> +            /                           ||                               '----------
> +           /                            ||                                 virtio-blk
> +          /                             ||                                      |
> +          |                             ||                               replication(5)
> +          |                    NBD  -------->   NBD   (2)                       |
> +          |                  client     ||    server ---> hidden disk <-- active disk(4)
> +          |                     ^       ||                      |
> +          |              replication(1) ||                      |
> +          |                     |       ||                      |
> +          |   +-----------------'       ||                      |
> +         (3)  |drive-backup sync=none   ||                      |
> +--------. |   +-----------------+       ||                      |
> +Primary | |                     |       ||           backing    |
> +--------' |                     |       ||                      |
> +          V                     |                               |
> +       +-------------------------------------------+            |
> +       |               shared disk                 | <----------+
> +       +-------------------------------------------+
> +
> +
> +    1) Primary writes will read original data and forward it to Secondary
> +       QEMU.
> +    2) The hidden-disk buffers the original content that is modified by the
> +       primary VM. It should also be an empty disk, and

extra spaces at end of line

> +       the driver supports bdrv_make_empty() and backing file.
> +    3) Primary write requests will be written to Shared disk.
> +    4) Secondary write requests will be buffered in the active disk and it
> +       will overwrite the existing sector content in the buffer.
> +
>   == Failure Handling ==
>   There are 7 internal errors when block replication is running:
>   1. I/O error on primary disk
> @@ -145,7 +213,7 @@ d. replication_stop_all()
>      things except failover. The caller must hold the I/O mutex lock if it is
>      in migration/checkpoint thread.
>
> -== Usage ==
> +== Non-shared disk usage ==
>   Primary:
>     -drive if=xxx,driver=quorum,read-pattern=fifo,id=colo1,vote-threshold=1,\
>            children.0.file.filename=1.raw,\
> @@ -234,6 +302,61 @@ Secondary:
>     The primary host is down, so we should do the following thing:
>     { 'execute': 'nbd-server-stop' }
>
> +== Shared disk usage ==

Keep the some coding style with "== Non-shared disk usage ==" part is 
good to me.

> +Primary:
> + -drive if=virtio,id=primary_disk0,file.filename=1.raw,driver=raw
> +
> +Issue qmp command:
> + {'execute': 'human-monitor-command',

two space indentation for the whole "{...}" part

> +    'arguments': {
> +        'command-line': 'drive_add-nbuddydriver=replication,

missing spaces

> +        mode=primary,
> +        file.driver=nbd,
> +        file.host=9.42.3.17,
> +        file.port=9998,
> +        file.export=hidden_disk0,
> +        shared-disk-id=primary_disk0,
> +        shared-disk=on,
> +        node-name=rep'

Keep the whole commands after "command-line" in one line, or you can 
execute it correctly. IIRC

> +    }
> + }

Secondary:

> + -drive if=none,driver=qcow2,file.filename=/mnt/ramfs/hidden_disk.img,id=hidden_disk0,\
> +        backing.driver=raw,backing.file.filename=1.raw \
> + -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\
> +        file.driver=qcow2,top-id=active-disk0,\
> +        file.file.filename=/mnt/ramfs/active_disk.img,\
> +        file.backing=hidden_disk0,shared-disk=on
> +
> +Issue qmp command:
> +1. {'execute': 'nbd-server-start',
> +    'arguments': {
> +        'addr': {
> +            'type': 'inet',
> +            'data': {
> +                'host': '0',

s/0/9.42.3.17/g, since you use designated ip address above

> +                'port': '9998'
> +            }
> +        }
> +    }
> +   }
> +2. {
> +    'execute': 'nbd-server-add',
> +    'arguments': {
> +        'device': 'hidden_disk0',
> +        'writable': true
> +    }
> +  }
> +
> +After Failover:
> +Primary:
> +{'execute': 'human-monitor-command',
> +    'arguments': {
> +        'command-line': 'drive_delrep'

drive_del rep

> +    }
> +}
> +
> +Secondary:
> +  {'execute': 'nbd-server-stop' }
> +
>   TODO:
>   1. Continuous block replication
> -2. Shared disk
>

  reply	other threads:[~2016-10-25  8:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-20 13:57 [Qemu-devel] [PATCH RFC 0/7] COLO block replication supports shared disk case zhanghailiang
2016-10-20 13:57 ` [Qemu-devel] [PATCH RFC 1/7] docs/block-replication: Add description for shared-disk case zhanghailiang
2016-10-25  9:03   ` Changlong Xie [this message]
2016-11-28  5:13     ` Hailiang Zhang
2016-11-28  6:00       ` Changlong Xie
2016-11-28  5:58         ` Hailiang Zhang
2016-10-20 13:57 ` [Qemu-devel] [PATCH RFC 2/7] block-backend: Introduce blk_root() helper zhanghailiang
2016-10-25  9:58   ` Changlong Xie
2016-12-05  2:41     ` Hailiang Zhang
2016-10-20 13:57 ` [Qemu-devel] [PATCH RFC 3/7] replication: add shared-disk and shared-disk-id options zhanghailiang
2016-10-25 10:01   ` Changlong Xie
2016-12-05  3:08     ` Hailiang Zhang
2016-10-26  1:58   ` Changlong Xie
2016-10-20 13:57 ` [Qemu-devel] [PATCH RFC 4/7] replication: Split out backup_do_checkpoint() from secondary_do_checkpoint() zhanghailiang
2016-10-26  1:40   ` Changlong Xie
2016-12-05  3:41     ` Hailiang Zhang
2016-10-20 13:57 ` [Qemu-devel] [PATCH RFC 5/7] replication: fix code logic with the new shared_disk option zhanghailiang
2016-10-20 13:57 ` [Qemu-devel] [PATCH RFC 6/7] replication: Implement block replication for shared disk case zhanghailiang
2016-10-20 13:57 ` [Qemu-devel] [PATCH RFC 7/7] nbd/replication: implement .bdrv_get_info() for nbd and replication driver zhanghailiang
2016-10-20 15:34   ` Eric Blake
2016-10-24  2:44     ` Hailiang Zhang
2016-10-26  2:06 ` [Qemu-devel] [PATCH RFC 0/7] COLO block replication supports shared disk case Changlong Xie
2016-11-22 10:33 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-11-23  1:47   ` Hailiang Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=580F1FDE.8050401@cn.fujitsu.com \
    --to=xiecl.fnst@cn.fujitsu.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=wency@cn.fujitsu.com \
    --cc=zhang.zhanghailiang@huawei.com \
    --cc=zhangchen.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.