All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Cc: qemu devel <qemu-devel@nongnu.org>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Fam Zheng <famz@redhat.com>, Max Reitz <mreitz@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>, Jeff Cody <jcody@redhat.com>,
	Wen Congyang <wency@cn.fujitsu.com>,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	qemu block <qemu-block@nongnu.org>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Dong Eddie <eddie.dong@intel.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Gonglei <arei.gonglei@huawei.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v19 00/10] Block replication for continuous checkpoints
Date: Mon, 30 May 2016 11:20:16 -0700	[thread overview]
Message-ID: <20160530182016.GC1366@stefanha-x1.localdomain> (raw)
In-Reply-To: <1463729780-31982-1-git-send-email-xiecl.fnst@cn.fujitsu.com>

[-- Attachment #1: Type: text/plain, Size: 6772 bytes --]

On Fri, May 20, 2016 at 03:36:10PM +0800, Changlong Xie wrote:
> Block replication is a very important feature which is used for
> continuous checkpoints(for example: COLO).
> 
> You can get the detailed information about block replication from here:
> http://wiki.qemu.org/Features/BlockReplication
> 
> Usage:
> Please refer to docs/block-replication.txt
> 
> You can get the patch here:
> https://github.com/Pating/qemu/tree/changlox/block-replication-v19
> 
> You can get the patch with framework here:
> https://github.com/Pating/qemu/tree/changlox/colo_framework_v18
> 
> TODO:
> 1. Continuous block replication. It will be started after basic functions
>    are accepted.
> 
> Changs Log:
> V19:
> 1. Rebase to v2.6.0
> 2. Address comments from stefan
> p3: a new patch that export interfaces for extra serialization
> p8: 
> 1. call replication_stop() before freeing s->top_id
> 2. check top_bs
> 3. reopen file readonly in error return paths
> 4. enable extra serialization between read and COW
> p9: try to hanlde SIGABRT
> V18:
> p6: add local_err in all replication callbacks to prevent "errp == NULL"
> p7: add missing qemu_iovec_destroy(xxx)
> V17:
> 1. Rebase to the lastest codes 
> p2: refactor backup_do_checkpoint addressed comments from Jeff Cody
> p4: fix bugs in "drive_add buddy xxx" hmp commands
> p6: add "since: 2.7"
> p7: fix bug in replication_close(), add missing "qapi/error.h", add test-replication 
> p8: add "since: 2.7"
> V16:
> 1. Rebase to the newest codes
> 2. Address comments from Stefan & hailiang
> p3: we don't need this patch now
> p4: add "top-id" parameters for secondary
> p6: fix NULL pointer in replication callbacks, remove unnecessary typedefs, 
> add doc comments that explain the semantics of Replication
> p7: Refactor AioContext for thread-safe, remove unnecessary get_top_bs()
> *Note*: I'm working on replication testcase now, will send out in V17
> V15:
> 1. Rebase to the newest codes
> 2. Fix typos and coding style addresed Eric's comments
> 3. Address Stefan's comments
>    1) Make backup_do_checkpoint public, drop the changes on BlockJobDriver
>    2) Update the message and description for [PATCH 4/9]
>    3) Make replication_(start/stop/do_checkpoint)_all as global interfaces
>    4) Introduce AioContext lock to protect start/stop/do_checkpoint callbacks
>    5) Use BdrvChild instead of holding on to BlockDriverState * pointers
> 4. Clear BDRV_O_INACTIVE for hidden disk's open_flags since commit 09e0c771  
> 5. Introduce replication_get_error_all to check replication status
> 6. Remove useless discard interface
> V14:
> 1. Implement auto complete active commit
> 2. Implement active commit block job for replication.c
> 3. Address the comments from Stefan, add replication-specific API and data
>    structure, also remove old block layer APIs
> V13:
> 1. Rebase to the newest codes
> 2. Remove redundant marcos and semicolon in replication.c 
> 3. Fix typos in block-replication.txt
> V12:
> 1. Rebase to the newest codes
> 2. Use backing reference to replcace 'allow-write-backing-file'
> V11:
> 1. Reopen the backing file when starting blcok replication if it is not
>    opened in R/W mode
> 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
>    when opening backing file
> 3. Block the top BDS so there is only one block job for the top BDS and
>    its backing chain.
> V10:
> 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
>    reference.
> 2. Address the comments from Eric Blake
> V9:
> 1. Update the error messages
> 2. Rebase to the newest qemu
> 3. Split child add/delete support. These patches are sent in another patchset.
> V8:
> 1. Address Alberto Garcia's comments
> V7:
> 1. Implement adding/removing quorum child. Remove the option non-connect.
> 2. Simplify the backing refrence option according to Stefan Hajnoczi's suggestion
> V6:
> 1. Rebase to the newest qemu.
> V5:
> 1. Address the comments from Gong Lei
> 2. Speed the failover up. The secondary vm can take over very quickly even
>    if there are too many I/O requests.
> V4:
> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
> V3:
> 1: use error_setg() instead of error_set()
> 2. Add a new block job API
> 3. Active disk, hidden disk and nbd target uses the same AioContext
> 4. Add a testcase to test new hbitmap API
> V2:
> 1. Redesign the secondary qemu(use image-fleecing)
> 2. Use Error objects to return error message
> 3. Address the comments from Max Reitz and Eric Blake
> 
> Changlong Xie (3):
>   Backup: export interfaces for extra serialization
>   Introduce new APIs to do replication operation
>   tests: add unit test case for replication
> 
> Wen Congyang (7):
>   unblock backup operations in backing file
>   Backup: clear all bitmap when doing block checkpoint
>   Link backup into block core
>   docs: block replication's description
>   auto complete active commit
>   Implement new driver for block replication
>   support replication driver in blockdev-add
> 
>  Makefile.objs                |   1 +
>  block.c                      |  17 ++
>  block/Makefile.objs          |   3 +-
>  block/backup.c               |  59 +++-
>  block/mirror.c               |  13 +-
>  block/replication.c          | 666 +++++++++++++++++++++++++++++++++++++++++++
>  blockdev.c                   |   2 +-
>  docs/block-replication.txt   | 239 ++++++++++++++++
>  include/block/block_backup.h |  17 ++
>  include/block/block_int.h    |   3 +-
>  qapi/block-core.json         |  33 ++-
>  qemu-img.c                   |   2 +-
>  replication.c                | 105 +++++++
>  replication.h                | 176 ++++++++++++
>  tests/.gitignore             |   1 +
>  tests/Makefile               |   4 +
>  tests/test-replication.c     | 523 +++++++++++++++++++++++++++++++++
>  17 files changed, 1847 insertions(+), 17 deletions(-)
>  create mode 100644 block/replication.c
>  create mode 100644 docs/block-replication.txt
>  create mode 100644 include/block/block_backup.h
>  create mode 100644 replication.c
>  create mode 100644 replication.h
>  create mode 100644 tests/test-replication.c

I have reviewed many revisions of this series.  The main mechanism in
this series makes sense to me.

I'm still concerned that checkpointing (vm_stop(), not in this series
but COLO in general) depends on bdrv_drain(), which can block forever if
I/O is hung.  That doesn't seem like a reasonable limitation for a high
availability feature since it may lead to the VM becoming unavailable.

I'd like Jeff and/or Kevin to review this series and merge it once they
are happy.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  parent reply	other threads:[~2016-05-30 18:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-20  7:36 [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 01/10] unblock backup operations in backing file Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 02/10] Backup: clear all bitmap when doing block checkpoint Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 03/10] Backup: export interfaces for extra serialization Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 04/10] Link backup into block core Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 05/10] docs: block replication's description Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 06/10] auto complete active commit Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 07/10] Introduce new APIs to do replication operation Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 08/10] Implement new driver for block replication Changlong Xie
2016-05-30 18:14   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-05-31  1:20     ` Changlong Xie
2016-06-07  4:59   ` [Qemu-devel] " Changlong Xie
2016-06-07  5:36   ` Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 09/10] tests: add unit test case for replication Changlong Xie
2016-05-27  1:46   ` Changlong Xie
2016-05-30 17:34   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2016-05-31 10:21     ` Changlong Xie
2016-05-20  7:36 ` [Qemu-devel] [PATCH v19 10/10] support replication driver in blockdev-add Changlong Xie
2016-05-27  1:59 ` [Qemu-devel] [PATCH v19 00/10] Block replication for continuous checkpoints Changlong Xie
2016-05-27  7:23   ` Fam Zheng
2016-05-30 18:20 ` Stefan Hajnoczi [this message]
2016-05-31 10:25   ` [Qemu-devel] [Qemu-block] " Changlong Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160530182016.GC1366@stefanha-x1.localdomain \
    --to=stefanha@gmail.com \
    --cc=arei.gonglei@huawei.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=wency@cn.fujitsu.com \
    --cc=xiecl.fnst@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.