All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Fam Zheng <famz@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 15/18] block/mirror: Add active mirroring
Date: Wed, 11 Oct 2017 14:33:45 +0200	[thread overview]
Message-ID: <2def1cd1-e1ca-772f-b026-235bae9bfd9d@redhat.com> (raw)
In-Reply-To: <20171010101622.GH4177@dhcp-200-186.str.redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6540 bytes --]

On 2017-10-10 12:16, Kevin Wolf wrote:
> Am 18.09.2017 um 18:26 hat Max Reitz geschrieben:
>> On 2017-09-18 12:06, Stefan Hajnoczi wrote:
>>> On Sat, Sep 16, 2017 at 03:58:01PM +0200, Max Reitz wrote:
>>>> On 2017-09-14 17:57, Stefan Hajnoczi wrote:
>>>>> On Wed, Sep 13, 2017 at 08:19:07PM +0200, Max Reitz wrote:
>>>>>> This patch implements active synchronous mirroring.  In active mode, the
>>>>>> passive mechanism will still be in place and is used to copy all
>>>>>> initially dirty clusters off the source disk; but every write request
>>>>>> will write data both to the source and the target disk, so the source
>>>>>> cannot be dirtied faster than data is mirrored to the target.  Also,
>>>>>> once the block job has converged (BLOCK_JOB_READY sent), source and
>>>>>> target are guaranteed to stay in sync (unless an error occurs).
>>>>>>
>>>>>> Optionally, dirty data can be copied to the target disk on read
>>>>>> operations, too.
>>>>>>
>>>>>> Active mode is completely optional and currently disabled at runtime.  A
>>>>>> later patch will add a way for users to enable it.
>>>>>>
>>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>>> ---
>>>>>>  qapi/block-core.json |  23 +++++++
>>>>>>  block/mirror.c       | 187 +++++++++++++++++++++++++++++++++++++++++++++++++--
>>>>>>  2 files changed, 205 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>>>>>> index bb11815608..e072cfa67c 100644
>>>>>> --- a/qapi/block-core.json
>>>>>> +++ b/qapi/block-core.json
>>>>>> @@ -938,6 +938,29 @@
>>>>>>    'data': ['top', 'full', 'none', 'incremental'] }
>>>>>>  
>>>>>>  ##
>>>>>> +# @MirrorCopyMode:
>>>>>> +#
>>>>>> +# An enumeration whose values tell the mirror block job when to
>>>>>> +# trigger writes to the target.
>>>>>> +#
>>>>>> +# @passive: copy data in background only.
>>>>>> +#
>>>>>> +# @active-write: when data is written to the source, write it
>>>>>> +#                (synchronously) to the target as well.  In addition,
>>>>>> +#                data is copied in background just like in @passive
>>>>>> +#                mode.
>>>>>> +#
>>>>>> +# @active-read-write: write data to the target (synchronously) both
>>>>>> +#                     when it is read from and written to the source.
>>>>>> +#                     In addition, data is copied in background just
>>>>>> +#                     like in @passive mode.
>>>>>
>>>>> I'm not sure the terms "active"/"passive" are helpful.  "Active commit"
>>>>> means committing the top-most BDS while the guest is accessing it.  The
>>>>> "passive" mirror block still works on the top-most BDS while the guest
>>>>> is accessing it.
>>>>>
>>>>> Calling it "asynchronous" and "synchronous" is clearer to me.  It's also
>>>>> the terminology used in disk replication (e.g. DRBD).
>>>>
>>>> I'd be OK with that, too, but I think I remember that in the past at
>>>> least Kevin made a clear distinction between active/passive and
>>>> sync/async when it comes to mirroring.
>>>>
>>>>> Ideally the user wouldn't have to worry about async vs sync because QEMU
>>>>> would switch modes as appropriate in order to converge.  That way
>>>>> libvirt also doesn't have to worry about this.
>>>>
>>>> So here you mean async/sync in the way I meant it, i.e., whether the
>>>> mirror operations themselves are async/sync?
>>>
>>> The meaning I had in mind is:
>>>
>>> Sync mirroring means a guest write waits until the target write
>>> completes.
>>
>> I.e. active-sync, ...
>>
>>> Async mirroring means guest writes completes independently of target
>>> writes.
>>
>> ... i.e. passive or active-async in the future.
> 
> So we already have at least three different modes, sync/async doesn't
> quite cut it anyway. There's a reason why we have been talking about
> both active/passive and sync/async.
> 
> When I was looking at the code, it actually occurred to me that there
> are more possible different modes than I thought there were: This patch
> waits for successful completion on the source before it even attempts to
> write to the destination.
> 
> Wouldn't it be generally (i.e. in the success case) more useful if we
> start both requests at the same time and only wait for both to complete,
> avoiding to double the latency? If the source write fails, we're out of
> sync, obviously, so we'd have to mark the block dirty again.

I've thought about it, but my issues were:

(1) What to do when something fails
and
(2) I didn't really want to start coroutines from coroutines...

As for (1)...  My notes actually say I've come to a conclusion: If the
target write fails, that's pretty much OK, because then the source is
newer than the target, which is normal for mirroring.  If the source
write fails, we can just consider the target outdated, too (as you've
said).  Also, we'll give an error to the guest, so it's clear that
something has gone wrong.

So (2) was the reason I didn't do it in this series.  I think it's OK to
add this later on and let future me worry about how to coordinate both
requests.

I guess I'd start e.g. the target operation as a new coroutine, then
continue the source operation in the original one, and finally yield
until the target operation has finished?

> By the way, what happens when the guest modifies the RAM during the
> request? Is it acceptable even for writes if source and target differ
> after a successful write operation? Don't we need a bounce buffer
> anyway?

Sometimes I think that maybe I shouldn't keep my thoughts to myself
after I've come to the conclusion "...naah, it's all bad anyway". :-)

When Stefan mentioned this for reads, I thought about the write
situation, yes.  My conclusion was that the guest would be required (by
protocol) to keep the write buffer constant while the operation is
running, because otherwise the guest has no idea what is going to be on
disk.  So it would be stupid for the guest to modify the write buffer then.

But (1) depending on the emulated hardware, maybe the guest does have an
idea (e.g. some register that tells the guest which offset is currently
written) -- but with the structure of the block layer, I doubt that's
possible in qemu,

and (2) maybe the guest wants to be stupid.  Even if the guest doesn't
know what will end up on disk, we have to make sure that it's the same
on both source and target.

So, yeah, a bounce buffer would be good in all cases.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

  reply	other threads:[~2017-10-11 12:34 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-13 18:18 [Qemu-devel] [PATCH 00/18] block/mirror: Add active-sync mirroring Max Reitz
2017-09-13 18:18 ` [Qemu-devel] [PATCH 01/18] block: Add BdrvDeletedStatus Max Reitz
2017-09-13 18:18 ` [Qemu-devel] [PATCH 02/18] block: BDS deletion during bdrv_drain_recurse Max Reitz
2017-09-18  3:44   ` Fam Zheng
2017-09-18 16:13     ` Max Reitz
2017-10-09 18:30       ` Max Reitz
2017-10-10  8:36   ` Kevin Wolf
2017-10-11 11:41     ` Max Reitz
2017-09-13 18:18 ` [Qemu-devel] [PATCH 03/18] blockjob: Make drained_{begin, end} public Max Reitz
2017-09-18  3:46   ` Fam Zheng
2017-09-13 18:18 ` [Qemu-devel] [PATCH 04/18] block/mirror: Pull out mirror_perform() Max Reitz
2017-09-18  3:48   ` Fam Zheng
2017-09-25  9:38   ` Vladimir Sementsov-Ogievskiy
2017-09-13 18:18 ` [Qemu-devel] [PATCH 05/18] block/mirror: Convert to coroutines Max Reitz
2017-09-18  6:02   ` Fam Zheng
2017-09-18 16:41     ` Max Reitz
2017-10-10  9:14   ` Kevin Wolf
2017-10-11 11:43     ` Max Reitz
2017-09-13 18:18 ` [Qemu-devel] [PATCH 06/18] block/mirror: Use CoQueue to wait on in-flight ops Max Reitz
2017-09-13 18:18 ` [Qemu-devel] [PATCH 07/18] block/mirror: Wait for in-flight op conflicts Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 08/18] block/mirror: Use source as a BdrvChild Max Reitz
2017-10-10  9:27   ` Kevin Wolf
2017-10-11 11:46     ` Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 09/18] block: Generalize should_update_child() rule Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 10/18] block/mirror: Make source the file child Max Reitz
2017-10-10  9:47   ` Kevin Wolf
2017-10-11 12:02     ` Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 11/18] hbitmap: Add @advance param to hbitmap_iter_next() Max Reitz
2017-09-25 15:38   ` Vladimir Sementsov-Ogievskiy
2017-09-25 20:40     ` Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 12/18] block/dirty-bitmap: Add bdrv_dirty_iter_next_area Max Reitz
2017-09-25 15:49   ` Vladimir Sementsov-Ogievskiy
2017-09-25 20:43     ` Max Reitz
2017-10-02 13:32     ` Vladimir Sementsov-Ogievskiy
2017-09-13 18:19 ` [Qemu-devel] [PATCH 13/18] block/mirror: Keep write perm for pending writes Max Reitz
2017-10-10  9:58   ` Kevin Wolf
2017-10-11 12:20     ` Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 14/18] block/mirror: Distinguish active from passive ops Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 15/18] block/mirror: Add active mirroring Max Reitz
2017-09-14 15:57   ` Stefan Hajnoczi
2017-09-16 13:58     ` Max Reitz
2017-09-18 10:06       ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-09-18 16:26         ` Max Reitz
2017-09-19  9:44           ` Stefan Hajnoczi
2017-09-19  9:57             ` Daniel P. Berrange
2017-09-20 14:56               ` Stefan Hajnoczi
2017-10-10 10:16           ` Kevin Wolf
2017-10-11 12:33             ` Max Reitz [this message]
2017-09-13 18:19 ` [Qemu-devel] [PATCH 16/18] block/mirror: Add copy mode QAPI interface Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 17/18] qemu-io: Add background write Max Reitz
2017-09-18  6:46   ` Fam Zheng
2017-09-18 17:53     ` Max Reitz
2017-09-19  8:03       ` Fam Zheng
2017-09-21 14:40         ` Max Reitz
2017-09-21 14:59           ` Fam Zheng
2017-09-21 15:03             ` Max Reitz
2017-09-13 18:19 ` [Qemu-devel] [PATCH 18/18] iotests: Add test for active mirroring Max Reitz
2017-09-18  6:45   ` Fam Zheng
2017-09-18 16:53     ` Max Reitz
2017-09-19  8:08       ` Fam Zheng
2017-09-14 15:42 ` [Qemu-devel] [PATCH 00/18] block/mirror: Add active-sync mirroring Stefan Hajnoczi
2017-09-16 14:02   ` Max Reitz
2017-09-18 10:02     ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-09-18 15:42       ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2def1cd1-e1ca-772f-b026-235bae9bfd9d@redhat.com \
    --to=mreitz@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.