From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47202)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <xiecl.fnst@cn.fujitsu.com>) id 1aOzUm-0004Ii-On
	for qemu-devel@nongnu.org; Thu, 28 Jan 2016 22:13:13 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <xiecl.fnst@cn.fujitsu.com>) id 1aOzUj-0006v2-IG
	for qemu-devel@nongnu.org; Thu, 28 Jan 2016 22:13:12 -0500
Received: from [59.151.112.132] (port=54693 helo=heian.cn.fujitsu.com)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <xiecl.fnst@cn.fujitsu.com>) id 1aOzUi-0006te-LS
	for qemu-devel@nongnu.org; Thu, 28 Jan 2016 22:13:09 -0500
Message-ID: <56AAD8E6.8020208@cn.fujitsu.com>
Date: Fri, 29 Jan 2016 11:13:42 +0800
From: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
MIME-Version: 1.0
References: <1452676712-24239-1-git-send-email-xiecl.fnst@cn.fujitsu.com>
	<1452676712-24239-8-git-send-email-xiecl.fnst@cn.fujitsu.com>
	<20160127144644.GL26163@stefanha-x1.localdomain>
	<56A96B34.9090906@cn.fujitsu.com>
	<20160128151543.GH9825@stefanha-x1.localdomain>
In-Reply-To: <20160128151543.GH9825@stefanha-x1.localdomain>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH v14 7/8] Implement new driver for block
	replication
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Stefan Hajnoczi <stefanha@redhat.com>, Wen Congyang <wency@cn.fujitsu.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <famz@redhat.com>, zhanghailiang <zhang.zhanghailiang@huawei.com>, fnstml-hwcolo@cn.fujitsu.com, qemu devel <qemu-devel@nongnu.org>, Max Reitz <mreitz@redhat.com>, Gonglei <arei.gonglei@huawei.com>, Paolo Bonzini <pbonzini@redhat.com>

On 01/28/2016 11:15 PM, Stefan Hajnoczi wrote:
> On Thu, Jan 28, 2016 at 09:13:24AM +0800, Wen Congyang wrote:
>> On 01/27/2016 10:46 PM, Stefan Hajnoczi wrote:
>>> On Wed, Jan 13, 2016 at 05:18:31PM +0800, Changlong Xie wrote:
>>>> +static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
>>>> +{
>>>> +    Error *local_err = NULL;
>>>> +    int ret;
>>>> +
>>>> +    if (!s->secondary_disk->job) {
>>>> +        error_setg(errp, "Backup job is cancelled unexpectedly");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    block_job_do_checkpoint(s->secondary_disk->job, &local_err);
>>>> +    if (local_err) {
>>>> +        error_propagate(errp, local_err);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    ret = s->active_disk->drv->bdrv_make_empty(s->active_disk);
>>>
>>> What happens to in-flight requests to the active and hidden disks?
>>
>> we MUST call do_checkpoint() when the vm is stopped.
>
> Please document the environment under which the block replication
> callback functions run.

OK

>
> I'm concerned that the bdrv_drain_all() in vm_stop() can take a long
> time if the disk is slow/failing.  bdrv_drain_all() blocks until all
> in-flight I/O requests have completed.  What does the Primary do if the
> Secondary becomes unresponsive?

Actually, we knew this problem. But currently, there seems no better way 
to resolve it. If you have any ideas?

>
>>>> +    switch (s->mode) {
>>>> +    case REPLICATION_MODE_PRIMARY:
>>>> +        break;
>>>> +    case REPLICATION_MODE_SECONDARY:
>>>> +        s->active_disk = bs->file->bs;
>>>> +        if (!bs->file->bs->backing) {
>>>> +            error_setg(errp, "Active disk doesn't have backing file");
>>>> +            return;
>>>> +        }
>>>> +
>>>> +        s->hidden_disk = s->active_disk->backing->bs;
>>>> +        if (!s->hidden_disk->backing) {
>>>> +            error_setg(errp, "Hidden disk doesn't have backing file");
>>>> +            return;
>>>> +        }
>>>> +
>>>> +        s->secondary_disk = s->hidden_disk->backing->bs;
>>>> +        if (!s->secondary_disk->blk) {
>>>> +            error_setg(errp, "The secondary disk doesn't have block backend");
>>>> +            return;
>>>> +        }
>>>
>>> Kevin: Is code allowed to stash away BlockDriverState pointers for
>>> convenience or should it keep the BdrvChild pointers instead?  In order
>>> for replication to work as expected, the graph shouldn't change but for
>>> consistency maybe BdrvChild is best.
>
> I asked Kevin about this on IRC and he agreed that BdrvChild should be
> used instead of holding on to BlockDriverState * pointers.  Although
> these pointers will not change during replication (if the op blockers
> are set up correctly), it's more consistent and certainly safer to go
> through BdrvChild.

Ok

>
>>>> +        /* start backup job now */
>>>> +        error_setg(&s->blocker,
>>>> +                   "block device is in use by internal backup job");
>>>> +        bdrv_op_block_all(s->top_bs, s->blocker);
>>>> +        bdrv_op_unblock(s->top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker);
>>>> +        bdrv_ref(s->hidden_disk);
>>>
>>> Why is the explicit reference to hidden_disk (but not secondary_disk or
>>> active_disk) is necessary?
>>
>> IIRC, we should reference the backup target before calling backup_start(),
>> and we will reference the backup source in backup_start().
>
> I'm not sure why this is necessary since they are part of the backing
> chain.
>

Just as Wen said, we should reference the backup target before calling 
backup_start() to protect it from destroying, if backup job is stopped 
unexpectedly.

> If it is necessary, please add a comment so it's clear why the reference
> is being taken.
>

Ok

> Stefan
>