From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:36950)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <famz@redhat.com>) id 1ZuDJe-0001e7-55
	for qemu-devel@nongnu.org; Thu, 05 Nov 2015 00:42:31 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <famz@redhat.com>) id 1ZuDJc-0006Ho-Vd
	for qemu-devel@nongnu.org; Thu, 05 Nov 2015 00:42:30 -0500
Date: Thu, 5 Nov 2015 13:42:19 +0800
From: Fam Zheng <famz@redhat.com>
Message-ID: <20151105054219.GG24893@ad.usersys.redhat.com>
References: <1433742974-20128-1-git-send-email-famz@redhat.com>
	<1433742974-20128-4-git-send-email-famz@redhat.com>
	<20151104183526.GA8620@noname.redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20151104183526.GA8620@noname.redhat.com>
Subject: Re: [Qemu-devel] [PATCH v7 3/8] mirror: Do zero write on target if
 sectors not allocated
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org, rjones@redhat.com, Jeff Cody <jcody@redhat.com>, qemu-devel@nongnu.org, qemu-stable@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>, pbonzini@redhat.com, jsnow@redhat.com, wangxiaolong@web.ucloud.cn

On Wed, 11/04 19:35, Kevin Wolf wrote:
> Am 08.06.2015 um 07:56 hat Fam Zheng geschrieben:
> > If guest discards a source cluster, mirroring with bdrv_aio_readv is overkill.
> > Some protocols do zero upon discard, where it's best to use
> > bdrv_aio_write_zeroes, otherwise, bdrv_aio_discard will be enough.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block/mirror.c | 20 ++++++++++++++++++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/mirror.c b/block/mirror.c
> > index d2515c7..3c38695 100644
> > --- a/block/mirror.c
> > +++ b/block/mirror.c
> > @@ -164,6 +164,8 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
> >      int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
> >      uint64_t delay_ns = 0;
> >      MirrorOp *op;
> > +    int pnum;
> > +    int64_t ret;
> >  
> >      s->sector_num = hbitmap_iter_next(&s->hbi);
> >      if (s->sector_num < 0) {
> > @@ -290,8 +292,22 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
> >      s->in_flight++;
> >      s->sectors_in_flight += nb_sectors;
> >      trace_mirror_one_iteration(s, sector_num, nb_sectors);
> > -    bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
> > -                   mirror_read_complete, op);
> > +
> > +    ret = bdrv_get_block_status_above(source, NULL, sector_num,
> > +                                      nb_sectors, &pnum);
> > +    if (ret < 0 || pnum < nb_sectors ||
> 
> Earlier today I told Richard Jones that qemu-img commit should really
> be using zero cluster support in the backing file since 2.4 because I
> remembered this commit. Turns out it doesn't actually use it but writes
> explicit zeros instead.
> 
> The reason is the condition 'pnum < nb_sectors' here, which makes mirror
> fall back to explicit writes if bdrv_get_block_status_above() doesn't
> return enough sectors (enough being relatively large here, I think in
> qemu-img commit it's always the full 10 MB buffer).
> 
> In other words, we are ignoring any zero areas smaller than 10 MB!
> 
> (What made this worse is that qcow2 had a bug that reports only a single
> zero cluster at a time, so it would never report more than 10 MB, even
> if the image was completely zeroed. I've sent a fix for that one.)
> 
> In order to fix this, we'll probably need to move the call to
> bdrv_get_block_status_above() before actually allocating memory and
> all that for the full nb_chunks. We should detect zeros on the usual
> block job granularity (64k by default, I think).
> 
> > +            (ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
> > +        bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
> > +                       mirror_read_complete, op);
> > +    } else if (ret & BDRV_BLOCK_ZERO) {
> > +        bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> > +                              s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> > +                              mirror_write_complete, op);
> > +    } else {
> > +        assert(!(ret & BDRV_BLOCK_DATA));
> > +        bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
> > +                         mirror_write_complete, op);
> > +    }
> >      return delay_ns;
> >  }
> 
> Paolo also noticed that there's no reason at all to allocate buffers
> and a qiov for the write_zeroes and discard cases.

I'll write a patch to address these. Thanks!

Fam