All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: John Snow <jsnow@redhat.com>, Leo Luan <leoluan@gmail.com>,
	qemu-devel@nongnu.org, Qemu-block <qemu-block@nongnu.org>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	Max Reitz <mreitz@redhat.com>
Subject: Re: Avoid copying unallocated clusters during full backup
Date: Fri, 17 Apr 2020 15:24:51 -0500	[thread overview]
Message-ID: <7c722a98-29ab-ba65-2f19-088628ce8f00@redhat.com> (raw)
In-Reply-To: <ba8ff0c2-2e56-c8d7-a13a-4af48372f172@redhat.com>

On 4/17/20 3:11 PM, John Snow wrote:

>> +
>> +    if (s->sync_mode == MIRROR_SYNC_MODE_FULL &&
>> +       s->bcs->target->bs->drv != NULL &&
>> +       strncmp(s->bcs->target->bs->drv->format_name, "qcow2", 5) == 0 &&
>> +       s->bcs->source->bs->backing_file[0] == '\0')
> 
> This isn't going to suffice upstream; the backup job can't be performing
> format introspection to determine behavior on the fly.

Agreed.  The idea is right (we NEED to make backup operations smarter 
based on knowledge about both source and destination block status), but 
the implementation is not (a check for strcncmp("qcow2") is not ideal).

> 
> I think what you're really after is something like
> bdrv_unallocated_blocks_are_zero().

The fact that qemu-img already has a lot of optimizations makes me 
wonder what we can salvage from there into reusable code that both 
qemu-img and block backup can share, so that we're not reimplementing 
block status handling in multiple places.

> So the basic premise is that if you are copying a qcow2 file and the
> unallocated portions as defined by the qcow2 metadata are zero, it's
> safe to skip those, so you can treat it like SYNC_MODE_TOP.
> 
> I think you *also* have to know if the *source* needs those regions
> explicitly zeroed, and it's not always safe to just skip them at the
> manifest level.
> 
> I thought there was code that handled this to some extent already, but I
> don't know. I think Vladimir has worked on it recently and can probably
> let you know where I am mistaken :)

Yes, I'm hoping Vladimir (or his other buddies at Virtuozzo) can chime 
in.  Meanwhile, I've working on v2 of some patches that will improve 
qemu's ability to tell if a destination qcow2 file already reads as all 
zeroes, and we already have bdrv_block_status() for telling which 
portions of a source image already read as all zeroes (whether or not it 
is due to not being allocated, the goal here is that we should NOT have 
to copy anything that reads as zero on the source over to the 
destination if the destination already starts life as reading all zero).

And if nothing else, qemu 5.0 just added 'qemu-img convert 
--target-is-zero' as a last-ditch means of telling qemu to assume the 
destination reads as all zeroes, even if it cannot quickly prove it; we 
probably want to add a similar knob into the QMP commands for initiating 
block backup, for the same reasons.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



  reply	other threads:[~2020-04-17 20:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-17 18:33 Avoid copying unallocated clusters during full backup Leo Luan
2020-04-17 20:11 ` John Snow
2020-04-17 20:24   ` Eric Blake [this message]
2020-04-17 22:57     ` Leo Luan
2020-04-18  0:34       ` John Snow
2020-04-18  1:43         ` Leo Luan
2020-04-20 10:56           ` Vladimir Sementsov-Ogievskiy
2020-04-20 14:31             ` Bryan S Rosenburg
2020-04-20 15:04               ` Vladimir Sementsov-Ogievskiy
2020-04-21 14:41                 ` Bryan S Rosenburg
2020-04-17 22:31   ` Leo Luan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7c722a98-29ab-ba65-2f19-088628ce8f00@redhat.com \
    --to=eblake@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=leoluan@gmail.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.