All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Estimation of qcow2 image size converted from raw image
@ 2017-02-13 15:46 Maor Lipchuk
  2017-02-13 17:03 ` John Snow
  2017-02-15 15:14 ` Stefan Hajnoczi
  0 siblings, 2 replies; 17+ messages in thread
From: Maor Lipchuk @ 2017-02-13 15:46 UTC (permalink / raw)
  To: qemu-devel, qemu-discuss; +Cc: Nir Soffer, Kevin Wolf, Allon Mureinik

Hi all,

I was wondering if that is possible to provide a new API that
estimates the size of
qcow2 image converted from a raw image. We could use this new API to
allocate the
size more precisely before the convert operation.

What are we trying to do:
- Convert raw sparse image from NFS or from block device to qcow2 image
  on thin provisioned block device

- In ovirt thin provisioned block device is a regular lv, and we like
  allocate only the required size for the the qcow file.

Our current (stupid) solution is to allocate the entire LV using the
size of the raw image.

Here is an example flow:

    $ truncate -s 10G test.raw

We don't know what will be the size of the qcow on the block storage,
so we allocate the entire LV:

    $ lvcreate --size 10G vg/lv

Then we convert the file to the new LV:

    $ qemu-img convert -f raw -O qcow2 test.raw /dev/vg/lv

After the copy we can check the actual size:

    $ qemu-img check /dev/vg/lv

And reduce the LV:

    $ lvreduce -L128m vg/lv

But we like to avoid the allocation, and allocate only the needed size
before we convert the image.

We found that if we create a file with one byte for each cluster (64K),
qcow2 file will be bigger than the raw file:

Creating worst case raw file:

    with open("worst.raw", "wb") as f:
        for offset in range(64 * 1024 - 1, 10 * 1024**3, 64 * 1024):
            f.seek(offset)
            f.write("x")

$ ls -lh worst.raw
-rw-rw-r--. 1 user user 10G Feb 13 16:43 worst.raw

$ du -sh worst.raw
642M worst.raw

$ ls -lh worst.qcow2
-rw-r--r--. 1 user user 11G Feb 13 17:10 worst.qcow2

Now compare that to the best case:

    with open("best.raw", "wb") as f:
        for i in range(10 * 1024**3 / (64*1024)):
            f.write("x" * 4096)

$ ls -lh best.raw
-rw-rw-r--. 1 user user 10G Feb 13 17:18 best.raw

$ du -sh best.raw
641M best.raw

$ qemu-img convert -p -f raw -O qcow2 best.raw best.qcow2

$ ls -lh best.qcow2
-rw-r--r--. 1 user user 641M Feb 13 17:21 best.qcow2

$ du -sh best.qcow2
641M best.qcow2

So it seems that to estimate the size of the qcow2 file, we need
to check not only the number of blocks but the location of the blocks.

We can probably use qemu-img map to estimate:

$ qemu-img map worst.raw --output json
[{ "start": 0, "length": 61440, "depth": 0, "zero": true, "data":
false, "offset": 0},
{ "start": 61440, "length": 4096, "depth": 0, "zero": false, "data":
true, "offset": 61440},
{ "start": 65536, "length": 61440, "depth": 0, "zero": true, "data":
false, "offset": 65536},
{ "start": 126976, "length": 4096, "depth": 0, "zero": false, "data":
true, "offset": 126976},
{ "start": 131072, "length": 61440, "depth": 0, "zero": true, "data":
false, "offset": 131072},
...

$ qemu-img map best.raw --output json
[{ "start": 0, "length": 671088640, "depth": 0, "zero": false, "data":
true, "offset": 0},
{ "start": 671088640, "length": 10066325504, "depth": 0, "zero": true,
"data": false, "offset": 671088640},
{ "start": 10737414144, "length": 4096, "depth": 0, "zero": false,
"data": true, "offset": 10737414144}]

But this means we have to include qcow2 allocation logic in our code, and
the calculation will break when qcow2 changes the format.

We think that the best way to solve this issue is to return this info
from qemu-img, maybe as a flag to qemu-img convert that will
calculate the size of the converted image without doing any writes.


See also:
https://bugzilla.redhat.com/1358717 - Export of vm with thin provision
disk from NFS Data domain and Import to Block Data domain makes
virtual and Actual size of disk same.

https://bugzilla.redhat.com/1419240 - Creating a Clone vm from
template with Format "QCOW2" and Target "block based storage" has a
disk with same actual and virtual size.


Regards,
Maor

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-13 15:46 [Qemu-devel] Estimation of qcow2 image size converted from raw image Maor Lipchuk
@ 2017-02-13 17:03 ` John Snow
  2017-02-13 17:16   ` Daniel P. Berrange
  2017-02-15 15:14 ` Stefan Hajnoczi
  1 sibling, 1 reply; 17+ messages in thread
From: John Snow @ 2017-02-13 17:03 UTC (permalink / raw)
  To: Maor Lipchuk, qemu-devel, qemu-discuss
  Cc: Nir Soffer, Kevin Wolf, Allon Mureinik, Qemu-block

CCing qemu-block;

On 02/13/2017 10:46 AM, Maor Lipchuk wrote:
> Hi all,
> 
> I was wondering if that is possible to provide a new API that
> estimates the size of
> qcow2 image converted from a raw image. We could use this new API to
> allocate the
> size more precisely before the convert operation.
> 

I'm not sure you'd need an API to do this, you could estimate it
yourself pretty effectively.

Naively, just loop 64KiB at a time and check if each 64KiB chunk is zero
or not. If it isn't, add +64KiB to the filesize estimate. If it is, skip
that chunk.

On filesystems that already support sparse allocation, you can use lseek
with SEEK_DATA or SEEK_HOLE to find where the data/zeroes are and do
something more clever to find out where the zeroes are and estimate that
way.

Then you'll add a certain number of metadata blocks to finish, and
you'll have a pretty solid estimate.

> What are we trying to do:
> - Convert raw sparse image from NFS or from block device to qcow2 image
>   on thin provisioned block device
> 
> - In ovirt thin provisioned block device is a regular lv, and we like
>   allocate only the required size for the the qcow file.
> 
> Our current (stupid) solution is to allocate the entire LV using the
> size of the raw image.
> 
> Here is an example flow:
> 
>     $ truncate -s 10G test.raw
> 
> We don't know what will be the size of the qcow on the block storage,
> so we allocate the entire LV:
> 
>     $ lvcreate --size 10G vg/lv
> 
> Then we convert the file to the new LV:
> 
>     $ qemu-img convert -f raw -O qcow2 test.raw /dev/vg/lv
> 
> After the copy we can check the actual size:
> 
>     $ qemu-img check /dev/vg/lv
> 
> And reduce the LV:
> 
>     $ lvreduce -L128m vg/lv
> 
> But we like to avoid the allocation, and allocate only the needed size
> before we convert the image.
> 
> We found that if we create a file with one byte for each cluster (64K),
> qcow2 file will be bigger than the raw file:
> 

Makes sense. You essentially allocate the entire file this way (No
unallocated clusters) and then you have to pay the metadata tax on top
if it.

> Creating worst case raw file:
> 
>     with open("worst.raw", "wb") as f:
>         for offset in range(64 * 1024 - 1, 10 * 1024**3, 64 * 1024):
>             f.seek(offset)
>             f.write("x")
> 
> $ ls -lh worst.raw
> -rw-rw-r--. 1 user user 10G Feb 13 16:43 worst.raw
> 
> $ du -sh worst.raw
> 642M worst.raw
> 
> $ ls -lh worst.qcow2
> -rw-r--r--. 1 user user 11G Feb 13 17:10 worst.qcow2
> 
> Now compare that to the best case:
> 
>     with open("best.raw", "wb") as f:
>         for i in range(10 * 1024**3 / (64*1024)):
>             f.write("x" * 4096)
> 

So, 640MiB of "x" contiguously from the beginning of the file?

You're only touching about ... 10,240 clusters that way. Makes sense
that the qcow2 is nearly the same size as the written data.

> $ ls -lh best.raw
> -rw-rw-r--. 1 user user 10G Feb 13 17:18 best.raw
> 
> $ du -sh best.raw
> 641M best.raw
> 
> $ qemu-img convert -p -f raw -O qcow2 best.raw best.qcow2
> 
> $ ls -lh best.qcow2
> -rw-r--r--. 1 user user 641M Feb 13 17:21 best.qcow2
> 
> $ du -sh best.qcow2
> 641M best.qcow2
> 
> So it seems that to estimate the size of the qcow2 file, we need
> to check not only the number of blocks but the location of the blocks.
> 

Well, you need to check the number of allocated clusters. "blocks" are
not a meaningful concept to qcow2 exactly. One byte written to every
single cluster will fully allocate the file.

> We can probably use qemu-img map to estimate:
> 
> $ qemu-img map worst.raw --output json
> [{ "start": 0, "length": 61440, "depth": 0, "zero": true, "data":
> false, "offset": 0},
> { "start": 61440, "length": 4096, "depth": 0, "zero": false, "data":
> true, "offset": 61440},
> { "start": 65536, "length": 61440, "depth": 0, "zero": true, "data":
> false, "offset": 65536},
> { "start": 126976, "length": 4096, "depth": 0, "zero": false, "data":
> true, "offset": 126976},
> { "start": 131072, "length": 61440, "depth": 0, "zero": true, "data":
> false, "offset": 131072},
> ...
> 
> $ qemu-img map best.raw --output json
> [{ "start": 0, "length": 671088640, "depth": 0, "zero": false, "data":
> true, "offset": 0},
> { "start": 671088640, "length": 10066325504, "depth": 0, "zero": true,
> "data": false, "offset": 671088640},
> { "start": 10737414144, "length": 4096, "depth": 0, "zero": false,
> "data": true, "offset": 10737414144}]
> 
> But this means we have to include qcow2 allocation logic in our code, and
> the calculation will break when qcow2 changes the format.
> 

Not likely to change considerably, but fair enough of a point.

Also keep in mind that changing the cluster size will give you different
answers, too -- but that different cluster sizes will effect the runtime
performance of the image as well.

> We think that the best way to solve this issue is to return this info
> from qemu-img, maybe as a flag to qemu-img convert that will
> calculate the size of the converted image without doing any writes.
> 

Might not be too hard to add, but it wouldn't necessarily be any more
accurate than if you implemented the same logic, I think.

Still, it'd be up to us to keep it up to date, but I don't know what
guarantees we could provide about the accuracy of the estimate or
preventing it from bitrot if there are format changes..

> 
> See also:
> https://bugzilla.redhat.com/1358717 - Export of vm with thin provision
> disk from NFS Data domain and Import to Block Data domain makes
> virtual and Actual size of disk same.
> 
> https://bugzilla.redhat.com/1419240 - Creating a Clone vm from
> template with Format "QCOW2" and Target "block based storage" has a
> disk with same actual and virtual size.
> 
> 
> Regards,
> Maor
> 

--js

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-13 17:03 ` John Snow
@ 2017-02-13 17:16   ` Daniel P. Berrange
  2017-02-13 18:26     ` John Snow
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel P. Berrange @ 2017-02-13 17:16 UTC (permalink / raw)
  To: John Snow
  Cc: Maor Lipchuk, qemu-devel, qemu-discuss, Nir Soffer, Kevin Wolf,
	Allon Mureinik, Qemu-block

On Mon, Feb 13, 2017 at 12:03:35PM -0500, John Snow wrote:
> Also keep in mind that changing the cluster size will give you different
> answers, too -- but that different cluster sizes will effect the runtime
> performance of the image as well.

This means that apps trying to figure out this future usage have to
understand fine internal impl details of qcow2 to correctly calculate
it.

> > We think that the best way to solve this issue is to return this info
> > from qemu-img, maybe as a flag to qemu-img convert that will
> > calculate the size of the converted image without doing any writes.
> > 
> 
> Might not be too hard to add, but it wouldn't necessarily be any more
> accurate than if you implemented the same logic, I think.
> 
> Still, it'd be up to us to keep it up to date, but I don't know what
> guarantees we could provide about the accuracy of the estimate or
> preventing it from bitrot if there are format changes..

As opposed to every application trying to implement the logic
themselves...it'll likely bitrot even worse in 3rd party apps
as their maintainers won't notice format changes until they
see a bug report.  Likewise, app developers aren't in a much
better position wrt to accracy - if anything they'll do a worse
job at calculating it since they might miss subtable nuances of
the qcow2 format that qemu developers would more likely get right.

This isn't just a problem wrt to the usage scenario mentioned in
this thread. For active VMs, consider you want to determine whether
you are at risk of overcommitting the filesystem or not. You cannot
simply sum up the image capacity - you need to know the largest
size that the qcow2 file is going to grow to in future[1] - this
again requires the app to calculate overhead of qcow2 metdata to
understand what they've committed to providing in terms of storage

Regards,
Daniel

[1] There is no upper limit if internal snapshots are usedm but if
    we assume use of external snapshots, we should be able to
    calculate the file size commitment.
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-13 17:16   ` Daniel P. Berrange
@ 2017-02-13 18:26     ` John Snow
  0 siblings, 0 replies; 17+ messages in thread
From: John Snow @ 2017-02-13 18:26 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Maor Lipchuk, qemu-devel, qemu-discuss, Nir Soffer, Kevin Wolf,
	Allon Mureinik, Qemu-block



On 02/13/2017 12:16 PM, Daniel P. Berrange wrote:
> On Mon, Feb 13, 2017 at 12:03:35PM -0500, John Snow wrote:
>> Also keep in mind that changing the cluster size will give you different
>> answers, too -- but that different cluster sizes will effect the runtime
>> performance of the image as well.
> 
> This means that apps trying to figure out this future usage have to
> understand fine internal impl details of qcow2 to correctly calculate
> it.
> 

Well, as long as they just want an *estimate* ...

Plus, the spec for qcow2 is open source! :) What internal details? O:-)

>>> We think that the best way to solve this issue is to return this info
>>> from qemu-img, maybe as a flag to qemu-img convert that will
>>> calculate the size of the converted image without doing any writes.
>>>
>>
>> Might not be too hard to add, but it wouldn't necessarily be any more
>> accurate than if you implemented the same logic, I think.
>>
>> Still, it'd be up to us to keep it up to date, but I don't know what
>> guarantees we could provide about the accuracy of the estimate or
>> preventing it from bitrot if there are format changes..
> 
> As opposed to every application trying to implement the logic
> themselves...it'll likely bitrot even worse in 3rd party apps
> as their maintainers won't notice format changes until they
> see a bug report.  Likewise, app developers aren't in a much
> better position wrt to accracy - if anything they'll do a worse
> job at calculating it since they might miss subtable nuances of
> the qcow2 format that qemu developers would more likely get right.
> 

Sure, just cautioning against the idea that we'll be able to provide
anything better than an *estimate*, for all the same reasons it would be
difficult for anyone else to provide anything better than an educated guess.

Was not seriously campaigning against us adding it -- just offering a
pathway to not have to wait for us to do it, since ours likely won't be
much more accurate or stable in any meaningful sense.

--js

> This isn't just a problem wrt to the usage scenario mentioned in
> this thread. For active VMs, consider you want to determine whether
> you are at risk of overcommitting the filesystem or not. You cannot
> simply sum up the image capacity - you need to know the largest
> size that the qcow2 file is going to grow to in future[1] - this
> again requires the app to calculate overhead of qcow2 metdata to
> understand what they've committed to providing in terms of storage
> 
> Regards,
> Daniel
> 
> [1] There is no upper limit if internal snapshots are usedm but if
>     we assume use of external snapshots, we should be able to
>     calculate the file size commitment.
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-13 15:46 [Qemu-devel] Estimation of qcow2 image size converted from raw image Maor Lipchuk
  2017-02-13 17:03 ` John Snow
@ 2017-02-15 15:14 ` Stefan Hajnoczi
  2017-02-15 15:20   ` Daniel P. Berrange
  2017-02-15 15:49   ` Nir Soffer
  1 sibling, 2 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2017-02-15 15:14 UTC (permalink / raw)
  To: Maor Lipchuk
  Cc: qemu-devel, qemu-discuss, Nir Soffer, Kevin Wolf, Allon Mureinik,
	Max Reitz

[-- Attachment #1: Type: text/plain, Size: 1550 bytes --]

On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
> I was wondering if that is possible to provide a new API that
> estimates the size of
> qcow2 image converted from a raw image. We could use this new API to
> allocate the
> size more precisely before the convert operation.
> 
[...]
> We think that the best way to solve this issue is to return this info
> from qemu-img, maybe as a flag to qemu-img convert that will
> calculate the size of the converted image without doing any writes.

Sounds reasonable.  qcow2 actually already does some of this calculation
internally for image preallocation in qcow2_create2().

Let's try this syntax:

  $ qemu-img query-max-size -f raw -O qcow2 input.raw
  1234678000

As John explained, it is only an estimate.  But it will be a
conservative maximum.

Internally BlockDriver needs a new interface:

struct BlockDriver {
    /*
     * Return a conservative estimate of the maximum host file size
     * required by a new image given an existing BlockDriverState (not
     * necessarily opened with this BlockDriver).
     */
    uint64_t (*bdrv_query_max_size)(BlockDriverState *other_bs,
                                    Error **errp);
};

This interface allows individual block drivers to probe other_bs in
whatever way necessary (e.g. querying block allocation status).

Since this is a conservative max estimate there's no need to read all
data to check for zero regions.  We should give the best estimate that
can be generated quickly.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-15 15:14 ` Stefan Hajnoczi
@ 2017-02-15 15:20   ` Daniel P. Berrange
  2017-02-15 15:34     ` Eric Blake
  2017-02-15 15:57     ` Nir Soffer
  2017-02-15 15:49   ` Nir Soffer
  1 sibling, 2 replies; 17+ messages in thread
From: Daniel P. Berrange @ 2017-02-15 15:20 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Maor Lipchuk, Kevin Wolf, Allon Mureinik, qemu-devel, Max Reitz,
	Nir Soffer, qemu-discuss

On Wed, Feb 15, 2017 at 03:14:19PM +0000, Stefan Hajnoczi wrote:
> On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
> > I was wondering if that is possible to provide a new API that
> > estimates the size of
> > qcow2 image converted from a raw image. We could use this new API to
> > allocate the
> > size more precisely before the convert operation.
> > 
> [...]
> > We think that the best way to solve this issue is to return this info
> > from qemu-img, maybe as a flag to qemu-img convert that will
> > calculate the size of the converted image without doing any writes.
> 
> Sounds reasonable.  qcow2 actually already does some of this calculation
> internally for image preallocation in qcow2_create2().
> 
> Let's try this syntax:
> 
>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
>   1234678000
> 
> As John explained, it is only an estimate.  But it will be a
> conservative maximum.

This forces you to have an input file. It would be nice to be able to
get the same information by merely giving the desired capacity e.g

  $ qemu-img query-max-size -O qcow2 20G


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-15 15:20   ` Daniel P. Berrange
@ 2017-02-15 15:34     ` Eric Blake
  2017-02-15 15:57     ` Nir Soffer
  1 sibling, 0 replies; 17+ messages in thread
From: Eric Blake @ 2017-02-15 15:34 UTC (permalink / raw)
  To: Daniel P. Berrange, Stefan Hajnoczi
  Cc: Kevin Wolf, Allon Mureinik, qemu-devel, Max Reitz, Nir Soffer,
	Maor Lipchuk, qemu-discuss

[-- Attachment #1: Type: text/plain, Size: 780 bytes --]

On 02/15/2017 09:20 AM, Daniel P. Berrange wrote:
>> Let's try this syntax:
>>
>>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
>>   1234678000
>>
>> As John explained, it is only an estimate.  But it will be a
>> conservative maximum.
> 
> This forces you to have an input file. It would be nice to be able to
> get the same information by merely giving the desired capacity e.g
> 
>   $ qemu-img query-max-size -O qcow2 20G

We'd need an option to tell the difference between size and pre-existing
file, if we want to support both ways.  Maybe:

qemu-img query-max-size -O qcow2 --size 20G
qemu-img query-max-size -O qcow2 -f raw input.raw

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-15 15:14 ` Stefan Hajnoczi
  2017-02-15 15:20   ` Daniel P. Berrange
@ 2017-02-15 15:49   ` Nir Soffer
  2017-02-20 11:07     ` Stefan Hajnoczi
  1 sibling, 1 reply; 17+ messages in thread
From: Nir Soffer @ 2017-02-15 15:49 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Maor Lipchuk, qemu-devel, qemu-discuss, Kevin Wolf,
	Allon Mureinik, Max Reitz

On Wed, Feb 15, 2017 at 5:14 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
>> I was wondering if that is possible to provide a new API that
>> estimates the size of
>> qcow2 image converted from a raw image. We could use this new API to
>> allocate the
>> size more precisely before the convert operation.
>>
> [...]
>> We think that the best way to solve this issue is to return this info
>> from qemu-img, maybe as a flag to qemu-img convert that will
>> calculate the size of the converted image without doing any writes.
>
> Sounds reasonable.  qcow2 actually already does some of this calculation
> internally for image preallocation in qcow2_create2().
>
> Let's try this syntax:
>
>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
>   1234678000

This is little bit verbose compared to other commands
(e.g. info, check, convert)

Since this is needed only during convert, maybe this can be
a convert flag?

    qemu-img convert -f xxx -O yyy src dst --estimate-size --output json
    {
        "estimated size": 1234678000
    }

> As John explained, it is only an estimate.  But it will be a
> conservative maximum.
>
> Internally BlockDriver needs a new interface:
>
> struct BlockDriver {
>     /*
>      * Return a conservative estimate of the maximum host file size
>      * required by a new image given an existing BlockDriverState (not
>      * necessarily opened with this BlockDriver).
>      */
>     uint64_t (*bdrv_query_max_size)(BlockDriverState *other_bs,
>                                     Error **errp);
> };
>
> This interface allows individual block drivers to probe other_bs in
> whatever way necessary (e.g. querying block allocation status).
>
> Since this is a conservative max estimate there's no need to read all
> data to check for zero regions.  We should give the best estimate that
> can be generated quickly.

I think we need to check allocation (e.g. with SEEK_DATA), I hope this
is what you mean by not read all data.

Nir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-15 15:20   ` Daniel P. Berrange
  2017-02-15 15:34     ` Eric Blake
@ 2017-02-15 15:57     ` Nir Soffer
  2017-02-15 16:05       ` [Qemu-devel] [Qemu-discuss] " Alberto Garcia
  2017-02-15 16:07       ` [Qemu-devel] " Daniel P. Berrange
  1 sibling, 2 replies; 17+ messages in thread
From: Nir Soffer @ 2017-02-15 15:57 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Stefan Hajnoczi, Maor Lipchuk, Kevin Wolf, Allon Mureinik,
	qemu-devel, Max Reitz, qemu-discuss

On Wed, Feb 15, 2017 at 5:20 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
> On Wed, Feb 15, 2017 at 03:14:19PM +0000, Stefan Hajnoczi wrote:
>> On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
>> > I was wondering if that is possible to provide a new API that
>> > estimates the size of
>> > qcow2 image converted from a raw image. We could use this new API to
>> > allocate the
>> > size more precisely before the convert operation.
>> >
>> [...]
>> > We think that the best way to solve this issue is to return this info
>> > from qemu-img, maybe as a flag to qemu-img convert that will
>> > calculate the size of the converted image without doing any writes.
>>
>> Sounds reasonable.  qcow2 actually already does some of this calculation
>> internally for image preallocation in qcow2_create2().
>>
>> Let's try this syntax:
>>
>>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
>>   1234678000
>>
>> As John explained, it is only an estimate.  But it will be a
>> conservative maximum.
>
> This forces you to have an input file. It would be nice to be able to
> get the same information by merely giving the desired capacity e.g
>
>   $ qemu-img query-max-size -O qcow2 20G

Without a file, this will have to assume that all clusters will be allocated.

Do you have a use case for not using existing file?

For ovirt we need this when converting a file from one storage to another,
the capabilities of the storage matter in both cases.

(Adding all)

Nir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [Qemu-discuss] Estimation of qcow2 image size converted from raw image
  2017-02-15 15:57     ` Nir Soffer
@ 2017-02-15 16:05       ` Alberto Garcia
  2017-02-15 16:11         ` Daniel P. Berrange
  2017-02-15 16:07       ` [Qemu-devel] " Daniel P. Berrange
  1 sibling, 1 reply; 17+ messages in thread
From: Alberto Garcia @ 2017-02-15 16:05 UTC (permalink / raw)
  To: Nir Soffer, Daniel P. Berrange
  Cc: Kevin Wolf, Allon Mureinik, Stefan Hajnoczi, qemu-devel,
	Max Reitz, Maor Lipchuk, qemu-discuss

On Wed 15 Feb 2017 04:57:12 PM CET, Nir Soffer wrote:
>>> Let's try this syntax:
>>>
>>>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
>>>   1234678000
>>>
>>> As John explained, it is only an estimate.  But it will be a
>>> conservative maximum.
>>
>> This forces you to have an input file. It would be nice to be able to
>> get the same information by merely giving the desired capacity e.g
>>
>>   $ qemu-img query-max-size -O qcow2 20G
>
> Without a file, this will have to assume that all clusters will be
> allocated.

...and that there are no internal snapshots. I'm not sure if this is
very useful in general.

Berto

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-15 15:57     ` Nir Soffer
  2017-02-15 16:05       ` [Qemu-devel] [Qemu-discuss] " Alberto Garcia
@ 2017-02-15 16:07       ` Daniel P. Berrange
  2017-02-20 11:12         ` Stefan Hajnoczi
  1 sibling, 1 reply; 17+ messages in thread
From: Daniel P. Berrange @ 2017-02-15 16:07 UTC (permalink / raw)
  To: Nir Soffer
  Cc: Stefan Hajnoczi, Maor Lipchuk, Kevin Wolf, Allon Mureinik,
	qemu-devel, Max Reitz, qemu-discuss

On Wed, Feb 15, 2017 at 05:57:12PM +0200, Nir Soffer wrote:
> On Wed, Feb 15, 2017 at 5:20 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
> > On Wed, Feb 15, 2017 at 03:14:19PM +0000, Stefan Hajnoczi wrote:
> >> On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
> >> > I was wondering if that is possible to provide a new API that
> >> > estimates the size of
> >> > qcow2 image converted from a raw image. We could use this new API to
> >> > allocate the
> >> > size more precisely before the convert operation.
> >> >
> >> [...]
> >> > We think that the best way to solve this issue is to return this info
> >> > from qemu-img, maybe as a flag to qemu-img convert that will
> >> > calculate the size of the converted image without doing any writes.
> >>
> >> Sounds reasonable.  qcow2 actually already does some of this calculation
> >> internally for image preallocation in qcow2_create2().
> >>
> >> Let's try this syntax:
> >>
> >>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
> >>   1234678000
> >>
> >> As John explained, it is only an estimate.  But it will be a
> >> conservative maximum.
> >
> > This forces you to have an input file. It would be nice to be able to
> > get the same information by merely giving the desired capacity e.g
> >
> >   $ qemu-img query-max-size -O qcow2 20G
> 
> Without a file, this will have to assume that all clusters will be allocated.
> 
> Do you have a use case for not using existing file?

If you want to format a new qcow2 file in a pre-created block device you
want to know how big the block device should be. Or you want to validate
that the filesystem you're about to created it in will not become
overrcomitted wrt pre-existing guests. So you need to consider the FS
free space, vs query-max-size for all existing guest, combined with
query-max-size for the new disk you wan tto create


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [Qemu-discuss] Estimation of qcow2 image size converted from raw image
  2017-02-15 16:05       ` [Qemu-devel] [Qemu-discuss] " Alberto Garcia
@ 2017-02-15 16:11         ` Daniel P. Berrange
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel P. Berrange @ 2017-02-15 16:11 UTC (permalink / raw)
  To: Alberto Garcia
  Cc: Nir Soffer, Kevin Wolf, Allon Mureinik, Stefan Hajnoczi,
	qemu-devel, Max Reitz, Maor Lipchuk, qemu-discuss

On Wed, Feb 15, 2017 at 05:05:04PM +0100, Alberto Garcia wrote:
> On Wed 15 Feb 2017 04:57:12 PM CET, Nir Soffer wrote:
> >>> Let's try this syntax:
> >>>
> >>>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
> >>>   1234678000
> >>>
> >>> As John explained, it is only an estimate.  But it will be a
> >>> conservative maximum.
> >>
> >> This forces you to have an input file. It would be nice to be able to
> >> get the same information by merely giving the desired capacity e.g
> >>
> >>   $ qemu-img query-max-size -O qcow2 20G
> >
> > Without a file, this will have to assume that all clusters will be
> > allocated.
> 
> ...and that there are no internal snapshots. I'm not sure if this is
> very useful in general.

As long as the caveat is documented it is fine. Internal snapshots are
often completely ignored by apps since they have many downsides compared
to using external snapshots.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-15 15:49   ` Nir Soffer
@ 2017-02-20 11:07     ` Stefan Hajnoczi
       [not found]       ` <CAJ1JNOdzD7DHTHGJEO2YQANDPq0kY-PEh6J1jBkP7hUW0Kvy9w@mail.gmail.com>
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Hajnoczi @ 2017-02-20 11:07 UTC (permalink / raw)
  To: Nir Soffer
  Cc: Maor Lipchuk, qemu-devel, qemu-discuss, Kevin Wolf,
	Allon Mureinik, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 2510 bytes --]

On Wed, Feb 15, 2017 at 05:49:58PM +0200, Nir Soffer wrote:
> On Wed, Feb 15, 2017 at 5:14 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> > On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
> >> I was wondering if that is possible to provide a new API that
> >> estimates the size of
> >> qcow2 image converted from a raw image. We could use this new API to
> >> allocate the
> >> size more precisely before the convert operation.
> >>
> > [...]
> >> We think that the best way to solve this issue is to return this info
> >> from qemu-img, maybe as a flag to qemu-img convert that will
> >> calculate the size of the converted image without doing any writes.
> >
> > Sounds reasonable.  qcow2 actually already does some of this calculation
> > internally for image preallocation in qcow2_create2().
> >
> > Let's try this syntax:
> >
> >   $ qemu-img query-max-size -f raw -O qcow2 input.raw
> >   1234678000
> 
> This is little bit verbose compared to other commands
> (e.g. info, check, convert)
> 
> Since this is needed only during convert, maybe this can be
> a convert flag?
> 
>     qemu-img convert -f xxx -O yyy src dst --estimate-size --output json
>     {
>         "estimated size": 1234678000
>     }

What is dst?  It's a dummy argument.

Let's not try to shoehorn this new sub-command into qemu-img convert.

> > As John explained, it is only an estimate.  But it will be a
> > conservative maximum.
> >
> > Internally BlockDriver needs a new interface:
> >
> > struct BlockDriver {
> >     /*
> >      * Return a conservative estimate of the maximum host file size
> >      * required by a new image given an existing BlockDriverState (not
> >      * necessarily opened with this BlockDriver).
> >      */
> >     uint64_t (*bdrv_query_max_size)(BlockDriverState *other_bs,
> >                                     Error **errp);
> > };
> >
> > This interface allows individual block drivers to probe other_bs in
> > whatever way necessary (e.g. querying block allocation status).
> >
> > Since this is a conservative max estimate there's no need to read all
> > data to check for zero regions.  We should give the best estimate that
> > can be generated quickly.
> 
> I think we need to check allocation (e.g. with SEEK_DATA), I hope this
> is what you mean by not read all data.

Yes, allocation data must be checked.  But it will not read data
clusters from disk to check if they contains only zeroes.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-15 16:07       ` [Qemu-devel] " Daniel P. Berrange
@ 2017-02-20 11:12         ` Stefan Hajnoczi
  0 siblings, 0 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2017-02-20 11:12 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Nir Soffer, Maor Lipchuk, Kevin Wolf, Allon Mureinik, qemu-devel,
	Max Reitz, qemu-discuss

[-- Attachment #1: Type: text/plain, Size: 2500 bytes --]

On Wed, Feb 15, 2017 at 04:07:43PM +0000, Daniel P. Berrange wrote:
> On Wed, Feb 15, 2017 at 05:57:12PM +0200, Nir Soffer wrote:
> > On Wed, Feb 15, 2017 at 5:20 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
> > > On Wed, Feb 15, 2017 at 03:14:19PM +0000, Stefan Hajnoczi wrote:
> > >> On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
> > >> > I was wondering if that is possible to provide a new API that
> > >> > estimates the size of
> > >> > qcow2 image converted from a raw image. We could use this new API to
> > >> > allocate the
> > >> > size more precisely before the convert operation.
> > >> >
> > >> [...]
> > >> > We think that the best way to solve this issue is to return this info
> > >> > from qemu-img, maybe as a flag to qemu-img convert that will
> > >> > calculate the size of the converted image without doing any writes.
> > >>
> > >> Sounds reasonable.  qcow2 actually already does some of this calculation
> > >> internally for image preallocation in qcow2_create2().
> > >>
> > >> Let's try this syntax:
> > >>
> > >>   $ qemu-img query-max-size -f raw -O qcow2 input.raw
> > >>   1234678000
> > >>
> > >> As John explained, it is only an estimate.  But it will be a
> > >> conservative maximum.
> > >
> > > This forces you to have an input file. It would be nice to be able to
> > > get the same information by merely giving the desired capacity e.g
> > >
> > >   $ qemu-img query-max-size -O qcow2 20G
> > 
> > Without a file, this will have to assume that all clusters will be allocated.
> > 
> > Do you have a use case for not using existing file?
> 
> If you want to format a new qcow2 file in a pre-created block device you
> want to know how big the block device should be. Or you want to validate
> that the filesystem you're about to created it in will not become
> overrcomitted wrt pre-existing guests. So you need to consider the FS
> free space, vs query-max-size for all existing guest, combined with
> query-max-size for the new disk you wan tto create

QEMU can certainly provide a --size 20G mode which returns data size
(20G) + metadata size.  Of course, an empty qcow2 file will start off
much smaller since no data clusters are in use yet.

It's worth remembering that operations like savem and internal snapshots
can increase image size beyond the conservative max estimate.  So this
estimate isn't an upper bound for the future, just an upper bound for
qemu-img convert.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
       [not found]         ` <CAMRbyyssi_rspwDJTtWM1Ju5CTZ15z1xBikRDONrS84rx+B8Qg@mail.gmail.com>
@ 2017-02-22 16:15           ` Maor Lipchuk
  2017-02-22 22:06             ` Maor Lipchuk
  2017-02-28  9:19             ` Stefan Hajnoczi
  0 siblings, 2 replies; 17+ messages in thread
From: Maor Lipchuk @ 2017-02-22 16:15 UTC (permalink / raw)
  To: qemu-discuss, qemu-devel
  Cc: Allon Mureinik, Kevin Wolf, Nir Soffer, John Snow Huston

Hi all,

Thank you very much for your help, it was much helpful
We adopted John Snow advice and implemented our own calculation so we
can resolve the issue now,
We plan to drop this code once we can get this estimation from qemu-img.

This is the link to the patch introducing the calculation:
https://gerrit.ovirt.org/#/c/65039/14/lib/vdsm/storage/qcow2.py

And here are link to the tests that we added:
https://gerrit.ovirt.org/#/c/65039/14/tests/storage_qcow2_test.py

Here is how the calculation goes:

We first use qemuimg map to get the used clusters and count all the
clusters for each run returned from qemuimg.map(filename):

 def count_clusters(runs):
    count = 0
     last = -1
     for r in runs:
         # Find the cluster when start and end are located.
         start = r["start"] // CLUSTER_SIZE
         end = (r["start"] + r["length"]) // CLUSTER_SIZE
         if r["data"]:
             if start == end:
                 # This run is smaller then a cluster. If we have several runs
                 # in the same cluster, we want to count the cluster only once.
                 if start != last:
                     count += 1
             else:
                 # This run span over multiple clusters - we want to count all
                 # the clusters this run touch.
                 count += end - start
             last = end
     return count


The following calculation is based on Kevin's comments on the original
bug, and qcow2 spec:
https://github.com/qemu/qemu/blob/master/docs/specs/qcow2.txt:

     header_size = 3 * CLUSTER_SIZE

     virtual_size = os.stat(filename).st_size

     # Each 512MiB has one l2 table (64K)
     l2_tables = (virtual_size + (512 * 1024**2) - 1) // (512 * 1024**2)
     l2_tables_size = l2_tables * CLUSTER_SIZE

     # Each cluster have a refcount entry (16 bits) in the refcount tables. It
     # is not clear what is the size of the refcount table, lets assume it is
     # the same size as the l2 tables.
     refcounts_tables_size = l2_tables_size


After we calculate the estimated size we do the following logic and
multiply it with 1.1:

     chunk_size = config.getint("irs",
                                "volume_utilization_chunk_mb")
     chunk_size = chunk_size * sc.MEGAB
     newsize = (estimate_size + chunk_size) / sc.BLOCK_SIZE
     self.log.debug("Estimated allocation for qcow2 volume:"
                    "%d bytes", newsize)
     newsize = newsize * 1.1


Please let me know if that calculation is acceptable and makes since
for this use case

Thanks,
Maor



>> On Mon, Feb 20, 2017 at 1:07 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>> On Wed, Feb 15, 2017 at 05:49:58PM +0200, Nir Soffer wrote:
>>>> On Wed, Feb 15, 2017 at 5:14 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>>> > On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
>>>> >> I was wondering if that is possible to provide a new API that
>>>> >> estimates the size of
>>>> >> qcow2 image converted from a raw image. We could use this new API to
>>>> >> allocate the
>>>> >> size more precisely before the convert operation.
>>>> >>
>>>> > [...]
>>>> >> We think that the best way to solve this issue is to return this info
>>>> >> from qemu-img, maybe as a flag to qemu-img convert that will
>>>> >> calculate the size of the converted image without doing any writes.
>>>> >
>>>> > Sounds reasonable.  qcow2 actually already does some of this calculation
>>>> > internally for image preallocation in qcow2_create2().
>>>> >
>>>> > Let's try this syntax:
>>>> >
>>>> >   $ qemu-img query-max-size -f raw -O qcow2 input.raw
>>>> >   1234678000
>>>>
>>>> This is little bit verbose compared to other commands
>>>> (e.g. info, check, convert)
>>>>
>>>> Since this is needed only during convert, maybe this can be
>>>> a convert flag?
>>>>
>>>>     qemu-img convert -f xxx -O yyy src dst --estimate-size --output json
>>>>     {
>>>>         "estimated size": 1234678000
>>>>     }
>>>
>>> What is dst?  It's a dummy argument.
>>>
>>> Let's not try to shoehorn this new sub-command into qemu-img convert.
>>>
>>>> > As John explained, it is only an estimate.  But it will be a
>>>> > conservative maximum.
>>>> >
>>>> > Internally BlockDriver needs a new interface:
>>>> >
>>>> > struct BlockDriver {
>>>> >     /*
>>>> >      * Return a conservative estimate of the maximum host file size
>>>> >      * required by a new image given an existing BlockDriverState (not
>>>> >      * necessarily opened with this BlockDriver).
>>>> >      */
>>>> >     uint64_t (*bdrv_query_max_size)(BlockDriverState *other_bs,
>>>> >                                     Error **errp);
>>>> > };
>>>> >
>>>> > This interface allows individual block drivers to probe other_bs in
>>>> > whatever way necessary (e.g. querying block allocation status).
>>>> >
>>>> > Since this is a conservative max estimate there's no need to read all
>>>> > data to check for zero regions.  We should give the best estimate that
>>>> > can be generated quickly.
>>>>
>>>> I think we need to check allocation (e.g. with SEEK_DATA), I hope this
>>>> is what you mean by not read all data.
>>>
>>> Yes, allocation data must be checked.  But it will not read data
>>> clusters from disk to check if they contains only zeroes.
>>>
>>> Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-22 16:15           ` Maor Lipchuk
@ 2017-02-22 22:06             ` Maor Lipchuk
  2017-02-28  9:19             ` Stefan Hajnoczi
  1 sibling, 0 replies; 17+ messages in thread
From: Maor Lipchuk @ 2017-02-22 22:06 UTC (permalink / raw)
  To: qemu-discuss, qemu-devel
  Cc: Allon Mureinik, Kevin Wolf, Nir Soffer, John Snow Huston

Hi,

I added a few clarifications inline.
The relevant calculation is:
     header_size + l2_tables_size + refcounts_tables_size + data_size

and it is also described inline.
I will be happy if you can confirm this calculation is acceptable

Thanks,
Maor

On Wed, Feb 22, 2017 at 6:15 PM, Maor Lipchuk <mlipchuk@redhat.com> wrote:
> Hi all,
>
> Thank you very much for your help, it was much helpful
> We adopted John Snow advice and implemented our own calculation so we
> can resolve the issue now,
> We plan to drop this code once we can get this estimation from qemu-img.
>
> This is the link to the patch introducing the calculation:
> https://gerrit.ovirt.org/#/c/65039/14/lib/vdsm/storage/qcow2.py
>
> And here are link to the tests that we added:
> https://gerrit.ovirt.org/#/c/65039/14/tests/storage_qcow2_test.py
>
> Here is how the calculation goes:
>
> We first use qemuimg map to get the used clusters and count all the
> clusters for each run returned from qemuimg.map(filename):
>
>  def count_clusters(runs):

Just a clarification:
runs are the output of qemuimg.map(filename).
Here is the code from github that implements it:
https://github.com/oVirt/vdsm/blob/master/lib/vdsm/qemuimg.py

>     count = 0
>      last = -1
>      for r in runs:
>          # Find the cluster when start and end are located.
>          start = r["start"] // CLUSTER_SIZE
>          end = (r["start"] + r["length"]) // CLUSTER_SIZE
>          if r["data"]:
>              if start == end:
>                  # This run is smaller then a cluster. If we have several runs
>                  # in the same cluster, we want to count the cluster only once.
>                  if start != last:
>                      count += 1
>              else:
>                  # This run span over multiple clusters - we want to count all
>                  # the clusters this run touch.
>                  count += end - start
>              last = end
>      return count
>
>
> The following calculation is based on Kevin's comments on the original
> bug, and qcow2 spec:
> https://github.com/qemu/qemu/blob/master/docs/specs/qcow2.txt:
>
>      header_size = 3 * CLUSTER_SIZE
>
>      virtual_size = os.stat(filename).st_size
>
>      # Each 512MiB has one l2 table (64K)
>      l2_tables = (virtual_size + (512 * 1024**2) - 1) // (512 * 1024**2)
>      l2_tables_size = l2_tables * CLUSTER_SIZE
>
>      # Each cluster have a refcount entry (16 bits) in the refcount tables. It
>      # is not clear what is the size of the refcount table, lets assume it is
>      # the same size as the l2 tables.
>      refcounts_tables_size = l2_tables_size

The calculation is missing two more lines:

     data_size = used_clusters * CLUSTER_SIZE
     return header_size + l2_tables_size + refcounts_tables_size + data_size


>
>
> After we calculate the estimated size we do the following logic and
> multiply it with 1.1:
>
>      chunk_size = config.getint("irs",
>                                 "volume_utilization_chunk_mb")
>      chunk_size = chunk_size * sc.MEGAB
>      newsize = (estimate_size + chunk_size) / sc.BLOCK_SIZE
>      self.log.debug("Estimated allocation for qcow2 volume:"
>                     "%d bytes", newsize)
>      newsize = newsize * 1.1
>


This last calculation can be ignored, it is mainly a safety space we add


>
> Please let me know if that calculation is acceptable and makes since
> for this use case
>
> Thanks,
> Maor
>
>
>
>>> On Mon, Feb 20, 2017 at 1:07 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>>> On Wed, Feb 15, 2017 at 05:49:58PM +0200, Nir Soffer wrote:
>>>>> On Wed, Feb 15, 2017 at 5:14 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>>>> > On Mon, Feb 13, 2017 at 05:46:19PM +0200, Maor Lipchuk wrote:
>>>>> >> I was wondering if that is possible to provide a new API that
>>>>> >> estimates the size of
>>>>> >> qcow2 image converted from a raw image. We could use this new API to
>>>>> >> allocate the
>>>>> >> size more precisely before the convert operation.
>>>>> >>
>>>>> > [...]
>>>>> >> We think that the best way to solve this issue is to return this info
>>>>> >> from qemu-img, maybe as a flag to qemu-img convert that will
>>>>> >> calculate the size of the converted image without doing any writes.
>>>>> >
>>>>> > Sounds reasonable.  qcow2 actually already does some of this calculation
>>>>> > internally for image preallocation in qcow2_create2().
>>>>> >
>>>>> > Let's try this syntax:
>>>>> >
>>>>> >   $ qemu-img query-max-size -f raw -O qcow2 input.raw
>>>>> >   1234678000
>>>>>
>>>>> This is little bit verbose compared to other commands
>>>>> (e.g. info, check, convert)
>>>>>
>>>>> Since this is needed only during convert, maybe this can be
>>>>> a convert flag?
>>>>>
>>>>>     qemu-img convert -f xxx -O yyy src dst --estimate-size --output json
>>>>>     {
>>>>>         "estimated size": 1234678000
>>>>>     }
>>>>
>>>> What is dst?  It's a dummy argument.
>>>>
>>>> Let's not try to shoehorn this new sub-command into qemu-img convert.
>>>>
>>>>> > As John explained, it is only an estimate.  But it will be a
>>>>> > conservative maximum.
>>>>> >
>>>>> > Internally BlockDriver needs a new interface:
>>>>> >
>>>>> > struct BlockDriver {
>>>>> >     /*
>>>>> >      * Return a conservative estimate of the maximum host file size
>>>>> >      * required by a new image given an existing BlockDriverState (not
>>>>> >      * necessarily opened with this BlockDriver).
>>>>> >      */
>>>>> >     uint64_t (*bdrv_query_max_size)(BlockDriverState *other_bs,
>>>>> >                                     Error **errp);
>>>>> > };
>>>>> >
>>>>> > This interface allows individual block drivers to probe other_bs in
>>>>> > whatever way necessary (e.g. querying block allocation status).
>>>>> >
>>>>> > Since this is a conservative max estimate there's no need to read all
>>>>> > data to check for zero regions.  We should give the best estimate that
>>>>> > can be generated quickly.
>>>>>
>>>>> I think we need to check allocation (e.g. with SEEK_DATA), I hope this
>>>>> is what you mean by not read all data.
>>>>
>>>> Yes, allocation data must be checked.  But it will not read data
>>>> clusters from disk to check if they contains only zeroes.
>>>>
>>>> Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] Estimation of qcow2 image size converted from raw image
  2017-02-22 16:15           ` Maor Lipchuk
  2017-02-22 22:06             ` Maor Lipchuk
@ 2017-02-28  9:19             ` Stefan Hajnoczi
  1 sibling, 0 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2017-02-28  9:19 UTC (permalink / raw)
  To: Maor Lipchuk
  Cc: qemu-discuss, qemu-devel, Allon Mureinik, Kevin Wolf, Nir Soffer,
	John Snow Huston

[-- Attachment #1: Type: text/plain, Size: 3453 bytes --]

On Wed, Feb 22, 2017 at 06:15:47PM +0200, Maor Lipchuk wrote:
> Hi all,
> 
> Thank you very much for your help, it was much helpful
> We adopted John Snow advice and implemented our own calculation so we
> can resolve the issue now,
> We plan to drop this code once we can get this estimation from qemu-img.
> 
> This is the link to the patch introducing the calculation:
> https://gerrit.ovirt.org/#/c/65039/14/lib/vdsm/storage/qcow2.py
> 
> And here are link to the tests that we added:
> https://gerrit.ovirt.org/#/c/65039/14/tests/storage_qcow2_test.py
> 
> Here is how the calculation goes:
> 
> We first use qemuimg map to get the used clusters and count all the
> clusters for each run returned from qemuimg.map(filename):
> 
>  def count_clusters(runs):
>     count = 0
>      last = -1
>      for r in runs:
>          # Find the cluster when start and end are located.
>          start = r["start"] // CLUSTER_SIZE
>          end = (r["start"] + r["length"]) // CLUSTER_SIZE
>          if r["data"]:
>              if start == end:
>                  # This run is smaller then a cluster. If we have several runs
>                  # in the same cluster, we want to count the cluster only once.
>                  if start != last:
>                      count += 1
>              else:
>                  # This run span over multiple clusters - we want to count all
>                  # the clusters this run touch.
>                  count += end - start
>              last = end
>      return count
> 
> 
> The following calculation is based on Kevin's comments on the original
> bug, and qcow2 spec:
> https://github.com/qemu/qemu/blob/master/docs/specs/qcow2.txt:
> 
>      header_size = 3 * CLUSTER_SIZE

Are you including the L1 table in these 3 clusters?

>      virtual_size = os.stat(filename).st_size

This assumes the input file is in raw format.  If the input file is in
vmdk, vhdx, qcow2, etc then the POSIX file size does not represent the
virtual disk size.

> 
>      # Each 512MiB has one l2 table (64K)
>      l2_tables = (virtual_size + (512 * 1024**2) - 1) // (512 * 1024**2)
>      l2_tables_size = l2_tables * CLUSTER_SIZE
> 
>      # Each cluster have a refcount entry (16 bits) in the refcount tables. It
>      # is not clear what is the size of the refcount table, lets assume it is
>      # the same size as the l2 tables.
>      refcounts_tables_size = l2_tables_size

There is a formula for calculating refcount blocks in qcow2_create2():

  /* total size of refcount blocks
   *
   * note: every host cluster is reference-counted, including metadata
   * (even refcount blocks are recursively included).
   * Let:
   *   a = total_size (this is the guest disk size)
   *   m = meta size not including refcount blocks and refcount tables
   *   c = cluster size
   *   y1 = number of refcount blocks entries
   *   y2 = meta size including everything
   *   rces = refcount entry size in bytes
   * then,
   *   y1 = (y2 + a)/c
   *   y2 = y1 * rces + y1 * rces * sizeof(u64) / c + m
   * we can get y1:
   *   y1 = (a + m) / (c - rces - rces * sizeof(u64) / c)
   */
  nrefblocke = (aligned_total_size + meta_size + cluster_size)
             / (cluster_size - rces - rces * sizeof(uint64_t)
                                           / cluster_size);
  meta_size += DIV_ROUND_UP(nrefblocke, refblock_size) * cluster_size;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-02-28  9:19 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-13 15:46 [Qemu-devel] Estimation of qcow2 image size converted from raw image Maor Lipchuk
2017-02-13 17:03 ` John Snow
2017-02-13 17:16   ` Daniel P. Berrange
2017-02-13 18:26     ` John Snow
2017-02-15 15:14 ` Stefan Hajnoczi
2017-02-15 15:20   ` Daniel P. Berrange
2017-02-15 15:34     ` Eric Blake
2017-02-15 15:57     ` Nir Soffer
2017-02-15 16:05       ` [Qemu-devel] [Qemu-discuss] " Alberto Garcia
2017-02-15 16:11         ` Daniel P. Berrange
2017-02-15 16:07       ` [Qemu-devel] " Daniel P. Berrange
2017-02-20 11:12         ` Stefan Hajnoczi
2017-02-15 15:49   ` Nir Soffer
2017-02-20 11:07     ` Stefan Hajnoczi
     [not found]       ` <CAJ1JNOdzD7DHTHGJEO2YQANDPq0kY-PEh6J1jBkP7hUW0Kvy9w@mail.gmail.com>
     [not found]         ` <CAMRbyyssi_rspwDJTtWM1Ju5CTZ15z1xBikRDONrS84rx+B8Qg@mail.gmail.com>
2017-02-22 16:15           ` Maor Lipchuk
2017-02-22 22:06             ` Maor Lipchuk
2017-02-28  9:19             ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.