Re: Bug? qemu-img convert to preallocated image makes it sparse

From: Max Reitz <mreitz@redhat.com>
To: Maxim Levitsky <mlevitsk@redhat.com>, Kevin Wolf <kwolf@redhat.com>
Cc: sgarzare@redhat.com, "Richard W.M. Jones" <rjones@redhat.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Subject: Re: Bug? qemu-img convert to preallocated image makes it sparse
Date: Thu, 16 Jan 2020 16:56:15 +0100	[thread overview]
Message-ID: <4d6ca271-d666-4ba2-4115-5d60f94e4c32@redhat.com> (raw)
In-Reply-To: <03ebf1f7ad780fca65dfc7486e860beb33c71d20.camel@redhat.com>

[-- Attachment #1.1: Type: text/plain, Size: 4672 bytes --]

On 16.01.20 16:38, Maxim Levitsky wrote:
> On Thu, 2020-01-16 at 15:55 +0100, Max Reitz wrote:
>> On 16.01.20 15:50, Kevin Wolf wrote:
>>> Am 16.01.2020 um 15:37 hat Max Reitz geschrieben:
>>>> On 16.01.20 15:13, Richard W.M. Jones wrote:
>>>>> I'm not necessarily saying this is a bug, but a change in behaviour in
>>>>> qemu has caused virt-v2v to fail.  The reproducer is quite simple.
>>>>>
>>>>> Create sparse and preallocated qcow2 files of the same size:
>>>>>
>>>>>   $ qemu-img create -f qcow2 sparse.qcow2 50M
>>>>>   Formatting 'sparse.qcow2', fmt=qcow2 size=52428800 cluster_size=65536 lazy_refcounts=off refcount_bits=16
>>>>>
>>>>>   $ qemu-img create -f qcow2 prealloc.qcow2 50M -o preallocation=falloc,compat=1.1
>>>>>   Formatting 'prealloc.qcow2', fmt=qcow2 size=52428800 compat=1.1 cluster_size=65536 preallocation=falloc lazy_refcounts=off refcount_bits=16
>>>>>
>>>>>   $ du -m sparse.qcow2 prealloc.qcow2 
>>>>>   1 sparse.qcow2
>>>>>   51	prealloc.qcow2
>>>>>
>>>>> Now copy the sparse file into the preallocated file using the -n
>>>>> option so qemu-img doesn't create the target:
>>>>>
>>>>>   $ qemu-img convert -p -n -f qcow2 -O qcow2 sparse.qcow2 prealloc.qcow2
>>>>>       (100.00/100%)
>>>>>
>>>>> In new qemu that makes the target file sparse:
>>>>>
>>>>>   $ du -m sparse.qcow2 prealloc.qcow2 
>>>>>   1 sparse.qcow2
>>>>>   1 prealloc.qcow2         <-- should still be 51
>>>>>
>>>>> In old qemu the target file remained preallocated, which is what
>>>>> I and virt-v2v are expecting.
>>>>>
>>>>> I bisected this to the following commit:
>>>>>
>>>>> 4d7c487eac1652dfe4498fe84f32900ad461d61b is the first bad commit
>>>>> commit 4d7c487eac1652dfe4498fe84f32900ad461d61b
>>>>> Author: Max Reitz <mreitz@redhat.com>
>>>>> Date:   Wed Jul 24 19:12:29 2019 +0200
>>>>>
>>>>>     qemu-img: Fix bdrv_has_zero_init() use in convert
>>>>>     
>>>>>     bdrv_has_zero_init() only has meaning for newly created images or image
>>>>>     areas.  If qemu-img convert did not create the image itself, it cannot
>>>>>     rely on bdrv_has_zero_init()'s result to carry any meaning.
>>>>>     
>>>>>     Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>>     Message-id: 20190724171239.8764-2-mreitz@redhat.com
>>>>>     Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
>>>>>     Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
>>>>>     Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>>
>>>>>  qemu-img.c | 11 ++++++++---
>>>>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>>>>
>>>>> Reverting this commit on the current master branch restores the
>>>>> expected behaviour.
>>>>
>>>> The commit changed the behavior because now qemu-img realizes that it
>>>> cannot skip writing to areas that are supposed to be zero when it
>>>> converts to an existing image (because we have no idea what data that
>>>> existing image contains).  So that’s a bug fix, and I don’t think we can
>>>> undo it without being wrong.
>>>>
>>>> The problem is that qemu-img will try to be quickthat about making these
>>>> areas zero, and so it does zero writes (actually, it even zeroes the
>>>> whole image) and in the process it will of course discard all preallocation.
>>>>
>>>> Now, about fixing the problem I’m not so sure.
>>>
>>> Wouldn't just passing -S 0 solve the problem? It should tell qemu-img
>>> convert that you don't want it to sparsify anything.
>>
>> But it would also convert the falloc preallocation to a full one.
>>
>> (I had a section to this effect in my first draft, but then I
>> accidentally deleted it and forgot it in my second version...)
>>
>> Max
>>
> 
> How about doing write zeros without discard only in this particular case (convert to existing image)
> Basically omitting the BDRV_REQ_MAY_UNMAP flag to blk_co_pwrite_zeroes.
> It will be slow, but maybe for this particular case, it is acceptable?

I don’t think so, because AFAIA one relatively common case is to convert
to an existing block device, and we want to zero that quickly.

Or if the target image actually stores some data, I can imagine that
people actually want a discarding unmap so the image doesn’t use more
space than necessary.

(Also, if we resolve this without any new kind of flag, I’m not sure
whether we can guarantee to keep the desired behavior in the future,
because maybe something else will force us to forego existing
preallocation unless we know that the user really wants to keep it.  I
think there are just different use cases that convert -n is used for and
it makes sense to add a flag to distinguish between them.)

Max

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]