All of lore.kernel.org
 help / color / mirror / Atom feed
* qcow2: Zero-initialization of external data files
@ 2020-02-17 16:56 Max Reitz
  2020-04-06 22:22 ` Eric Blake
  0 siblings, 1 reply; 12+ messages in thread
From: Max Reitz @ 2020-02-17 16:56 UTC (permalink / raw)
  To: Qemu-block, Kevin Wolf; +Cc: qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1372 bytes --]

Hi,

AFAIU, external data files with data_file_raw=on are supposed to return
the same data as the qcow2 file when read.  But we still use the qcow2
metadata structures (which are by default initialized to “everything
unallocated”), even though we never ensure that the external data file
is zero, too, so this can happen:

$ dd if=/dev/urandom of=foo.raw 64M
[...]

$ sudo losetup -f --show foo.raw
/dev/loop0

$ sudo ./qemu-img create -f qcow2 -o \
    data_file=/dev/loop0,data_file_raw=on foo.qcow2 64M
[...]

$ sudo ./qemu-io -c 'read -P 0 0 64M' foo.qcow2
read 67108864/67108864 bytes at offset 0
64 MiB, 1 ops; 00.00 sec (25.036 GiB/sec and 400.5751 ops/sec)

$ sudo ./qemu-io -c 'read -P 0 0 64M' -f raw foo.raw
Pattern verification failed at offset 0, 67108864 bytes
read 67108864/67108864 bytes at offset 0
64 MiB, 1 ops; 00.01 sec (5.547 GiB/sec and 88.7484 ops/sec)

I suppose this behavior is fine for blockdev-create because I guess it’s
the user’s responsibility to ensure that the external data file is zero.
 But maybe it isn’t, so that’s my first question: Is it really the
user’s responsibility or should we always ensure it’s zero?

My second question is: If we decide that this is fine for
blockdev-create, should at least qcow2_co_create_opts() ensure the data
file it just created is zero?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-02-17 16:56 qcow2: Zero-initialization of external data files Max Reitz
@ 2020-04-06 22:22 ` Eric Blake
  2020-04-09 13:05   ` Max Reitz
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2020-04-06 22:22 UTC (permalink / raw)
  To: Max Reitz, Qemu-block, Kevin Wolf; +Cc: qemu-devel

On 2/17/20 10:56 AM, Max Reitz wrote:
> Hi,
> 
> AFAIU, external data files with data_file_raw=on are supposed to return
> the same data as the qcow2 file when read.  But we still use the qcow2
> metadata structures (which are by default initialized to “everything
> unallocated”), even though we never ensure that the external data file
> is zero, too, so this can happen:
> 
> $ dd if=/dev/urandom of=foo.raw 64M
> [...]
> 
> $ sudo losetup -f --show foo.raw
> /dev/loop0
> 
> $ sudo ./qemu-img create -f qcow2 -o \
>      data_file=/dev/loop0,data_file_raw=on foo.qcow2 64M
> [...]
> 
> $ sudo ./qemu-io -c 'read -P 0 0 64M' foo.qcow2
> read 67108864/67108864 bytes at offset 0
> 64 MiB, 1 ops; 00.00 sec (25.036 GiB/sec and 400.5751 ops/sec)

This looks like a bug (and we should fix it for 5.0 if possible) - read 
of a data_file_raw=on should not treat unallocated clusters as reading 
0, but rather as reading whatever the raw data contains.

> 
> $ sudo ./qemu-io -c 'read -P 0 0 64M' -f raw foo.raw
> Pattern verification failed at offset 0, 67108864 bytes
> read 67108864/67108864 bytes at offset 0
> 64 MiB, 1 ops; 00.01 sec (5.547 GiB/sec and 88.7484 ops/sec)
> 
> I suppose this behavior is fine for blockdev-create because I guess it’s
> the user’s responsibility to ensure that the external data file is zero.
>   But maybe it isn’t, so that’s my first question: Is it really the
> user’s responsibility or should we always ensure it’s zero?

I'd argue that requiring the user to pre-zero the raw data file is 
undesirable; and that we should instead fix our code to not report the 
image as reading all zeroes when creating with data_file_raw=on.

> 
> My second question is: If we decide that this is fine for
> blockdev-create, should at least qcow2_co_create_opts() ensure the data
> file it just created is zero?

Having an option to make qemu force-zero the raw image during 
qcow2_co_create_opts seems reasonable, but for performance reasons, I 
don't think the flag should be on by default.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-06 22:22 ` Eric Blake
@ 2020-04-09 13:05   ` Max Reitz
  2020-04-09 13:42     ` Eric Blake
  0 siblings, 1 reply; 12+ messages in thread
From: Max Reitz @ 2020-04-09 13:05 UTC (permalink / raw)
  To: Eric Blake, Qemu-block, Kevin Wolf; +Cc: qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2468 bytes --]

On 07.04.20 00:22, Eric Blake wrote:
> On 2/17/20 10:56 AM, Max Reitz wrote:
>> Hi,
>>
>> AFAIU, external data files with data_file_raw=on are supposed to return
>> the same data as the qcow2 file when read.  But we still use the qcow2
>> metadata structures (which are by default initialized to “everything
>> unallocated”), even though we never ensure that the external data file
>> is zero, too, so this can happen:
>>
>> $ dd if=/dev/urandom of=foo.raw 64M
>> [...]
>>
>> $ sudo losetup -f --show foo.raw
>> /dev/loop0
>>
>> $ sudo ./qemu-img create -f qcow2 -o \
>>      data_file=/dev/loop0,data_file_raw=on foo.qcow2 64M
>> [...]
>>
>> $ sudo ./qemu-io -c 'read -P 0 0 64M' foo.qcow2
>> read 67108864/67108864 bytes at offset 0
>> 64 MiB, 1 ops; 00.00 sec (25.036 GiB/sec and 400.5751 ops/sec)
> 
> This looks like a bug (and we should fix it for 5.0 if possible)

It seems a bit difficult for 5.0 now.  (But I don’t think it’d be a
regression, so that shouldn’t be too bad.)

> read
> of a data_file_raw=on should not treat unallocated clusters as reading
> 0, but rather as reading whatever the raw data contains.
> 
>>
>> $ sudo ./qemu-io -c 'read -P 0 0 64M' -f raw foo.raw
>> Pattern verification failed at offset 0, 67108864 bytes
>> read 67108864/67108864 bytes at offset 0
>> 64 MiB, 1 ops; 00.01 sec (5.547 GiB/sec and 88.7484 ops/sec)
>>
>> I suppose this behavior is fine for blockdev-create because I guess it’s
>> the user’s responsibility to ensure that the external data file is zero.
>>   But maybe it isn’t, so that’s my first question: Is it really the
>> user’s responsibility or should we always ensure it’s zero?
> 
> I'd argue that requiring the user to pre-zero the raw data file is
> undesirable; and that we should instead fix our code to not report the
> image as reading all zeroes when creating with data_file_raw=on.

OK.  I think that could be achieved by just enforcing @preallocation to
be at least “metadata” whenever @data-file-raw is set.  Would that make
sense?

Max

>> My second question is: If we decide that this is fine for
>> blockdev-create, should at least qcow2_co_create_opts() ensure the data
>> file it just created is zero?
> 
> Having an option to make qemu force-zero the raw image during
> qcow2_co_create_opts seems reasonable, but for performance reasons, I
> don't think the flag should be on by default.
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 13:05   ` Max Reitz
@ 2020-04-09 13:42     ` Eric Blake
  2020-04-09 13:47       ` Eric Blake
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2020-04-09 13:42 UTC (permalink / raw)
  To: Max Reitz, Qemu-block, Kevin Wolf; +Cc: qemu-devel

On 4/9/20 8:05 AM, Max Reitz wrote:

>>> $ sudo ./qemu-img create -f qcow2 -o \
>>>       data_file=/dev/loop0,data_file_raw=on foo.qcow2 64M
>>> [...]
>>>
>>> $ sudo ./qemu-io -c 'read -P 0 0 64M' foo.qcow2
>>> read 67108864/67108864 bytes at offset 0
>>> 64 MiB, 1 ops; 00.00 sec (25.036 GiB/sec and 400.5751 ops/sec)
>>
>> This looks like a bug (and we should fix it for 5.0 if possible)
> 
> It seems a bit difficult for 5.0 now.  (But I don’t think it’d be a
> regression, so that shouldn’t be too bad.)

So, you're arguing that since 4.2 has the same bug, slipping the fix to 
5.1 instead of 5.0 is not bad because it's not a regression new to 5.0. 
Yes, that's a reasonable answer, if a fix is not fast.


>>> I suppose this behavior is fine for blockdev-create because I guess it’s
>>> the user’s responsibility to ensure that the external data file is zero.
>>>    But maybe it isn’t, so that’s my first question: Is it really the
>>> user’s responsibility or should we always ensure it’s zero?
>>
>> I'd argue that requiring the user to pre-zero the raw data file is
>> undesirable; and that we should instead fix our code to not report the
>> image as reading all zeroes when creating with data_file_raw=on.
> 
> OK.  I think that could be achieved by just enforcing @preallocation to
> be at least “metadata” whenever @data-file-raw is set.  Would that make
> sense?

Is a preallocation of metadata sufficient to report things correctly? 
If so, it seems like a reasonable compromise to me.  I was more 
envisioning a fix elsewhere: if we are reporting block status of what 
looks like an unallocated cluster, but data-file-raw is set, we change 
our answer to instead report it as allocated with unknown contents.  But 
with preallocation, you either force the qcow2 file to list no cluster 
as unallocated (which matches the fact that the raw image really is 
fully allocated) while not touching the raw image, or you can go one 
step further and request full preallocation to wipe the raw image to 0 
in the process.

> 
> Max
> 
>>> My second question is: If we decide that this is fine for
>>> blockdev-create, should at least qcow2_co_create_opts() ensure the data
>>> file it just created is zero?
>>
>> Having an option to make qemu force-zero the raw image during
>> qcow2_co_create_opts seems reasonable, but for performance reasons, I
>> don't think the flag should be on by default.

And by mentioning preallocation, you've managed to convince me that we 
may already have exactly the option I was envisioning.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 13:42     ` Eric Blake
@ 2020-04-09 13:47       ` Eric Blake
  2020-04-09 14:10         ` Max Reitz
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2020-04-09 13:47 UTC (permalink / raw)
  To: Max Reitz, Qemu-block, Kevin Wolf; +Cc: qemu-devel

On 4/9/20 8:42 AM, Eric Blake wrote:

>>> I'd argue that requiring the user to pre-zero the raw data file is
>>> undesirable; and that we should instead fix our code to not report the
>>> image as reading all zeroes when creating with data_file_raw=on.
>>
>> OK.  I think that could be achieved by just enforcing @preallocation to
>> be at least “metadata” whenever @data-file-raw is set.  Would that make
>> sense?
> 
> Is a preallocation of metadata sufficient to report things correctly? If 
> so, it seems like a reasonable compromise to me.  I was more envisioning 
> a fix elsewhere: if we are reporting block status of what looks like an 
> unallocated cluster, but data-file-raw is set, we change our answer to 
> instead report it as allocated with unknown contents.  But with 
> preallocation, you either force the qcow2 file to list no cluster as 
> unallocated (which matches the fact that the raw image really is fully 
> allocated) while not touching the raw image, or you can go one step 
> further and request full preallocation to wipe the raw image to 0 in the 
> process.

What happens when an operation attempts to unmap things?  Do we reject 
all unmap operations when data-file-raw is set (thus leaving a cluster 
marked as allocated at all times, if we can first guarantee that 
preallocation set things up that way)?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 13:47       ` Eric Blake
@ 2020-04-09 14:10         ` Max Reitz
  2020-04-09 14:32           ` Eric Blake
  0 siblings, 1 reply; 12+ messages in thread
From: Max Reitz @ 2020-04-09 14:10 UTC (permalink / raw)
  To: Eric Blake, Qemu-block, Kevin Wolf; +Cc: qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1858 bytes --]

On 09.04.20 15:47, Eric Blake wrote:
> On 4/9/20 8:42 AM, Eric Blake wrote:
> 
>>>> I'd argue that requiring the user to pre-zero the raw data file is
>>>> undesirable; and that we should instead fix our code to not report the
>>>> image as reading all zeroes when creating with data_file_raw=on.
>>>
>>> OK.  I think that could be achieved by just enforcing @preallocation to
>>> be at least “metadata” whenever @data-file-raw is set.  Would that make
>>> sense?
>>
>> Is a preallocation of metadata sufficient to report things correctly?
>> If so, it seems like a reasonable compromise to me.  I was more
>> envisioning a fix elsewhere: if we are reporting block status of what
>> looks like an unallocated cluster, but data-file-raw is set, we change
>> our answer to instead report it as allocated with unknown contents. 
>> But with preallocation, you either force the qcow2 file to list no
>> cluster as unallocated (which matches the fact that the raw image
>> really is fully allocated) while not touching the raw image, or you
>> can go one step further and request full preallocation to wipe the raw
>> image to 0 in the process.
> 
> What happens when an operation attempts to unmap things?  Do we reject
> all unmap operations when data-file-raw is set (thus leaving a cluster
> marked as allocated at all times, if we can first guarantee that
> preallocation set things up that way)?
No, unmap operations currently work.  qcow2_free_any_clusters() passes
them through to the external data file.

The problem is that the unmap also zeroes the L2 entry, so if you then
write data to the raw file, it won’t be visible from the qcow2 side of
things.  However, I’m not sure whether we support modifications of a raw
file when it is already “in use” by a qcow2 image, so maybe that’s fine.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 14:10         ` Max Reitz
@ 2020-04-09 14:32           ` Eric Blake
  2020-04-09 15:01             ` Max Reitz
  2020-04-14 12:28             ` Kevin Wolf
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Blake @ 2020-04-09 14:32 UTC (permalink / raw)
  To: Max Reitz, Qemu-block, Kevin Wolf; +Cc: qemu-devel

On 4/9/20 9:10 AM, Max Reitz wrote:

>>
>> What happens when an operation attempts to unmap things?  Do we reject
>> all unmap operations when data-file-raw is set (thus leaving a cluster
>> marked as allocated at all times, if we can first guarantee that
>> preallocation set things up that way)?
> No, unmap operations currently work.  qcow2_free_any_clusters() passes
> them through to the external data file.
> 
> The problem is that the unmap also zeroes the L2 entry, so if you then
> write data to the raw file, it won’t be visible from the qcow2 side of
> things.  However, I’m not sure whether we support modifications of a raw
> file when it is already “in use” by a qcow2 image, so maybe that’s fine.

We don't support concurrent modification. But if the guest is running 
and unmaps things, then shuts off, then we edit the raw file offline, 
then we restart the guest, the guest should see the results of those 
offline edits.

We have to special-case the qcow2 code to either treat unallocated 
cluster + raw-data-file as not really unallocated (so that we can see 
those edits), or we have to special case it to handle unmap + 
raw-data-file specially (pass unmap to the raw file, but do NOT mark the 
cluster unallocated in the qcow2 wrapper).  Which special case we choose 
for unmap may in turn affect whether it is easier to require 
preallocation=metadata (because we can now guarantee that no cluster 
will ever be marked unallocated in qcow2) or whether the default 
preallocation of leaving clusters "unallocated" in qcow2 still sees the 
proper guest data.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 14:32           ` Eric Blake
@ 2020-04-09 15:01             ` Max Reitz
  2020-04-09 15:46               ` Eric Blake
  2020-04-14 12:28             ` Kevin Wolf
  1 sibling, 1 reply; 12+ messages in thread
From: Max Reitz @ 2020-04-09 15:01 UTC (permalink / raw)
  To: Eric Blake, Qemu-block, Kevin Wolf; +Cc: qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1285 bytes --]

On 09.04.20 16:32, Eric Blake wrote:
> On 4/9/20 9:10 AM, Max Reitz wrote:
> 
>>>
>>> What happens when an operation attempts to unmap things?  Do we reject
>>> all unmap operations when data-file-raw is set (thus leaving a cluster
>>> marked as allocated at all times, if we can first guarantee that
>>> preallocation set things up that way)?
>> No, unmap operations currently work.  qcow2_free_any_clusters() passes
>> them through to the external data file.
>>
>> The problem is that the unmap also zeroes the L2 entry, so if you then
>> write data to the raw file, it won’t be visible from the qcow2 side of
>> things.  However, I’m not sure whether we support modifications of a raw
>> file when it is already “in use” by a qcow2 image, so maybe that’s fine.
> 
> We don't support concurrent modification. But if the guest is running
> and unmaps things, then shuts off, then we edit the raw file offline,
> then we restart the guest, the guest should see the results of those
> offline edits.

Should it?  The specification doesn’t say anything about that.

In fact, I think we have always said we explicitly discourage that
because this might lead to outdated metadata; even though we usually
meant “dirty bitmaps” by that.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 15:01             ` Max Reitz
@ 2020-04-09 15:46               ` Eric Blake
  2020-04-09 15:56                 ` Eric Blake
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2020-04-09 15:46 UTC (permalink / raw)
  To: Max Reitz, Qemu-block, Kevin Wolf; +Cc: qemu-devel

On 4/9/20 10:01 AM, Max Reitz wrote:
> On 09.04.20 16:32, Eric Blake wrote:
>> On 4/9/20 9:10 AM, Max Reitz wrote:
>>
>>>>
>>>> What happens when an operation attempts to unmap things?  Do we reject
>>>> all unmap operations when data-file-raw is set (thus leaving a cluster
>>>> marked as allocated at all times, if we can first guarantee that
>>>> preallocation set things up that way)?
>>> No, unmap operations currently work.  qcow2_free_any_clusters() passes
>>> them through to the external data file.
>>>
>>> The problem is that the unmap also zeroes the L2 entry, so if you then
>>> write data to the raw file, it won’t be visible from the qcow2 side of
>>> things.  However, I’m not sure whether we support modifications of a raw
>>> file when it is already “in use” by a qcow2 image, so maybe that’s fine.
>>
>> We don't support concurrent modification. But if the guest is running
>> and unmaps things, then shuts off, then we edit the raw file offline,
>> then we restart the guest, the guest should see the results of those
>> offline edits.
> 
> Should it?  The specification doesn’t say anything about that.
> 
> In fact, I think we have always said we explicitly discourage that
> because this might lead to outdated metadata; even though we usually
> meant “dirty bitmaps” by that.

Hmm.  Kevin, I'd really like your opinion here.  The point of the 
raw-external-data flag is to state that "qemu MUST ensure that whatever 
is done to this image while the guest is running is reflected through to 
the raw file, so that after the guest stops, the raw file alone is still 
viable to see what the guest saw".  But as you say, there's a difference 
between "the raw file will read what the guest saw" and "we can now edit 
the raw file without regards to the qcow2 wrapper but later reuse of the 
qcow2 wrapper won't be corrupted by those edits".

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 15:46               ` Eric Blake
@ 2020-04-09 15:56                 ` Eric Blake
  2020-04-14 12:34                   ` Kevin Wolf
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2020-04-09 15:56 UTC (permalink / raw)
  To: Max Reitz, Qemu-block, Kevin Wolf; +Cc: qemu-devel

On 4/9/20 10:46 AM, Eric Blake wrote:

>>> We don't support concurrent modification. But if the guest is running
>>> and unmaps things, then shuts off, then we edit the raw file offline,
>>> then we restart the guest, the guest should see the results of those
>>> offline edits.
>>
>> Should it?  The specification doesn’t say anything about that.
>>
>> In fact, I think we have always said we explicitly discourage that
>> because this might lead to outdated metadata; even though we usually
>> meant “dirty bitmaps” by that.
> 
> Hmm.  Kevin, I'd really like your opinion here.  The point of the 
> raw-external-data flag is to state that "qemu MUST ensure that whatever 
> is done to this image while the guest is running is reflected through to 
> the raw file, so that after the guest stops, the raw file alone is still 
> viable to see what the guest saw".  But as you say, there's a difference 
> between "the raw file will read what the guest saw" and "we can now edit 
> the raw file without regards to the qcow2 wrapper but later reuse of the 
> qcow2 wrapper won't be corrupted by those edits".

Another random thought: Should we add a header extension that records 
the timestamps of an external data file?  That way, if the timestamps of 
the file have changed from what we recorded in our optional header, then 
we can flag to the user that our metadata may be stale because of what 
appears to be external edits.  But that's not always going to save us - 
timestamps on a block device don't behave the same as timestamps on a 
POSIX file, and just because timestamps change (such as when copying a 
file from one place to another) does not imply that contents have 
changed.  My personal take - unless adding such a header can definitely 
add safety, it may not be worth the cost of complicating the standard - 
this was more just documenting an idea I had even if we don't choose to 
pursue it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 14:32           ` Eric Blake
  2020-04-09 15:01             ` Max Reitz
@ 2020-04-14 12:28             ` Kevin Wolf
  1 sibling, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2020-04-14 12:28 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, Qemu-block, Max Reitz

Am 09.04.2020 um 16:32 hat Eric Blake geschrieben:
> On 4/9/20 9:10 AM, Max Reitz wrote:
> 
> > > 
> > > What happens when an operation attempts to unmap things?  Do we reject
> > > all unmap operations when data-file-raw is set (thus leaving a cluster
> > > marked as allocated at all times, if we can first guarantee that
> > > preallocation set things up that way)?
> > No, unmap operations currently work.  qcow2_free_any_clusters() passes
> > them through to the external data file.
> > 
> > The problem is that the unmap also zeroes the L2 entry, so if you then
> > write data to the raw file, it won’t be visible from the qcow2 side of
> > things.  However, I’m not sure whether we support modifications of a raw
> > file when it is already “in use” by a qcow2 image, so maybe that’s fine.
> 
> We don't support concurrent modification. But if the guest is running and
> unmaps things, then shuts off, then we edit the raw file offline, then we
> restart the guest, the guest should see the results of those offline edits.

If you write to the external data file other than through qcow2, you
have invalidated the qcow2 layer. You can only use the image as a raw
image after this.

There is no point in bending over backwards in qcow2 to allow something
like this. If you're not interested in valid metadata, but want to
ignore all of it, you can just use raw from the start.

Kevin



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: qcow2: Zero-initialization of external data files
  2020-04-09 15:56                 ` Eric Blake
@ 2020-04-14 12:34                   ` Kevin Wolf
  0 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2020-04-14 12:34 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, Qemu-block, Max Reitz

Am 09.04.2020 um 17:56 hat Eric Blake geschrieben:
> On 4/9/20 10:46 AM, Eric Blake wrote:
> 
> > > > We don't support concurrent modification. But if the guest is running
> > > > and unmaps things, then shuts off, then we edit the raw file offline,
> > > > then we restart the guest, the guest should see the results of those
> > > > offline edits.
> > > 
> > > Should it?  The specification doesn’t say anything about that.
> > > 
> > > In fact, I think we have always said we explicitly discourage that
> > > because this might lead to outdated metadata; even though we usually
> > > meant “dirty bitmaps” by that.
> > 
> > Hmm.  Kevin, I'd really like your opinion here.  The point of the
> > raw-external-data flag is to state that "qemu MUST ensure that whatever
> > is done to this image while the guest is running is reflected through to
> > the raw file, so that after the guest stops, the raw file alone is still
> > viable to see what the guest saw".  But as you say, there's a difference
> > between "the raw file will read what the guest saw" and "we can now edit
> > the raw file without regards to the qcow2 wrapper but later reuse of the
> > qcow2 wrapper won't be corrupted by those edits".
> 
> Another random thought: Should we add a header extension that records the
> timestamps of an external data file?  That way, if the timestamps of the
> file have changed from what we recorded in our optional header, then we can
> flag to the user that our metadata may be stale because of what appears to
> be external edits.  But that's not always going to save us - timestamps on a
> block device don't behave the same as timestamps on a POSIX file, and just
> because timestamps change (such as when copying a file from one place to
> another) does not imply that contents have changed.  My personal take -
> unless adding such a header can definitely add safety, it may not be worth
> the cost of complicating the standard - this was more just documenting an
> idea I had even if we don't choose to pursue it.

In the context of verifying backing file links, Jeff Cody once brought
up an idea where we would have something like a generation counter in
the header that would be increased every time you open the image
read-write. Then you could store that counter value in the backing file
and external data file links and detect if someone else wrote to the
child image and invalidated it.

Though obviously raw images still won't have a counter, and if they had
one, nobody would increase it when writing to it externally, so it
doesn't actually work for real-world external data files...

I wouldn't rely on file timestamps, you already mentioned some good
reasons.

Kevin



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-04-14 12:36 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-17 16:56 qcow2: Zero-initialization of external data files Max Reitz
2020-04-06 22:22 ` Eric Blake
2020-04-09 13:05   ` Max Reitz
2020-04-09 13:42     ` Eric Blake
2020-04-09 13:47       ` Eric Blake
2020-04-09 14:10         ` Max Reitz
2020-04-09 14:32           ` Eric Blake
2020-04-09 15:01             ` Max Reitz
2020-04-09 15:46               ` Eric Blake
2020-04-09 15:56                 ` Eric Blake
2020-04-14 12:34                   ` Kevin Wolf
2020-04-14 12:28             ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.