QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [Qemu-devel] [PATCH 0/3] block: Make various formats' block_status recurse again
@ 2019-07-25 15:55 Max Reitz
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 1/3] vdi: Make block_status recurse for fixed images Max Reitz
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Max Reitz @ 2019-07-25 15:55 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Hi,

69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it
would only go down to the protocol layer if the format layer returned
BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient
information whether a given range in the image is zero or not.
Generally, this is because the image is preallocated and thus all ranges
appear as zeroes.

However, it only implemented this preallocation detection for qcow2.
There are more formats that support preallocation, though: vdi, vhdx,
vmdk, vpc.  (Funny how they all start with “v”.)

For vdi, vmdk, and vpc, the fix is rather simple, because they really
have different subformats depending on whether an image is preallocated
or not.  This makes the check very simple.

vhdx is more like qcow2, where after the image has been created, it
isn’t clear whether it’s been preallocated or everything is allocated
because everything was already written to.  69f47505ee added a heuristic
to qcow2 to get around this, but I think that’s too much for vhdx.  I
just left it unfixed, because I don’t care that much, honestly (and I
don’t think anyone else does).


Max Reitz (3):
  vdi: Make block_status recurse for fixed images
  vmdk: Make block_status recurse for flat extents
  vpc: Do not return RAW from block_status

 block/vdi.c  | 3 ++-
 block/vmdk.c | 3 +++
 block/vpc.c  | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 1/3] vdi: Make block_status recurse for fixed images
  2019-07-25 15:55 [Qemu-devel] [PATCH 0/3] block: Make various formats' block_status recurse again Max Reitz
@ 2019-07-25 15:55 ` Max Reitz
  2019-08-12 14:47   ` Vladimir Sementsov-Ogievskiy
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 2/3] vmdk: Make block_status recurse for flat extents Max Reitz
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Max Reitz @ 2019-07-25 15:55 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/vdi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/vdi.c b/block/vdi.c
index b9845a4cbd..40d40c34d5 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -542,7 +542,8 @@ static int coroutine_fn vdi_co_block_status(BlockDriverState *bs,
     *map = s->header.offset_data + (uint64_t)bmap_entry * s->block_size +
         index_in_block;
     *file = bs->file->bs;
-    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
+    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID |
+        (s->header.image_type == VDI_TYPE_STATIC ? BDRV_BLOCK_RECURSE : 0);
 }
 
 static int coroutine_fn
-- 
2.21.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 2/3] vmdk: Make block_status recurse for flat extents
  2019-07-25 15:55 [Qemu-devel] [PATCH 0/3] block: Make various formats' block_status recurse again Max Reitz
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 1/3] vdi: Make block_status recurse for fixed images Max Reitz
@ 2019-07-25 15:55 ` Max Reitz
  2019-08-12 14:59   ` Vladimir Sementsov-Ogievskiy
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status Max Reitz
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Max Reitz @ 2019-07-25 15:55 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/vmdk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/vmdk.c b/block/vmdk.c
index bd36ece125..fd78fd0ccf 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1692,6 +1692,9 @@ static int coroutine_fn vmdk_co_block_status(BlockDriverState *bs,
         if (!extent->compressed) {
             ret |= BDRV_BLOCK_OFFSET_VALID;
             *map = cluster_offset + index_in_cluster;
+            if (extent->flat) {
+                ret |= BDRV_BLOCK_RECURSE;
+            }
         }
         *file = extent->file->bs;
         break;
-- 
2.21.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status
  2019-07-25 15:55 [Qemu-devel] [PATCH 0/3] block: Make various formats' block_status recurse again Max Reitz
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 1/3] vdi: Make block_status recurse for fixed images Max Reitz
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 2/3] vmdk: Make block_status recurse for flat extents Max Reitz
@ 2019-07-25 15:55 ` Max Reitz
  2019-08-12 15:33   ` Vladimir Sementsov-Ogievskiy
  2019-08-12 18:39 ` [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again John Snow
  2019-08-15 15:49 ` [Qemu-devel] " Max Reitz
  4 siblings, 1 reply; 18+ messages in thread
From: Max Reitz @ 2019-07-25 15:55 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

vpc is not really a passthrough driver, even when using the fixed
subformat (where host and guest offsets are equal).  It should handle
preallocation like all other drivers do, namely by returning
DATA | RECURSE instead of RAW.

There is no tangible difference but the fact that bdrv_is_allocated() no
longer falls through to the protocol layer.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/vpc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/vpc.c b/block/vpc.c
index d4776ee8a5..b25aab0425 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -737,7 +737,7 @@ static int coroutine_fn vpc_co_block_status(BlockDriverState *bs,
         *pnum = bytes;
         *map = offset;
         *file = bs->file->bs;
-        return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
+        return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_RECURSE;
     }
 
     qemu_co_mutex_lock(&s->lock);
-- 
2.21.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] vdi: Make block_status recurse for fixed images
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 1/3] vdi: Make block_status recurse for fixed images Max Reitz
@ 2019-08-12 14:47   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 18+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-08-12 14:47 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

25.07.2019 18:55, Max Reitz wrote:
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Sorry for a delay, I thought that maintainers of the formats will approve these patches ;)

Don't know vdi code, but it is what I suggested and seems to be the right thing to do:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/vdi.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/block/vdi.c b/block/vdi.c
> index b9845a4cbd..40d40c34d5 100644
> --- a/block/vdi.c
> +++ b/block/vdi.c
> @@ -542,7 +542,8 @@ static int coroutine_fn vdi_co_block_status(BlockDriverState *bs,
>       *map = s->header.offset_data + (uint64_t)bmap_entry * s->block_size +
>           index_in_block;
>       *file = bs->file->bs;
> -    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
> +    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID |
> +        (s->header.image_type == VDI_TYPE_STATIC ? BDRV_BLOCK_RECURSE : 0);
>   }
>   
>   static int coroutine_fn
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] vmdk: Make block_status recurse for flat extents
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 2/3] vmdk: Make block_status recurse for flat extents Max Reitz
@ 2019-08-12 14:59   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 18+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-08-12 14:59 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

25.07.2019 18:55, Max Reitz wrote:
> Fixes: 69f47505ee66afaa513305de0c1895a224e52c45
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Again, don't know vmdk code, but briefly looking at it (and at vmdk spec) I
see that "extents" are files, and flat extent is a raw file without any special
format. And it is allocated by blk_truncate(.. PREALLOC_MODE_OFF ..), so really
looks like metadata preallocation.

And, any way, there should not be real damage, as patch simply brings back old behavior
for one case.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/vmdk.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/block/vmdk.c b/block/vmdk.c
> index bd36ece125..fd78fd0ccf 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -1692,6 +1692,9 @@ static int coroutine_fn vmdk_co_block_status(BlockDriverState *bs,
>           if (!extent->compressed) {
>               ret |= BDRV_BLOCK_OFFSET_VALID;
>               *map = cluster_offset + index_in_cluster;
> +            if (extent->flat) {
> +                ret |= BDRV_BLOCK_RECURSE;
> +            }
>           }
>           *file = extent->file->bs;
>           break;
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status Max Reitz
@ 2019-08-12 15:33   ` Vladimir Sementsov-Ogievskiy
  2019-08-12 15:56     ` Max Reitz
  0 siblings, 1 reply; 18+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-08-12 15:33 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

25.07.2019 18:55, Max Reitz wrote:
> vpc is not really a passthrough driver, even when using the fixed
> subformat (where host and guest offsets are equal).  It should handle
> preallocation like all other drivers do, namely by returning
> DATA | RECURSE instead of RAW.
> 
> There is no tangible difference but the fact that bdrv_is_allocated() no
> longer falls through to the protocol layer.

Hmm. Isn't a real bug (fixed by this patch) ?

Assume vpc->file is qcow2 with backing, which have "unallocated" region, which is
backed by actual data in backing file.

So, this region will be reported as not allocated and will be skipped by any copying
loop using block-status? Is it a bug of BDRV_BLOCK_RAW itself? Or I don't understand
something..

> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block/vpc.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/vpc.c b/block/vpc.c
> index d4776ee8a5..b25aab0425 100644
> --- a/block/vpc.c
> +++ b/block/vpc.c
> @@ -737,7 +737,7 @@ static int coroutine_fn vpc_co_block_status(BlockDriverState *bs,
>           *pnum = bytes;
>           *map = offset;
>           *file = bs->file->bs;
> -        return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
> +        return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_RECURSE;
>       }
>   
>       qemu_co_mutex_lock(&s->lock);
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status
  2019-08-12 15:33   ` Vladimir Sementsov-Ogievskiy
@ 2019-08-12 15:56     ` Max Reitz
  2019-08-12 16:50       ` Vladimir Sementsov-Ogievskiy
  2019-08-13 10:38       ` Kevin Wolf
  0 siblings, 2 replies; 18+ messages in thread
From: Max Reitz @ 2019-08-12 15:56 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel

[-- Attachment #1.1: Type: text/plain, Size: 1158 bytes --]

On 12.08.19 17:33, Vladimir Sementsov-Ogievskiy wrote:
> 25.07.2019 18:55, Max Reitz wrote:
>> vpc is not really a passthrough driver, even when using the fixed
>> subformat (where host and guest offsets are equal).  It should handle
>> preallocation like all other drivers do, namely by returning
>> DATA | RECURSE instead of RAW.
>>
>> There is no tangible difference but the fact that bdrv_is_allocated() no
>> longer falls through to the protocol layer.
> 
> Hmm. Isn't a real bug (fixed by this patch) ?
> 
> Assume vpc->file is qcow2 with backing, which have "unallocated" region, which is
> backed by actual data in backing file.

Come on now.

> So, this region will be reported as not allocated and will be skipped by any copying
> loop using block-status? Is it a bug of BDRV_BLOCK_RAW itself? Or I don't understand
> something..

I think what you don’t understand is that if you have a vpc file inside
of a qcow2 file, you’re doing basically everything wrong. ;-)

But maybe we should drop BDRV_BLOCK_RAW...  Does it do anything good for
us in the raw driver?  Shouldn’t it too just return DATA | RECURSE?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status
  2019-08-12 15:56     ` Max Reitz
@ 2019-08-12 16:50       ` Vladimir Sementsov-Ogievskiy
  2019-08-12 19:07         ` Max Reitz
  2019-08-13 10:38       ` Kevin Wolf
  1 sibling, 1 reply; 18+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-08-12 16:50 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

12.08.2019 18:56, Max Reitz wrote:
> On 12.08.19 17:33, Vladimir Sementsov-Ogievskiy wrote:
>> 25.07.2019 18:55, Max Reitz wrote:
>>> vpc is not really a passthrough driver, even when using the fixed
>>> subformat (where host and guest offsets are equal).  It should handle
>>> preallocation like all other drivers do, namely by returning
>>> DATA | RECURSE instead of RAW.
>>>
>>> There is no tangible difference but the fact that bdrv_is_allocated() no
>>> longer falls through to the protocol layer.
>>
>> Hmm. Isn't a real bug (fixed by this patch) ?
>>
>> Assume vpc->file is qcow2 with backing, which have "unallocated" region, which is
>> backed by actual data in backing file.
> 
> Come on now.
> 
>> So, this region will be reported as not allocated and will be skipped by any copying
>> loop using block-status? Is it a bug of BDRV_BLOCK_RAW itself? Or I don't understand
>> something..
> 
> I think what you don’t understand is that if you have a vpc file inside
> of a qcow2 file, you’re doing basically everything wrong. ;-)
> 
> But maybe we should drop BDRV_BLOCK_RAW...  Does it do anything good for
> us in the raw driver?  Shouldn’t it too just return DATA | RECURSE?
> 

And if I have raw driver above qcow2, it will not work, like I've described above..


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again
  2019-07-25 15:55 [Qemu-devel] [PATCH 0/3] block: Make various formats' block_status recurse again Max Reitz
                   ` (2 preceding siblings ...)
  2019-07-25 15:55 ` [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status Max Reitz
@ 2019-08-12 18:39 ` John Snow
  2019-08-12 19:11   ` Max Reitz
  2019-08-15 15:49 ` [Qemu-devel] " Max Reitz
  4 siblings, 1 reply; 18+ messages in thread
From: John Snow @ 2019-08-12 18:39 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel



On 7/25/19 11:55 AM, Max Reitz wrote:
> Hi,
> 
> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it
> would only go down to the protocol layer if the format layer returned
> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient
> information whether a given range in the image is zero or not.
> Generally, this is because the image is preallocated and thus all ranges
> appear as zeroes.
> 
> However, it only implemented this preallocation detection for qcow2.
> There are more formats that support preallocation, though: vdi, vhdx,
> vmdk, vpc.  (Funny how they all start with “v”.)
> 
> For vdi, vmdk, and vpc, the fix is rather simple, because they really
> have different subformats depending on whether an image is preallocated
> or not.  This makes the check very simple.
> 
> vhdx is more like qcow2, where after the image has been created, it
> isn’t clear whether it’s been preallocated or everything is allocated
> because everything was already written to.  69f47505ee added a heuristic
> to qcow2 to get around this, but I think that’s too much for vhdx.  I
> just left it unfixed, because I don’t care that much, honestly (and I
> don’t think anyone else does).
> 

What's the practical outcome of that, and is the limitation documented
somewhere?

(I'm fine with not fixing it, I just want it documented somehow.)

> 
> Max Reitz (3):
>   vdi: Make block_status recurse for fixed images
>   vmdk: Make block_status recurse for flat extents
>   vpc: Do not return RAW from block_status
> 
>  block/vdi.c  | 3 ++-
>  block/vmdk.c | 3 +++
>  block/vpc.c  | 2 +-
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status
  2019-08-12 16:50       ` Vladimir Sementsov-Ogievskiy
@ 2019-08-12 19:07         ` Max Reitz
  0 siblings, 0 replies; 18+ messages in thread
From: Max Reitz @ 2019-08-12 19:07 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel

[-- Attachment #1.1: Type: text/plain, Size: 1554 bytes --]

On 12.08.19 18:50, Vladimir Sementsov-Ogievskiy wrote:
> 12.08.2019 18:56, Max Reitz wrote:
>> On 12.08.19 17:33, Vladimir Sementsov-Ogievskiy wrote:
>>> 25.07.2019 18:55, Max Reitz wrote:
>>>> vpc is not really a passthrough driver, even when using the fixed
>>>> subformat (where host and guest offsets are equal).  It should handle
>>>> preallocation like all other drivers do, namely by returning
>>>> DATA | RECURSE instead of RAW.
>>>>
>>>> There is no tangible difference but the fact that bdrv_is_allocated() no
>>>> longer falls through to the protocol layer.
>>>
>>> Hmm. Isn't a real bug (fixed by this patch) ?
>>>
>>> Assume vpc->file is qcow2 with backing, which have "unallocated" region, which is
>>> backed by actual data in backing file.
>>
>> Come on now.
>>
>>> So, this region will be reported as not allocated and will be skipped by any copying
>>> loop using block-status? Is it a bug of BDRV_BLOCK_RAW itself? Or I don't understand
>>> something..
>>
>> I think what you don’t understand is that if you have a vpc file inside
>> of a qcow2 file, you’re doing basically everything wrong. ;-)
>>
>> But maybe we should drop BDRV_BLOCK_RAW...  Does it do anything good for
>> us in the raw driver?  Shouldn’t it too just return DATA | RECURSE?
>>
> 
> And if I have raw driver above qcow2, it will not work, like I've described above..

Yep.  That’s why I was wondering.  (This is a more likely case, because
maybe you really want to use raw’s offset capability on top of qcow2.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again
  2019-08-12 18:39 ` [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again John Snow
@ 2019-08-12 19:11   ` Max Reitz
  2019-08-12 21:45     ` John Snow
  0 siblings, 1 reply; 18+ messages in thread
From: Max Reitz @ 2019-08-12 19:11 UTC (permalink / raw)
  To: John Snow, qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel

[-- Attachment #1.1: Type: text/plain, Size: 2509 bytes --]

On 12.08.19 20:39, John Snow wrote:
> 
> 
> On 7/25/19 11:55 AM, Max Reitz wrote:
>> Hi,
>>
>> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it
>> would only go down to the protocol layer if the format layer returned
>> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient
>> information whether a given range in the image is zero or not.
>> Generally, this is because the image is preallocated and thus all ranges
>> appear as zeroes.
>>
>> However, it only implemented this preallocation detection for qcow2.
>> There are more formats that support preallocation, though: vdi, vhdx,
>> vmdk, vpc.  (Funny how they all start with “v”.)
>>
>> For vdi, vmdk, and vpc, the fix is rather simple, because they really
>> have different subformats depending on whether an image is preallocated
>> or not.  This makes the check very simple.
>>
>> vhdx is more like qcow2, where after the image has been created, it
>> isn’t clear whether it’s been preallocated or everything is allocated
>> because everything was already written to.  69f47505ee added a heuristic
>> to qcow2 to get around this, but I think that’s too much for vhdx.  I
>> just left it unfixed, because I don’t care that much, honestly (and I
>> don’t think anyone else does).
>>
> 
> What's the practical outcome of that, and is the limitation documented
> somewhere?

The outcome is that it if you preallocate a vhdx image
(subformat=fixed), you’ll see that all sectors contain data, even if
they may be zero sectors on the filesystem level.

I don’t think it’s user-visible whatsoever.

> (I'm fine with not fixing it, I just want it documented somehow.)

I am really not inclined to start any documentation on the
particularities with which qemu handles vhdx images.

(Especially so considering we don’t even have any documentation on the
qcow2 case.  The stress in my paragraph was “heuristic”.  If you
preallocate a qcow2 image, but then discard enough sectors that the
heuristic thinks you didn’t, you’ll have the same effect.  Or if you
grow a preallocated image without preallocating the new area.)

Max

>>
>> Max Reitz (3):
>>   vdi: Make block_status recurse for fixed images
>>   vmdk: Make block_status recurse for flat extents
>>   vpc: Do not return RAW from block_status
>>
>>  block/vdi.c  | 3 ++-
>>  block/vmdk.c | 3 +++
>>  block/vpc.c  | 2 +-
>>  3 files changed, 6 insertions(+), 2 deletions(-)
>>



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again
  2019-08-12 19:11   ` Max Reitz
@ 2019-08-12 21:45     ` John Snow
  2019-08-13 14:48       ` Max Reitz
  0 siblings, 1 reply; 18+ messages in thread
From: John Snow @ 2019-08-12 21:45 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel



On 8/12/19 3:11 PM, Max Reitz wrote:
> On 12.08.19 20:39, John Snow wrote:
>>
>>
>> On 7/25/19 11:55 AM, Max Reitz wrote:
>>> Hi,
>>>
>>> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it
>>> would only go down to the protocol layer if the format layer returned
>>> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient
>>> information whether a given range in the image is zero or not.
>>> Generally, this is because the image is preallocated and thus all ranges
>>> appear as zeroes.
>>>
>>> However, it only implemented this preallocation detection for qcow2.
>>> There are more formats that support preallocation, though: vdi, vhdx,
>>> vmdk, vpc.  (Funny how they all start with “v”.)
>>>
>>> For vdi, vmdk, and vpc, the fix is rather simple, because they really
>>> have different subformats depending on whether an image is preallocated
>>> or not.  This makes the check very simple.
>>>
>>> vhdx is more like qcow2, where after the image has been created, it
>>> isn’t clear whether it’s been preallocated or everything is allocated
>>> because everything was already written to.  69f47505ee added a heuristic
>>> to qcow2 to get around this, but I think that’s too much for vhdx.  I
>>> just left it unfixed, because I don’t care that much, honestly (and I
>>> don’t think anyone else does).
>>>
>>
>> What's the practical outcome of that, and is the limitation documented
>> somewhere?
> 
> The outcome is that it if you preallocate a vhdx image
> (subformat=fixed), you’ll see that all sectors contain data, even if
> they may be zero sectors on the filesystem level.
> 
> I don’t think it’s user-visible whatsoever.
> 

But it might mean that doing things with sync=top might over-allocate
data depending on the destination, wouldn't it?

That's not crucial, but it's possibly visible, no?

>> (I'm fine with not fixing it, I just want it documented somehow.)
> 
> I am really not inclined to start any documentation on the
> particularities with which qemu handles vhdx images.
> 
> (Especially so considering we don’t even have any documentation on the
> qcow2 case.  The stress in my paragraph was “heuristic”.  If you
> preallocate a qcow2 image, but then discard enough sectors that the
> heuristic thinks you didn’t, you’ll have the same effect.  Or if you
> grow a preallocated image without preallocating the new area.)
> 
> Max
> 

"But our qcow2 docs are also bad" is the kind of argument I can't
*really* disagree with, but...

(I wish we did have a documentation manual per-format that mentioned
some gotchas and general info about each format, but I can't really ask
you to do that now: I just worry when I see patches like this that the
knowledge or memory that there ever was a quirk will vanish immediately.)

--js


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status
  2019-08-12 15:56     ` Max Reitz
  2019-08-12 16:50       ` Vladimir Sementsov-Ogievskiy
@ 2019-08-13 10:38       ` Kevin Wolf
  2019-08-13 14:49         ` Max Reitz
  1 sibling, 1 reply; 18+ messages in thread
From: Kevin Wolf @ 2019-08-13 10:38 UTC (permalink / raw)
  To: Max Reitz; +Cc: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1577 bytes --]

Am 12.08.2019 um 17:56 hat Max Reitz geschrieben:
> On 12.08.19 17:33, Vladimir Sementsov-Ogievskiy wrote:
> > 25.07.2019 18:55, Max Reitz wrote:
> >> vpc is not really a passthrough driver, even when using the fixed
> >> subformat (where host and guest offsets are equal).  It should handle
> >> preallocation like all other drivers do, namely by returning
> >> DATA | RECURSE instead of RAW.
> >>
> >> There is no tangible difference but the fact that bdrv_is_allocated() no
> >> longer falls through to the protocol layer.
> > 
> > Hmm. Isn't a real bug (fixed by this patch) ?
> > 
> > Assume vpc->file is qcow2 with backing, which have "unallocated" region, which is
> > backed by actual data in backing file.
> 
> Come on now.
> 
> > So, this region will be reported as not allocated and will be skipped by any copying
> > loop using block-status? Is it a bug of BDRV_BLOCK_RAW itself? Or I don't understand
> > something..
> 
> I think what you don’t understand is that if you have a vpc file inside
> of a qcow2 file, you’re doing basically everything wrong. ;-)
> 
> But maybe we should drop BDRV_BLOCK_RAW...  Does it do anything good for
> us in the raw driver?  Shouldn’t it too just return DATA | RECURSE?

DATA | RECURSE is still DATA, i.e. marks the block as allocated. If you
do that unconditionally, we will never consider a block unallocated.
RECURSE doesn't undo this, the only thing it might do is settting ZERO
additionally.

So I would say unconditionally returning DATA | RECURSE is almost always
wrong.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again
  2019-08-12 21:45     ` John Snow
@ 2019-08-13 14:48       ` Max Reitz
  2019-08-13 22:35         ` John Snow
  0 siblings, 1 reply; 18+ messages in thread
From: Max Reitz @ 2019-08-13 14:48 UTC (permalink / raw)
  To: John Snow, qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel

[-- Attachment #1.1: Type: text/plain, Size: 3391 bytes --]

On 12.08.19 23:45, John Snow wrote:
> 
> 
> On 8/12/19 3:11 PM, Max Reitz wrote:
>> On 12.08.19 20:39, John Snow wrote:
>>>
>>>
>>> On 7/25/19 11:55 AM, Max Reitz wrote:
>>>> Hi,
>>>>
>>>> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it
>>>> would only go down to the protocol layer if the format layer returned
>>>> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient
>>>> information whether a given range in the image is zero or not.
>>>> Generally, this is because the image is preallocated and thus all ranges
>>>> appear as zeroes.
>>>>
>>>> However, it only implemented this preallocation detection for qcow2.
>>>> There are more formats that support preallocation, though: vdi, vhdx,
>>>> vmdk, vpc.  (Funny how they all start with “v”.)
>>>>
>>>> For vdi, vmdk, and vpc, the fix is rather simple, because they really
>>>> have different subformats depending on whether an image is preallocated
>>>> or not.  This makes the check very simple.
>>>>
>>>> vhdx is more like qcow2, where after the image has been created, it
>>>> isn’t clear whether it’s been preallocated or everything is allocated
>>>> because everything was already written to.  69f47505ee added a heuristic
>>>> to qcow2 to get around this, but I think that’s too much for vhdx.  I
>>>> just left it unfixed, because I don’t care that much, honestly (and I
>>>> don’t think anyone else does).
>>>>
>>>
>>> What's the practical outcome of that, and is the limitation documented
>>> somewhere?
>>
>> The outcome is that it if you preallocate a vhdx image
>> (subformat=fixed), you’ll see that all sectors contain data, even if
>> they may be zero sectors on the filesystem level.
>>
>> I don’t think it’s user-visible whatsoever.
>>
> 
> But it might mean that doing things with sync=top might over-allocate
> data depending on the destination, wouldn't it?
> 
> That's not crucial, but it's possibly visible, no?

I don’t think it has anything to do with sync=top because whether a
block is zero on the protocol level has nothing to do with whether it is
allocated on the format level.

It may make a difference for convert which uses block_status to inquire
the zero status.  However, it also does zero-detection, so...

>>> (I'm fine with not fixing it, I just want it documented somehow.)
>>
>> I am really not inclined to start any documentation on the
>> particularities with which qemu handles vhdx images.
>>
>> (Especially so considering we don’t even have any documentation on the
>> qcow2 case.  The stress in my paragraph was “heuristic”.  If you
>> preallocate a qcow2 image, but then discard enough sectors that the
>> heuristic thinks you didn’t, you’ll have the same effect.  Or if you
>> grow a preallocated image without preallocating the new area.)
>>
>> Max
>>
> 
> "But our qcow2 docs are also bad" is the kind of argument I can't
> *really* disagree with, but...

My main argument is that nobody would read the vhdx docs anyway.

Max

> (I wish we did have a documentation manual per-format that mentioned
> some gotchas and general info about each format, but I can't really ask
> you to do that now: I just worry when I see patches like this that the
> knowledge or memory that there ever was a quirk will vanish immediately.)
> 
> --js
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status
  2019-08-13 10:38       ` Kevin Wolf
@ 2019-08-13 14:49         ` Max Reitz
  0 siblings, 0 replies; 18+ messages in thread
From: Max Reitz @ 2019-08-13 14:49 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Vladimir Sementsov-Ogievskiy, qemu-devel, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1796 bytes --]

On 13.08.19 12:38, Kevin Wolf wrote:
> Am 12.08.2019 um 17:56 hat Max Reitz geschrieben:
>> On 12.08.19 17:33, Vladimir Sementsov-Ogievskiy wrote:
>>> 25.07.2019 18:55, Max Reitz wrote:
>>>> vpc is not really a passthrough driver, even when using the fixed
>>>> subformat (where host and guest offsets are equal).  It should handle
>>>> preallocation like all other drivers do, namely by returning
>>>> DATA | RECURSE instead of RAW.
>>>>
>>>> There is no tangible difference but the fact that bdrv_is_allocated() no
>>>> longer falls through to the protocol layer.
>>>
>>> Hmm. Isn't a real bug (fixed by this patch) ?
>>>
>>> Assume vpc->file is qcow2 with backing, which have "unallocated" region, which is
>>> backed by actual data in backing file.
>>
>> Come on now.
>>
>>> So, this region will be reported as not allocated and will be skipped by any copying
>>> loop using block-status? Is it a bug of BDRV_BLOCK_RAW itself? Or I don't understand
>>> something..
>>
>> I think what you don’t understand is that if you have a vpc file inside
>> of a qcow2 file, you’re doing basically everything wrong. ;-)
>>
>> But maybe we should drop BDRV_BLOCK_RAW...  Does it do anything good for
>> us in the raw driver?  Shouldn’t it too just return DATA | RECURSE?
> 
> DATA | RECURSE is still DATA, i.e. marks the block as allocated. If you
> do that unconditionally, we will never consider a block unallocated.

Which is correct for formats that do not have backing files.

> RECURSE doesn't undo this, the only thing it might do is settting ZERO
> additionally.
> 
> So I would say unconditionally returning DATA | RECURSE is almost always
> wrong.

I would say it’s always right when it is a format driver and there is no
backing file.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again
  2019-08-13 14:48       ` Max Reitz
@ 2019-08-13 22:35         ` John Snow
  0 siblings, 0 replies; 18+ messages in thread
From: John Snow @ 2019-08-13 22:35 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel



On 8/13/19 10:48 AM, Max Reitz wrote:
> On 12.08.19 23:45, John Snow wrote:
>>
>>
>> On 8/12/19 3:11 PM, Max Reitz wrote:
>>> On 12.08.19 20:39, John Snow wrote:
>>>>
>>>>
>>>> On 7/25/19 11:55 AM, Max Reitz wrote:
>>>>> Hi,
>>>>>
>>>>> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it
>>>>> would only go down to the protocol layer if the format layer returned
>>>>> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient
>>>>> information whether a given range in the image is zero or not.
>>>>> Generally, this is because the image is preallocated and thus all ranges
>>>>> appear as zeroes.
>>>>>
>>>>> However, it only implemented this preallocation detection for qcow2.
>>>>> There are more formats that support preallocation, though: vdi, vhdx,
>>>>> vmdk, vpc.  (Funny how they all start with “v”.)
>>>>>
>>>>> For vdi, vmdk, and vpc, the fix is rather simple, because they really
>>>>> have different subformats depending on whether an image is preallocated
>>>>> or not.  This makes the check very simple.
>>>>>
>>>>> vhdx is more like qcow2, where after the image has been created, it
>>>>> isn’t clear whether it’s been preallocated or everything is allocated
>>>>> because everything was already written to.  69f47505ee added a heuristic
>>>>> to qcow2 to get around this, but I think that’s too much for vhdx.  I
>>>>> just left it unfixed, because I don’t care that much, honestly (and I
>>>>> don’t think anyone else does).
>>>>>
>>>>
>>>> What's the practical outcome of that, and is the limitation documented
>>>> somewhere?
>>>
>>> The outcome is that it if you preallocate a vhdx image
>>> (subformat=fixed), you’ll see that all sectors contain data, even if
>>> they may be zero sectors on the filesystem level.
>>>
>>> I don’t think it’s user-visible whatsoever.
>>>
>>
>> But it might mean that doing things with sync=top might over-allocate
>> data depending on the destination, wouldn't it?
>>
>> That's not crucial, but it's possibly visible, no?
> 
> I don’t think it has anything to do with sync=top because whether a
> block is zero on the protocol level has nothing to do with whether it is
> allocated on the format level.
> 
> It may make a difference for convert which uses block_status to inquire
> the zero status.  However, it also does zero-detection, so...
> 

Oh, okay then. Probably... fine, but I have a nagging doubt relating to
some of the fallbacks in e.g. qcow2 that tend to inflate zeroes in some
cases (or used to. Maybe it's been fixed since.)

...but I can't point to anything, so it's fine, and I'm just drawing
things out for no reason.

Reviewed-by: John Snow <jsnow@redhat.com>

>>>> (I'm fine with not fixing it, I just want it documented somehow.)
>>>
>>> I am really not inclined to start any documentation on the
>>> particularities with which qemu handles vhdx images.
>>>
>>> (Especially so considering we don’t even have any documentation on the
>>> qcow2 case.  The stress in my paragraph was “heuristic”.  If you
>>> preallocate a qcow2 image, but then discard enough sectors that the
>>> heuristic thinks you didn’t, you’ll have the same effect.  Or if you
>>> grow a preallocated image without preallocating the new area.)
>>>
>>> Max
>>>
>>
>> "But our qcow2 docs are also bad" is the kind of argument I can't
>> *really* disagree with, but...
> 
> My main argument is that nobody would read the vhdx docs anyway.
> 
> Max
> 

That's the sort of thing I'd like to change, but I guess I haven't
really made good on that desire in any way, so what good is that?

--js


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/3] block: Make various formats' block_status recurse again
  2019-07-25 15:55 [Qemu-devel] [PATCH 0/3] block: Make various formats' block_status recurse again Max Reitz
                   ` (3 preceding siblings ...)
  2019-08-12 18:39 ` [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again John Snow
@ 2019-08-15 15:49 ` " Max Reitz
  4 siblings, 0 replies; 18+ messages in thread
From: Max Reitz @ 2019-08-15 15:49 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel

[-- Attachment #1.1: Type: text/plain, Size: 1687 bytes --]

On 25.07.19 17:55, Max Reitz wrote:
> Hi,
> 
> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it
> would only go down to the protocol layer if the format layer returned
> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient
> information whether a given range in the image is zero or not.
> Generally, this is because the image is preallocated and thus all ranges
> appear as zeroes.
> 
> However, it only implemented this preallocation detection for qcow2.
> There are more formats that support preallocation, though: vdi, vhdx,
> vmdk, vpc.  (Funny how they all start with “v”.)
> 
> For vdi, vmdk, and vpc, the fix is rather simple, because they really
> have different subformats depending on whether an image is preallocated
> or not.  This makes the check very simple.
> 
> vhdx is more like qcow2, where after the image has been created, it
> isn’t clear whether it’s been preallocated or everything is allocated
> because everything was already written to.  69f47505ee added a heuristic
> to qcow2 to get around this, but I think that’s too much for vhdx.  I
> just left it unfixed, because I don’t care that much, honestly (and I
> don’t think anyone else does).
> 
> 
> Max Reitz (3):
>   vdi: Make block_status recurse for fixed images
>   vmdk: Make block_status recurse for flat extents
>   vpc: Do not return RAW from block_status
> 
>  block/vdi.c  | 3 ++-
>  block/vmdk.c | 3 +++
>  block/vpc.c  | 2 +-
>  3 files changed, 6 insertions(+), 2 deletions(-)

Thanks for the reviews, applied to my block-next branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block-next

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, back to index

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-25 15:55 [Qemu-devel] [PATCH 0/3] block: Make various formats' block_status recurse again Max Reitz
2019-07-25 15:55 ` [Qemu-devel] [PATCH 1/3] vdi: Make block_status recurse for fixed images Max Reitz
2019-08-12 14:47   ` Vladimir Sementsov-Ogievskiy
2019-07-25 15:55 ` [Qemu-devel] [PATCH 2/3] vmdk: Make block_status recurse for flat extents Max Reitz
2019-08-12 14:59   ` Vladimir Sementsov-Ogievskiy
2019-07-25 15:55 ` [Qemu-devel] [PATCH 3/3] vpc: Do not return RAW from block_status Max Reitz
2019-08-12 15:33   ` Vladimir Sementsov-Ogievskiy
2019-08-12 15:56     ` Max Reitz
2019-08-12 16:50       ` Vladimir Sementsov-Ogievskiy
2019-08-12 19:07         ` Max Reitz
2019-08-13 10:38       ` Kevin Wolf
2019-08-13 14:49         ` Max Reitz
2019-08-12 18:39 ` [Qemu-devel] [Qemu-block] [PATCH 0/3] block: Make various formats' block_status recurse again John Snow
2019-08-12 19:11   ` Max Reitz
2019-08-12 21:45     ` John Snow
2019-08-13 14:48       ` Max Reitz
2019-08-13 22:35         ` John Snow
2019-08-15 15:49 ` [Qemu-devel] " Max Reitz

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org qemu-devel@archiver.kernel.org
	public-inbox-index qemu-devel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox