All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
@ 2015-12-14 17:43 Vladimir Sementsov-Ogievskiy
  2015-12-14 20:05 ` Max Reitz
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-12-14 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, famz, mreitz, stefanha, den, jsnow

The new feature for qcow2: storing dirty bitmaps.

Only dirty bitmaps relative to this qcow2 image should be stored in it.

Strings started from +# are RFC-strings, not to be commited of course.


Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

 docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 150 insertions(+), 1 deletion(-)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 121dfc8..3c89580 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -103,7 +103,17 @@ in the description of a field.
                     write to an image with unknown auto-clear features if it
                     clears the respective bits from this field first.
 
-                    Bits 0-63:  Reserved (set to 0)
+                    Bit 0:      Dirty bitmaps bit.
+                                This bit is responsible for Dirty bitmaps
+                                extension consistency.
+                                If it is set, but there is no Dirty bitmaps
+                                extensions, this should be considered as an
+                                error.
+                                If it is not set, but there is a Dirty bitmaps
+                                extension, its data should be considered as
+                                inconsistent.
+
+                    Bits 1-63:  Reserved (set to 0)
 
          96 -  99:  refcount_order
                     Describes the width of a reference count block entry (width
@@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
                         0x00000000 - End of the header extension area
                         0xE2792ACA - Backing file format name
                         0x6803f857 - Feature name table
+                        0x23852875 - Dirty bitmaps
                         other      - Unknown header extension, can be safely
                                      ignored
 
@@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
                     terminated if it has full length)
 
 
+== Dirty bitmaps ==
+
+Dirty bitmaps is an optional header extension. It provides an ability to store
+dirty bitmaps in a qcow2 image. The data of this extension should be considered
+as consistent only if corresponding auto-clear feature bit is set (see
+autoclear_features above).
+The fields of Dirty bitmaps extension are:
+
+          0 -  3:  nb_dirty_bitmaps
+                   The number of dirty bitmaps contained in the image. Valid
+                   values: 1 - 65535.
+# Let's be strict, the feature should be deleted with deleting last bitmap.
+
+          4 -  7:  dirty_bitmap_directory_size
+                   Size of the Dirty Bitmap Directory in bytes. It should be
+                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
+                   headers.
+# This field is necessary to effectively read Dirty Bitmap Directory, because
+# it's entries (which are dirty bitmap headers) may have different lengths.
+
+          8 - 15:  dirty_bitmap_directory_offset
+                   Offset into the image file at which the Dirty Bitmap
+                   Directory starts. Must be aligned to a cluster boundary.
+
+
 == Host cluster management ==
 
 qcow2 manages the allocation of host clusters by maintaining a reference count
@@ -360,3 +396,116 @@ Snapshot table entry:
 
         variable:   Padding to round up the snapshot table entry size to the
                     next multiple of 8.
+
+
+== Dirty bitmaps ==
+
+The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
+are relating to the virtual disk, stored in this image.
+
+=== Dirty Bitmap Directory ===
+
+Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
+entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
+starting offset and length are given by the header extension fields
+dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
+the bitmap directory have variable length, depending on the length of the
+bitmap name. These entries are also called dirty bitmap headers.
+
+Dirty Bitmap Directory Entry:
+
+    Byte 0 -  7:    dirty_bitmap_table_offset
+                    Offset into the image file at which the Dirty Bitmap Table
+                    (described below) for the bitmap starts. Must be aligned to
+                    a cluster boundary.
+
+         8 - 11:    dirty_bitmap_table_size
+                    Number of entries in the Dirty Bitmap Table of the bitmap.
+
+        12 - 15:    flags
+                    Bit
+                      0: in_use
+                         The bitmap was not saved correctly and may be
+                         inconsistent.
+
+                      1: auto
+                         The bitmap should be autoloaded as block dirty bitmap
+                         and tracking should be started. Type of the bitmap
+                         should be 'Dirty Tracking Bitmap'.
+
+                    Bits 2 - 31 are reserved and must be 0.
+
+        16 - 17:    name_size
+                    Size of the bitmap name. Valid values: 1 - 1023.
+
+             18:    type
+                    This field describes the sort of the bitmap.
+                    Values:
+                      0: Dirty Tracking Bitmap
+
+                    Values 1 - 255 are reserved.
+# This is mostly for error checking and information in qemu-img info output.
+# The other types may be, for example, "Backup Bitmap" - to make it possible
+# stop backup job on vm stop and resume it later. The another one is "Sector
+# Alloction Bitmap" (Fam, John, please comment).
+
+             19:    granularity_bits
+                    Granularity bits. Valid values are: 0 - 31.
+# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
+# there are no reasons of increasing it.
+
+                    Granularity is calculated as
+                        granularity = 1 << granularity_bits
+
+                    Granularity of the bitmap is how many bytes of the image
+                    accounts for one bit of the bitmap.
+# To be closer to qcow2 and its reality, I've decided to use byte-granularity
+# here, not sector-granularity.
+
+        variable:   The name of the bitmap (not null terminated). Should be
+                    unique among all dirty bitmap names within the Dirty
+                    bitmaps extension.
+
+        variable:   Padding to round up the Dirty Bitmap Directory Entry size
+                    to the next multiple of 8.
+
+=== Dirty Bitmap Table ===
+
+Dirty bitmaps are stored using a one-level (not two-level like refcounts and
+guest clusters mapping) structure for the mapping of bitmaps to host clusters.
+It is called Dirty Bitmap Table.
+
+Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
+Directory Entry) and may use multiple clusters, however it must be contiguous
+in the image file.
+
+Given an offset (in bytes) into the bitmap, the offset into the image file can
+be obtained as follows:
+
+    byte_offset =
+        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
+
+Taking into account the granularity of the bitmap, an offset in bits into the
+image file, corresponding to byte number byte_nr of the image can be calculated
+like this:
+
+    bit_offset =
+        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
+
+Note: the last formula is only for understanding the things, it is unlikely for
+it to be useful in a program.
+
+Dirty Bitmap Table entry:
+
+    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
+                    If bits 9 - 55 are zero:
+                      0: Cluster should be read as all zeros.
+                      1: Cluster should be read as all ones.
+
+         1 -  8:    Reserved and must be zero.
+
+         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
+                    cluster boundary. If the offset is 0, the cluster is
+                    unallocated, see bit 0 description.
+
+        56 - 63:    Reserved, must be 0.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 17:43 [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
@ 2015-12-14 20:05 ` Max Reitz
  2015-12-14 20:45   ` John Snow
                     ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Max Reitz @ 2015-12-14 20:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, famz, stefanha, den, jsnow

[-- Attachment #1: Type: text/plain, Size: 13017 bytes --]

On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
> The new feature for qcow2: storing dirty bitmaps.
> 
> Only dirty bitmaps relative to this qcow2 image should be stored in it.
> 
> Strings started from +# are RFC-strings, not to be commited of course.
> 
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> 
>  docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 150 insertions(+), 1 deletion(-)

Overall: Looks better to me. Good enough for me to ACK it, but I still
have some issues with it.

Let's evaluate the main point of critique I had: I really want this not
to be qemu-specific but potentially useful to all programs.

Pretty good: You do implicitly describe what a (dirty) bitmap looks like
by describing how to obtain the bit offset of a certain byte guest
offset. So it's not an opaque binary data dump anymore.

(Why only "pretty good"? I find the description to be a bit too
"implicit", I think a separate section describing the bitmap structure
would be better.)

Good: The bitmap actually describes the qcow2 file.

Not so good: While now any program knows how to read the bitmap and that
it does refer to this qcow2 file, it's interpretation is not so easy
still. Generally, a dirty bitmap has some reference point, that is the
state of the disk when the bitmap was cleared or created. For instance,
for incremental backups, whenever you create a backup based on a dirty
bitmap, the dirty bitmap is cleared and the backup target is then said
reference point.
I think it would be nice to put that reference point (i.e. the name of
an image file that contains the clean image) into the dirty bitmap
header, if possible.


(Note: I won't comment on orthography, because I feel like that is
something a native speaker should do. O:-))

> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 121dfc8..3c89580 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -103,7 +103,17 @@ in the description of a field.
>                      write to an image with unknown auto-clear features if it
>                      clears the respective bits from this field first.
>  
> -                    Bits 0-63:  Reserved (set to 0)
> +                    Bit 0:      Dirty bitmaps bit.
> +                                This bit is responsible for Dirty bitmaps
> +                                extension consistency.
> +                                If it is set, but there is no Dirty bitmaps
> +                                extensions, this should be considered as an
> +                                error.
> +                                If it is not set, but there is a Dirty bitmaps
> +                                extension, its data should be considered as
> +                                inconsistent.
> +
> +                    Bits 1-63:  Reserved (set to 0)
>  
>           96 -  99:  refcount_order
>                      Describes the width of a reference count block entry (width
> @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
>                          0x00000000 - End of the header extension area
>                          0xE2792ACA - Backing file format name
>                          0x6803f857 - Feature name table
> +                        0x23852875 - Dirty bitmaps
>                          other      - Unknown header extension, can be safely
>                                       ignored
>  
> @@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
>                      terminated if it has full length)
>  
>  
> +== Dirty bitmaps ==
> +
> +Dirty bitmaps is an optional header extension. It provides an ability to store
> +dirty bitmaps in a qcow2 image. The data of this extension should be considered
> +as consistent only if corresponding auto-clear feature bit is set (see
> +autoclear_features above).
> +The fields of Dirty bitmaps extension are:
> +
> +          0 -  3:  nb_dirty_bitmaps
> +                   The number of dirty bitmaps contained in the image. Valid
> +                   values: 1 - 65535.

Again, I don't see a reason for why we should impose a strict upper
limit here. I'd prefer "Note that qemu currently only supports up to
65535 dirty bitmaps per image."

> +# Let's be strict, the feature should be deleted with deleting last bitmap.
> +
> +          4 -  7:  dirty_bitmap_directory_size
> +                   Size of the Dirty Bitmap Directory in bytes. It should be
> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
> +                   headers.

No, it "should" not be equal, it *must* be equal. But I think you can
just omit that last sentence, that would be just as fine.

> +# This field is necessary to effectively read Dirty Bitmap Directory, because
> +# it's entries (which are dirty bitmap headers) may have different lengths.
> +
> +          8 - 15:  dirty_bitmap_directory_offset
> +                   Offset into the image file at which the Dirty Bitmap
> +                   Directory starts. Must be aligned to a cluster boundary.
> +
> +
>  == Host cluster management ==
>  
>  qcow2 manages the allocation of host clusters by maintaining a reference count
> @@ -360,3 +396,116 @@ Snapshot table entry:
>  
>          variable:   Padding to round up the snapshot table entry size to the
>                      next multiple of 8.
> +
> +
> +== Dirty bitmaps ==
> +
> +The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
> +are relating to the virtual disk, stored in this image.
> +
> +=== Dirty Bitmap Directory ===
> +
> +Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
> +entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
> +starting offset and length are given by the header extension fields
> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
> +the bitmap directory have variable length, depending on the length of the
> +bitmap name. These entries are also called dirty bitmap headers.
> +
> +Dirty Bitmap Directory Entry:
> +
> +    Byte 0 -  7:    dirty_bitmap_table_offset
> +                    Offset into the image file at which the Dirty Bitmap Table
> +                    (described below) for the bitmap starts. Must be aligned to
> +                    a cluster boundary.
> +
> +         8 - 11:    dirty_bitmap_table_size
> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
> +
> +        12 - 15:    flags
> +                    Bit
> +                      0: in_use
> +                         The bitmap was not saved correctly and may be
> +                         inconsistent.
> +
> +                      1: auto
> +                         The bitmap should be autoloaded as block dirty bitmap
> +                         and tracking should be started. Type of the bitmap
> +                         should be 'Dirty Tracking Bitmap'.

I find the wording a bit too qemu-specific. How about this:

This bitmap is the default dirty bitmap for the virtual disk represented
by this qcow2 image. It should track all write accesses immediately
after the image has been opened.

And I find the "should" in "Type of the bitmap should be..." a bit too
weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.

> +
> +                    Bits 2 - 31 are reserved and must be 0.
> +
> +        16 - 17:    name_size
> +                    Size of the bitmap name. Valid values: 1 - 1023.
> +
> +             18:    type
> +                    This field describes the sort of the bitmap.
> +                    Values:
> +                      0: Dirty Tracking Bitmap

If we allow different kinds of bitmaps, it should not be called "dirty
bitmap" everywhere anymore.

> +
> +                    Values 1 - 255 are reserved.
> +# This is mostly for error checking and information in qemu-img info output.
> +# The other types may be, for example, "Backup Bitmap" - to make it possible
> +# stop backup job on vm stop and resume it later. The another one is "Sector
> +# Alloction Bitmap" (Fam, John, please comment).

I'm waiting for their comments because that sounds like "refcount table
with refcount_bits=1" to me. :-)

> +             19:    granularity_bits
> +                    Granularity bits. Valid values are: 0 - 31.
> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
> +# there are no reasons of increasing it.

Good (implicit) question. I can't imagine any reason for wanting to have
a coarser granularity than 2 GB, but I do think there may be a need in
the future for some people.

Once again, I think we should discriminate between what is generally a
useful limitation and what is simply due to qemu not supporting anything
else right now.

Thus, I think it would be better to increase the range to 0 - 63 and
make a note that qemu only supports values up to 31 right now.

> +
> +                    Granularity is calculated as
> +                        granularity = 1 << granularity_bits
> +
> +                    Granularity of the bitmap is how many bytes of the image
> +                    accounts for one bit of the bitmap.
> +# To be closer to qcow2 and its reality, I've decided to use byte-granularity
> +# here, not sector-granularity.

I like that. But do note that qcow2 does align everything at least to
512 bytes, so having used sector granularity wouldn't have been too bad.

> +
> +        variable:   The name of the bitmap (not null terminated). Should be
> +                    unique among all dirty bitmap names within the Dirty
> +                    bitmaps extension.
> +
> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
> +                    to the next multiple of 8.

What I'd like here is variable additional information based on the
bitmap type. For some types, this may be absolutely necessary; for dirty
tracking bitmaps it depends on what we do about the reference point thing.

The reference point thing is the following: As mentioned at the top, I'd
like there to be some kind of description of what the clean state was.
As far as I know, this is generally a backup in the form of a file. In
that case, we could put that filename here.

I don't think not having a reference point description is a serious show
stopper. qemu itself does rely on the management layer to know which
bitmap to use when. But I think it would be pretty nice to have it here.

> +
> +=== Dirty Bitmap Table ===
> +
> +Dirty bitmaps are stored using a one-level (not two-level like refcounts and
> +guest clusters mapping) structure for the mapping of bitmaps to host clusters.
> +It is called Dirty Bitmap Table.
> +
> +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
> +Directory Entry) and may use multiple clusters, however it must be contiguous
> +in the image file.
> +
> +Given an offset (in bytes) into the bitmap, the offset into the image file can
> +be obtained as follows:
> +
> +    byte_offset =
> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
> +
> +Taking into account the granularity of the bitmap, an offset in bits into the
> +image file, corresponding to byte number byte_nr of the image can be calculated
> +like this:
> +
> +    bit_offset =
> +        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
> +
> +Note: the last formula is only for understanding the things, it is unlikely for
> +it to be useful in a program.

I think this note is superfluous. All the pseudo-code in this file is
only that, pseudo-code. ;-)

Apart from that, I think this last formula should be in its own section
("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a
bitmap. Putting this term there should basically suffice.

I was about to say I'd like it to define the bit order, too (i.e. "bit
offset 0 is the LSb"), but, well, it just uses the bit order used
everywhere in qcow2.

> +
> +Dirty Bitmap Table entry:
> +
> +    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
> +                    If bits 9 - 55 are zero:
> +                      0: Cluster should be read as all zeros.
> +                      1: Cluster should be read as all ones.
> +
> +         1 -  8:    Reserved and must be zero.
> +
> +         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
> +                    cluster boundary. If the offset is 0, the cluster is
> +                    unallocated, see bit 0 description.
> +
> +        56 - 63:    Reserved, must be 0.
> 

Looks good.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 20:05 ` Max Reitz
@ 2015-12-14 20:45   ` John Snow
  2015-12-14 21:44     ` Max Reitz
  2015-12-21 13:41     ` Vladimir Sementsov-Ogievskiy
  2015-12-15  4:18   ` Fam Zheng
  2015-12-15  9:58   ` Kevin Wolf
  2 siblings, 2 replies; 14+ messages in thread
From: John Snow @ 2015-12-14 20:45 UTC (permalink / raw)
  To: Max Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, den, famz, stefanha



On 12/14/2015 03:05 PM, Max Reitz wrote:
> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>> The new feature for qcow2: storing dirty bitmaps.
>>
>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>
>> Strings started from +# are RFC-strings, not to be commited of course.
>>
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
>>  docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 150 insertions(+), 1 deletion(-)
> 
> Overall: Looks better to me. Good enough for me to ACK it, but I still
> have some issues with it.
> 
> Let's evaluate the main point of critique I had: I really want this not
> to be qemu-specific but potentially useful to all programs.
> 
> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
> by describing how to obtain the bit offset of a certain byte guest
> offset. So it's not an opaque binary data dump anymore.
> 
> (Why only "pretty good"? I find the description to be a bit too
> "implicit", I think a separate section describing the bitmap structure
> would be better.)
> 
> Good: The bitmap actually describes the qcow2 file.
> 
> Not so good: While now any program knows how to read the bitmap and that
> it does refer to this qcow2 file, it's interpretation is not so easy
> still. Generally, a dirty bitmap has some reference point, that is the
> state of the disk when the bitmap was cleared or created. For instance,
> for incremental backups, whenever you create a backup based on a dirty
> bitmap, the dirty bitmap is cleared and the backup target is then said
> reference point.
> I think it would be nice to put that reference point (i.e. the name of
> an image file that contains the clean image) into the dirty bitmap
> header, if possible.
> 

This starts to get a little spooky, because QEMU doesn't necessarily
know where (or what) the reference is. QEMU doesn't even know where the
last incremental is.

It might be hard to store something meaningful here.

I suppose the management application could do it itself if it wants to.

> 
> (Note: I won't comment on orthography, because I feel like that is
> something a native speaker should do. O:-))
> 
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..3c89580 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,17 @@ in the description of a field.
>>                      write to an image with unknown auto-clear features if it
>>                      clears the respective bits from this field first.
>>  
>> -                    Bits 0-63:  Reserved (set to 0)
>> +                    Bit 0:      Dirty bitmaps bit.
>> +                                This bit is responsible for Dirty bitmaps
>> +                                extension consistency.
>> +                                If it is set, but there is no Dirty bitmaps
>> +                                extensions, this should be considered as an
>> +                                error.
>> +                                If it is not set, but there is a Dirty bitmaps
>> +                                extension, its data should be considered as
>> +                                inconsistent.
>> +
>> +                    Bits 1-63:  Reserved (set to 0)
>>  
>>           96 -  99:  refcount_order
>>                      Describes the width of a reference count block entry (width
>> @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
>>                          0x00000000 - End of the header extension area
>>                          0xE2792ACA - Backing file format name
>>                          0x6803f857 - Feature name table
>> +                        0x23852875 - Dirty bitmaps
>>                          other      - Unknown header extension, can be safely
>>                                       ignored
>>  
>> @@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
>>                      terminated if it has full length)
>>  
>>  
>> +== Dirty bitmaps ==
>> +
>> +Dirty bitmaps is an optional header extension. It provides an ability to store
>> +dirty bitmaps in a qcow2 image. The data of this extension should be considered
>> +as consistent only if corresponding auto-clear feature bit is set (see
>> +autoclear_features above).
>> +The fields of Dirty bitmaps extension are:
>> +
>> +          0 -  3:  nb_dirty_bitmaps
>> +                   The number of dirty bitmaps contained in the image. Valid
>> +                   values: 1 - 65535.
> 
> Again, I don't see a reason for why we should impose a strict upper
> limit here. I'd prefer "Note that qemu currently only supports up to
> 65535 dirty bitmaps per image."
> 

After discussing this with Eric, I agree.

It's hard to present justification for arbitrary limits if our only
concern is "That's just too many."

Let's list the pragmatic limit we intend to impose in QEMU, but allow
the format to use the fill width of this field.

>> +# Let's be strict, the feature should be deleted with deleting last bitmap.
>> +
>> +          4 -  7:  dirty_bitmap_directory_size
>> +                   Size of the Dirty Bitmap Directory in bytes. It should be
>> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
>> +                   headers.
> 
> No, it "should" not be equal, it *must* be equal. But I think you can
> just omit that last sentence, that would be just as fine.
> 

It's informative, though. Just another clarifying detail that the bitmap
directory is the collection of all the dirty bitmap headers.

Replacing "should" with "must" is sufficient.

>> +# This field is necessary to effectively read Dirty Bitmap Directory, because
>> +# it's entries (which are dirty bitmap headers) may have different lengths.
>> +
>> +          8 - 15:  dirty_bitmap_directory_offset
>> +                   Offset into the image file at which the Dirty Bitmap
>> +                   Directory starts. Must be aligned to a cluster boundary.
>> +
>> +
>>  == Host cluster management ==
>>  
>>  qcow2 manages the allocation of host clusters by maintaining a reference count
>> @@ -360,3 +396,116 @@ Snapshot table entry:
>>  
>>          variable:   Padding to round up the snapshot table entry size to the
>>                      next multiple of 8.
>> +
>> +
>> +== Dirty bitmaps ==
>> +
>> +The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
>> +are relating to the virtual disk, stored in this image.
>> +
>> +=== Dirty Bitmap Directory ===
>> +
>> +Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
>> +entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
>> +starting offset and length are given by the header extension fields
>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
>> +the bitmap directory have variable length, depending on the length of the
>> +bitmap name. These entries are also called dirty bitmap headers.
>> +
>> +Dirty Bitmap Directory Entry:
>> +
>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>> +                    Offset into the image file at which the Dirty Bitmap Table
>> +                    (described below) for the bitmap starts. Must be aligned to
>> +                    a cluster boundary.
>> +
>> +         8 - 11:    dirty_bitmap_table_size
>> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
>> +
>> +        12 - 15:    flags
>> +                    Bit
>> +                      0: in_use
>> +                         The bitmap was not saved correctly and may be
>> +                         inconsistent.
>> +
>> +                      1: auto
>> +                         The bitmap should be autoloaded as block dirty bitmap
>> +                         and tracking should be started. Type of the bitmap
>> +                         should be 'Dirty Tracking Bitmap'.
> 
> I find the wording a bit too qemu-specific. How about this:
> 
> This bitmap is the default dirty bitmap for the virtual disk represented
> by this qcow2 image. It should track all write accesses immediately
> after the image has been opened.
> 

Closer; though we can certainly have more than one, so "a" may not be
appropriate.

"This bitmap is a dirty tracking bitmap for the virtual disk represented
by this qcow2 image."

And to avoid "should" again:

"All writes to this file must also be represented in this bitmap."

> And I find the "should" in "Type of the bitmap should be..." a bit too
> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
> 

Sure.

>> +
>> +                    Bits 2 - 31 are reserved and must be 0.
>> +
>> +        16 - 17:    name_size
>> +                    Size of the bitmap name. Valid values: 1 - 1023.
>> +
>> +             18:    type
>> +                    This field describes the sort of the bitmap.
>> +                    Values:
>> +                      0: Dirty Tracking Bitmap
> 
> If we allow different kinds of bitmaps, it should not be called "dirty
> bitmap" everywhere anymore.
> 

Agreed.

>> +
>> +                    Values 1 - 255 are reserved.
>> +# This is mostly for error checking and information in qemu-img info output.
>> +# The other types may be, for example, "Backup Bitmap" - to make it possible
>> +# stop backup job on vm stop and resume it later. The another one is "Sector
>> +# Alloction Bitmap" (Fam, John, please comment).
> 
> I'm waiting for their comments because that sounds like "refcount table
> with refcount_bits=1" to me. :-)
> 

The idea is that we may allow for bitmaps to store other kinds of
information, like "allocated in this layer." That information is not
necessarily useful to qcow2, but it might be for other image formats. If
we ever do add such subtypes, we can always add a new reserved entry:

# 1: Reserved - Invalid for qcow2
# 2: Backup Progress Bitmap ...
# 3-255: Reserved

Backup progress bitmaps may indeed be useful and sane information to
store in a qcow2, though.

Fam may have more opinions as he's been working on this area of thought
recently.

>> +             19:    granularity_bits
>> +                    Granularity bits. Valid values are: 0 - 31.
>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>> +# there are no reasons of increasing it.
> 
> Good (implicit) question. I can't imagine any reason for wanting to have
> a coarser granularity than 2 GB, but I do think there may be a need in
> the future for some people.
> 
> Once again, I think we should discriminate between what is generally a
> useful limitation and what is simply due to qemu not supporting anything
> else right now.
> 
> Thus, I think it would be better to increase the range to 0 - 63 and
> make a note that qemu only supports values up to 31 right now.
> 

I suppose that won't hurt anything.

(I look forward to the future where the hard drives are so big and the
network bandwidth so bountiful that 2GB granularity is seen as "too
fine-grained!")

>> +
>> +                    Granularity is calculated as
>> +                        granularity = 1 << granularity_bits
>> +
>> +                    Granularity of the bitmap is how many bytes of the image
>> +                    accounts for one bit of the bitmap.
>> +# To be closer to qcow2 and its reality, I've decided to use byte-granularity
>> +# here, not sector-granularity.
> 
> I like that. But do note that qcow2 does align everything at least to
> 512 bytes, so having used sector granularity wouldn't have been too bad.
> 
>> +
>> +        variable:   The name of the bitmap (not null terminated). Should be
>> +                    unique among all dirty bitmap names within the Dirty
>> +                    bitmaps extension.
>> +
>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
>> +                    to the next multiple of 8.
> 
> What I'd like here is variable additional information based on the
> bitmap type. For some types, this may be absolutely necessary; for dirty
> tracking bitmaps it depends on what we do about the reference point thing.
> 
> The reference point thing is the following: As mentioned at the top, I'd
> like there to be some kind of description of what the clean state was.
> As far as I know, this is generally a backup in the form of a file. In
> that case, we could put that filename here.
> 

We may also have exported that backup to an NBD server and we're not
sure (on the local end) where that data is anymore, though.

For local utility usage, when a reference is possible, we might be able
to list it as an optional nice thing, but I think requiring it might be
difficult.

> I don't think not having a reference point description is a serious show
> stopper. qemu itself does rely on the management layer to know which
> bitmap to use when. But I think it would be pretty nice to have it here.
> 

I'm not opposed to listing some "nice" information when available.

last_backup /path/to/incremental.5.qcow2
base_backup /path/to/reference.qcow2

>> +
>> +=== Dirty Bitmap Table ===
>> +
>> +Dirty bitmaps are stored using a one-level (not two-level like refcounts and
>> +guest clusters mapping) structure for the mapping of bitmaps to host clusters.
>> +It is called Dirty Bitmap Table.
>> +
>> +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>> +Directory Entry) and may use multiple clusters, however it must be contiguous
>> +in the image file.
>> +
>> +Given an offset (in bytes) into the bitmap, the offset into the image file can
>> +be obtained as follows:
>> +
>> +    byte_offset =
>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>> +
>> +Taking into account the granularity of the bitmap, an offset in bits into the
>> +image file, corresponding to byte number byte_nr of the image can be calculated
>> +like this:
>> +
>> +    bit_offset =
>> +        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
>> +
>> +Note: the last formula is only for understanding the things, it is unlikely for
>> +it to be useful in a program.
> 
> I think this note is superfluous. All the pseudo-code in this file is
> only that, pseudo-code. ;-)
> 
> Apart from that, I think this last formula should be in its own section
> ("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a
> bitmap. Putting this term there should basically suffice.
> 
> I was about to say I'd like it to define the bit order, too (i.e. "bit
> offset 0 is the LSb"), but, well, it just uses the bit order used
> everywhere in qcow2.
> 
>> +
>> +Dirty Bitmap Table entry:
>> +
>> +    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
>> +                    If bits 9 - 55 are zero:
>> +                      0: Cluster should be read as all zeros.
>> +                      1: Cluster should be read as all ones.
>> +
>> +         1 -  8:    Reserved and must be zero.
>> +
>> +         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
>> +                    cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated, see bit 0 description.
>> +
>> +        56 - 63:    Reserved, must be 0.
>>
> 
> Looks good.
> 
> Max
> 

Thank you, Max & Vladimir.

--js

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 20:45   ` John Snow
@ 2015-12-14 21:44     ` Max Reitz
  2015-12-14 22:26       ` John Snow
  2015-12-21 13:41     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 14+ messages in thread
From: Max Reitz @ 2015-12-14 21:44 UTC (permalink / raw)
  To: John Snow, Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, den, famz, stefanha

[-- Attachment #1: Type: text/plain, Size: 11457 bytes --]

On 14.12.2015 21:45, John Snow wrote:
> 
> 
> On 12/14/2015 03:05 PM, Max Reitz wrote:
>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>> The new feature for qcow2: storing dirty bitmaps.
>>>
>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>
>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>>  docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>  1 file changed, 150 insertions(+), 1 deletion(-)
>>
>> Overall: Looks better to me. Good enough for me to ACK it, but I still
>> have some issues with it.
>>
>> Let's evaluate the main point of critique I had: I really want this not
>> to be qemu-specific but potentially useful to all programs.
>>
>> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
>> by describing how to obtain the bit offset of a certain byte guest
>> offset. So it's not an opaque binary data dump anymore.
>>
>> (Why only "pretty good"? I find the description to be a bit too
>> "implicit", I think a separate section describing the bitmap structure
>> would be better.)
>>
>> Good: The bitmap actually describes the qcow2 file.
>>
>> Not so good: While now any program knows how to read the bitmap and that
>> it does refer to this qcow2 file, it's interpretation is not so easy
>> still. Generally, a dirty bitmap has some reference point, that is the
>> state of the disk when the bitmap was cleared or created. For instance,
>> for incremental backups, whenever you create a backup based on a dirty
>> bitmap, the dirty bitmap is cleared and the backup target is then said
>> reference point.
>> I think it would be nice to put that reference point (i.e. the name of
>> an image file that contains the clean image) into the dirty bitmap
>> header, if possible.
>>
> 
> This starts to get a little spooky, because QEMU doesn't necessarily
> know where (or what) the reference is. QEMU doesn't even know where the
> last incremental is.
> 
> It might be hard to store something meaningful here.

Yes, I thought as much, that's where the "if possible" comes in.

I don't think it would hurt to include the field even if we're unsure
how much use we can make of it. If we can store something nice there,
great! If we can't, too bad.

> I suppose the management application could do it itself if it wants to.

Considering the qcow2 image is closed by qemu anyway just after the
bitmaps are written into it, this might indeed be possible. But I
wouldn't count on it.

>>
>> (Note: I won't comment on orthography, because I feel like that is
>> something a native speaker should do. O:-))
>>
>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>> index 121dfc8..3c89580 100644
>>> --- a/docs/specs/qcow2.txt
>>> +++ b/docs/specs/qcow2.txt

[...]

>>> +# Let's be strict, the feature should be deleted with deleting last bitmap.
>>> +
>>> +          4 -  7:  dirty_bitmap_directory_size
>>> +                   Size of the Dirty Bitmap Directory in bytes. It should be
>>> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
>>> +                   headers.
>>
>> No, it "should" not be equal, it *must* be equal. But I think you can
>> just omit that last sentence, that would be just as fine.
>>
> 
> It's informative, though. Just another clarifying detail that the bitmap
> directory is the collection of all the dirty bitmap headers.
> 
> Replacing "should" with "must" is sufficient.

Of course.

[...]

>>> +=== Dirty Bitmap Directory ===

[...]

>>> +        12 - 15:    flags
>>> +                    Bit
>>> +                      0: in_use
>>> +                         The bitmap was not saved correctly and may be
>>> +                         inconsistent.
>>> +
>>> +                      1: auto
>>> +                         The bitmap should be autoloaded as block dirty bitmap
>>> +                         and tracking should be started. Type of the bitmap
>>> +                         should be 'Dirty Tracking Bitmap'.
>>
>> I find the wording a bit too qemu-specific. How about this:
>>
>> This bitmap is the default dirty bitmap for the virtual disk represented
>> by this qcow2 image. It should track all write accesses immediately
>> after the image has been opened.
>>
> 
> Closer; though we can certainly have more than one, so "a" may not be
> appropriate.
> 
> "This bitmap is a dirty tracking bitmap for the virtual disk represented
> by this qcow2 image."

Hm, aren't all dirty tracking bitmaps of this image... Well, dirty
tracking bitmaps of this image?

We currently don't have anything but dirty tracking bitmaps, and the
flag actually is explicitly valid only for that kind of bitmaps (for
now); also, in this revision, all dirty bitmaps do refer to this very
qcow2 file anyway.

Maybe I just interpreted the sentence wrong and put too much weight into
the first part ("should be autoloaded as block dirty bitmap") so I felt
the need to translate it to an explicit "is the default".

So I guess this bit simply means that the bitmap should be active in
that it tracks new writes?

> And to avoid "should" again:
> 
> "All writes to this file must also be represented in this bitmap."

:-)

I wasn't sure here. I felt "should" appropriate because maybe you do
have a very compelling reason not to do so and to let a certain write
access slip by on purpose.

So this is up to you, if it's "must", then it's "must". If it's
"should", then it's "should".

>> And I find the "should" in "Type of the bitmap should be..." a bit too
>> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
>>
> 
> Sure.
> 

[...]

>>> +# This is mostly for error checking and information in qemu-img info output.
>>> +# The other types may be, for example, "Backup Bitmap" - to make it possible
>>> +# stop backup job on vm stop and resume it later. The another one is "Sector
>>> +# Alloction Bitmap" (Fam, John, please comment).
>>
>> I'm waiting for their comments because that sounds like "refcount table
>> with refcount_bits=1" to me. :-)
>>
> 
> The idea is that we may allow for bitmaps to store other kinds of
> information, like "allocated in this layer." That information is not
> necessarily useful to qcow2, but it might be for other image formats. If
> we ever do add such subtypes, we can always add a new reserved entry:
> 
> # 1: Reserved - Invalid for qcow2
> # 2: Backup Progress Bitmap ...
> # 3-255: Reserved
> 
> Backup progress bitmaps may indeed be useful and sane information to
> store in a qcow2, though.

OK. I was wondering because I want to (again) make sure that these other
bitmaps are not plain dumps of some obscure bitmap qemu has, but that we
can imagine bitmaps that are generally useful.

First: OK, it doesn't really matter. Having this field definitely is
good, even if we wouldn't actually use it.

<offtopic reason="This series is about dirty bitmaps and not about other
types of bitmaps">
Now, my thoughts about backup progress bitmaps are the following:

If we put the backup progress into the source file, that's a bit
strange. You aren't modifying this file, so why would you put it here?
Also, no program but the one doing the backup can do anything with that
bitmap. Others will just say "Nice to know how far you're in your
backup, but, well, am I supposed do something about it?"

Instead, we could put it into the target. That would be nicer, because
then you open the target and you can see "Oh, there's a backup progress
bitmap! I guess I'll need to copy these clusters from the backup source
[given in the backup bitmap header]."

Strictly speaking, the same can be achived by combining the dirty bitmap
used for the backup (in case of an incremental backup) and the refcount
information for the backup target. However, finding the right dirty
bitmap may be a bit cumbersome, so I do think having such a backup
progress bitmap (in the target image) is justified.
</offtopic>

(If you've already decided to put the backup progress into the target
image, you may find that offtopic block not very helpful; however, it
may help you get an idea of when I consider some information to be
beneficial to a qcow2 file.)

> Fam may have more opinions as he's been working on this area of thought
> recently.
> 
>>> +             19:    granularity_bits
>>> +                    Granularity bits. Valid values are: 0 - 31.
>>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>>> +# there are no reasons of increasing it.
>>
>> Good (implicit) question. I can't imagine any reason for wanting to have
>> a coarser granularity than 2 GB, but I do think there may be a need in
>> the future for some people.
>>
>> Once again, I think we should discriminate between what is generally a
>> useful limitation and what is simply due to qemu not supporting anything
>> else right now.
>>
>> Thus, I think it would be better to increase the range to 0 - 63 and
>> make a note that qemu only supports values up to 31 right now.
>>
> 
> I suppose that won't hurt anything.
> 
> (I look forward to the future where the hard drives are so big and the
> network bandwidth so bountiful that 2GB granularity is seen as "too
> fine-grained!")

640k... ;-)

[...]

>>> +
>>> +        variable:   The name of the bitmap (not null terminated). Should be
>>> +                    unique among all dirty bitmap names within the Dirty
>>> +                    bitmaps extension.
>>> +
>>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
>>> +                    to the next multiple of 8.
>>
>> What I'd like here is variable additional information based on the
>> bitmap type. For some types, this may be absolutely necessary; for dirty
>> tracking bitmaps it depends on what we do about the reference point thing.
>>
>> The reference point thing is the following: As mentioned at the top, I'd
>> like there to be some kind of description of what the clean state was.
>> As far as I know, this is generally a backup in the form of a file. In
>> that case, we could put that filename here.
>>
> 
> We may also have exported that backup to an NBD server and we're not
> sure (on the local end) where that data is anymore, though.
> 
> For local utility usage, when a reference is possible, we might be able
> to list it as an optional nice thing, but I think requiring it might be
> difficult.

You're right, we don't need to require it.

>> I don't think not having a reference point description is a serious show
>> stopper. qemu itself does rely on the management layer to know which
>> bitmap to use when. But I think it would be pretty nice to have it here.
>>
> 
> I'm not opposed to listing some "nice" information when available.
> 
> last_backup /path/to/incremental.5.qcow2
> base_backup /path/to/reference.qcow2
> 

I don't think we need the base_backup since you can get that by walking
through the backing chain of the reference point (the backup target),
but it probably won't hurt if you can make a good general semantic
connection to the dirty bitmap.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 21:44     ` Max Reitz
@ 2015-12-14 22:26       ` John Snow
  2015-12-15  4:34         ` Fam Zheng
  2015-12-15 14:10         ` Max Reitz
  0 siblings, 2 replies; 14+ messages in thread
From: John Snow @ 2015-12-14 22:26 UTC (permalink / raw)
  To: Max Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, den, famz, stefanha



On 12/14/2015 04:44 PM, Max Reitz wrote:
> On 14.12.2015 21:45, John Snow wrote:
>>
>>
>> On 12/14/2015 03:05 PM, Max Reitz wrote:
>>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>>> The new feature for qcow2: storing dirty bitmaps.
>>>>
>>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>>
>>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>>
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>
>>>>  docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>  1 file changed, 150 insertions(+), 1 deletion(-)
>>>
>>> Overall: Looks better to me. Good enough for me to ACK it, but I still
>>> have some issues with it.
>>>
>>> Let's evaluate the main point of critique I had: I really want this not
>>> to be qemu-specific but potentially useful to all programs.
>>>
>>> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
>>> by describing how to obtain the bit offset of a certain byte guest
>>> offset. So it's not an opaque binary data dump anymore.
>>>
>>> (Why only "pretty good"? I find the description to be a bit too
>>> "implicit", I think a separate section describing the bitmap structure
>>> would be better.)
>>>
>>> Good: The bitmap actually describes the qcow2 file.
>>>
>>> Not so good: While now any program knows how to read the bitmap and that
>>> it does refer to this qcow2 file, it's interpretation is not so easy
>>> still. Generally, a dirty bitmap has some reference point, that is the
>>> state of the disk when the bitmap was cleared or created. For instance,
>>> for incremental backups, whenever you create a backup based on a dirty
>>> bitmap, the dirty bitmap is cleared and the backup target is then said
>>> reference point.
>>> I think it would be nice to put that reference point (i.e. the name of
>>> an image file that contains the clean image) into the dirty bitmap
>>> header, if possible.
>>>
>>
>> This starts to get a little spooky, because QEMU doesn't necessarily
>> know where (or what) the reference is. QEMU doesn't even know where the
>> last incremental is.
>>
>> It might be hard to store something meaningful here.
> 
> Yes, I thought as much, that's where the "if possible" comes in.
> 
> I don't think it would hurt to include the field even if we're unsure
> how much use we can make of it. If we can store something nice there,
> great! If we can't, too bad.
> 
>> I suppose the management application could do it itself if it wants to.
> 
> Considering the qcow2 image is closed by qemu anyway just after the
> bitmaps are written into it, this might indeed be possible. But I
> wouldn't count on it.
> 
>>>
>>> (Note: I won't comment on orthography, because I feel like that is
>>> something a native speaker should do. O:-))
>>>
>>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>>> index 121dfc8..3c89580 100644
>>>> --- a/docs/specs/qcow2.txt
>>>> +++ b/docs/specs/qcow2.txt
> 
> [...]
> 
>>>> +# Let's be strict, the feature should be deleted with deleting last bitmap.
>>>> +
>>>> +          4 -  7:  dirty_bitmap_directory_size
>>>> +                   Size of the Dirty Bitmap Directory in bytes. It should be
>>>> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
>>>> +                   headers.
>>>
>>> No, it "should" not be equal, it *must* be equal. But I think you can
>>> just omit that last sentence, that would be just as fine.
>>>
>>
>> It's informative, though. Just another clarifying detail that the bitmap
>> directory is the collection of all the dirty bitmap headers.
>>
>> Replacing "should" with "must" is sufficient.
> 
> Of course.
> 
> [...]
> 
>>>> +=== Dirty Bitmap Directory ===
> 
> [...]
> 
>>>> +        12 - 15:    flags
>>>> +                    Bit
>>>> +                      0: in_use
>>>> +                         The bitmap was not saved correctly and may be
>>>> +                         inconsistent.
>>>> +
>>>> +                      1: auto
>>>> +                         The bitmap should be autoloaded as block dirty bitmap
>>>> +                         and tracking should be started. Type of the bitmap
>>>> +                         should be 'Dirty Tracking Bitmap'.
>>>
>>> I find the wording a bit too qemu-specific. How about this:
>>>
>>> This bitmap is the default dirty bitmap for the virtual disk represented
>>> by this qcow2 image. It should track all write accesses immediately
>>> after the image has been opened.
>>>
>>
>> Closer; though we can certainly have more than one, so "a" may not be
>> appropriate.
>>
>> "This bitmap is a dirty tracking bitmap for the virtual disk represented
>> by this qcow2 image."
> 
> Hm, aren't all dirty tracking bitmaps of this image... Well, dirty
> tracking bitmaps of this image?
> 

Tch, sorry. The hyper-generalization is setting in again. Yes, of
course. I was trying to just remove "the" in favor of "a/an."

> We currently don't have anything but dirty tracking bitmaps, and the
> flag actually is explicitly valid only for that kind of bitmaps (for
> now); also, in this revision, all dirty bitmaps do refer to this very
> qcow2 file anyway.
> 

You're right, though using the explicit language "This bitmap [which is
/definitely/ a dirty tracking type...]" is just to cement the idea.

> Maybe I just interpreted the sentence wrong and put too much weight into
> the first part ("should be autoloaded as block dirty bitmap") so I felt
> the need to translate it to an explicit "is the default".
> 
> So I guess this bit simply means that the bitmap should be active in
> that it tracks new writes?
> 

Yes: the idea is of course that QEMU will see this and know that it
needs to load this bitmap as active. It's useful for the "default"
bitmap, but there's no strict reason why there can't be multiple.

(Why is that useful? In my thoughts, it's for different backup paradigms
that backup to different places on different schedules. Weekly vs.
daily, for instance.)

>> And to avoid "should" again:
>>
>> "All writes to this file must also be represented in this bitmap."
> 
> :-)
> 
> I wasn't sure here. I felt "should" appropriate because maybe you do
> have a very compelling reason not to do so and to let a certain write
> access slip by on purpose.
> 

For the dirty type, I don't think so -- unless you have since decided to
disable the bitmap and discontinue its use (but keep the data, for
whatever reason that may be), but at that point you should have also
toggled off the 'auto' bit, so I don't think so.

> So this is up to you, if it's "must", then it's "must". If it's
> "should", then it's "should".
> 

If the auto bit is on, it's a must.

Let's try again:

"1: auto
    This bitmap is of the Dirty Tracking Type and must accurately
    reflect all writes to the virtual disk image by any application
    that would write to it. This bitmap should not lose the auto bit
    except by user intervention."

Or something like that? The idea is that it will indeed be "autoloaded"
and made active, but some bitmaps might be stored in various
frozen/unactive states where this is not desirable.

>>> And I find the "should" in "Type of the bitmap should be..." a bit too
>>> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
>>>
>>
>> Sure.
>>
> 
> [...]
> 
>>>> +# This is mostly for error checking and information in qemu-img info output.
>>>> +# The other types may be, for example, "Backup Bitmap" - to make it possible
>>>> +# stop backup job on vm stop and resume it later. The another one is "Sector
>>>> +# Alloction Bitmap" (Fam, John, please comment).
>>>
>>> I'm waiting for their comments because that sounds like "refcount table
>>> with refcount_bits=1" to me. :-)
>>>
>>
>> The idea is that we may allow for bitmaps to store other kinds of
>> information, like "allocated in this layer." That information is not
>> necessarily useful to qcow2, but it might be for other image formats. If
>> we ever do add such subtypes, we can always add a new reserved entry:
>>
>> # 1: Reserved - Invalid for qcow2
>> # 2: Backup Progress Bitmap ...
>> # 3-255: Reserved
>>
>> Backup progress bitmaps may indeed be useful and sane information to
>> store in a qcow2, though.
> 
> OK. I was wondering because I want to (again) make sure that these other
> bitmaps are not plain dumps of some obscure bitmap qemu has, but that we
> can imagine bitmaps that are generally useful.
> 
> First: OK, it doesn't really matter. Having this field definitely is
> good, even if we wouldn't actually use it.
> 
> <offtopic reason="This series is about dirty bitmaps and not about other
> types of bitmaps">
> Now, my thoughts about backup progress bitmaps are the following:
> 
> If we put the backup progress into the source file, that's a bit
> strange. You aren't modifying this file, so why would you put it here?
> Also, no program but the one doing the backup can do anything with that
> bitmap. Others will just say "Nice to know how far you're in your
> backup, but, well, am I supposed do something about it?"
> 

You're not wrong.

> Instead, we could put it into the target. That would be nicer, because
> then you open the target and you can see "Oh, there's a backup progress
> bitmap! I guess I'll need to copy these clusters from the backup source
> [given in the backup bitmap header]."
> 

Sure! I had not thought through the feature in any great detail -- we're
only generally aware that we may want to store more than one kind of
bitmap in the future.

There's even an RFE for this exact type of bitmap, complete with a
hilariously wrong estimate for when I'd have this done by:

https://bugzilla.redhat.com/show_bug.cgi?id=905123

> Strictly speaking, the same can be achived by combining the dirty bitmap
> used for the backup (in case of an incremental backup) and the refcount
> information for the backup target. However, finding the right dirty
> bitmap may be a bit cumbersome, so I do think having such a backup
> progress bitmap (in the target image) is justified.
> </offtopic>
> 
> (If you've already decided to put the backup progress into the target
> image, you may find that offtopic block not very helpful; however, it
> may help you get an idea of when I consider some information to be
> beneficial to a qcow2 file.)
> 

<Offtopic>
There's no code written specifically for this yet, so we're still OK. It
could reasonably go into either the source or destination:

Destination: "Here are the sectors I still need" is reasonable, as then
you can easily go and fetch the missing bits you need.

However, if the source was modified after the bitmap was written to the
destination, we have no way of knowing if anything has been updated
since we started the backup.

If we store it in the source: "Here's what's changed [and what we never
copied] since we started that backup," we can even make offline writes
to the source after we stop the backup, then resume it later and still
have it work out.

In the destination model, we point to the source for where to get the
rest of the data. In the source model, we point to the destination of
our unfinished backup.

One definitely feels more independently useful: "Here's how to get the
rest of my data" vs "I was being used for an operation you may or may
not care about.", though the less meaningful one might actually be
easier and more flexible to maintain from QEMU's standpoint.

Offtopic indeed.

</Offtopic>

The only real point is that there *might* be different semantics in the
future, but we may also be able to get away with overloading the
existing type entirely.

The reserved type enum will work out OK for this.

>> Fam may have more opinions as he's been working on this area of thought
>> recently.
>>
>>>> +             19:    granularity_bits
>>>> +                    Granularity bits. Valid values are: 0 - 31.
>>>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>>>> +# there are no reasons of increasing it.
>>>
>>> Good (implicit) question. I can't imagine any reason for wanting to have
>>> a coarser granularity than 2 GB, but I do think there may be a need in
>>> the future for some people.
>>>
>>> Once again, I think we should discriminate between what is generally a
>>> useful limitation and what is simply due to qemu not supporting anything
>>> else right now.
>>>
>>> Thus, I think it would be better to increase the range to 0 - 63 and
>>> make a note that qemu only supports values up to 31 right now.
>>>
>>
>> I suppose that won't hurt anything.
>>
>> (I look forward to the future where the hard drives are so big and the
>> network bandwidth so bountiful that 2GB granularity is seen as "too
>> fine-grained!")
> 
> 640k... ;-)
> 
> [...]
> 

"I look forward to the future" wasn't sarcasm as much as it was an
honest statement. :)

>>>> +
>>>> +        variable:   The name of the bitmap (not null terminated). Should be
>>>> +                    unique among all dirty bitmap names within the Dirty
>>>> +                    bitmaps extension.
>>>> +
>>>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
>>>> +                    to the next multiple of 8.
>>>
>>> What I'd like here is variable additional information based on the
>>> bitmap type. For some types, this may be absolutely necessary; for dirty
>>> tracking bitmaps it depends on what we do about the reference point thing.
>>>
>>> The reference point thing is the following: As mentioned at the top, I'd
>>> like there to be some kind of description of what the clean state was.
>>> As far as I know, this is generally a backup in the form of a file. In
>>> that case, we could put that filename here.
>>>
>>
>> We may also have exported that backup to an NBD server and we're not
>> sure (on the local end) where that data is anymore, though.
>>
>> For local utility usage, when a reference is possible, we might be able
>> to list it as an optional nice thing, but I think requiring it might be
>> difficult.
> 
> You're right, we don't need to require it.
> 
>>> I don't think not having a reference point description is a serious show
>>> stopper. qemu itself does rely on the management layer to know which
>>> bitmap to use when. But I think it would be pretty nice to have it here.
>>>
>>
>> I'm not opposed to listing some "nice" information when available.
>>
>> last_backup /path/to/incremental.5.qcow2
>> base_backup /path/to/reference.qcow2
>>
> 
> I don't think we need the base_backup since you can get that by walking
> through the backing chain of the reference point (the backup target),
> but it probably won't hurt if you can make a good general semantic
> connection to the dirty bitmap.
> 
> Max
> 

Yes, probably not. How would you suggest this be implemented, also? Does
the bitmap object need to begin tracking a last-known backup
filename/identifier? I remembered you didn't seem too happy about
keeping filenames in memory for QEMU, but we currently don't track this
information at all. Answer has so far been "It's up to management!"

--js

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 20:05 ` Max Reitz
  2015-12-14 20:45   ` John Snow
@ 2015-12-15  4:18   ` Fam Zheng
  2015-12-15 10:04     ` Vladimir Sementsov-Ogievskiy
  2015-12-15 16:40     ` John Snow
  2015-12-15  9:58   ` Kevin Wolf
  2 siblings, 2 replies; 14+ messages in thread
From: Fam Zheng @ 2015-12-15  4:18 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: kwolf, qemu-devel, Max Reitz, stefanha, den, jsnow

On Mon, 12/14 21:05, Max Reitz wrote:
> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
> > The new feature for qcow2: storing dirty bitmaps.
> > 
> > Only dirty bitmaps relative to this qcow2 image should be stored in it.
> > 
> > Strings started from +# are RFC-strings, not to be commited of course.
> > 
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > ---
> > 
> >  docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 150 insertions(+), 1 deletion(-)
> 
> Overall: Looks better to me. Good enough for me to ACK it, but I still
> have some issues with it.
> 
> Let's evaluate the main point of critique I had: I really want this not
> to be qemu-specific but potentially useful to all programs.
> 
> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
> by describing how to obtain the bit offset of a certain byte guest
> offset. So it's not an opaque binary data dump anymore.
> 
> (Why only "pretty good"? I find the description to be a bit too
> "implicit", I think a separate section describing the bitmap structure
> would be better.)
> 
> Good: The bitmap actually describes the qcow2 file.
> 
> Not so good: While now any program knows how to read the bitmap and that
> it does refer to this qcow2 file, it's interpretation is not so easy
> still. Generally, a dirty bitmap has some reference point, that is the
> state of the disk when the bitmap was cleared or created. For instance,
> for incremental backups, whenever you create a backup based on a dirty
> bitmap, the dirty bitmap is cleared and the backup target is then said
> reference point.
> I think it would be nice to put that reference point (i.e. the name of
> an image file that contains the clean image) into the dirty bitmap
> header, if possible.
> 
> 
> (Note: I won't comment on orthography, because I feel like that is
> something a native speaker should do. O:-))
> 
> > diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> > index 121dfc8..3c89580 100644
> > --- a/docs/specs/qcow2.txt
> > +++ b/docs/specs/qcow2.txt
> > @@ -103,7 +103,17 @@ in the description of a field.
> >                      write to an image with unknown auto-clear features if it
> >                      clears the respective bits from this field first.
> >  
> > -                    Bits 0-63:  Reserved (set to 0)
> > +                    Bit 0:      Dirty bitmaps bit.
> > +                                This bit is responsible for Dirty bitmaps
> > +                                extension consistency.
> > +                                If it is set, but there is no Dirty bitmaps
> > +                                extensions, this should be considered as an
> > +                                error.
> > +                                If it is not set, but there is a Dirty bitmaps
> > +                                extension, its data should be considered as
> > +                                inconsistent.
> > +
> > +                    Bits 1-63:  Reserved (set to 0)
> >  
> >           96 -  99:  refcount_order
> >                      Describes the width of a reference count block entry (width
> > @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
> >                          0x00000000 - End of the header extension area
> >                          0xE2792ACA - Backing file format name
> >                          0x6803f857 - Feature name table
> > +                        0x23852875 - Dirty bitmaps
> >                          other      - Unknown header extension, can be safely
> >                                       ignored
> >  
> > @@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
> >                      terminated if it has full length)
> >  
> >  
> > +== Dirty bitmaps ==
> > +
> > +Dirty bitmaps is an optional header extension. It provides an ability to store
> > +dirty bitmaps in a qcow2 image. The data of this extension should be considered
> > +as consistent only if corresponding auto-clear feature bit is set (see
> > +autoclear_features above).
> > +The fields of Dirty bitmaps extension are:
> > +
> > +          0 -  3:  nb_dirty_bitmaps
> > +                   The number of dirty bitmaps contained in the image. Valid
> > +                   values: 1 - 65535.
> 
> Again, I don't see a reason for why we should impose a strict upper
> limit here. I'd prefer "Note that qemu currently only supports up to
> 65535 dirty bitmaps per image."
> 
> > +# Let's be strict, the feature should be deleted with deleting last bitmap.

Do you mean unsetting the auto-clear feature bit? Yes, I think that makes sense.

> > +
> > +          4 -  7:  dirty_bitmap_directory_size
> > +                   Size of the Dirty Bitmap Directory in bytes. It should be
> > +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
> > +                   headers.
> 
> No, it "should" not be equal, it *must* be equal. But I think you can
> just omit that last sentence, that would be just as fine.
> 
> > +# This field is necessary to effectively read Dirty Bitmap Directory, because
> > +# it's entries (which are dirty bitmap headers) may have different lengths.
> > +
> > +          8 - 15:  dirty_bitmap_directory_offset
> > +                   Offset into the image file at which the Dirty Bitmap
> > +                   Directory starts. Must be aligned to a cluster boundary.
> > +
> > +
> >  == Host cluster management ==
> >  
> >  qcow2 manages the allocation of host clusters by maintaining a reference count
> > @@ -360,3 +396,116 @@ Snapshot table entry:
> >  
> >          variable:   Padding to round up the snapshot table entry size to the
> >                      next multiple of 8.
> > +
> > +
> > +== Dirty bitmaps ==
> > +
> > +The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
> > +are relating to the virtual disk, stored in this image.
> > +
> > +=== Dirty Bitmap Directory ===
> > +
> > +Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
> > +entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
> > +starting offset and length are given by the header extension fields
> > +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
> > +the bitmap directory have variable length, depending on the length of the
> > +bitmap name. These entries are also called dirty bitmap headers.
> > +
> > +Dirty Bitmap Directory Entry:
> > +
> > +    Byte 0 -  7:    dirty_bitmap_table_offset
> > +                    Offset into the image file at which the Dirty Bitmap Table
> > +                    (described below) for the bitmap starts. Must be aligned to
> > +                    a cluster boundary.
> > +
> > +         8 - 11:    dirty_bitmap_table_size
> > +                    Number of entries in the Dirty Bitmap Table of the bitmap.
> > +
> > +        12 - 15:    flags
> > +                    Bit
> > +                      0: in_use
> > +                         The bitmap was not saved correctly and may be
> > +                         inconsistent.
> > +
> > +                      1: auto
> > +                         The bitmap should be autoloaded as block dirty bitmap
> > +                         and tracking should be started. Type of the bitmap
> > +                         should be 'Dirty Tracking Bitmap'.
> 
> I find the wording a bit too qemu-specific. How about this:
> 
> This bitmap is the default dirty bitmap for the virtual disk represented
> by this qcow2 image. It should track all write accesses immediately
> after the image has been opened.
> 
> And I find the "should" in "Type of the bitmap should be..." a bit too
> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
> 
> > +
> > +                    Bits 2 - 31 are reserved and must be 0.
> > +
> > +        16 - 17:    name_size
> > +                    Size of the bitmap name. Valid values: 1 - 1023.
> > +
> > +             18:    type
> > +                    This field describes the sort of the bitmap.
> > +                    Values:
> > +                      0: Dirty Tracking Bitmap
> 
> If we allow different kinds of bitmaps, it should not be called "dirty
> bitmap" everywhere anymore.
> 
> > +
> > +                    Values 1 - 255 are reserved.
> > +# This is mostly for error checking and information in qemu-img info output.
> > +# The other types may be, for example, "Backup Bitmap" - to make it possible
> > +# stop backup job on vm stop and resume it later. The another one is "Sector
> > +# Alloction Bitmap" (Fam, John, please comment).
> 
> I'm waiting for their comments because that sounds like "refcount table
> with refcount_bits=1" to me. :-)

Allocation information for qcow2 can be obtained from L1/L2/refcount tables, so
maybe it's not worth a type here.  I am not quite sure about the "backup
bitmap" type either, because it is QEMU specific; alternatively we can add an
"enabled" flag to each dirty bitmap, so that the "drive-backup" bitmap can be
saved as a disabled dirty tracking bitmap.

But the idea of having the type field makes sense to me in general.

> 
> > +             19:    granularity_bits
> > +                    Granularity bits. Valid values are: 0 - 31.
> > +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
> > +# there are no reasons of increasing it.
> 
> Good (implicit) question. I can't imagine any reason for wanting to have
> a coarser granularity than 2 GB, but I do think there may be a need in
> the future for some people.
> 
> Once again, I think we should discriminate between what is generally a
> useful limitation and what is simply due to qemu not supporting anything
> else right now.
> 
> Thus, I think it would be better to increase the range to 0 - 63 and
> make a note that qemu only supports values up to 31 right now.
> 
> > +
> > +                    Granularity is calculated as
> > +                        granularity = 1 << granularity_bits
> > +
> > +                    Granularity of the bitmap is how many bytes of the image
> > +                    accounts for one bit of the bitmap.
> > +# To be closer to qcow2 and its reality, I've decided to use byte-granularity
> > +# here, not sector-granularity.
> 
> I like that. But do note that qcow2 does align everything at least to
> 512 bytes, so having used sector granularity wouldn't have been too bad.

I would then suggest limiting the valid values to 9 - 63, at least as a note
for QEMU support.

> 
> > +
> > +        variable:   The name of the bitmap (not null terminated). Should be
> > +                    unique among all dirty bitmap names within the Dirty
> > +                    bitmaps extension.
> > +
> > +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
> > +                    to the next multiple of 8.
> 
> What I'd like here is variable additional information based on the
> bitmap type. For some types, this may be absolutely necessary; for dirty
> tracking bitmaps it depends on what we do about the reference point thing.
> 
> The reference point thing is the following: As mentioned at the top, I'd
> like there to be some kind of description of what the clean state was.
> As far as I know, this is generally a backup in the form of a file. In
> that case, we could put that filename here.
> 
> I don't think not having a reference point description is a serious show
> stopper. qemu itself does rely on the management layer to know which
> bitmap to use when. But I think it would be pretty nice to have it here.
> 
> > +
> > +=== Dirty Bitmap Table ===
> > +
> > +Dirty bitmaps are stored using a one-level (not two-level like refcounts and
> > +guest clusters mapping) structure for the mapping of bitmaps to host clusters.
> > +It is called Dirty Bitmap Table.
> > +
> > +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
> > +Directory Entry) and may use multiple clusters, however it must be contiguous
> > +in the image file.
> > +
> > +Given an offset (in bytes) into the bitmap, the offset into the image file can
> > +be obtained as follows:
> > +
> > +    byte_offset =
> > +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
> > +
> > +Taking into account the granularity of the bitmap, an offset in bits into the
> > +image file, corresponding to byte number byte_nr of the image can be calculated
> > +like this:
> > +
> > +    bit_offset =
> > +        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
> > +
> > +Note: the last formula is only for understanding the things, it is unlikely for
> > +it to be useful in a program.
> 
> I think this note is superfluous. All the pseudo-code in this file is
> only that, pseudo-code. ;-)
> 
> Apart from that, I think this last formula should be in its own section
> ("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a
> bitmap. Putting this term there should basically suffice.
> 
> I was about to say I'd like it to define the bit order, too (i.e. "bit
> offset 0 is the LSb"), but, well, it just uses the bit order used
> everywhere in qcow2.
> 
> > +
> > +Dirty Bitmap Table entry:
> > +
> > +    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
> > +                    If bits 9 - 55 are zero:
> > +                      0: Cluster should be read as all zeros.
> > +                      1: Cluster should be read as all ones.
> > +
> > +         1 -  8:    Reserved and must be zero.
> > +
> > +         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
> > +                    cluster boundary. If the offset is 0, the cluster is
> > +                    unallocated, see bit 0 description.
> > +
> > +        56 - 63:    Reserved, must be 0.
> > 
> 
> Looks good.
> 

Apart from all above, is it worth documenting what will happen with all the
existing dirty bitmaps when internal snapshots are created/removed/switched?

Fam

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 22:26       ` John Snow
@ 2015-12-15  4:34         ` Fam Zheng
  2015-12-15 14:10         ` Max Reitz
  1 sibling, 0 replies; 14+ messages in thread
From: Fam Zheng @ 2015-12-15  4:34 UTC (permalink / raw)
  To: John Snow
  Cc: kwolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz,
	stefanha, den

On Mon, 12/14 17:26, John Snow wrote:
> 
> If the auto bit is on, it's a must.
> 
> Let's try again:
> 
> "1: auto
>     This bitmap is of the Dirty Tracking Type and must accurately
>     reflect all writes to the virtual disk image by any application
>     that would write to it. This bitmap should not lose the auto bit
>     except by user intervention."
> 
> Or something like that? The idea is that it will indeed be "autoloaded"
> and made active, but some bitmaps might be stored in various
> frozen/unactive states where this is not desirable.

Yes, I like this idea. It also answers my "enabled" flag question in reply to
Max's comments.

Fam

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 20:05 ` Max Reitz
  2015-12-14 20:45   ` John Snow
  2015-12-15  4:18   ` Fam Zheng
@ 2015-12-15  9:58   ` Kevin Wolf
  2015-12-15 14:03     ` Max Reitz
  2 siblings, 1 reply; 14+ messages in thread
From: Kevin Wolf @ 2015-12-15  9:58 UTC (permalink / raw)
  To: Max Reitz
  Cc: Vladimir Sementsov-Ogievskiy, famz, qemu-devel, stefanha, den, jsnow

[-- Attachment #1: Type: text/plain, Size: 14705 bytes --]

Am 14.12.2015 um 21:05 hat Max Reitz geschrieben:
> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
> > The new feature for qcow2: storing dirty bitmaps.
> > 
> > Only dirty bitmaps relative to this qcow2 image should be stored in it.
> > 
> > Strings started from +# are RFC-strings, not to be commited of course.
> > 
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

First of all, I think we may need some improvemnts on details and wording
here and there, but the format in general looks quite reasonable to me
(at the first sight at least).

> >  docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 150 insertions(+), 1 deletion(-)
> 
> Overall: Looks better to me. Good enough for me to ACK it, but I still
> have some issues with it.
> 
> Let's evaluate the main point of critique I had: I really want this not
> to be qemu-specific but potentially useful to all programs.
> 
> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
> by describing how to obtain the bit offset of a certain byte guest
> offset. So it's not an opaque binary data dump anymore.
> 
> (Why only "pretty good"? I find the description to be a bit too
> "implicit", I think a separate section describing the bitmap structure
> would be better.)
> 
> Good: The bitmap actually describes the qcow2 file.
> 
> Not so good: While now any program knows how to read the bitmap and that
> it does refer to this qcow2 file, it's interpretation is not so easy
> still. Generally, a dirty bitmap has some reference point, that is the
> state of the disk when the bitmap was cleared or created. For instance,
> for incremental backups, whenever you create a backup based on a dirty
> bitmap, the dirty bitmap is cleared and the backup target is then said
> reference point.
> I think it would be nice to put that reference point (i.e. the name of
> an image file that contains the clean image) into the dirty bitmap
> header, if possible.

I don't think it's a valid assumption that the reference point has a
file name. Which makes me wonder... How do dirty bitmaps and internal
snapshots play together?

What I guess could be done is storing a creation date if you think this
would be useful.

> (Note: I won't comment on orthography, because I feel like that is
> something a native speaker should do. O:-))
> 
> > diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> > index 121dfc8..3c89580 100644
> > --- a/docs/specs/qcow2.txt
> > +++ b/docs/specs/qcow2.txt
> > @@ -103,7 +103,17 @@ in the description of a field.
> >                      write to an image with unknown auto-clear features if it
> >                      clears the respective bits from this field first.
> >  
> > -                    Bits 0-63:  Reserved (set to 0)
> > +                    Bit 0:      Dirty bitmaps bit.
> > +                                This bit is responsible for Dirty bitmaps
> > +                                extension consistency.
> > +                                If it is set, but there is no Dirty bitmaps
> > +                                extensions, this should be considered as an
> > +                                error.
> > +                                If it is not set, but there is a Dirty bitmaps
> > +                                extension, its data should be considered as
> > +                                inconsistent.
> > +
> > +                    Bits 1-63:  Reserved (set to 0)
> >  
> >           96 -  99:  refcount_order
> >                      Describes the width of a reference count block entry (width
> > @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
> >                          0x00000000 - End of the header extension area
> >                          0xE2792ACA - Backing file format name
> >                          0x6803f857 - Feature name table
> > +                        0x23852875 - Dirty bitmaps
> >                          other      - Unknown header extension, can be safely
> >                                       ignored
> >  
> > @@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
> >                      terminated if it has full length)
> >  
> >  
> > +== Dirty bitmaps ==
> > +
> > +Dirty bitmaps is an optional header extension. It provides an ability to store
> > +dirty bitmaps in a qcow2 image. The data of this extension should be considered
> > +as consistent only if corresponding auto-clear feature bit is set (see
> > +autoclear_features above).
> > +The fields of Dirty bitmaps extension are:
> > +
> > +          0 -  3:  nb_dirty_bitmaps
> > +                   The number of dirty bitmaps contained in the image. Valid
> > +                   values: 1 - 65535.
> 
> Again, I don't see a reason for why we should impose a strict upper
> limit here. I'd prefer "Note that qemu currently only supports up to
> 65535 dirty bitmaps per image."
> 
> > +# Let's be strict, the feature should be deleted with deleting last bitmap.
> > +
> > +          4 -  7:  dirty_bitmap_directory_size
> > +                   Size of the Dirty Bitmap Directory in bytes. It should be
> > +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
> > +                   headers.
> 
> No, it "should" not be equal, it *must* be equal. But I think you can
> just omit that last sentence, that would be just as fine.
> 
> > +# This field is necessary to effectively read Dirty Bitmap Directory, because
> > +# it's entries (which are dirty bitmap headers) may have different lengths.
> > +
> > +          8 - 15:  dirty_bitmap_directory_offset
> > +                   Offset into the image file at which the Dirty Bitmap
> > +                   Directory starts. Must be aligned to a cluster boundary.
> > +
> > +
> >  == Host cluster management ==
> >  
> >  qcow2 manages the allocation of host clusters by maintaining a reference count
> > @@ -360,3 +396,116 @@ Snapshot table entry:
> >  
> >          variable:   Padding to round up the snapshot table entry size to the
> >                      next multiple of 8.
> > +
> > +
> > +== Dirty bitmaps ==
> > +
> > +The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
> > +are relating to the virtual disk, stored in this image.
> > +
> > +=== Dirty Bitmap Directory ===
> > +
> > +Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
> > +entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
> > +starting offset and length are given by the header extension fields
> > +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
> > +the bitmap directory have variable length, depending on the length of the
> > +bitmap name. These entries are also called dirty bitmap headers.
> > +
> > +Dirty Bitmap Directory Entry:
> > +
> > +    Byte 0 -  7:    dirty_bitmap_table_offset
> > +                    Offset into the image file at which the Dirty Bitmap Table
> > +                    (described below) for the bitmap starts. Must be aligned to
> > +                    a cluster boundary.
> > +
> > +         8 - 11:    dirty_bitmap_table_size
> > +                    Number of entries in the Dirty Bitmap Table of the bitmap.
> > +
> > +        12 - 15:    flags
> > +                    Bit
> > +                      0: in_use
> > +                         The bitmap was not saved correctly and may be
> > +                         inconsistent.
> > +
> > +                      1: auto
> > +                         The bitmap should be autoloaded as block dirty bitmap
> > +                         and tracking should be started. Type of the bitmap
> > +                         should be 'Dirty Tracking Bitmap'.
> 
> I find the wording a bit too qemu-specific. How about this:
> 
> This bitmap is the default dirty bitmap for the virtual disk represented
> by this qcow2 image. It should track all write accesses immediately
> after the image has been opened.

...unless an internal snapshot is loaded?

> And I find the "should" in "Type of the bitmap should be..." a bit too
> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
> 
> > +
> > +                    Bits 2 - 31 are reserved and must be 0.
> > +
> > +        16 - 17:    name_size
> > +                    Size of the bitmap name. Valid values: 1 - 1023.
> > +
> > +             18:    type
> > +                    This field describes the sort of the bitmap.
> > +                    Values:
> > +                      0: Dirty Tracking Bitmap
> 
> If we allow different kinds of bitmaps, it should not be called "dirty
> bitmap" everywhere anymore.
> 
> > +
> > +                    Values 1 - 255 are reserved.
> > +# This is mostly for error checking and information in qemu-img info output.
> > +# The other types may be, for example, "Backup Bitmap" - to make it possible
> > +# stop backup job on vm stop and resume it later. The another one is "Sector
> > +# Alloction Bitmap" (Fam, John, please comment).
> 
> I'm waiting for their comments because that sounds like "refcount table
> with refcount_bits=1" to me. :-)
> 
> > +             19:    granularity_bits
> > +                    Granularity bits. Valid values are: 0 - 31.
> > +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
> > +# there are no reasons of increasing it.
> 
> Good (implicit) question. I can't imagine any reason for wanting to have
> a coarser granularity than 2 GB, but I do think there may be a need in
> the future for some people.
> 
> Once again, I think we should discriminate between what is generally a
> useful limitation and what is simply due to qemu not supporting anything
> else right now.
> 
> Thus, I think it would be better to increase the range to 0 - 63 and
> make a note that qemu only supports values up to 31 right now.

Why 63 and not 255 (the maximum for this one-byte field)? As soon as we
pick an arbitrary value, we can just take whatever qemu actually
supports. We can always increase the limit later.

> > +
> > +                    Granularity is calculated as
> > +                        granularity = 1 << granularity_bits
> > +
> > +                    Granularity of the bitmap is how many bytes of the image
> > +                    accounts for one bit of the bitmap.
> > +# To be closer to qcow2 and its reality, I've decided to use byte-granularity
> > +# here, not sector-granularity.
> 
> I like that. But do note that qcow2 does align everything at least to
> 512 bytes, so having used sector granularity wouldn't have been too bad.

No 512 byte units in the image format, please. It's an implementation
artifact of qemu and has nothing to do with physical disk sectors. Those
happen to be 4k in most cases nowadays.

> > +
> > +        variable:   The name of the bitmap (not null terminated). Should be
> > +                    unique among all dirty bitmap names within the Dirty
> > +                    bitmaps extension.
> > +
> > +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
> > +                    to the next multiple of 8.
> 
> What I'd like here is variable additional information based on the
> bitmap type. For some types, this may be absolutely necessary; for dirty
> tracking bitmaps it depends on what we do about the reference point thing.

This could turn out useful, yes.

I also think some extensibility mechanism would be good, like extra_data
in internal snapshots.

> The reference point thing is the following: As mentioned at the top, I'd
> like there to be some kind of description of what the clean state was.
> As far as I know, this is generally a backup in the form of a file. In
> that case, we could put that filename here.

Who says that it's a local file? Or that it hasn't been moved since we
created the bitmap?

> I don't think not having a reference point description is a serious show
> stopper. qemu itself does rely on the management layer to know which
> bitmap to use when. But I think it would be pretty nice to have it here.
> 
> > +
> > +=== Dirty Bitmap Table ===
> > +
> > +Dirty bitmaps are stored using a one-level (not two-level like refcounts and
> > +guest clusters mapping) structure for the mapping of bitmaps to host clusters.
> > +It is called Dirty Bitmap Table.
> > +
> > +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
> > +Directory Entry) and may use multiple clusters, however it must be contiguous
> > +in the image file.
> > +
> > +Given an offset (in bytes) into the bitmap, the offset into the image file can
> > +be obtained as follows:
> > +
> > +    byte_offset =
> > +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
> > +
> > +Taking into account the granularity of the bitmap, an offset in bits into the
> > +image file, corresponding to byte number byte_nr of the image can be calculated
> > +like this:
> > +
> > +    bit_offset =
> > +        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
> > +
> > +Note: the last formula is only for understanding the things, it is unlikely for
> > +it to be useful in a program.
> 
> I think this note is superfluous. All the pseudo-code in this file is
> only that, pseudo-code. ;-)
> 
> Apart from that, I think this last formula should be in its own section
> ("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a
> bitmap. Putting this term there should basically suffice.
> 
> I was about to say I'd like it to define the bit order, too (i.e. "bit
> offset 0 is the LSb"), but, well, it just uses the bit order used
> everywhere in qcow2.
> 
> > +
> > +Dirty Bitmap Table entry:
> > +
> > +    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
> > +                    If bits 9 - 55 are zero:
> > +                      0: Cluster should be read as all zeros.
> > +                      1: Cluster should be read as all ones.
> > +
> > +         1 -  8:    Reserved and must be zero.
> > +
> > +         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
> > +                    cluster boundary. If the offset is 0, the cluster is
> > +                    unallocated, see bit 0 description.
> > +
> > +        56 - 63:    Reserved, must be 0.

Kevin

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-15  4:18   ` Fam Zheng
@ 2015-12-15 10:04     ` Vladimir Sementsov-Ogievskiy
  2015-12-15 16:40     ` John Snow
  1 sibling, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-12-15 10:04 UTC (permalink / raw)
  To: Fam Zheng; +Cc: kwolf, qemu-devel, Max Reitz, stefanha, den, jsnow

Thanks everyone for comments!

On 15.12.2015 07:18, Fam Zheng wrote:
> On Mon, 12/14 21:05, Max Reitz wrote:
>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>> The new feature for qcow2: storing dirty bitmaps.
>>>
>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>
>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>>   docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>   1 file changed, 150 insertions(+), 1 deletion(-)
>> Overall: Looks better to me. Good enough for me to ACK it, but I still
>> have some issues with it.
>>
>> Let's evaluate the main point of critique I had: I really want this not
>> to be qemu-specific but potentially useful to all programs.
>>
>> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
>> by describing how to obtain the bit offset of a certain byte guest
>> offset. So it's not an opaque binary data dump anymore.
>>
>> (Why only "pretty good"? I find the description to be a bit too
>> "implicit", I think a separate section describing the bitmap structure
>> would be better.)
>>
>> Good: The bitmap actually describes the qcow2 file.
>>
>> Not so good: While now any program knows how to read the bitmap and that
>> it does refer to this qcow2 file, it's interpretation is not so easy
>> still. Generally, a dirty bitmap has some reference point, that is the
>> state of the disk when the bitmap was cleared or created. For instance,
>> for incremental backups, whenever you create a backup based on a dirty
>> bitmap, the dirty bitmap is cleared and the backup target is then said
>> reference point.
>> I think it would be nice to put that reference point (i.e. the name of
>> an image file that contains the clean image) into the dirty bitmap
>> header, if possible.
>>
>>
>> (Note: I won't comment on orthography, because I feel like that is
>> something a native speaker should do. O:-))
>>
>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>> index 121dfc8..3c89580 100644
>>> --- a/docs/specs/qcow2.txt
>>> +++ b/docs/specs/qcow2.txt
>>> @@ -103,7 +103,17 @@ in the description of a field.
>>>                       write to an image with unknown auto-clear features if it
>>>                       clears the respective bits from this field first.
>>>   
>>> -                    Bits 0-63:  Reserved (set to 0)
>>> +                    Bit 0:      Dirty bitmaps bit.
>>> +                                This bit is responsible for Dirty bitmaps
>>> +                                extension consistency.
>>> +                                If it is set, but there is no Dirty bitmaps
>>> +                                extensions, this should be considered as an
>>> +                                error.
>>> +                                If it is not set, but there is a Dirty bitmaps
>>> +                                extension, its data should be considered as
>>> +                                inconsistent.
>>> +
>>> +                    Bits 1-63:  Reserved (set to 0)
>>>   
>>>            96 -  99:  refcount_order
>>>                       Describes the width of a reference count block entry (width
>>> @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
>>>                           0x00000000 - End of the header extension area
>>>                           0xE2792ACA - Backing file format name
>>>                           0x6803f857 - Feature name table
>>> +                        0x23852875 - Dirty bitmaps
>>>                           other      - Unknown header extension, can be safely
>>>                                        ignored
>>>   
>>> @@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
>>>                       terminated if it has full length)
>>>   
>>>   
>>> +== Dirty bitmaps ==
>>> +
>>> +Dirty bitmaps is an optional header extension. It provides an ability to store
>>> +dirty bitmaps in a qcow2 image. The data of this extension should be considered
>>> +as consistent only if corresponding auto-clear feature bit is set (see
>>> +autoclear_features above).
>>> +The fields of Dirty bitmaps extension are:
>>> +
>>> +          0 -  3:  nb_dirty_bitmaps
>>> +                   The number of dirty bitmaps contained in the image. Valid
>>> +                   values: 1 - 65535.
>> Again, I don't see a reason for why we should impose a strict upper
>> limit here. I'd prefer "Note that qemu currently only supports up to
>> 65535 dirty bitmaps per image."
>>
>>> +# Let's be strict, the feature should be deleted with deleting last bitmap.
> Do you mean unsetting the auto-clear feature bit? Yes, I think that makes sense.

auto-clear bit should be zeroed, of course

>
>>> +
>>> +          4 -  7:  dirty_bitmap_directory_size
>>> +                   Size of the Dirty Bitmap Directory in bytes. It should be
>>> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
>>> +                   headers.
>> No, it "should" not be equal, it *must* be equal. But I think you can
>> just omit that last sentence, that would be just as fine.
>>
>>> +# This field is necessary to effectively read Dirty Bitmap Directory, because
>>> +# it's entries (which are dirty bitmap headers) may have different lengths.
>>> +
>>> +          8 - 15:  dirty_bitmap_directory_offset
>>> +                   Offset into the image file at which the Dirty Bitmap
>>> +                   Directory starts. Must be aligned to a cluster boundary.
>>> +
>>> +
>>>   == Host cluster management ==
>>>   
>>>   qcow2 manages the allocation of host clusters by maintaining a reference count
>>> @@ -360,3 +396,116 @@ Snapshot table entry:
>>>   
>>>           variable:   Padding to round up the snapshot table entry size to the
>>>                       next multiple of 8.
>>> +
>>> +
>>> +== Dirty bitmaps ==
>>> +
>>> +The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
>>> +are relating to the virtual disk, stored in this image.
>>> +
>>> +=== Dirty Bitmap Directory ===
>>> +
>>> +Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
>>> +entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
>>> +starting offset and length are given by the header extension fields
>>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
>>> +the bitmap directory have variable length, depending on the length of the
>>> +bitmap name. These entries are also called dirty bitmap headers.
>>> +
>>> +Dirty Bitmap Directory Entry:
>>> +
>>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>>> +                    Offset into the image file at which the Dirty Bitmap Table
>>> +                    (described below) for the bitmap starts. Must be aligned to
>>> +                    a cluster boundary.
>>> +
>>> +         8 - 11:    dirty_bitmap_table_size
>>> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
>>> +
>>> +        12 - 15:    flags
>>> +                    Bit
>>> +                      0: in_use
>>> +                         The bitmap was not saved correctly and may be
>>> +                         inconsistent.
>>> +
>>> +                      1: auto
>>> +                         The bitmap should be autoloaded as block dirty bitmap
>>> +                         and tracking should be started. Type of the bitmap
>>> +                         should be 'Dirty Tracking Bitmap'.
>> I find the wording a bit too qemu-specific. How about this:
>>
>> This bitmap is the default dirty bitmap for the virtual disk represented
>> by this qcow2 image. It should track all write accesses immediately
>> after the image has been opened.
>>
>> And I find the "should" in "Type of the bitmap should be..." a bit too
>> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
>>
>>> +
>>> +                    Bits 2 - 31 are reserved and must be 0.
>>> +
>>> +        16 - 17:    name_size
>>> +                    Size of the bitmap name. Valid values: 1 - 1023.
>>> +
>>> +             18:    type
>>> +                    This field describes the sort of the bitmap.
>>> +                    Values:
>>> +                      0: Dirty Tracking Bitmap
>> If we allow different kinds of bitmaps, it should not be called "dirty
>> bitmap" everywhere anymore.
>>
>>> +
>>> +                    Values 1 - 255 are reserved.
>>> +# This is mostly for error checking and information in qemu-img info output.
>>> +# The other types may be, for example, "Backup Bitmap" - to make it possible
>>> +# stop backup job on vm stop and resume it later. The another one is "Sector
>>> +# Alloction Bitmap" (Fam, John, please comment).
>> I'm waiting for their comments because that sounds like "refcount table
>> with refcount_bits=1" to me. :-)
> Allocation information for qcow2 can be obtained from L1/L2/refcount tables, so
> maybe it's not worth a type here.  I am not quite sure about the "backup
> bitmap" type either, because it is QEMU specific; alternatively we can add an
> "enabled" flag to each dirty bitmap, so that the "drive-backup" bitmap can be
> saved as a disabled dirty tracking bitmap.

Anyway, the bitmap is identified by its name, not by this field. And of 
course each mechanism (backup, etc.) should know the name of its bitmap. 
So, this field is just additional information for user and error checking.

>
> But the idea of having the type field makes sense to me in general.
>
>>> +             19:    granularity_bits
>>> +                    Granularity bits. Valid values are: 0 - 31.
>>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>>> +# there are no reasons of increasing it.
>> Good (implicit) question. I can't imagine any reason for wanting to have
>> a coarser granularity than 2 GB, but I do think there may be a need in
>> the future for some people.
>>
>> Once again, I think we should discriminate between what is generally a
>> useful limitation and what is simply due to qemu not supporting anything
>> else right now.
>>
>> Thus, I think it would be better to increase the range to 0 - 63 and
>> make a note that qemu only supports values up to 31 right now.
>>
>>> +
>>> +                    Granularity is calculated as
>>> +                        granularity = 1 << granularity_bits
>>> +
>>> +                    Granularity of the bitmap is how many bytes of the image
>>> +                    accounts for one bit of the bitmap.
>>> +# To be closer to qcow2 and its reality, I've decided to use byte-granularity
>>> +# here, not sector-granularity.
>> I like that. But do note that qcow2 does align everything at least to
>> 512 bytes, so having used sector granularity wouldn't have been too bad.
> I would then suggest limiting the valid values to 9 - 63, at least as a note
> for QEMU support.
>
>>> +
>>> +        variable:   The name of the bitmap (not null terminated). Should be
>>> +                    unique among all dirty bitmap names within the Dirty
>>> +                    bitmaps extension.
>>> +
>>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
>>> +                    to the next multiple of 8.
>> What I'd like here is variable additional information based on the
>> bitmap type. For some types, this may be absolutely necessary; for dirty
>> tracking bitmaps it depends on what we do about the reference point thing.
>>
>> The reference point thing is the following: As mentioned at the top, I'd
>> like there to be some kind of description of what the clean state was.
>> As far as I know, this is generally a backup in the form of a file. In
>> that case, we could put that filename here.
>>
>> I don't think not having a reference point description is a serious show
>> stopper. qemu itself does rely on the management layer to know which
>> bitmap to use when. But I think it would be pretty nice to have it here.
>>
>>> +
>>> +=== Dirty Bitmap Table ===
>>> +
>>> +Dirty bitmaps are stored using a one-level (not two-level like refcounts and
>>> +guest clusters mapping) structure for the mapping of bitmaps to host clusters.
>>> +It is called Dirty Bitmap Table.
>>> +
>>> +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>>> +Directory Entry) and may use multiple clusters, however it must be contiguous
>>> +in the image file.
>>> +
>>> +Given an offset (in bytes) into the bitmap, the offset into the image file can
>>> +be obtained as follows:
>>> +
>>> +    byte_offset =
>>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>>> +
>>> +Taking into account the granularity of the bitmap, an offset in bits into the
>>> +image file, corresponding to byte number byte_nr of the image can be calculated
>>> +like this:
>>> +
>>> +    bit_offset =
>>> +        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
>>> +
>>> +Note: the last formula is only for understanding the things, it is unlikely for
>>> +it to be useful in a program.
>> I think this note is superfluous. All the pseudo-code in this file is
>> only that, pseudo-code. ;-)
>>
>> Apart from that, I think this last formula should be in its own section
>> ("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a
>> bitmap. Putting this term there should basically suffice.
>>
>> I was about to say I'd like it to define the bit order, too (i.e. "bit
>> offset 0 is the LSb"), but, well, it just uses the bit order used
>> everywhere in qcow2.
>>
>>> +
>>> +Dirty Bitmap Table entry:
>>> +
>>> +    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
>>> +                    If bits 9 - 55 are zero:
>>> +                      0: Cluster should be read as all zeros.
>>> +                      1: Cluster should be read as all ones.
>>> +
>>> +         1 -  8:    Reserved and must be zero.
>>> +
>>> +         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
>>> +                    cluster boundary. If the offset is 0, the cluster is
>>> +                    unallocated, see bit 0 description.
>>> +
>>> +        56 - 63:    Reserved, must be 0.
>>>
>> Looks good.
>>
> Apart from all above, is it worth documenting what will happen with all the
> existing dirty bitmaps when internal snapshots are created/removed/switched?

I think, this should not be in qcow2 spec. Actually, on snapshot switch 
the bitmap, which tracks the image for backup purposes becomes invalid 
and should be deleted.

>
> Fam
>



-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-15  9:58   ` Kevin Wolf
@ 2015-12-15 14:03     ` Max Reitz
  0 siblings, 0 replies; 14+ messages in thread
From: Max Reitz @ 2015-12-15 14:03 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Vladimir Sementsov-Ogievskiy, famz, qemu-devel, stefanha, den, jsnow

On 2015-12-15 at 10:58, Kevin Wolf wrote:
> Am 14.12.2015 um 21:05 hat Max Reitz geschrieben:
>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>> The new feature for qcow2: storing dirty bitmaps.
>>>
>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>
>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>
> First of all, I think we may need some improvemnts on details and wording
> here and there, but the format in general looks quite reasonable to me
> (at the first sight at least).
>

[...]

>> Not so good: While now any program knows how to read the bitmap and that
>> it does refer to this qcow2 file, it's interpretation is not so easy

*its

>> still. Generally, a dirty bitmap has some reference point, that is the
>> state of the disk when the bitmap was cleared or created. For instance,
>> for incremental backups, whenever you create a backup based on a dirty
>> bitmap, the dirty bitmap is cleared and the backup target is then said
>> reference point.
>> I think it would be nice to put that reference point (i.e. the name of
>> an image file that contains the clean image) into the dirty bitmap
>> header, if possible.
>
> I don't think it's a valid assumption that the reference point has a
> file name.

Right, but you can't really do anything useful with the dirty bitmap if 
you don't know that reference point.

I do realize that it may not always possible and often very cumbersome 
to come up with some filename to put here, but I still consider it very 
valuable information if available.

John suggested there to be a way for the management layer to put in this 
information if available. That may save qemu from the trouble of having 
to figure it out.

Unless you/we can think of a better way of describing the reference 
point (note: we don't necessarily have an existing reference point at 
all, you can just clear the dirty bitmap at your whim), I'd be 
completely fine with just adding a reference point filename field that 
may be empty if that information is unknown, and in practice just always 
leave it empty for now.

>            Which makes me wonder... How do dirty bitmaps and internal
> snapshots play together?
>
> What I guess could be done is storing a creation date if you think this
> would be useful.

Only in that you could look through the local filesystem if perchance 
you find a disk image with the same size as the qcow2 image in question 
and roughly the right creation date...

[...]

>>> +             19:    granularity_bits
>>> +                    Granularity bits. Valid values are: 0 - 31.
>>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>>> +# there are no reasons of increasing it.
>>
>> Good (implicit) question. I can't imagine any reason for wanting to have
>> a coarser granularity than 2 GB, but I do think there may be a need in
>> the future for some people.
>>
>> Once again, I think we should discriminate between what is generally a
>> useful limitation and what is simply due to qemu not supporting anything
>> else right now.
>>
>> Thus, I think it would be better to increase the range to 0 - 63 and
>> make a note that qemu only supports values up to 31 right now.
>
> Why 63 and not 255 (the maximum for this one-byte field)? As soon as we
> pick an arbitrary value, we can just take whatever qemu actually
> supports. We can always increase the limit later.

Because I'm willing to take a bet that we won't need granularities 
greater than 2^63 even in twenty years of now.

Also, qcow2 simply doesn't support images with more than ~2^64 clusters.

[...]

>>> +
>>> +        variable:   The name of the bitmap (not null terminated). Should be
>>> +                    unique among all dirty bitmap names within the Dirty
>>> +                    bitmaps extension.
>>> +
>>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
>>> +                    to the next multiple of 8.
>>
>> What I'd like here is variable additional information based on the
>> bitmap type. For some types, this may be absolutely necessary; for dirty
>> tracking bitmaps it depends on what we do about the reference point thing.
>
> This could turn out useful, yes.
>
> I also think some extensibility mechanism would be good, like extra_data
> in internal snapshots.
>
>> The reference point thing is the following: As mentioned at the top, I'd
>> like there to be some kind of description of what the clean state was.
>> As far as I know, this is generally a backup in the form of a file. In
>> that case, we could put that filename here.
>
> Who says that it's a local file? Or that it hasn't been moved since we
> created the bitmap?

The former: Nobody, but if it is, it's valuable information. (I won't 
suggest json: filenames, because that's not very useful information to 
any program outside of qemu.)

The latter: Good point. But the same applies to backing filenames (and 
for the case of incremental backups, which is probably mainly what we 
want to use this feature for, the reference point filename just is the 
backing filename of the backup to be created).

Max

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 22:26       ` John Snow
  2015-12-15  4:34         ` Fam Zheng
@ 2015-12-15 14:10         ` Max Reitz
  1 sibling, 0 replies; 14+ messages in thread
From: Max Reitz @ 2015-12-15 14:10 UTC (permalink / raw)
  To: John Snow, Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, den, famz, stefanha

On 2015-12-14 at 23:26, John Snow wrote:
> On 12/14/2015 04:44 PM, Max Reitz wrote:
>> On 14.12.2015 21:45, John Snow wrote:
>>> On 12/14/2015 03:05 PM, Max Reitz wrote:
>>>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>>>> The new feature for qcow2: storing dirty bitmaps.
>>>>>
>>>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>>>
>>>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>>>
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>> ---
>>>>>
>>>>>   docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>>   1 file changed, 150 insertions(+), 1 deletion(-)

[...]

> If the auto bit is on, it's a must.
>
> Let's try again:
>
> "1: auto
>      This bitmap is of the Dirty Tracking Type and must accurately
>      reflect all writes to the virtual disk image by any application
>      that would write to it. This bitmap should not lose the auto bit
>      except by user intervention."
>
> Or something like that? The idea is that it will indeed be "autoloaded"
> and made active, but some bitmaps might be stored in various
> frozen/unactive states where this is not desirable.

Sounds good to me.

[...]

>> (If you've already decided to put the backup progress into the target
>> image, you may find that offtopic block not very helpful; however, it
>> may help you get an idea of when I consider some information to be
>> beneficial to a qcow2 file.)
>>
>
> <Offtopic>
> There's no code written specifically for this yet, so we're still OK. It
> could reasonably go into either the source or destination:
>
> Destination: "Here are the sectors I still need" is reasonable, as then
> you can easily go and fetch the missing bits you need.
>
> However, if the source was modified after the bitmap was written to the
> destination, we have no way of knowing if anything has been updated
> since we started the backup.
>
> If we store it in the source: "Here's what's changed [and what we never
> copied] since we started that backup," we can even make offline writes
> to the source after we stop the backup, then resume it later and still
> have it work out.
>
> In the destination model, we point to the source for where to get the
> rest of the data. In the source model, we point to the destination of
> our unfinished backup.
>
> One definitely feels more independently useful: "Here's how to get the
> rest of my data" vs "I was being used for an operation you may or may
> not care about.", though the less meaningful one might actually be
> easier and more flexible to maintain from QEMU's standpoint.

...OK, let's worry about that later. :-)

> Offtopic indeed.
>
> </Offtopic>
>
> The only real point is that there *might* be different semantics in the
> future, but we may also be able to get away with overloading the
> existing type entirely.
>
> The reserved type enum will work out OK for this.

Yep.

[...]

>>> I'm not opposed to listing some "nice" information when available.
>>>
>>> last_backup /path/to/incremental.5.qcow2
>>> base_backup /path/to/reference.qcow2
>>>
>>
>> I don't think we need the base_backup since you can get that by walking
>> through the backing chain of the reference point (the backup target),
>> but it probably won't hurt if you can make a good general semantic
>> connection to the dirty bitmap.
>>
>> Max
>>
>
> Yes, probably not. How would you suggest this be implemented, also? Does
> the bitmap object need to begin tracking a last-known backup
> filename/identifier? I remembered you didn't seem too happy about
> keeping filenames in memory for QEMU, but we currently don't track this
> information at all. Answer has so far been "It's up to management!"

Maybe we can keep it up to management. I think you suggested that we may 
give the management layer a way of writing the reference point 
information into the dirty bitmap information. That would work, then.

Other than that... Well, whenever we have a transaction which clears a 
dirty bitmap, we'd have to analyse the other operations within that 
transaction, and if there's some blockdev-backup or drive-backup job 
there, we'd have to take the target image's filename and keep it in the 
bitmap structure, probably, yes. Ugly indeed.

The question is whether management tools are actually keen on 
implementing this feature for us... We'll see.

Max

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-15  4:18   ` Fam Zheng
  2015-12-15 10:04     ` Vladimir Sementsov-Ogievskiy
@ 2015-12-15 16:40     ` John Snow
  2015-12-21 12:20       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 14+ messages in thread
From: John Snow @ 2015-12-15 16:40 UTC (permalink / raw)
  To: Fam Zheng, Vladimir Sementsov-Ogievskiy
  Cc: kwolf, qemu-devel, Max Reitz, stefanha, den



On 12/14/2015 11:18 PM, Fam Zheng wrote:
> On Mon, 12/14 21:05, Max Reitz wrote:
>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>> The new feature for qcow2: storing dirty bitmaps.
>>>
>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>
>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>>  docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>  1 file changed, 150 insertions(+), 1 deletion(-)
>>
>> Overall: Looks better to me. Good enough for me to ACK it, but I still
>> have some issues with it.
>>
>> Let's evaluate the main point of critique I had: I really want this not
>> to be qemu-specific but potentially useful to all programs.
>>
>> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
>> by describing how to obtain the bit offset of a certain byte guest
>> offset. So it's not an opaque binary data dump anymore.
>>
>> (Why only "pretty good"? I find the description to be a bit too
>> "implicit", I think a separate section describing the bitmap structure
>> would be better.)
>>
>> Good: The bitmap actually describes the qcow2 file.
>>
>> Not so good: While now any program knows how to read the bitmap and that
>> it does refer to this qcow2 file, it's interpretation is not so easy
>> still. Generally, a dirty bitmap has some reference point, that is the
>> state of the disk when the bitmap was cleared or created. For instance,
>> for incremental backups, whenever you create a backup based on a dirty
>> bitmap, the dirty bitmap is cleared and the backup target is then said
>> reference point.
>> I think it would be nice to put that reference point (i.e. the name of
>> an image file that contains the clean image) into the dirty bitmap
>> header, if possible.
>>
>>
>> (Note: I won't comment on orthography, because I feel like that is
>> something a native speaker should do. O:-))
>>
>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>> index 121dfc8..3c89580 100644
>>> --- a/docs/specs/qcow2.txt
>>> +++ b/docs/specs/qcow2.txt
>>> @@ -103,7 +103,17 @@ in the description of a field.
>>>                      write to an image with unknown auto-clear features if it
>>>                      clears the respective bits from this field first.
>>>  
>>> -                    Bits 0-63:  Reserved (set to 0)
>>> +                    Bit 0:      Dirty bitmaps bit.
>>> +                                This bit is responsible for Dirty bitmaps
>>> +                                extension consistency.
>>> +                                If it is set, but there is no Dirty bitmaps
>>> +                                extensions, this should be considered as an
>>> +                                error.
>>> +                                If it is not set, but there is a Dirty bitmaps
>>> +                                extension, its data should be considered as
>>> +                                inconsistent.
>>> +
>>> +                    Bits 1-63:  Reserved (set to 0)
>>>  
>>>           96 -  99:  refcount_order
>>>                      Describes the width of a reference count block entry (width
>>> @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
>>>                          0x00000000 - End of the header extension area
>>>                          0xE2792ACA - Backing file format name
>>>                          0x6803f857 - Feature name table
>>> +                        0x23852875 - Dirty bitmaps
>>>                          other      - Unknown header extension, can be safely
>>>                                       ignored
>>>  
>>> @@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
>>>                      terminated if it has full length)
>>>  
>>>  
>>> +== Dirty bitmaps ==
>>> +
>>> +Dirty bitmaps is an optional header extension. It provides an ability to store
>>> +dirty bitmaps in a qcow2 image. The data of this extension should be considered
>>> +as consistent only if corresponding auto-clear feature bit is set (see
>>> +autoclear_features above).
>>> +The fields of Dirty bitmaps extension are:
>>> +
>>> +          0 -  3:  nb_dirty_bitmaps
>>> +                   The number of dirty bitmaps contained in the image. Valid
>>> +                   values: 1 - 65535.
>>
>> Again, I don't see a reason for why we should impose a strict upper
>> limit here. I'd prefer "Note that qemu currently only supports up to
>> 65535 dirty bitmaps per image."
>>
>>> +# Let's be strict, the feature should be deleted with deleting last bitmap.
> 
> Do you mean unsetting the auto-clear feature bit? Yes, I think that makes sense.
> 

I assumed he meant the entire bitmap header. If there's no bitmaps,
there's no reason to store the extension anymore.

>>> +
>>> +          4 -  7:  dirty_bitmap_directory_size
>>> +                   Size of the Dirty Bitmap Directory in bytes. It should be
>>> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
>>> +                   headers.
>>
>> No, it "should" not be equal, it *must* be equal. But I think you can
>> just omit that last sentence, that would be just as fine.
>>
>>> +# This field is necessary to effectively read Dirty Bitmap Directory, because
>>> +# it's entries (which are dirty bitmap headers) may have different lengths.
>>> +
>>> +          8 - 15:  dirty_bitmap_directory_offset
>>> +                   Offset into the image file at which the Dirty Bitmap
>>> +                   Directory starts. Must be aligned to a cluster boundary.
>>> +
>>> +
>>>  == Host cluster management ==
>>>  
>>>  qcow2 manages the allocation of host clusters by maintaining a reference count
>>> @@ -360,3 +396,116 @@ Snapshot table entry:
>>>  
>>>          variable:   Padding to round up the snapshot table entry size to the
>>>                      next multiple of 8.
>>> +
>>> +
>>> +== Dirty bitmaps ==
>>> +
>>> +The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
>>> +are relating to the virtual disk, stored in this image.
>>> +
>>> +=== Dirty Bitmap Directory ===
>>> +
>>> +Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
>>> +entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
>>> +starting offset and length are given by the header extension fields
>>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
>>> +the bitmap directory have variable length, depending on the length of the
>>> +bitmap name. These entries are also called dirty bitmap headers.
>>> +
>>> +Dirty Bitmap Directory Entry:
>>> +
>>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>>> +                    Offset into the image file at which the Dirty Bitmap Table
>>> +                    (described below) for the bitmap starts. Must be aligned to
>>> +                    a cluster boundary.
>>> +
>>> +         8 - 11:    dirty_bitmap_table_size
>>> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
>>> +
>>> +        12 - 15:    flags
>>> +                    Bit
>>> +                      0: in_use
>>> +                         The bitmap was not saved correctly and may be
>>> +                         inconsistent.
>>> +
>>> +                      1: auto
>>> +                         The bitmap should be autoloaded as block dirty bitmap
>>> +                         and tracking should be started. Type of the bitmap
>>> +                         should be 'Dirty Tracking Bitmap'.
>>
>> I find the wording a bit too qemu-specific. How about this:
>>
>> This bitmap is the default dirty bitmap for the virtual disk represented
>> by this qcow2 image. It should track all write accesses immediately
>> after the image has been opened.
>>
>> And I find the "should" in "Type of the bitmap should be..." a bit too
>> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
>>
>>> +
>>> +                    Bits 2 - 31 are reserved and must be 0.
>>> +
>>> +        16 - 17:    name_size
>>> +                    Size of the bitmap name. Valid values: 1 - 1023.
>>> +
>>> +             18:    type
>>> +                    This field describes the sort of the bitmap.
>>> +                    Values:
>>> +                      0: Dirty Tracking Bitmap
>>
>> If we allow different kinds of bitmaps, it should not be called "dirty
>> bitmap" everywhere anymore.
>>
>>> +
>>> +                    Values 1 - 255 are reserved.
>>> +# This is mostly for error checking and information in qemu-img info output.
>>> +# The other types may be, for example, "Backup Bitmap" - to make it possible
>>> +# stop backup job on vm stop and resume it later. The another one is "Sector
>>> +# Alloction Bitmap" (Fam, John, please comment).
>>
>> I'm waiting for their comments because that sounds like "refcount table
>> with refcount_bits=1" to me. :-)
> 
> Allocation information for qcow2 can be obtained from L1/L2/refcount tables, so
> maybe it's not worth a type here.  I am not quite sure about the "backup
> bitmap" type either, because it is QEMU specific; alternatively we can add an
> "enabled" flag to each dirty bitmap, so that the "drive-backup" bitmap can be
> saved as a disabled dirty tracking bitmap.
> 
> But the idea of having the type field makes sense to me in general.
> 

Right -- our plans for other types are still nebulous, but this gives us
some wiggle room.

>>
>>> +             19:    granularity_bits
>>> +                    Granularity bits. Valid values are: 0 - 31.
>>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>>> +# there are no reasons of increasing it.
>>
>> Good (implicit) question. I can't imagine any reason for wanting to have
>> a coarser granularity than 2 GB, but I do think there may be a need in
>> the future for some people.
>>
>> Once again, I think we should discriminate between what is generally a
>> useful limitation and what is simply due to qemu not supporting anything
>> else right now.
>>
>> Thus, I think it would be better to increase the range to 0 - 63 and
>> make a note that qemu only supports values up to 31 right now.
>>
>>> +
>>> +                    Granularity is calculated as
>>> +                        granularity = 1 << granularity_bits
>>> +
>>> +                    Granularity of the bitmap is how many bytes of the image
>>> +                    accounts for one bit of the bitmap.
>>> +# To be closer to qcow2 and its reality, I've decided to use byte-granularity
>>> +# here, not sector-granularity.
>>
>> I like that. But do note that qcow2 does align everything at least to
>> 512 bytes, so having used sector granularity wouldn't have been too bad.
> 
> I would then suggest limiting the valid values to 9 - 63, at least as a note
> for QEMU support.
> 

>From qcow2.txt:

'20 - 23:   cluster_bits
            Number of bits that are used for addressing an offset
            within a cluster (1 << cluster_bits is the cluster size).
            Must not be less than 9 (i.e. 512 byte clusters).'

If we wanted to make the argument for it, we actually could feasibly
introduce 9 as a minimum, arguing that there's simply no use for a
bitmap that tracks granularities any smaller than the qcow2 can even
support.

We could also simply make the case that the bitmap granularity can never
be any smaller than the qcow2's current sector size, but that would
require some small adjustments in QEMU currently to reject bitmaps
configured to be smaller than such.

It's also fine as-is, as having a smaller granularity won't actually
break anything, it's just redundantly verbose and wasteful.

>>
>>> +
>>> +        variable:   The name of the bitmap (not null terminated). Should be
>>> +                    unique among all dirty bitmap names within the Dirty
>>> +                    bitmaps extension.
>>> +
>>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
>>> +                    to the next multiple of 8.
>>
>> What I'd like here is variable additional information based on the
>> bitmap type. For some types, this may be absolutely necessary; for dirty
>> tracking bitmaps it depends on what we do about the reference point thing.
>>
>> The reference point thing is the following: As mentioned at the top, I'd
>> like there to be some kind of description of what the clean state was.
>> As far as I know, this is generally a backup in the form of a file. In
>> that case, we could put that filename here.
>>
>> I don't think not having a reference point description is a serious show
>> stopper. qemu itself does rely on the management layer to know which
>> bitmap to use when. But I think it would be pretty nice to have it here.
>>
>>> +
>>> +=== Dirty Bitmap Table ===
>>> +
>>> +Dirty bitmaps are stored using a one-level (not two-level like refcounts and
>>> +guest clusters mapping) structure for the mapping of bitmaps to host clusters.
>>> +It is called Dirty Bitmap Table.
>>> +
>>> +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>>> +Directory Entry) and may use multiple clusters, however it must be contiguous
>>> +in the image file.
>>> +
>>> +Given an offset (in bytes) into the bitmap, the offset into the image file can
>>> +be obtained as follows:
>>> +
>>> +    byte_offset =
>>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>>> +
>>> +Taking into account the granularity of the bitmap, an offset in bits into the
>>> +image file, corresponding to byte number byte_nr of the image can be calculated
>>> +like this:
>>> +
>>> +    bit_offset =
>>> +        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
>>> +
>>> +Note: the last formula is only for understanding the things, it is unlikely for
>>> +it to be useful in a program.
>>
>> I think this note is superfluous. All the pseudo-code in this file is
>> only that, pseudo-code. ;-)
>>
>> Apart from that, I think this last formula should be in its own section
>> ("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a
>> bitmap. Putting this term there should basically suffice.
>>
>> I was about to say I'd like it to define the bit order, too (i.e. "bit
>> offset 0 is the LSb"), but, well, it just uses the bit order used
>> everywhere in qcow2.
>>
>>> +
>>> +Dirty Bitmap Table entry:
>>> +
>>> +    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
>>> +                    If bits 9 - 55 are zero:
>>> +                      0: Cluster should be read as all zeros.
>>> +                      1: Cluster should be read as all ones.
>>> +
>>> +         1 -  8:    Reserved and must be zero.
>>> +
>>> +         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
>>> +                    cluster boundary. If the offset is 0, the cluster is
>>> +                    unallocated, see bit 0 description.
>>> +
>>> +        56 - 63:    Reserved, must be 0.
>>>
>>
>> Looks good.
>>
> 
> Apart from all above, is it worth documenting what will happen with all the
> existing dirty bitmaps when internal snapshots are created/removed/switched?
> 
> Fam
> 

The bitmaps need to be marked appropriately dirty for any cluster(s)
that are changed as a result of the operation.

The degenerate case if it's too hard to tell is to just mark the entire
bitmap dirty. Not sure if deleting any bitmaps is the right answer,
since management apps might expect to see them ...

Even though a drive can be "rolled back" via a snapshot, there's not
really an equivalent operation for a backup chain, and I don't even
think it would be possible to implement.

Unless the .qcow2 had a special knowledge of "This bitmap was last
synced when we made a snapshot" QEMU wouldn't even be able to notify the
management application that it should consider pointing to [some
incremental] as the now most current one. Incrementals are very
external-state dependent features.

So even if we decided to keep backup copies of bitmaps, they wouldn't
necessarily even be of much use to anyone, I think. Heaven help you if
you delete a snapshot not created near any particular incremental.

So the degenerate case of "You deleted a snapshot, that was an
unfortunate choice, your bitmaps have been marked dirty" seems
appropriate, even though the incremental backup chain now has a weird
"old2 -> old1 -> deleted -> old1.1" style timeflow to it now.

This is worth hashing out and including a blurb, I think.

--js

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-15 16:40     ` John Snow
@ 2015-12-21 12:20       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-12-21 12:20 UTC (permalink / raw)
  To: John Snow, Fam Zheng; +Cc: kwolf, qemu-devel, Max Reitz, stefanha, den

On 15.12.2015 19:40, John Snow wrote:
>
> On 12/14/2015 11:18 PM, Fam Zheng wrote:
>> On Mon, 12/14 21:05, Max Reitz wrote:
>>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>>> The new feature for qcow2: storing dirty bitmaps.
>>>>
>>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>>
>>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>>
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>
>>>>   docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>   1 file changed, 150 insertions(+), 1 deletion(-)
>>> Overall: Looks better to me. Good enough for me to ACK it, but I still
>>> have some issues with it.
>>>
>>> Let's evaluate the main point of critique I had: I really want this not
>>> to be qemu-specific but potentially useful to all programs.
>>>
>>> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
>>> by describing how to obtain the bit offset of a certain byte guest
>>> offset. So it's not an opaque binary data dump anymore.
>>>
>>> (Why only "pretty good"? I find the description to be a bit too
>>> "implicit", I think a separate section describing the bitmap structure
>>> would be better.)
>>>
>>> Good: The bitmap actually describes the qcow2 file.
>>>
>>> Not so good: While now any program knows how to read the bitmap and that
>>> it does refer to this qcow2 file, it's interpretation is not so easy
>>> still. Generally, a dirty bitmap has some reference point, that is the
>>> state of the disk when the bitmap was cleared or created. For instance,
>>> for incremental backups, whenever you create a backup based on a dirty
>>> bitmap, the dirty bitmap is cleared and the backup target is then said
>>> reference point.
>>> I think it would be nice to put that reference point (i.e. the name of
>>> an image file that contains the clean image) into the dirty bitmap
>>> header, if possible.
>>>
>>>
>>> (Note: I won't comment on orthography, because I feel like that is
>>> something a native speaker should do. O:-))
>>>
>>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>>> index 121dfc8..3c89580 100644
>>>> --- a/docs/specs/qcow2.txt
>>>> +++ b/docs/specs/qcow2.txt
>>>> @@ -103,7 +103,17 @@ in the description of a field.
>>>>                       write to an image with unknown auto-clear features if it
>>>>                       clears the respective bits from this field first.
>>>>   
>>>> -                    Bits 0-63:  Reserved (set to 0)
>>>> +                    Bit 0:      Dirty bitmaps bit.
>>>> +                                This bit is responsible for Dirty bitmaps
>>>> +                                extension consistency.
>>>> +                                If it is set, but there is no Dirty bitmaps
>>>> +                                extensions, this should be considered as an
>>>> +                                error.
>>>> +                                If it is not set, but there is a Dirty bitmaps
>>>> +                                extension, its data should be considered as
>>>> +                                inconsistent.
>>>> +
>>>> +                    Bits 1-63:  Reserved (set to 0)
>>>>   
>>>>            96 -  99:  refcount_order
>>>>                       Describes the width of a reference count block entry (width
>>>> @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
>>>>                           0x00000000 - End of the header extension area
>>>>                           0xE2792ACA - Backing file format name
>>>>                           0x6803f857 - Feature name table
>>>> +                        0x23852875 - Dirty bitmaps
>>>>                           other      - Unknown header extension, can be safely
>>>>                                        ignored
>>>>   
>>>> @@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
>>>>                       terminated if it has full length)
>>>>   
>>>>   
>>>> +== Dirty bitmaps ==
>>>> +
>>>> +Dirty bitmaps is an optional header extension. It provides an ability to store
>>>> +dirty bitmaps in a qcow2 image. The data of this extension should be considered
>>>> +as consistent only if corresponding auto-clear feature bit is set (see
>>>> +autoclear_features above).
>>>> +The fields of Dirty bitmaps extension are:
>>>> +
>>>> +          0 -  3:  nb_dirty_bitmaps
>>>> +                   The number of dirty bitmaps contained in the image. Valid
>>>> +                   values: 1 - 65535.
>>> Again, I don't see a reason for why we should impose a strict upper
>>> limit here. I'd prefer "Note that qemu currently only supports up to
>>> 65535 dirty bitmaps per image."
>>>
>>>> +# Let's be strict, the feature should be deleted with deleting last bitmap.
>> Do you mean unsetting the auto-clear feature bit? Yes, I think that makes sense.
>>
> I assumed he meant the entire bitmap header. If there's no bitmaps,
> there's no reason to store the extension anymore.
>
>>>> +
>>>> +          4 -  7:  dirty_bitmap_directory_size
>>>> +                   Size of the Dirty Bitmap Directory in bytes. It should be
>>>> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
>>>> +                   headers.
>>> No, it "should" not be equal, it *must* be equal. But I think you can
>>> just omit that last sentence, that would be just as fine.
>>>
>>>> +# This field is necessary to effectively read Dirty Bitmap Directory, because
>>>> +# it's entries (which are dirty bitmap headers) may have different lengths.
>>>> +
>>>> +          8 - 15:  dirty_bitmap_directory_offset
>>>> +                   Offset into the image file at which the Dirty Bitmap
>>>> +                   Directory starts. Must be aligned to a cluster boundary.
>>>> +
>>>> +
>>>>   == Host cluster management ==
>>>>   
>>>>   qcow2 manages the allocation of host clusters by maintaining a reference count
>>>> @@ -360,3 +396,116 @@ Snapshot table entry:
>>>>   
>>>>           variable:   Padding to round up the snapshot table entry size to the
>>>>                       next multiple of 8.
>>>> +
>>>> +
>>>> +== Dirty bitmaps ==
>>>> +
>>>> +The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
>>>> +are relating to the virtual disk, stored in this image.
>>>> +
>>>> +=== Dirty Bitmap Directory ===
>>>> +
>>>> +Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
>>>> +entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
>>>> +starting offset and length are given by the header extension fields
>>>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
>>>> +the bitmap directory have variable length, depending on the length of the
>>>> +bitmap name. These entries are also called dirty bitmap headers.
>>>> +
>>>> +Dirty Bitmap Directory Entry:
>>>> +
>>>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>>>> +                    Offset into the image file at which the Dirty Bitmap Table
>>>> +                    (described below) for the bitmap starts. Must be aligned to
>>>> +                    a cluster boundary.
>>>> +
>>>> +         8 - 11:    dirty_bitmap_table_size
>>>> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
>>>> +
>>>> +        12 - 15:    flags
>>>> +                    Bit
>>>> +                      0: in_use
>>>> +                         The bitmap was not saved correctly and may be
>>>> +                         inconsistent.
>>>> +
>>>> +                      1: auto
>>>> +                         The bitmap should be autoloaded as block dirty bitmap
>>>> +                         and tracking should be started. Type of the bitmap
>>>> +                         should be 'Dirty Tracking Bitmap'.
>>> I find the wording a bit too qemu-specific. How about this:
>>>
>>> This bitmap is the default dirty bitmap for the virtual disk represented
>>> by this qcow2 image. It should track all write accesses immediately
>>> after the image has been opened.
>>>
>>> And I find the "should" in "Type of the bitmap should be..." a bit too
>>> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
>>>
>>>> +
>>>> +                    Bits 2 - 31 are reserved and must be 0.
>>>> +
>>>> +        16 - 17:    name_size
>>>> +                    Size of the bitmap name. Valid values: 1 - 1023.
>>>> +
>>>> +             18:    type
>>>> +                    This field describes the sort of the bitmap.
>>>> +                    Values:
>>>> +                      0: Dirty Tracking Bitmap
>>> If we allow different kinds of bitmaps, it should not be called "dirty
>>> bitmap" everywhere anymore.
>>>
>>>> +
>>>> +                    Values 1 - 255 are reserved.
>>>> +# This is mostly for error checking and information in qemu-img info output.
>>>> +# The other types may be, for example, "Backup Bitmap" - to make it possible
>>>> +# stop backup job on vm stop and resume it later. The another one is "Sector
>>>> +# Alloction Bitmap" (Fam, John, please comment).
>>> I'm waiting for their comments because that sounds like "refcount table
>>> with refcount_bits=1" to me. :-)
>> Allocation information for qcow2 can be obtained from L1/L2/refcount tables, so
>> maybe it's not worth a type here.  I am not quite sure about the "backup
>> bitmap" type either, because it is QEMU specific; alternatively we can add an
>> "enabled" flag to each dirty bitmap, so that the "drive-backup" bitmap can be
>> saved as a disabled dirty tracking bitmap.
>>
>> But the idea of having the type field makes sense to me in general.
>>
> Right -- our plans for other types are still nebulous, but this gives us
> some wiggle room.
>
>>>> +             19:    granularity_bits
>>>> +                    Granularity bits. Valid values are: 0 - 31.
>>>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>>>> +# there are no reasons of increasing it.
>>> Good (implicit) question. I can't imagine any reason for wanting to have
>>> a coarser granularity than 2 GB, but I do think there may be a need in
>>> the future for some people.
>>>
>>> Once again, I think we should discriminate between what is generally a
>>> useful limitation and what is simply due to qemu not supporting anything
>>> else right now.
>>>
>>> Thus, I think it would be better to increase the range to 0 - 63 and
>>> make a note that qemu only supports values up to 31 right now.
>>>
>>>> +
>>>> +                    Granularity is calculated as
>>>> +                        granularity = 1 << granularity_bits
>>>> +
>>>> +                    Granularity of the bitmap is how many bytes of the image
>>>> +                    accounts for one bit of the bitmap.
>>>> +# To be closer to qcow2 and its reality, I've decided to use byte-granularity
>>>> +# here, not sector-granularity.
>>> I like that. But do note that qcow2 does align everything at least to
>>> 512 bytes, so having used sector granularity wouldn't have been too bad.
>> I would then suggest limiting the valid values to 9 - 63, at least as a note
>> for QEMU support.
>>
>  From qcow2.txt:
>
> '20 - 23:   cluster_bits
>              Number of bits that are used for addressing an offset
>              within a cluster (1 << cluster_bits is the cluster size).
>              Must not be less than 9 (i.e. 512 byte clusters).'
>
> If we wanted to make the argument for it, we actually could feasibly
> introduce 9 as a minimum, arguing that there's simply no use for a
> bitmap that tracks granularities any smaller than the qcow2 can even
> support.
>
> We could also simply make the case that the bitmap granularity can never
> be any smaller than the qcow2's current sector size, but that would
> require some small adjustments in QEMU currently to reject bitmaps
> configured to be smaller than such.
>
> It's also fine as-is, as having a smaller granularity won't actually
> break anything, it's just redundantly verbose and wasteful.
>
>>>> +
>>>> +        variable:   The name of the bitmap (not null terminated). Should be
>>>> +                    unique among all dirty bitmap names within the Dirty
>>>> +                    bitmaps extension.
>>>> +
>>>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
>>>> +                    to the next multiple of 8.
>>> What I'd like here is variable additional information based on the
>>> bitmap type. For some types, this may be absolutely necessary; for dirty
>>> tracking bitmaps it depends on what we do about the reference point thing.
>>>
>>> The reference point thing is the following: As mentioned at the top, I'd
>>> like there to be some kind of description of what the clean state was.
>>> As far as I know, this is generally a backup in the form of a file. In
>>> that case, we could put that filename here.
>>>
>>> I don't think not having a reference point description is a serious show
>>> stopper. qemu itself does rely on the management layer to know which
>>> bitmap to use when. But I think it would be pretty nice to have it here.
>>>
>>>> +
>>>> +=== Dirty Bitmap Table ===
>>>> +
>>>> +Dirty bitmaps are stored using a one-level (not two-level like refcounts and
>>>> +guest clusters mapping) structure for the mapping of bitmaps to host clusters.
>>>> +It is called Dirty Bitmap Table.
>>>> +
>>>> +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>>>> +Directory Entry) and may use multiple clusters, however it must be contiguous
>>>> +in the image file.
>>>> +
>>>> +Given an offset (in bytes) into the bitmap, the offset into the image file can
>>>> +be obtained as follows:
>>>> +
>>>> +    byte_offset =
>>>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>>>> +
>>>> +Taking into account the granularity of the bitmap, an offset in bits into the
>>>> +image file, corresponding to byte number byte_nr of the image can be calculated
>>>> +like this:
>>>> +
>>>> +    bit_offset =
>>>> +        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
>>>> +
>>>> +Note: the last formula is only for understanding the things, it is unlikely for
>>>> +it to be useful in a program.
>>> I think this note is superfluous. All the pseudo-code in this file is
>>> only that, pseudo-code. ;-)
>>>
>>> Apart from that, I think this last formula should be in its own section
>>> ("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a
>>> bitmap. Putting this term there should basically suffice.
>>>
>>> I was about to say I'd like it to define the bit order, too (i.e. "bit
>>> offset 0 is the LSb"), but, well, it just uses the bit order used
>>> everywhere in qcow2.
>>>
>>>> +
>>>> +Dirty Bitmap Table entry:
>>>> +
>>>> +    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
>>>> +                    If bits 9 - 55 are zero:
>>>> +                      0: Cluster should be read as all zeros.
>>>> +                      1: Cluster should be read as all ones.
>>>> +
>>>> +         1 -  8:    Reserved and must be zero.
>>>> +
>>>> +         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
>>>> +                    cluster boundary. If the offset is 0, the cluster is
>>>> +                    unallocated, see bit 0 description.
>>>> +
>>>> +        56 - 63:    Reserved, must be 0.
>>>>
>>> Looks good.
>>>
>> Apart from all above, is it worth documenting what will happen with all the
>> existing dirty bitmaps when internal snapshots are created/removed/switched?
>>
>> Fam
>>
> The bitmaps need to be marked appropriately dirty for any cluster(s)
> that are changed as a result of the operation.
>
> The degenerate case if it's too hard to tell is to just mark the entire
> bitmap dirty. Not sure if deleting any bitmaps is the right answer,
> since management apps might expect to see them ...

So, what about:
1:  auto
     The bitmap must reflect all changes of the virtual disk by any 
application that would write to this qcow2 file (including writes, 
snapshot switching, etc.). The type of this bitmap must be 'Dirty 
Tracking Bitmap'.

>
> Even though a drive can be "rolled back" via a snapshot, there's not
> really an equivalent operation for a backup chain, and I don't even
> think it would be possible to implement.
>
> Unless the .qcow2 had a special knowledge of "This bitmap was last
> synced when we made a snapshot" QEMU wouldn't even be able to notify the
> management application that it should consider pointing to [some
> incremental] as the now most current one. Incrementals are very
> external-state dependent features.
>
> So even if we decided to keep backup copies of bitmaps, they wouldn't
> necessarily even be of much use to anyone, I think. Heaven help you if
> you delete a snapshot not created near any particular incremental.
>
> So the degenerate case of "You deleted a snapshot, that was an
> unfortunate choice, your bitmaps have been marked dirty" seems
> appropriate, even though the incremental backup chain now has a weird
> "old2 -> old1 -> deleted -> old1.1" style timeflow to it now.
>
> This is worth hashing out and including a blurb, I think.
>
> --js


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
  2015-12-14 20:45   ` John Snow
  2015-12-14 21:44     ` Max Reitz
@ 2015-12-21 13:41     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 14+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-12-21 13:41 UTC (permalink / raw)
  To: John Snow, Max Reitz, qemu-devel; +Cc: kwolf, den, famz, stefanha

On 14.12.2015 23:45, John Snow wrote:
>
> On 12/14/2015 03:05 PM, Max Reitz wrote:
>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>> The new feature for qcow2: storing dirty bitmaps.
>>>
>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>
>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>
>>>   docs/specs/qcow2.txt | 151 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>   1 file changed, 150 insertions(+), 1 deletion(-)
>> Overall: Looks better to me. Good enough for me to ACK it, but I still
>> have some issues with it.
>>
>> Let's evaluate the main point of critique I had: I really want this not
>> to be qemu-specific but potentially useful to all programs.
>>
>> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
>> by describing how to obtain the bit offset of a certain byte guest
>> offset. So it's not an opaque binary data dump anymore.
>>
>> (Why only "pretty good"? I find the description to be a bit too
>> "implicit", I think a separate section describing the bitmap structure
>> would be better.)
>>
>> Good: The bitmap actually describes the qcow2 file.
>>
>> Not so good: While now any program knows how to read the bitmap and that
>> it does refer to this qcow2 file, it's interpretation is not so easy
>> still. Generally, a dirty bitmap has some reference point, that is the
>> state of the disk when the bitmap was cleared or created. For instance,
>> for incremental backups, whenever you create a backup based on a dirty
>> bitmap, the dirty bitmap is cleared and the backup target is then said
>> reference point.
>> I think it would be nice to put that reference point (i.e. the name of
>> an image file that contains the clean image) into the dirty bitmap
>> header, if possible.
>>
> This starts to get a little spooky, because QEMU doesn't necessarily
> know where (or what) the reference is. QEMU doesn't even know where the
> last incremental is.
>
> It might be hard to store something meaningful here.
>
> I suppose the management application could do it itself if it wants to.
>
>> (Note: I won't comment on orthography, because I feel like that is
>> something a native speaker should do. O:-))
>>
>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>> index 121dfc8..3c89580 100644
>>> --- a/docs/specs/qcow2.txt
>>> +++ b/docs/specs/qcow2.txt
>>> @@ -103,7 +103,17 @@ in the description of a field.
>>>                       write to an image with unknown auto-clear features if it
>>>                       clears the respective bits from this field first.
>>>   
>>> -                    Bits 0-63:  Reserved (set to 0)
>>> +                    Bit 0:      Dirty bitmaps bit.
>>> +                                This bit is responsible for Dirty bitmaps
>>> +                                extension consistency.
>>> +                                If it is set, but there is no Dirty bitmaps
>>> +                                extensions, this should be considered as an
>>> +                                error.
>>> +                                If it is not set, but there is a Dirty bitmaps
>>> +                                extension, its data should be considered as
>>> +                                inconsistent.
>>> +
>>> +                    Bits 1-63:  Reserved (set to 0)
>>>   
>>>            96 -  99:  refcount_order
>>>                       Describes the width of a reference count block entry (width
>>> @@ -123,6 +133,7 @@ be stored. Each extension has a structure like the following:
>>>                           0x00000000 - End of the header extension area
>>>                           0xE2792ACA - Backing file format name
>>>                           0x6803f857 - Feature name table
>>> +                        0x23852875 - Dirty bitmaps
>>>                           other      - Unknown header extension, can be safely
>>>                                        ignored
>>>   
>>> @@ -166,6 +177,31 @@ the header extension data. Each entry look like this:
>>>                       terminated if it has full length)
>>>   
>>>   
>>> +== Dirty bitmaps ==
>>> +
>>> +Dirty bitmaps is an optional header extension. It provides an ability to store
>>> +dirty bitmaps in a qcow2 image. The data of this extension should be considered
>>> +as consistent only if corresponding auto-clear feature bit is set (see
>>> +autoclear_features above).
>>> +The fields of Dirty bitmaps extension are:
>>> +
>>> +          0 -  3:  nb_dirty_bitmaps
>>> +                   The number of dirty bitmaps contained in the image. Valid
>>> +                   values: 1 - 65535.
>> Again, I don't see a reason for why we should impose a strict upper
>> limit here. I'd prefer "Note that qemu currently only supports up to
>> 65535 dirty bitmaps per image."
>>
> After discussing this with Eric, I agree.
>
> It's hard to present justification for arbitrary limits if our only
> concern is "That's just too many."
>
> Let's list the pragmatic limit we intend to impose in QEMU, but allow
> the format to use the fill width of this field.
>
>>> +# Let's be strict, the feature should be deleted with deleting last bitmap.
>>> +
>>> +          4 -  7:  dirty_bitmap_directory_size
>>> +                   Size of the Dirty Bitmap Directory in bytes. It should be
>>> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty bitmap
>>> +                   headers.
>> No, it "should" not be equal, it *must* be equal. But I think you can
>> just omit that last sentence, that would be just as fine.
>>
> It's informative, though. Just another clarifying detail that the bitmap
> directory is the collection of all the dirty bitmap headers.
>
> Replacing "should" with "must" is sufficient.
>
>>> +# This field is necessary to effectively read Dirty Bitmap Directory, because
>>> +# it's entries (which are dirty bitmap headers) may have different lengths.
>>> +
>>> +          8 - 15:  dirty_bitmap_directory_offset
>>> +                   Offset into the image file at which the Dirty Bitmap
>>> +                   Directory starts. Must be aligned to a cluster boundary.
>>> +
>>> +
>>>   == Host cluster management ==
>>>   
>>>   qcow2 manages the allocation of host clusters by maintaining a reference count
>>> @@ -360,3 +396,116 @@ Snapshot table entry:
>>>   
>>>           variable:   Padding to round up the snapshot table entry size to the
>>>                       next multiple of 8.
>>> +
>>> +
>>> +== Dirty bitmaps ==
>>> +
>>> +The feature supports storing dirty bitmaps in a qcow2 image. All dirty bitmaps
>>> +are relating to the virtual disk, stored in this image.
>>> +
>>> +=== Dirty Bitmap Directory ===
>>> +
>>> +Each dirty bitmap saved in the image is described in a Dirty Bitmap Directory
>>> +entry. Dirty Bitmap Directory is a contiguous area in the image file, whose
>>> +starting offset and length are given by the header extension fields
>>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
>>> +the bitmap directory have variable length, depending on the length of the
>>> +bitmap name. These entries are also called dirty bitmap headers.
>>> +
>>> +Dirty Bitmap Directory Entry:
>>> +
>>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>>> +                    Offset into the image file at which the Dirty Bitmap Table
>>> +                    (described below) for the bitmap starts. Must be aligned to
>>> +                    a cluster boundary.
>>> +
>>> +         8 - 11:    dirty_bitmap_table_size
>>> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
>>> +
>>> +        12 - 15:    flags
>>> +                    Bit
>>> +                      0: in_use
>>> +                         The bitmap was not saved correctly and may be
>>> +                         inconsistent.
>>> +
>>> +                      1: auto
>>> +                         The bitmap should be autoloaded as block dirty bitmap
>>> +                         and tracking should be started. Type of the bitmap
>>> +                         should be 'Dirty Tracking Bitmap'.
>> I find the wording a bit too qemu-specific. How about this:
>>
>> This bitmap is the default dirty bitmap for the virtual disk represented
>> by this qcow2 image. It should track all write accesses immediately
>> after the image has been opened.
>>
> Closer; though we can certainly have more than one, so "a" may not be
> appropriate.
>
> "This bitmap is a dirty tracking bitmap for the virtual disk represented
> by this qcow2 image."
>
> And to avoid "should" again:
>
> "All writes to this file must also be represented in this bitmap."
>
>> And I find the "should" in "Type of the bitmap should be..." a bit too
>> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
>>
> Sure.
>
>>> +
>>> +                    Bits 2 - 31 are reserved and must be 0.
>>> +
>>> +        16 - 17:    name_size
>>> +                    Size of the bitmap name. Valid values: 1 - 1023.
>>> +
>>> +             18:    type
>>> +                    This field describes the sort of the bitmap.
>>> +                    Values:
>>> +                      0: Dirty Tracking Bitmap
>> If we allow different kinds of bitmaps, it should not be called "dirty
>> bitmap" everywhere anymore.
>>
> Agreed.

And, how do we call them? Just "bitmaps"? Not very informative.. 
However, there are no other bitmaps in qcow2, so, I'll try call them 
"bitmaps" in the next version, let's look at it..

>
>>> +
>>> +                    Values 1 - 255 are reserved.
>>> +# This is mostly for error checking and information in qemu-img info output.
>>> +# The other types may be, for example, "Backup Bitmap" - to make it possible
>>> +# stop backup job on vm stop and resume it later. The another one is "Sector
>>> +# Alloction Bitmap" (Fam, John, please comment).
>> I'm waiting for their comments because that sounds like "refcount table
>> with refcount_bits=1" to me. :-)
>>
> The idea is that we may allow for bitmaps to store other kinds of
> information, like "allocated in this layer." That information is not
> necessarily useful to qcow2, but it might be for other image formats. If
> we ever do add such subtypes, we can always add a new reserved entry:
>
> # 1: Reserved - Invalid for qcow2
> # 2: Backup Progress Bitmap ...
> # 3-255: Reserved
>
> Backup progress bitmaps may indeed be useful and sane information to
> store in a qcow2, though.
>
> Fam may have more opinions as he's been working on this area of thought
> recently.
>
>>> +             19:    granularity_bits
>>> +                    Granularity bits. Valid values are: 0 - 31.
>>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>>> +# there are no reasons of increasing it.
>> Good (implicit) question. I can't imagine any reason for wanting to have
>> a coarser granularity than 2 GB, but I do think there may be a need in
>> the future for some people.
>>
>> Once again, I think we should discriminate between what is generally a
>> useful limitation and what is simply due to qemu not supporting anything
>> else right now.
>>
>> Thus, I think it would be better to increase the range to 0 - 63 and
>> make a note that qemu only supports values up to 31 right now.
>>
> I suppose that won't hurt anything.
>
> (I look forward to the future where the hard drives are so big and the
> network bandwidth so bountiful that 2GB granularity is seen as "too
> fine-grained!")
>
>>> +
>>> +                    Granularity is calculated as
>>> +                        granularity = 1 << granularity_bits
>>> +
>>> +                    Granularity of the bitmap is how many bytes of the image
>>> +                    accounts for one bit of the bitmap.
>>> +# To be closer to qcow2 and its reality, I've decided to use byte-granularity
>>> +# here, not sector-granularity.
>> I like that. But do note that qcow2 does align everything at least to
>> 512 bytes, so having used sector granularity wouldn't have been too bad.
>>
>>> +
>>> +        variable:   The name of the bitmap (not null terminated). Should be
>>> +                    unique among all dirty bitmap names within the Dirty
>>> +                    bitmaps extension.
>>> +
>>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size
>>> +                    to the next multiple of 8.
>> What I'd like here is variable additional information based on the
>> bitmap type. For some types, this may be absolutely necessary; for dirty
>> tracking bitmaps it depends on what we do about the reference point thing.
>>
>> The reference point thing is the following: As mentioned at the top, I'd
>> like there to be some kind of description of what the clean state was.
>> As far as I know, this is generally a backup in the form of a file. In
>> that case, we could put that filename here.
>>
> We may also have exported that backup to an NBD server and we're not
> sure (on the local end) where that data is anymore, though.
>
> For local utility usage, when a reference is possible, we might be able
> to list it as an optional nice thing, but I think requiring it might be
> difficult.
>
>> I don't think not having a reference point description is a serious show
>> stopper. qemu itself does rely on the management layer to know which
>> bitmap to use when. But I think it would be pretty nice to have it here.
>>
> I'm not opposed to listing some "nice" information when available.
>
> last_backup /path/to/incremental.5.qcow2
> base_backup /path/to/reference.qcow2
>
>>> +
>>> +=== Dirty Bitmap Table ===
>>> +
>>> +Dirty bitmaps are stored using a one-level (not two-level like refcounts and
>>> +guest clusters mapping) structure for the mapping of bitmaps to host clusters.
>>> +It is called Dirty Bitmap Table.
>>> +
>>> +Each Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>>> +Directory Entry) and may use multiple clusters, however it must be contiguous
>>> +in the image file.
>>> +
>>> +Given an offset (in bytes) into the bitmap, the offset into the image file can
>>> +be obtained as follows:
>>> +
>>> +    byte_offset =
>>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>>> +
>>> +Taking into account the granularity of the bitmap, an offset in bits into the
>>> +image file, corresponding to byte number byte_nr of the image can be calculated
>>> +like this:
>>> +
>>> +    bit_offset =
>>> +        byte_offset(byte_nr / granularity / 8) * 8 + (byte_nr / granularity) % 8
>>> +
>>> +Note: the last formula is only for understanding the things, it is unlikely for
>>> +it to be useful in a program.
>> I think this note is superfluous. All the pseudo-code in this file is
>> only that, pseudo-code. ;-)
>>
>> Apart from that, I think this last formula should be in its own section
>> ("Dirty Bitmaps" or simply "Bitmaps") which describes the structure of a
>> bitmap. Putting this term there should basically suffice.
>>
>> I was about to say I'd like it to define the bit order, too (i.e. "bit
>> offset 0 is the LSb"), but, well, it just uses the bit order used
>> everywhere in qcow2.
>>
>>> +
>>> +Dirty Bitmap Table entry:
>>> +
>>> +    Bit       0:    Reserved and should be zero if bits 9 - 55 are non-zero.
>>> +                    If bits 9 - 55 are zero:
>>> +                      0: Cluster should be read as all zeros.
>>> +                      1: Cluster should be read as all ones.
>>> +
>>> +         1 -  8:    Reserved and must be zero.
>>> +
>>> +         9 - 55:    Bits 9 - 55 of host cluster offset. Must be aligned to a
>>> +                    cluster boundary. If the offset is 0, the cluster is
>>> +                    unallocated, see bit 0 description.
>>> +
>>> +        56 - 63:    Reserved, must be 0.
>>>
>> Looks good.
>>
>> Max
>>
> Thank you, Max & Vladimir.
>
> --js


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-12-21 13:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-14 17:43 [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
2015-12-14 20:05 ` Max Reitz
2015-12-14 20:45   ` John Snow
2015-12-14 21:44     ` Max Reitz
2015-12-14 22:26       ` John Snow
2015-12-15  4:34         ` Fam Zheng
2015-12-15 14:10         ` Max Reitz
2015-12-21 13:41     ` Vladimir Sementsov-Ogievskiy
2015-12-15  4:18   ` Fam Zheng
2015-12-15 10:04     ` Vladimir Sementsov-Ogievskiy
2015-12-15 16:40     ` John Snow
2015-12-21 12:20       ` Vladimir Sementsov-Ogievskiy
2015-12-15  9:58   ` Kevin Wolf
2015-12-15 14:03     ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.