From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39184) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aMAWF-0003ap-Na for qemu-devel@nongnu.org; Thu, 21 Jan 2016 03:23:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aMAWB-0001yD-K8 for qemu-devel@nongnu.org; Thu, 21 Jan 2016 03:23:03 -0500 Received: from mx2.parallels.com ([199.115.105.18]:49609) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aMAWB-0001wz-Am for qemu-devel@nongnu.org; Thu, 21 Jan 2016 03:22:59 -0500 Message-ID: <56A09559.7040805@virtuozzo.com> Date: Thu, 21 Jan 2016 11:22:49 +0300 From: Vladimir Sementsov-Ogievskiy MIME-Version: 1.0 References: <1452517517-3953-1-git-send-email-vsementsov@virtuozzo.com> <20160119174849.GH4579@noname.redhat.com> <569F7EE2.6080402@virtuozzo.com> <569FFA90.9040401@redhat.com> In-Reply-To: <569FFA90.9040401@redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v7] spec: add qcow2 bitmaps extension specification List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: John Snow , Kevin Wolf Cc: famz@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org On 21.01.2016 00:22, John Snow wrote: > > On 01/20/2016 07:34 AM, Vladimir Sementsov-Ogievskiy wrote: >> On 19.01.2016 20:48, Kevin Wolf wrote: >>> Am 11.01.2016 um 14:05 hat Vladimir Sementsov-Ogievskiy geschrieben: >>>> The new feature for qcow2: storing bitmaps. >>>> >>>> This patch adds new header extension to qcow2 - Bitmaps Extension. It >>>> provides an ability to store virtual disk related bitmaps in a qcow2 >>>> image. For now there is only one type of such bitmaps: Dirty Tracking >>>> Bitmap, which just tracks virtual disk changes from some moment. >>>> >>>> Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file, >>>> should be stored in this qcow2 file. The size of each bitmap >>>> (considering its granularity) is equal to virtual disk size. >>>> >>>> Signed-off-by: Vladimir Sementsov-Ogievskiy >>>> --- >>>> >>>> v7: >>>> >>>> - Rewordings, grammar. >>>> Max, Eric, John, thank you very much. >>>> >>>> - add last paragraph: remaining bits in bitmap data clusters must be >>>> zero. >>>> >>>> - s/Bitmap Directory/bitmap directory/ and other names like this at >>>> the request of Max. >>>> >>>> v6: >>>> >>>> - reword bitmap_directory_size description >>>> - bitmap type: make 0 reserved >>>> - extra_data_size: resize to 4bytes >>>> Also, I've marked this field as "must be zero". We can always change >>>> it, if we decide allowing managing app to specify any extra data, by >>>> defining some magic value as a top of user extra data.. So, for now >>>> non zeor extra_data_size should be considered as an error. >>>> - swap name and extra_data to give good alignment to extra_data. >>>> >>>> >>>> v5: >>>> >>>> - 'Dirty bitmaps' renamed to 'Bitmaps', as we may have several types of >>>> bitmaps. >>>> - rewordings >>>> - move upper bounds to "Notes about Qemu limits" >>>> - s/should/must somewhere. (but not everywhere) >>>> - move name_size field closer to name itself in bitmap header >>>> - add extra data area to bitmap header >>>> - move bitmap data description to separate section >>>> >>>> docs/specs/qcow2.txt | 172 >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++- >>>> 1 file changed, 171 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt >>>> index 121dfc8..997239d 100644 >>>> --- a/docs/specs/qcow2.txt >>>> +++ b/docs/specs/qcow2.txt >>>> @@ -103,7 +103,18 @@ in the description of a field. >>>> write to an image with unknown auto-clear >>>> features if it >>>> clears the respective bits from this field first. >>>> - Bits 0-63: Reserved (set to 0) >>>> + Bit 0: Bitmaps extension bit >>>> + This bit indicates consistency for >>>> the bitmaps >>>> + extension data. >>>> + >>>> + It is an error if this bit is set >>>> without the >>>> + bitmaps extension present. >>>> + >>>> + If the bitmaps extension is present >>>> but this >>>> + bit is unset, the bitmaps extension >>>> data is >>>> + inconsistent. >>> It may as well be consistent, but we don't know. >>> >>> Perhaps something like "must be considered inconsistent" or "is >>> potentially inconsistent". >>> >>>> + >>>> + Bits 1-63: Reserved (set to 0) >>>> 96 - 99: refcount_order >>>> Describes the width of a reference count block >>>> entry (width >>>> @@ -123,6 +134,7 @@ be stored. Each extension has a structure like >>>> the following: >>>> 0x00000000 - End of the header extension area >>>> 0xE2792ACA - Backing file format name >>>> 0x6803f857 - Feature name table >>>> + 0x23852875 - Bitmaps extension >>>> other - Unknown header extension, can >>>> be safely >>>> ignored >>>> @@ -166,6 +178,34 @@ the header extension data. Each entry look >>>> like this: >>>> terminated if it has full length) >>>> +== Bitmaps extension == >>>> + >>>> +The bitmaps extension is an optional header extension. It provides >>>> the ability >>>> +to store bitmaps related to a virtual disk. For now, there is only >>>> one bitmap >>>> +type: the dirty tracking bitmap, which tracks virtual disk changes >>>> from some >>>> +point in time. >>> I have one major problem with this patch, and it starts here. >>> >>> The spec talks about dirty tracking bitmaps all the way, but it never >>> really defines what a dirty tracking bitmap even contains. It has a few >>> hints here and there, but they aren't consistent. >>> >>> Here's the first hint: They track "virtual disk changes", which implies >>> they track guest clusters rather than host clusters. >>> >>>> +The data of the extension should be considered consistent only if the >>>> +corresponding auto-clear feature bit is set, see autoclear_features >>>> above. >>>> + >>>> +The fields of the bitmaps extension are: >>>> + >>>> + 0 - 3: nb_bitmaps >>>> + The number of bitmaps contained in the image. >>>> Must be >>>> + greater than or equal to 1. >>>> + >>>> + Note: Qemu currently only supports up to 65535 >>>> bitmaps per >>>> + image. >>>> + >>>> + 4 - 7: bitmap_directory_size >>>> + Size of the bitmap directory in bytes. It is the >>>> cumulative >>>> + size of all (nb_bitmaps) bitmap headers. >>>> + >>>> + 8 - 15: bitmap_directory_offset >>>> + Offset into the image file at which the bitmap >>>> directory >>>> + starts. Must be aligned to a cluster boundary. >>>> + >>>> + >>>> == Host cluster management == >>>> qcow2 manages the allocation of host clusters by maintaining a >>>> reference count >>>> @@ -360,3 +400,133 @@ Snapshot table entry: >>>> variable: Padding to round up the snapshot table entry >>>> size to the >>>> next multiple of 8. >>>> + >>>> + >>>> +== Bitmaps == >>>> + >>>> +As mentioned above, the bitmaps extension provides the ability to >>>> store bitmaps >>>> +related a virtual disk. This section describes how these bitmaps are >>>> stored. >>> s/related/related to/ >>> >>>> +Note: all bitmaps are related to the virtual disk stored in this image. >>>> + >>>> +=== Bitmap directory === >>>> + >>>> +Each bitmap saved in the image is described in a bitmap directory >>>> entry. The >>>> +bitmap directory is a contiguous area in the image file, whose >>>> starting offset >>>> +and length are given by the header extension fields >>>> bitmap_directory_offset and >>>> +bitmap_directory_size. The entries of the bitmap directory have >>>> variable >>>> +length, depending on the length of the bitmap name and extra data. >>>> These >>>> +entries are also called bitmap headers. >>>> + >>>> +Structure of a bitmap directory entry: >>>> + >>>> + Byte 0 - 7: bitmap_table_offset >>>> + Offset into the image file at which the bitmap >>>> table >>>> + (described below) for the bitmap starts. Must be >>>> aligned to >>>> + a cluster boundary. >>>> + >>>> + 8 - 11: bitmap_table_size >>>> + Number of entries in the bitmap table of the >>>> bitmap. >>>> + >>>> + 12 - 15: flags >>>> + Bit >>>> + 0: in_use >>>> + The bitmap was not saved correctly and may be >>>> + inconsistent. >>>> + >>>> + 1: auto >>>> + The bitmap must reflect all changes of the >>>> virtual >>>> + disk by any application that would write to >>>> this qcow2 >>>> + file (including writes, snapshot switching, >>>> etc.). The >>>> + type of this bitmap must be 'dirty tracking >>>> bitmap'. >>> This suggests that we can represent snapshot switching in a dirty >>> tracking bitmap. Which is almost impossible if we track guest clusters, >>> except if loading a snapshot should mean that all bits in the bitmap are >>> set. However, that feels a bit useless and dirty tracking across >>> snapshots doesn't seem to make a whole lot of sense anyway. >>> >>> Maybe what would make more sense is that a bitmap is tied to a specific >>> snapshot or bitmaps are included in a snapshot, so that you can revert >>> to the dirty status as it was when the snapshot was taken. >>> >>> Or, if you really meant, that the tracking words on a host cluster >>> level, what clusters would be included in the bitmap? Would only the >>> virtual disk be included? VM state? Any metadata? >>> >>> It seems we definitely need a new section on dirty tracking bitmaps that >>> describes what the bitmap actually means and how it's supposed to work >>> with snapshots. I guess we could also talk about how it works with other >>> changes to the image like resizing. >> Bitmaps track guest clusters. >> Backup is not related to snapshots, so, I think set all bits in the >> dirty bitmap on snapshot switch is ok. Of course we can't just ignore >> snapshot switch, as it changes how user (and backup) sees the virtual >> disk. Deleting the bitmap on snapshot switch is not good option too I >> think. >> >> Bitmap size is defined to be equal to virtual disk size. So on resize, >> the one who resize must resize the bitmap too, to maintain accordance of >> the image file and the spec. >> >> I'll think about such section, ok. >> >>>> + Bits 2 - 31 are reserved and must be 0. >>>> + >>>> + 16: type >>>> + This field describes the sort of the bitmap. >>>> + Values: >>>> + 1: Dirty tracking bitmap >>>> + >>>> + Values 0, 2 - 255 are reserved. >>>> + >>>> + 17: granularity_bits >>>> + Granularity bits. Valid values: 0 - 63. >>>> + >>>> + Note: Qemu currently doesn't support >>>> granularity_bits >>>> + greater than 31. >>>> + >>>> + Granularity is calculated as >>>> + granularity = 1 << granularity_bits >>>> + >>>> + A bitmap's granularity is how many bytes of the >>>> image >>>> + accounts for one bit of the bitmap. >>>> + >>>> + 18 - 19: name_size >>>> + Size of the bitmap name. Must be non-zero. >>>> + >>>> + Note: Qemu currently doesn't support values >>>> greater than >>>> + 1023. >>>> + >>>> + 20 - 23: extra_data_size >>>> + Size of type-specific extra data. >>>> + >>>> + For now, as no extra data is defined, >>>> extra_data_size is >>>> + reserved and must be zero. >>>> + >>>> + variable: Type-specific extra data for the bitmap. >>> We talked about this in the other subthread. >>> >>>> + variable: The name of the bitmap (not null terminated). >>>> Must be >>>> + unique among all bitmap names within the bitmaps >>>> extension. >>>> + >>>> + variable: Padding to round up the bitmap directory entry >>>> size to the >>>> + next multiple of 8. >>>> + >>>> +=== Bitmap table === >>>> + >>>> +Bitmaps are stored using a one-level structure (as opposed to two-level >>>> +structure like for refcounts and guest clusters mapping) for the >>>> mapping of >>>> +bitmap data to host clusters. This structure is called the bitmap >>>> table. >>>> + >>>> +Each bitmap table has a variable size (stored in the bitmap >>>> directory Entry) >>>> +and may use multiple clusters, however, it must be contiguous in the >>>> image >>>> +file. >>>> + >>>> +Structure of a bitmap table entry: >>>> + >>>> + Bit 0: Reserved and must be zero if bits 9 - 55 are >>>> non-zero. >>>> + If bits 9 - 55 are zero: >>>> + 0: Cluster should be read as all zeros. >>>> + 1: Cluster should be read as all ones. >>>> + >>>> + 1 - 8: Reserved and must be zero. >>>> + >>>> + 9 - 55: Bits 9 - 55 of the host cluster offset. Must be >>>> aligned to >>>> + a cluster boundary. If the offset is 0, the >>>> cluster is >>>> + unallocated; in that case, bit 0 determines how >>>> this >>>> + cluster should be treated when read from. >>>> + >>>> + 56 - 63: Reserved and must be zero. >>>> + >>>> +=== Bitmap data === >>>> + >>>> +As noted above, bitmap data is stored in separate clusters, >>>> described by the >>>> +bitmap table. Given an offset (in bytes) into the bitmap data, the >>>> offset into >>>> +the image file can be obtained as follows: >>>> + >>>> + image_offset = >>>> + bitmap_table[bitmap_data_offset / cluster_size] + >>>> + (bitmap_data_offset % cluster_size) >>>> + >>>> +This offset is not defined if bits 9 - 55 of bitmap table entry are >>>> zero (see >>>> +above). >>>> + >>>> +Given an offset byte_nr into the virtual disk and the bitmap's >>>> granularity, the >>>> +bit offset into the bitmap can be calculated like this: >>>> + >>>> + bit_offset = >>>> + image_offset(byte_nr / granularity / 8) * 8 + >>>> + (byte_nr / granularity) % 8 >>>> + >>>> +If the size of the bitmap data is not a multiply of cluster size >>>> then the last >>>> +cluster of the bitmap data contains some unused tail bits. These >>>> bits must be >>>> +zero. >>> In which order are the bits stored in the bitmap? >> What do you mean? > He means BE or LE. You can intuit it from the formula, but it's not > explicitly stated, I think byte ordering? But it make sense when we deal with integers, which occupies more then 1 byte. Here is plain byte sequence. Bitmap. Like a string.. What should I write here additionally? >>> Kevin >> -- Best regards, Vladimir * now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.