qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
@ 2015-09-05 16:43 Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity() Vladimir Sementsov-Ogievskiy
                   ` (25 more replies)
  0 siblings, 26 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

v3:

Hi all! After long delay I am starting a new thread about persistent
dirty bitmaps. Not all ideas/requests are done from v2 thread, but I
need to start somewhere. Also, there are a lot of code changes to be
considered before everything is finished. Also, v2 thread is too large
and it is hard to handle it.

Here I'll summarize, what was changed. All ideas/comments, that was not
done from v2 thread I'll add as replays to this cover letter. If I miss
something, please add.

01-02: new patches, necessary preparation

03: docs
- add constraints
- document auto clear bit
- renames
    L1 table -> Dirty Bitmap Table
    Dirty Bitmaps Table -> Dirty Bitmap Directory
- remove Standard cluster with its zero bit,

struct changes:

extension header:
  add 32bit dirty_bitmap_directory_size
  it is needed to read the whole Dirty Bitmap Directory.
  nb_dirty_bitmaps becomes excessive (it may be calculated by looking
  through Dirty Bitmap Directory).
  RFC: is it bad? should I drop nb_dirty_bitmaps?

directory entry:
- bitmap size is renamed to 'number of virtual bits'
- flags added. (self/aut/read_only are not reliazed in code for now)
- granularity is now represented by granularity_bits
 

04: structs and consts
Structures changed, accordingly to spec.
QCowDirtyBitmap now only contains name and offset. Offset is offset in
bytes in Dirty Bitmap Directory, so, bitmap header can be accessed:
cached version: s->dirty_bitmap_directory + offset
in image file: s->dirty_bitmap_directory_offset + offset

== dirty-bitmaps feature ==

initial patch [PATCH 2/8] qcow2: add dirty-bitmaps feature
is splitted out to 5 parts. The first with license notice and
constants is in 04. Then, there are:
05: read Dirty Bitmap Directory
06: load dirty bitmap
07: store dirty bitmap
08: add dirty bitmap extension

The code is significantly changed and reorganized. For your comments:
- fixes about flushes
- g_try_... functions are used. But I left the old g_malloc/g_new for
things with constant or strictly limited size (dirty bitmap header,
with its name, name size < 1024, one cluster, bitmap name (< 1024))
- input validation added. 
- dirty bitmap directory is read/written wholly, not by parts.
- load/store through bdrv_dirty_bitmap_[de]serialize_part
- 'delete dirty bitmap' function is removed. This function is not used
in v3. May be this functionality should appear in qemu_img, or with qmp
series.
- qcow2_write_dirty_bitamps function is removed. This functionality is
rewritten, now Bitmap Directory is rewritten from
s->dirty_bitmap_directory, (directory_write()), and not generated from
s->dirty_bitmaps

09: load check
- new functionality, just check existance of dirty bitmap by name

10: store block dirty bitmap
- bdrv_store_dirty_bitmap simplified, as bitmap is simply passed to
qcow2_dirty_bitmap_store

11: load block dirty bitmap
- bdrv_load_dirty_bitmap simplified, as bitmap is created by
qcow2_dirty_bitmap_load. Also, granularity parameter is removed

patch [[PATCH 5/8] qcow2: add qcow2_dirty_bitmap_delete_all] is
removed. This function is not used in v3. May be this functionality
should appear in qemu_img, or with qmp series.

12: autoclear bit
- if it is not set, but bitmap extension is found, it just skipped and
warning printed. Dirty bitmaps are not cleread from the image as in v2
- handling in qcow2-dirty-bitmap.c is changed, the bit is just set when
adding entry to Dirty Bitmap Directory in the image.

13: cmd line
create=on|off flag added. Now, the bitmap is not auto-created.
By default flag is off.
If on, then new bitmap will be created in the image, if the bitmap with
same name is already exists an error will be generated.
If off, then the bitmap will be loaded from the image, if there is no
one an error will be generated.
If create=off and granularity is specified then granularity will be
checked for loaded bitmap and if not match an error will be generated.

also, s/drive/node

also, dirty_bitmap_func():
    - add errp parameter and handle this in main()
    - open image with BDRV_O_RDWR for loading bitmaps

14-16: new patches

17: test
- add three test cases.


v2:
 - rebase on my 'Dirty bitmaps migration' series
 - remove 'print dirty bitmap', 'query-dirty-bitmap' and use md5 for
   testing like with dirty bitmaps migration
 - autoclean features

v1:

The bitmaps are saved into qcow2 file format. It provides both
'internal' and 'external' dirty bitmaps feature:
 - for qcow2 drives we can store bitmaps in the same file
 - for other formats we can store bitmaps in the separate qcow2 file

QCow2 header is extended by fields 'nb_dirty_bitmaps' and
'dirty_bitmaps_offset' like with snapshots.

Proposed command line syntax is the following:

-dirty-bitmap [option1=val1][,option2=val2]...
    Available options are:
    name         The name for the bitmap (necessary).

    file         The file to load the bitmap from.

    file_id      When specified with 'file' option, then this file will
                 be available through this id for other -dirty-bitmap
                 options when specified without 'file' option, then it
                 is a reference to 'file', specified with another
                 -dirty-bitmap option, and it will be used to load the
                 bitmap from.

    drive        The drive to bind the bitmap to. It should be specified
                 as 'id' suboption of one of -drive options. If nor
                 'file' neither 'file_id' are specified, then the bitmap
                 will be loaded from that drive (internal dirty bitmap).

    granularity  The granularity for the bitmap. Not necessary, the
                 default value may be used.

    enabled      on|off. Default is 'on'. Disabled bitmaps are not
                 changing regardless of writes to corresponding drive.

Examples:

qemu -drive file=a.qcow2,id=disk -dirty-bitmap name=b,drive=disk
qemu -drive file=a.raw,id=disk \
     -dirty-bitmap name=b,drive=disk,file=b.qcow2,enabled=off

Vladimir Sementsov-Ogievskiy (8):
  spec: add qcow2-dirty-bitmaps specification
  qcow2: add dirty-bitmaps feature
  block: store persistent dirty bitmaps
  block: add bdrv_load_dirty_bitmap
  qcow2: add qcow2_dirty_bitmap_delete_all
  qcow2: add autoclear bit for dirty bitmaps
  qemu: command line option for dirty bitmaps
  iotests: test internal persistent dirty bitmap

 block.c                       |  82 +++++++
 block/Makefile.objs           |   2 +-
 block/qcow2-dirty-bitmap.c    | 537 ++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c                 |  69 +++++-
 block/qcow2.h                 |  61 +++++
 blockdev.c                    |  38 +++
 docs/specs/qcow2.txt          |  66 ++++++
 include/block/block.h         |   9 +
 include/block/block_int.h     |  10 +
 include/sysemu/blockdev.h     |   1 +
 include/sysemu/sysemu.h       |   1 +
 qemu-options.hx               |  37 +++
 tests/qemu-iotests/118        |  83 +++++++
 tests/qemu-iotests/118.out    |   5 +
 tests/qemu-iotests/group      |   1 +
 tests/qemu-iotests/iotests.py |   6 +
 vl.c                          | 100 ++++++++
 17 files changed, 1105 insertions(+), 3 deletions(-)
 create mode 100644 block/qcow2-dirty-bitmap.c
 create mode 100755 tests/qemu-iotests/118
 create mode 100644 tests/qemu-iotests/118.out

-- 
1.9.1

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity()
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-15 15:36   ` Eric Blake
  2015-10-05 22:47   ` John Snow
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size() Vladimir Sementsov-Ogievskiy
                   ` (24 subsequent siblings)
  25 siblings, 2 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c               | 2 +-
 include/block/block.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 4f7fc0d..6d14f5b 100644
--- a/block.c
+++ b/block.c
@@ -3591,7 +3591,7 @@ uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
     return granularity;
 }
 
-uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
+uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap)
 {
     return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }
diff --git a/include/block/block.h b/include/block/block.h
index edc1510..fb7d410 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -495,7 +495,7 @@ void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
-uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
+uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
 DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size()
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity() Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-15 15:37   ` Eric Blake
  2015-10-05 22:48   ` John Snow
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
                   ` (23 subsequent siblings)
  25 siblings, 2 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c               | 5 +++++
 include/block/block.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/block.c b/block.c
index 6d14f5b..8c39d0a 100644
--- a/block.c
+++ b/block.c
@@ -3632,6 +3632,11 @@ const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap)
     return bitmap->name;
 }
 
+int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
+{
+    return bitmap->size;
+}
+
 uint64_t bdrv_dirty_bitmap_data_size(const BdrvDirtyBitmap *bitmap,
                                      uint64_t count)
 {
diff --git a/include/block/block.h b/include/block/block.h
index fb7d410..8166640 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -510,6 +510,7 @@ void bdrv_set_dirty_iter(struct HBitmapIter *hbi, int64_t offset);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
 
 const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
+int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
 uint64_t bdrv_dirty_bitmap_data_size(const BdrvDirtyBitmap *bitmap,
                                      uint64_t count);
 void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity() Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size() Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 17:33   ` Vladimir Sementsov-Ogievskiy
                     ` (2 more replies)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts Vladimir Sementsov-Ogievskiy
                   ` (22 subsequent siblings)
  25 siblings, 3 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Persistent dirty bitmaps will be saved into qcow2 files. It may be used
as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
other drives (there may be qcow2 file with zero disk size but with
several dirty bitmaps for other drives).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 docs/specs/qcow2.txt | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 126 insertions(+), 1 deletion(-)

diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 121dfc8..5fc0365 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -103,7 +103,13 @@ in the description of a field.
                     write to an image with unknown auto-clear features if it
                     clears the respective bits from this field first.
 
-                    Bits 0-63:  Reserved (set to 0)
+                    Bit 0:      Dirty bitmaps bit. If this bit is set then
+                                there is a _consistent_ Dirty bitmaps extension
+                                in the image. If it is not set, but there is a
+                                Dirty bitmaps extension, its data should be
+                                considered as inconsistent.
+
+                    Bits 1-63:  Reserved (set to 0)
 
          96 -  99:  refcount_order
                     Describes the width of a reference count block entry (width
@@ -123,6 +129,7 @@ be stored. Each extension has a structure like the following:
                         0x00000000 - End of the header extension area
                         0xE2792ACA - Backing file format name
                         0x6803f857 - Feature name table
+                        0x23852875 - Dirty bitmaps
                         other      - Unknown header extension, can be safely
                                      ignored
 
@@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
                     terminated if it has full length)
 
 
+== Dirty bitmaps ==
+
+Dirty bitmaps is an optional header extension. It provides an ability to store
+dirty bitmaps in a qcow2 image. The fields are:
+
+          0 -  3:  nb_dirty_bitmaps
+                   The number of dirty bitmaps contained in the image. Valid
+                   values: 0 - 65535.
+
+          4 -  7:  dirty_bitmap_directory_size
+                   Size of the Dirty Bitmap Directory in bytes. Valid values:
+                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
+
+          8 - 15:  dirty_bitmap_directory_offset
+                   Offset into the image file at which the Dirty Bitmap
+                   Directory starts. Must be aligned to a cluster boundary.
+
+
 == Host cluster management ==
 
 qcow2 manages the allocation of host clusters by maintaining a reference count
@@ -360,3 +385,103 @@ Snapshot table entry:
 
         variable:   Padding to round up the snapshot table entry size to the
                     next multiple of 8.
+
+
+== Dirty bitmaps ==
+
+The feature supports storing dirty bitmaps in a qcow2 image.
+
+=== Cluster mapping ===
+
+Dirty bitmaps are stored using a ONE-level structure for the mapping of
+bitmaps to host clusters. It is called Dirty Bitmap Table.
+
+The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
+Directory Entry) and may use multiple clusters, however it must be contiguous
+in the image file.
+
+Given an offset (in bytes) into the bitmap, the offset into the image file can
+be obtained as follows:
+
+    byte_offset =
+        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
+
+Taking into accout the granularity of the bitmap, an offset in bits into the
+image file can be obtained like this:
+
+    bit_offset =
+        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr / granularity) % 8
+
+Here bit_nr is a number of "virtual" bit of the bitmap, which is covered by
+"physical" bit with number (bit_nr / granularity).
+
+Dirty Bitmap Table entry:
+
+    Bit  0 -  8:    Reserved
+
+         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
+                    cluster boundary. If the offset is 0, the cluster is
+                    unallocated, and should be read as all zeros.
+
+        56 - 63:    Reserved
+
+=== Dirty Bitmap Directory ===
+
+Each dirty bitmap, saved in the image is described in the Dirty Bitmap
+Directory entry. Dirty Bitmap Directory is a contiguous area in the image file,
+whose starting offset and length are given by the header extension fields
+dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
+the bitmap directory have variable length, depending on the length of the
+bitmap name.
+
+Dirty Bitmap Directory Entry:
+
+    Byte 0 -  7:    dirty_bitmap_table_offset
+                    Offset into the image file at which the Dirty Bitmap Table
+                    for the bitmap starts. Must be aligned to a cluster
+                    boundary.
+
+         8 - 15:    nb_virtual_bits
+                    Number of "virtual" bits in the bitmap. Number of
+                    "physical" bits would be:
+                    (nb_virtual_bits + granularity - 1) / granularity
+
+        16 - 19:    dirty_bitmap_table_size
+                    Number of entries in the Dirty Bitmap Table of the bitmap.
+                    Valid values: 0 - 0x8000000.
+                    Also, (dirty_bitmap_table_size * cluster_size) should not
+                    be greater than 0x20000000 (512 MB)
+
+        20 - 23:    granularity_bits
+                    Granularity bits. Valid values are: 0 - 63.
+
+                    Granularity is calculated as
+                        granularity = 1 << granularity_bits
+
+                    Granularity of the bitmap is how many "virtual" bits
+                    accounts for one "physical" bit.
+
+        24 - 27:    flags
+                    Bit
+                      0: in_use
+                         The bitmap is in use and may be inconsistent.
+
+                      1: self
+                         The bitmap is a dirty bitmap for the containing image.
+
+                      2: auto
+                         The bitmap should be autoloaded as block dirty bitmap.
+                         Only available if bit 1 (self) is set.
+
+                      3: read_only
+                         The bitmap should not be rewritten.
+
+                    Bits 4 - 31 are reserved.
+
+        28 - 29:    name_size
+                    Size of the bitmap name. Valid values: 0 - 1023.
+
+        variable:   The name of the bitmap (not null terminated).
+
+        variable:   Padding to round up the Dirty Bitmap Directory Entry size to
+                    the next multiple of 8.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-10-06 20:12   ` John Snow
  2015-10-06 20:16   ` John Snow
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 05/17] qcow2-dirty-bitmap: read dirty bitmap directory Vladimir Sementsov-Ogievskiy
                   ` (21 subsequent siblings)
  25 siblings, 2 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Add data structures and constraints accordingly to docs/specs/qcow2.txt

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/Makefile.objs        |  2 +-
 block/qcow2-dirty-bitmap.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.h              | 28 ++++++++++++++++++++++++++++
 3 files changed, 71 insertions(+), 1 deletion(-)
 create mode 100644 block/qcow2-dirty-bitmap.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 58ef2ef..c6e1f4b 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,5 @@
 block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
-block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
+block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-dirty-bitmap.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
 block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
new file mode 100644
index 0000000..fd4e0ef
--- /dev/null
+++ b/block/qcow2-dirty-bitmap.c
@@ -0,0 +1,42 @@
+/*
+ * Dirty bitmaps for the QCOW version 2 format
+ *
+ * Copyright (c) 2014-2015 Vladimir Sementsov-Ogievskiy
+ *
+ * This file is derived from qcow2-snapshot.c, original copyright:
+ * Copyright (c) 2004-2006 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+/* NOTICE: DBM here means Dirty Bitmap and used as a namespace for _internal_
+ * constants. Please do not use this _internal_ abbreviation for other needs
+ * and/or outside of this file. */
+
+/* Dirty Bitmap Directory Enrty constraints */
+#define DBM_MAX_TABLE_SIZE 0x8000000
+#define DBM_MAX_PHYS_SIZE 0x20000000 /* 512 mb */
+#define DBM_MAX_GRANULARITY_BITS 63
+#define DBM_MAX_NAME_SIZE 1023
+
+/* Dirty Bitmap Directory Enrty flags */
+#define DBM_RESERVED_FLAGS 0xffffffff
+
+/* bits [0, 8] U [56, 63] are reserved */
+#define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
diff --git a/block/qcow2.h b/block/qcow2.h
index 72e1328..a2a5d4a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -52,6 +52,10 @@
  * space for snapshot names and IDs */
 #define QCOW_MAX_SNAPSHOTS_SIZE (1024 * QCOW_MAX_SNAPSHOTS)
 
+/* Dirty Bitmap Header Extension constraints */
+#define QCOW_MAX_DIRTY_BITMAPS 65536
+#define QCOW_MAX_DIRTY_BITMAP_DIRECTORY_SIZE (1024 * QCOW_MAX_DIRTY_BITMAPS)
+
 /* indicate that the refcount of the referenced cluster is exactly one. */
 #define QCOW_OFLAG_COPIED     (1ULL << 63)
 /* indicate that the cluster is compressed (they never have the copied flag) */
@@ -141,6 +145,19 @@ typedef struct QEMU_PACKED QCowSnapshotHeader {
     /* name follows  */
 } QCowSnapshotHeader;
 
+typedef struct QEMU_PACKED QCowDirtyBitmapHeader {
+    /* header is 8 byte aligned */
+    uint64_t dirty_bitmap_table_offset;
+    uint64_t nb_virtual_bits;
+
+    uint32_t dirty_bitmap_table_size;
+    uint32_t granularity_bits;
+
+    uint32_t flags;
+    uint16_t name_size;
+    /* name follows  */
+} QCowDirtyBitmapHeader;
+
 typedef struct QEMU_PACKED QCowSnapshotExtraData {
     uint64_t vm_state_size_large;
     uint64_t disk_size;
@@ -159,6 +176,11 @@ typedef struct QCowSnapshot {
     uint64_t vm_clock_nsec;
 } QCowSnapshot;
 
+typedef struct QCowDirtyBitmap {
+    uint64_t offset;
+    char *name;
+} QCowDirtyBitmap;
+
 struct Qcow2Cache;
 typedef struct Qcow2Cache Qcow2Cache;
 
@@ -221,6 +243,12 @@ typedef uint64_t Qcow2GetRefcountFunc(const void *refcount_array,
 typedef void Qcow2SetRefcountFunc(void *refcount_array,
                                   uint64_t index, uint64_t value);
 
+typedef struct Qcow2DirtyBitmapHeaderExt {
+    uint32_t nb_dirty_bitmaps;
+    uint32_t dirty_bitmap_directory_size;
+    uint64_t dirty_bitmap_directory_offset;
+} QEMU_PACKED Qcow2DirtyBitmapHeaderExt;
+
 typedef struct BDRVQcowState {
     int cluster_bits;
     int cluster_size;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 05/17] qcow2-dirty-bitmap: read dirty bitmap directory
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-10-06 21:27   ` John Snow
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 06/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load() Vladimir Sementsov-Ogievskiy
                   ` (20 subsequent siblings)
  25 siblings, 1 reply; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Adds qcow2_read_dirty_bitmaps, reading Dirty Bitmap Directory as
specified in docs/specs/qcow2.txt

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-dirty-bitmap.c | 155 +++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.h              |  10 +++
 2 files changed, 165 insertions(+)

diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
index fd4e0ef..1260d1d 100644
--- a/block/qcow2-dirty-bitmap.c
+++ b/block/qcow2-dirty-bitmap.c
@@ -25,6 +25,9 @@
  * THE SOFTWARE.
  */
 
+#include "block/block_int.h"
+#include "block/qcow2.h"
+
 /* NOTICE: DBM here means Dirty Bitmap and used as a namespace for _internal_
  * constants. Please do not use this _internal_ abbreviation for other needs
  * and/or outside of this file. */
@@ -40,3 +43,155 @@
 
 /* bits [0, 8] U [56, 63] are reserved */
 #define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
+
+void qcow2_free_dirty_bitmaps(BlockDriverState *bs)
+{
+    BDRVQcowState *s = bs->opaque;
+    int i;
+
+    for (i = 0; i < s->nb_dirty_bitmaps; i++) {
+        g_free(s->dirty_bitmaps[i].name);
+    }
+    g_free(s->dirty_bitmaps);
+    s->dirty_bitmaps = NULL;
+    s->nb_dirty_bitmaps = 0;
+
+    g_free(s->dirty_bitmap_directory);
+    s->dirty_bitmap_directory = NULL;
+}
+
+static void bitmap_header_to_cpu(QCowDirtyBitmapHeader *h)
+{
+    be64_to_cpus(&h->dirty_bitmap_table_offset);
+    be64_to_cpus(&h->nb_virtual_bits);
+    be32_to_cpus(&h->dirty_bitmap_table_size);
+    be32_to_cpus(&h->granularity_bits);
+    be32_to_cpus(&h->flags);
+    be16_to_cpus(&h->name_size);
+}
+
+static int calc_dir_entry_size(size_t name_size)
+{
+    return align_offset(sizeof(QCowDirtyBitmapHeader) + name_size, 8);
+}
+
+static int dir_entry_size(QCowDirtyBitmapHeader *h)
+{
+    return calc_dir_entry_size(h->name_size);
+}
+
+static int check_constraints(int cluster_size,
+                             QCowDirtyBitmapHeader *h)
+{
+    uint64_t phys_bitmap_bytes =
+        (uint64_t)h->dirty_bitmap_table_size * cluster_size;
+    uint64_t max_virtual_bits = (phys_bitmap_bytes * 8) << h->granularity_bits;
+
+    int fail =
+            (h->dirty_bitmap_table_offset % cluster_size) ||
+            (h->dirty_bitmap_table_size > DBM_MAX_TABLE_SIZE) ||
+            (phys_bitmap_bytes > DBM_MAX_PHYS_SIZE) ||
+            (h->nb_virtual_bits > max_virtual_bits) ||
+            (h->granularity_bits > DBM_MAX_GRANULARITY_BITS) ||
+            (h->flags & DBM_RESERVED_FLAGS) ||
+            (h->name_size > DBM_MAX_NAME_SIZE);
+
+    return fail ? -EINVAL : 0;
+}
+
+static int directory_read(BlockDriverState *bs)
+{
+    int ret;
+    BDRVQcowState *s = bs->opaque;
+    uint8_t *entry, *end;
+
+    if (s->dirty_bitmap_directory != NULL) {
+        /* already read */
+        return -EEXIST;
+    }
+
+    s->dirty_bitmap_directory = g_try_malloc0(s->dirty_bitmap_directory_size);
+    if (s->dirty_bitmap_directory == NULL) {
+        return -ENOMEM;
+    }
+
+    ret = bdrv_pread(bs->file,
+                     s->dirty_bitmap_directory_offset,
+                     s->dirty_bitmap_directory,
+                     s->dirty_bitmap_directory_size);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    entry = s->dirty_bitmap_directory;
+    end = s->dirty_bitmap_directory + s->dirty_bitmap_directory_size;
+    while (entry < end) {
+        QCowDirtyBitmapHeader *h = (QCowDirtyBitmapHeader *)entry;
+        bitmap_header_to_cpu(h);
+
+        ret = check_constraints(s->cluster_size, h);
+        if (ret < 0) {
+            goto fail;
+        }
+
+        entry += dir_entry_size(h);
+    }
+
+    return 0;
+
+fail:
+    g_free(s->dirty_bitmap_directory);
+    s->dirty_bitmap_directory = NULL;
+
+    return ret;
+}
+
+int qcow2_read_dirty_bitmaps(BlockDriverState *bs)
+{
+    int ret;
+    BDRVQcowState *s = bs->opaque;
+    size_t offset;
+    QCowDirtyBitmap *bm, *end;
+
+    if (s->dirty_bitmap_directory != NULL || s->dirty_bitmaps != NULL) {
+        /* already read */
+        return -EEXIST;
+    }
+
+    if (s->nb_dirty_bitmaps == 0) {
+        /* No bitmaps - nothing to do */
+        return 0;
+    }
+
+    ret = directory_read(bs);
+    if (ret < 0) {
+        return ret;
+    }
+
+    s->dirty_bitmaps = g_try_new0(QCowDirtyBitmap, s->nb_dirty_bitmaps);
+    if (s->dirty_bitmaps == NULL) {
+        ret = -ENOMEM;
+        goto out;
+    }
+
+    offset = 0;
+    end = s->dirty_bitmaps + s->nb_dirty_bitmaps;
+    for (bm = s->dirty_bitmaps; bm < end; ++bm) {
+        QCowDirtyBitmapHeader *h =
+                (QCowDirtyBitmapHeader *)(s->dirty_bitmap_directory + offset);
+
+        bm->offset = offset;
+        bm->name = g_malloc(h->name_size + 1);
+        memcpy(bm->name, h + 1, h->name_size);
+        bm->name[h->name_size] = '\0';
+
+        offset += dir_entry_size(h);
+    }
+    ret = 0;
+
+out:
+    if (ret < 0) {
+        qcow2_free_dirty_bitmaps(bs);
+    }
+    return ret;
+}
diff --git a/block/qcow2.h b/block/qcow2.h
index a2a5d4a..5016fa1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -288,6 +288,12 @@ typedef struct BDRVQcowState {
     unsigned int nb_snapshots;
     QCowSnapshot *snapshots;
 
+    uint64_t dirty_bitmap_directory_offset;
+    size_t dirty_bitmap_directory_size;
+    uint8_t *dirty_bitmap_directory;
+    unsigned int nb_dirty_bitmaps;
+    QCowDirtyBitmap *dirty_bitmaps;
+
     int flags;
     int qcow_version;
     bool use_lazy_refcounts;
@@ -598,6 +604,10 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
 void qcow2_free_snapshots(BlockDriverState *bs);
 int qcow2_read_snapshots(BlockDriverState *bs);
 
+/* qcow2-dirty-bitmap.c functions */
+void qcow2_free_dirty_bitmaps(BlockDriverState *bs);
+int qcow2_read_dirty_bitmaps(BlockDriverState *bs);
+
 /* qcow2-cache.c functions */
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
 int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 06/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load()
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 05/17] qcow2-dirty-bitmap: read dirty bitmap directory Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-10-06 23:01   ` John Snow
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 07/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_store() Vladimir Sementsov-Ogievskiy
                   ` (19 subsequent siblings)
  25 siblings, 1 reply; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

This function loads block dirty bitmap from qcow2.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-dirty-bitmap.c | 155 +++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c              |   2 +
 block/qcow2.h              |   5 ++
 include/block/block_int.h  |   5 ++
 4 files changed, 167 insertions(+)

diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
index 1260d1d..ea50137 100644
--- a/block/qcow2-dirty-bitmap.c
+++ b/block/qcow2-dirty-bitmap.c
@@ -99,6 +99,13 @@ static int check_constraints(int cluster_size,
     return fail ? -EINVAL : 0;
 }
 
+static QCowDirtyBitmapHeader *bitmap_header(BDRVQcowState *s,
+                                            QCowDirtyBitmap *bitmap)
+{
+    return (QCowDirtyBitmapHeader *)
+           (s->dirty_bitmap_directory + bitmap->offset);
+}
+
 static int directory_read(BlockDriverState *bs)
 {
     int ret;
@@ -195,3 +202,151 @@ out:
     }
     return ret;
 }
+
+static QCowDirtyBitmap *find_dirty_bitmap_by_name(BlockDriverState *bs,
+                                                  const char *name)
+{
+    BDRVQcowState *s = bs->opaque;
+    QCowDirtyBitmap *bm, *end = s->dirty_bitmaps + s->nb_dirty_bitmaps;
+
+    for (bm = s->dirty_bitmaps; bm < end; ++bm) {
+        if (strcmp(bm->name, name) == 0) {
+            return bm;
+        }
+    }
+
+    return NULL;
+}
+
+/* dirty sectors in cluster is a number of sectors in the image, corresponding
+ * to one cluster of bitmap data */
+static uint64_t dirty_sectors_in_cluster(const BDRVQcowState *s,
+                                         const BdrvDirtyBitmap *bitmap)
+{
+    uint32_t sector_granularity =
+            bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
+
+    return (uint64_t)sector_granularity * (s->cluster_size << 3);
+}
+
+/* load_bitmap()
+ * load dirty bitmap from Dirty Bitmap Table
+ * Dirty Bitmap Table entries are assumed to be in big endian format */
+static int load_bitmap(BlockDriverState *bs,
+                       const uint64_t *dirty_bitmap_table,
+                       uint32_t dirty_bitmap_table_size,
+                       BdrvDirtyBitmap *bitmap)
+{
+    int ret = 0;
+    BDRVQcowState *s = bs->opaque;
+    uint64_t sector, dsc;
+    uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
+    int cl_size = s->cluster_size;
+    uint8_t *buf = NULL;
+    uint32_t i, tab_size =
+            size_to_clusters(s, bdrv_dirty_bitmap_data_size(bitmap, bm_size));
+
+    if (tab_size > dirty_bitmap_table_size) {
+        return -EINVAL;
+    }
+
+    bdrv_clear_dirty_bitmap(bitmap);
+
+    buf = g_malloc0(cl_size);
+    dsc = dirty_sectors_in_cluster(s, bitmap);
+    for (i = 0, sector = 0; i < tab_size; ++i, sector += dsc) {
+        uint64_t end = MIN(bm_size, sector + dsc);
+        uint64_t offset = be64_to_cpu(dirty_bitmap_table[i]);
+
+        if (offset & DBM_TABLE_ENTRY_RESERVED_MASK) {
+            ret = -EINVAL;
+            goto finish;
+        }
+
+        /* zero offset means cluster unallocated */
+        if (offset) {
+            ret = bdrv_pread(bs->file, offset, buf, cl_size);
+            if (ret < 0) {
+                goto finish;
+            }
+            bdrv_dirty_bitmap_deserialize_part(bitmap, buf, sector, end);
+        }
+    }
+    ret = 0;
+
+    bdrv_dirty_bitmap_deserialize_finish(bitmap);
+
+finish:
+    g_free(buf);
+
+    return ret;
+}
+
+BdrvDirtyBitmap * qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
+                                          BlockDriverState *bs_file,
+                                          const char *name,
+                                          Error **errp)
+{
+    BDRVQcowState *s = bs_file->opaque;
+    int ret;
+    QCowDirtyBitmap *bm;
+    QCowDirtyBitmapHeader *bmh;
+    uint64_t *dirty_bitmap_table = NULL;
+    uint32_t granularity;
+    uint64_t size = bdrv_nb_sectors(bs_for);
+    BdrvDirtyBitmap *bitmap = NULL;
+
+    bm = find_dirty_bitmap_by_name(bs_file, name);
+    if (bm == NULL) {
+        error_setg(errp, "Could not find bitmap '%s' in the node '%s'", name,
+                   bdrv_get_device_or_node_name(bs_file));
+        return NULL;
+    }
+    bmh = bitmap_header(s, bm);
+
+    if (size != bmh->nb_virtual_bits) {
+        error_setg(errp,
+                   "Bitmap '%s' in the node '%s' has size = %" PRIu64
+                   "when requested size (for node %s) = %" PRIu64,
+                   name, bdrv_get_device_or_node_name(bs_file),
+                   bmh->nb_virtual_bits,
+                   bdrv_get_device_or_node_name(bs_for), size);
+        return NULL;
+    }
+
+
+    dirty_bitmap_table = g_try_malloc(bmh->dirty_bitmap_table_size * sizeof(uint64_t));
+    if (dirty_bitmap_table == NULL) {
+        error_setg_errno(errp, -ENOMEM, "Could not allocate Dirty Bitmap Table");
+        return NULL;
+    }
+
+    ret = bdrv_pread(bs_file->file, bmh->dirty_bitmap_table_offset, dirty_bitmap_table,
+                     bmh->dirty_bitmap_table_size * sizeof(uint64_t));
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Could not read dirty_bitmap_table table from image");
+        goto finish;
+    }
+
+    granularity = BDRV_SECTOR_SIZE << bmh->granularity_bits;
+    bitmap = bdrv_create_dirty_bitmap(bs_for, granularity, name, errp);
+    if (bitmap == NULL) {
+        error_setg_errno(errp, -ENOMEM, "Could not create dirty bitmap");
+        goto finish;
+    }
+
+    ret = load_bitmap(bs_file, dirty_bitmap_table, bmh->dirty_bitmap_table_size, bitmap);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Could not read bitmap from image");
+        goto finish;
+    }
+
+finish:
+    if (*errp != NULL) {
+        bdrv_release_dirty_bitmap(bs_for, bitmap);
+        bitmap = NULL;
+    }
+    g_free(dirty_bitmap_table);
+
+    return bitmap;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 76c331b..58ebdd3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2965,6 +2965,8 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_get_info          = qcow2_get_info,
     .bdrv_get_specific_info = qcow2_get_specific_info,
 
+    .bdrv_dirty_bitmap_load = qcow2_dirty_bitmap_load,
+
     .bdrv_save_vmstate    = qcow2_save_vmstate,
     .bdrv_load_vmstate    = qcow2_load_vmstate,
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 5016fa1..51d1907 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -608,6 +608,11 @@ int qcow2_read_snapshots(BlockDriverState *bs);
 void qcow2_free_dirty_bitmaps(BlockDriverState *bs);
 int qcow2_read_dirty_bitmaps(BlockDriverState *bs);
 
+BdrvDirtyBitmap *qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
+                                         BlockDriverState *bs_file,
+                                         const char *name,
+                                         Error **errp);
+
 /* qcow2-cache.c functions */
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
 int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 14ad4c3..f982adc 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -204,6 +204,11 @@ struct BlockDriver {
     int (*bdrv_get_info)(BlockDriverState *bs, BlockDriverInfo *bdi);
     ImageInfoSpecific *(*bdrv_get_specific_info)(BlockDriverState *bs);
 
+    BdrvDirtyBitmap *(*bdrv_dirty_bitmap_load)(BlockDriverState *bs_for,
+                                               BlockDriverState *bs_file,
+                                               const char *name,
+                                               Error **errp);
+
     int (*bdrv_save_vmstate)(BlockDriverState *bs, QEMUIOVector *qiov,
                              int64_t pos);
     int (*bdrv_load_vmstate)(BlockDriverState *bs, uint8_t *buf,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 07/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_store()
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (5 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 06/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load() Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 08/17] qcow2: add dirty bitmaps extension Vladimir Sementsov-Ogievskiy
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

This function stores block dirty bitmap to qcow2. If the bitmap with
the same name, size and granularity already exists, it will be
rewritten, if the bitmap with the same name exists but granularity or
size does not match, an error will be genrated.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-dirty-bitmap.c | 418 +++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c              |   1 +
 block/qcow2.h              |   2 +
 include/block/block_int.h  |   2 +
 4 files changed, 423 insertions(+)

diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
index ea50137..39f54e4 100644
--- a/block/qcow2-dirty-bitmap.c
+++ b/block/qcow2-dirty-bitmap.c
@@ -70,6 +70,16 @@ static void bitmap_header_to_cpu(QCowDirtyBitmapHeader *h)
     be16_to_cpus(&h->name_size);
 }
 
+static void bitmap_header_to_be(QCowDirtyBitmapHeader *h)
+{
+    cpu_to_be64s(&h->dirty_bitmap_table_offset);
+    cpu_to_be64s(&h->nb_virtual_bits);
+    cpu_to_be32s(&h->dirty_bitmap_table_size);
+    cpu_to_be32s(&h->granularity_bits);
+    cpu_to_be32s(&h->flags);
+    cpu_to_be16s(&h->name_size);
+}
+
 static int calc_dir_entry_size(size_t name_size)
 {
     return align_offset(sizeof(QCowDirtyBitmapHeader) + name_size, 8);
@@ -80,6 +90,17 @@ static int dir_entry_size(QCowDirtyBitmapHeader *h)
     return calc_dir_entry_size(h->name_size);
 }
 
+static void directory_to_be(uint8_t *dir, size_t size)
+{
+    uint8_t *end = dir + size;
+    while (dir < end) {
+        QCowDirtyBitmapHeader *h = (QCowDirtyBitmapHeader *)dir;
+        dir += dir_entry_size(h);
+
+        bitmap_header_to_be(h);
+    }
+}
+
 static int check_constraints(int cluster_size,
                              QCowDirtyBitmapHeader *h)
 {
@@ -350,3 +371,400 @@ finish:
 
     return bitmap;
 }
+
+static int update_header_sync(BlockDriverState *bs) {
+    int ret;
+
+    ret = qcow2_update_header(bs);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = bdrv_flush(bs);
+    if (ret < 0) {
+        return ret;
+    }
+
+    return 0;
+}
+
+/* write Dirty Bitmap Directory from the state to new allocated clusters */
+static int64_t directory_write(BlockDriverState *bs, const uint8_t *dir, size_t size)
+{
+    int ret = 0;
+    uint8_t *dir_be = NULL;
+    int64_t dir_offset = 0;
+
+    dir_be = g_try_malloc(size);
+    if (dir_be == NULL) {
+        return -ENOMEM;
+    }
+    memcpy(dir_be, dir, size);
+    directory_to_be(dir_be, size);
+
+    /* Allocate space for the new Dirty Bitmap Directory */
+    dir_offset = qcow2_alloc_clusters(bs, size);
+    if (dir_offset < 0) {
+        ret = dir_offset;
+        goto out;
+    }
+
+    /* The Dirty Bitmap Directory position has not yet been updated, so these
+     * clusters must indeed be completely free */
+    ret = qcow2_pre_write_overlap_check(bs, 0, dir_offset, size);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = bdrv_pwrite(bs->file, dir_offset, dir_be, size);
+    if (ret < 0) {
+        goto out;
+    }
+
+out:
+    g_free(dir_be);
+
+    if (ret < 0) {
+        if (dir_offset > 0) {
+            qcow2_free_clusters(bs, dir_offset, size, QCOW2_DISCARD_ALWAYS);
+        }
+
+        return ret;
+    }
+
+    return dir_offset;
+}
+
+static int directory_push_entry(BlockDriverState *bs, QCowDirtyBitmapHeader *header)
+{
+    BDRVQcowState *s = bs->opaque;
+    int ret;
+    int entry_size = dir_entry_size(header);
+    int64_t new_offset = 0, old_offset = 0;
+    uint64_t new_size = s->dirty_bitmap_directory_size + entry_size, old_size = 0;
+    void *p;
+
+    if (new_size > QCOW_MAX_DIRTY_BITMAP_DIRECTORY_SIZE) {
+        return -EINVAL;
+    }
+
+    ret = check_constraints(s->cluster_size, header);
+    if (ret < 0) {
+        return -EINVAL;
+    }
+
+    old_offset = s->dirty_bitmap_directory_offset;
+    old_size = s->dirty_bitmap_directory_size;
+
+    uint8_t *new_dir = g_try_malloc(new_size);
+    if (new_dir == NULL) {
+        return -ENOMEM;
+    }
+    memcpy(new_dir, s->dirty_bitmap_directory, s->dirty_bitmap_directory_size);
+    memcpy(new_dir + s->dirty_bitmap_directory_size, header, entry_size);
+
+    new_offset = directory_write(bs, new_dir, new_size);
+    if (new_offset < 0) {
+        ret = new_offset;
+        goto fail;
+    }
+
+    ret = bdrv_flush(bs);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    s->dirty_bitmap_directory_offset = new_offset;
+    s->dirty_bitmap_directory_size = new_size;
+
+    ret = update_header_sync(bs);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    if (old_size) {
+        qcow2_free_clusters(bs, old_offset, old_size, QCOW2_DISCARD_ALWAYS);
+    }
+
+    g_free(s->dirty_bitmap_directory);
+    s->dirty_bitmap_directory = new_dir;
+
+    return 0;
+
+fail:
+    g_free(new_dir);
+    if (new_offset > 0) {
+        qcow2_free_clusters(bs, new_offset, new_size, QCOW2_DISCARD_ALWAYS);
+        s->dirty_bitmap_directory_offset = old_offset;
+        s->dirty_bitmap_directory_size = old_size;
+    }
+
+    p = g_try_realloc(s->dirty_bitmap_directory, s->dirty_bitmap_directory_size);
+    if (p != NULL) {
+        s->dirty_bitmap_directory = p;
+    }
+
+    return ret;
+}
+
+/* store_bitmap()
+ * update Dirty Bitmap Table by storing bitmap to it.
+ * Dirty Bitmap Table entries are assumed to be in big endian format
+ * On the error, the resulting Dirty Bitmap Table is valid for clearing, but
+ * may contain invalid bitmap */
+static int store_bitmap(BlockDriverState *bs,
+                        uint64_t *dirty_bitmap_table,
+                        uint32_t dirty_bitmap_table_size,
+                        const BdrvDirtyBitmap *bitmap)
+{
+    int ret;
+    BDRVQcowState *s = bs->opaque;
+    uint64_t sector, dsc;
+    uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
+    int cl_size = s->cluster_size;
+    uint8_t *buf = NULL;
+    uint32_t i, tab_size =
+            size_to_clusters(s, bdrv_dirty_bitmap_data_size(bitmap, bm_size));
+
+    if (tab_size > dirty_bitmap_table_size) {
+        return -EINVAL;
+    }
+
+    buf = g_malloc0(cl_size);
+    dsc = dirty_sectors_in_cluster(s, bitmap);
+    for (i = 0, sector = 0; i < tab_size; ++i, sector += dsc) {
+        uint64_t addr = be64_to_cpu(dirty_bitmap_table[i]) & ~511;
+        uint64_t end = MIN(bm_size, sector + dsc);
+        uint64_t write_size = bdrv_dirty_bitmap_data_size(bitmap, end - sector);
+
+        bdrv_dirty_bitmap_serialize_part(bitmap, buf, sector, end);
+
+        if (buffer_is_zero(buf, write_size)) {
+            if (addr) {
+                qcow2_free_clusters(bs, addr, cl_size, QCOW2_DISCARD_ALWAYS);
+            }
+            dirty_bitmap_table[i] = 0;
+        } else {
+            if (!addr) {
+                addr = qcow2_alloc_clusters(bs, cl_size);
+                dirty_bitmap_table[i] = cpu_to_be64(addr);
+            }
+
+            ret = bdrv_pwrite(bs->file, addr, buf, write_size);
+            if (ret < 0) {
+                goto finish;
+            }
+        }
+    }
+    ret = 0;
+
+finish:
+    g_free(buf);
+
+    return ret;
+}
+
+static int64_t alloc_zeroed_clusters(BlockDriverState *bs, uint64_t size)
+{
+    int ret = 0;
+    void *buf = NULL;
+    int64_t offset = qcow2_alloc_clusters(bs, size);
+    if (offset < 0) {
+        return offset;
+    }
+
+    buf = g_try_malloc0(size);
+    if (buf == NULL) {
+        ret = -ENOMEM;
+        goto out;
+    }
+
+    ret = qcow2_pre_write_overlap_check(bs, 0, offset, size);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = bdrv_pwrite(bs->file, offset, buf, size);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = bdrv_flush(bs);
+    if (ret < 0) {
+        goto out;
+    }
+
+out:
+    g_free(buf);
+
+    if (ret < 0) {
+        qcow2_free_clusters(bs, offset, size, QCOW2_DISCARD_ALWAYS);
+        return ret;
+    }
+
+    return offset;
+}
+
+static int directory_push(BlockDriverState *bs, const char *name,
+                              uint64_t size, int granularity)
+{
+    int ret;
+    BDRVQcowState *s = bs->opaque;
+    int sector_granularity = granularity >> BDRV_SECTOR_BITS;
+    size_t name_size = strlen(name);
+    size_t entry_size = calc_dir_entry_size(name_size);
+    QCowDirtyBitmapHeader *entry = g_malloc0(entry_size);
+    int64_t table_offset = 0;
+
+    entry->nb_virtual_bits = size;
+    entry->granularity_bits = ctz32(granularity >> BDRV_SECTOR_BITS);
+    entry->name_size = name_size;
+    memcpy(entry + 1, name, name_size);
+
+    entry->dirty_bitmap_table_size =
+            size_to_clusters(s, (((size - 1) / sector_granularity) >> 3) + 1);
+    table_offset = alloc_zeroed_clusters(bs, entry->dirty_bitmap_table_size *
+                                  sizeof(uint64_t));
+    if (table_offset < 0) {
+        ret = table_offset;
+        goto out;
+    }
+    entry->dirty_bitmap_table_offset = table_offset;
+
+    ret = directory_push_entry(bs, entry);
+    if (ret < 0) {
+        goto out;
+    }
+
+out:
+    g_free(entry);
+    if (ret < 0 && table_offset > 0) {
+        qcow2_free_clusters(bs, table_offset, entry->dirty_bitmap_table_size *
+                            sizeof(uint64_t), QCOW2_DISCARD_ALWAYS);
+    }
+
+    return ret;
+}
+
+static int dirty_bitmaps_push(BDRVQcowState *s, const char *name, uint32_t offset)
+{
+    QCowDirtyBitmap *bm;
+    QCowDirtyBitmap *p;
+
+    printf("dirty bitmaps push\n");
+    p = g_try_renew(QCowDirtyBitmap, s->dirty_bitmaps, s->nb_dirty_bitmaps + 1);
+    if (p == NULL) {
+        return -ENOMEM;
+    }
+    s->dirty_bitmaps = p;
+    s->nb_dirty_bitmaps++;
+
+    bm = s->dirty_bitmaps + s->nb_dirty_bitmaps - 1;
+    bm->name = g_strdup(name);
+    bm->offset = offset;
+
+
+    return 0;
+}
+
+static void dirty_bitmaps_pop(BDRVQcowState *s)
+{
+    QCowDirtyBitmap *p;
+
+    if (s->nb_dirty_bitmaps == 0) {
+        return;
+    }
+
+    p = g_try_renew(QCowDirtyBitmap, s->dirty_bitmaps, s->nb_dirty_bitmaps - 1);
+    if (p != NULL) {
+        s->dirty_bitmaps = p;
+    }
+
+    s->nb_dirty_bitmaps--;
+}
+
+/* if no id is provided, a new one is constructed */
+static int qcow2_dirty_bitmap_create(BlockDriverState *bs, const char *name,
+                              uint64_t size, int granularity)
+{
+    int ret;
+    BDRVQcowState *s = bs->opaque;
+
+    if (s->nb_dirty_bitmaps >= QCOW_MAX_DIRTY_BITMAPS) {
+        return -EFBIG;
+    }
+
+    /* Check that the name is unique */
+    if (find_dirty_bitmap_by_name(bs, name) != NULL) {
+        return -EEXIST;
+    }
+
+    ret = dirty_bitmaps_push(s, name, s->dirty_bitmap_directory_size);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = directory_push(bs, name, size, granularity);
+    if (ret < 0) {
+        //dirty_bitmaps_pop(s);
+        return ret;
+        dirty_bitmaps_pop(s);
+    }
+
+    return 0;
+}
+
+int qcow2_dirty_bitmap_store(BlockDriverState *bs, const BdrvDirtyBitmap *bitmap)
+{
+    BDRVQcowState *s = bs->opaque;
+    int ret = 0;
+    uint64_t *dirty_bitmap_table;
+    QCowDirtyBitmap *bm;
+    QCowDirtyBitmapHeader *bmh;
+    const char *name = bdrv_dirty_bitmap_name(bitmap);
+    uint64_t size = bdrv_dirty_bitmap_size(bitmap);
+    int granularity = bdrv_dirty_bitmap_granularity(bitmap);
+
+    /* find/create dirty bitmap */
+    bm = find_dirty_bitmap_by_name(bs, name);
+    if (bm == NULL) {
+        ret = qcow2_dirty_bitmap_create(bs, name, size, granularity);
+        if (ret < 0) {
+            return ret;
+        }
+        bm = s->dirty_bitmaps + s->nb_dirty_bitmaps - 1;
+        bmh = bitmap_header(s, bm);
+    } else {
+        bmh = bitmap_header(s, bm);
+
+        if (size != bmh->nb_virtual_bits ||
+            granularity != (BDRV_SECTOR_SIZE << bmh->granularity_bits)) {
+            return -EEXIST;
+        }
+    }
+
+    dirty_bitmap_table = g_try_new(uint64_t, bmh->dirty_bitmap_table_size);
+    if (dirty_bitmap_table == NULL) {
+        return -ENOMEM;
+    }
+    ret = bdrv_pread(bs->file, bmh->dirty_bitmap_table_offset, dirty_bitmap_table,
+                     bmh->dirty_bitmap_table_size * sizeof(uint64_t));
+    if (ret < 0) {
+        goto finish;
+    }
+
+    ret = store_bitmap(bs, dirty_bitmap_table, bmh->dirty_bitmap_table_size, bitmap);
+    if (ret < 0) {
+        goto finish;
+    }
+
+    ret = bdrv_pwrite(bs->file, bmh->dirty_bitmap_table_offset, dirty_bitmap_table,
+                      bmh->dirty_bitmap_table_size * sizeof(uint64_t));
+    if (ret < 0) {
+        goto finish;
+    }
+
+finish:
+    g_free(dirty_bitmap_table);
+    return ret;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 58ebdd3..e56683a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2966,6 +2966,7 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_get_specific_info = qcow2_get_specific_info,
 
     .bdrv_dirty_bitmap_load = qcow2_dirty_bitmap_load,
+    .bdrv_dirty_bitmap_store = qcow2_dirty_bitmap_store,
 
     .bdrv_save_vmstate    = qcow2_save_vmstate,
     .bdrv_load_vmstate    = qcow2_load_vmstate,
diff --git a/block/qcow2.h b/block/qcow2.h
index 51d1907..ec42cec 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -612,6 +612,8 @@ BdrvDirtyBitmap *qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
                                          BlockDriverState *bs_file,
                                          const char *name,
                                          Error **errp);
+int qcow2_dirty_bitmap_store(BlockDriverState *bs,
+                             const BdrvDirtyBitmap *bitmap);
 
 /* qcow2-cache.c functions */
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f982adc..c66621e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -208,6 +208,8 @@ struct BlockDriver {
                                                BlockDriverState *bs_file,
                                                const char *name,
                                                Error **errp);
+    int (*bdrv_dirty_bitmap_store)(BlockDriverState *bs,
+                                   const BdrvDirtyBitmap *bitmap);
 
     int (*bdrv_save_vmstate)(BlockDriverState *bs, QEMUIOVector *qiov,
                              int64_t pos);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 08/17] qcow2: add dirty bitmaps extension
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (6 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 07/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_store() Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 09/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load_check() Vladimir Sementsov-Ogievskiy
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Add dirty bitmap extension as specified in docs/specs/qcow2.txt.

Load bitmap headers on open. Handle close and update_header.

Handle resize: for now, just block resize if there are dirty bitmaps.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index e56683a..de638e7 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -60,6 +60,7 @@ typedef struct {
 #define  QCOW2_EXT_MAGIC_END 0
 #define  QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
 #define  QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857
+#define  QCOW2_EXT_MAGIC_DIRTY_BITMAPS 0x23852875
 
 static int qcow2_probe(const uint8_t *buf, int buf_size, const char *filename)
 {
@@ -89,6 +90,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
     QCowExtension ext;
     uint64_t offset;
     int ret;
+    Qcow2DirtyBitmapHeaderExt dirty_bitmaps_ext;
 
 #ifdef DEBUG_EXT
     printf("qcow2_read_extensions: start=%ld end=%ld\n", start_offset, end_offset);
@@ -159,6 +161,57 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
             }
             break;
 
+        case QCOW2_EXT_MAGIC_DIRTY_BITMAPS:
+            ret = bdrv_pread(bs->file, offset, &dirty_bitmaps_ext, ext.len);
+            if (ret < 0) {
+                error_setg_errno(errp, -ret, "ERROR: dirty_bitmaps_ext: "
+                                 "Could not read ext header");
+                return ret;
+            }
+
+            be32_to_cpus(&dirty_bitmaps_ext.nb_dirty_bitmaps);
+            be32_to_cpus(&dirty_bitmaps_ext.dirty_bitmap_directory_size);
+            be64_to_cpus(&dirty_bitmaps_ext.dirty_bitmap_directory_offset);
+
+            if (dirty_bitmaps_ext.nb_dirty_bitmaps > QCOW_MAX_DIRTY_BITMAPS) {
+                error_setg(errp, "ERROR: dirty_bitmaps_ext: "
+                                 "too many dirty bitmaps");
+                return -EINVAL;
+            }
+
+            if (dirty_bitmaps_ext.dirty_bitmap_directory_offset %
+                    s->cluster_bits) {
+                error_setg(errp, "ERROR: dirty_bitmaps_ext: "
+                                 "wrong dirty bitmap directory offset");
+                return -EINVAL;
+            }
+
+            if (dirty_bitmaps_ext.dirty_bitmap_directory_size >
+                QCOW_MAX_DIRTY_BITMAP_DIRECTORY_SIZE) {
+                error_setg(errp, "ERROR: dirty_bitmaps_ext: "
+                                 "too large dirty bitmap directory");
+                return -EINVAL;
+            }
+
+            s->nb_dirty_bitmaps = dirty_bitmaps_ext.nb_dirty_bitmaps;
+            s->dirty_bitmap_directory_offset =
+                    dirty_bitmaps_ext.dirty_bitmap_directory_offset;
+            s->dirty_bitmap_directory_size =
+                    dirty_bitmaps_ext.dirty_bitmap_directory_size;
+
+            ret = qcow2_read_dirty_bitmaps(bs);
+            if (ret < 0) {
+                error_setg_errno(errp, -ret, "Could not read dirty bitmaps");
+                return ret;
+            }
+
+#ifdef DEBUG_EXT
+            printf("Qcow2: Got dirty bitmaps extension:"
+                   " offset=%" PRIu64 " nb_bitmaps=%" PRIu32 "\n",
+                   s->dirty_bitmaps_offset, s->nb_dirty_bitmaps);
+#endif
+            break;
+
         default:
             /* unknown magic - save it in case we need to rewrite the header */
             {
@@ -1009,6 +1062,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
     g_free(s->unknown_header_fields);
     cleanup_unknown_header_ext(bs);
     qcow2_free_snapshots(bs);
+    qcow2_free_dirty_bitmaps(bs);
     qcow2_refcount_close(bs);
     qemu_vfree(s->l1_table);
     /* else pre-write overlap checks in cache_destroy may crash */
@@ -1487,6 +1541,7 @@ static void qcow2_close(BlockDriverState *bs)
     qemu_vfree(s->cluster_data);
     qcow2_refcount_close(bs);
     qcow2_free_snapshots(bs);
+    qcow2_free_dirty_bitmaps(bs);
 }
 
 static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
@@ -1679,6 +1734,24 @@ int qcow2_update_header(BlockDriverState *bs)
     buf += ret;
     buflen -= ret;
 
+    if (s->nb_dirty_bitmaps > 0) {
+        Qcow2DirtyBitmapHeaderExt dirty_bitmaps_header = {
+            .nb_dirty_bitmaps = cpu_to_be32(s->nb_dirty_bitmaps),
+            .dirty_bitmap_directory_size =
+                    cpu_to_be32(s->dirty_bitmap_directory_size),
+            .dirty_bitmap_directory_offset =
+                    cpu_to_be64(s->dirty_bitmap_directory_offset)
+        };
+        ret = header_ext_add(buf, QCOW2_EXT_MAGIC_DIRTY_BITMAPS,
+                             &dirty_bitmaps_header, sizeof(dirty_bitmaps_header),
+                             buflen);
+        if (ret < 0) {
+            goto fail;
+        }
+        buf += ret;
+        buflen -= ret;
+    }
+
     /* Keep unknown header extensions */
     QLIST_FOREACH(uext, &s->unknown_header_ext, next) {
         ret = header_ext_add(buf, uext->magic, uext->data, uext->len, buflen);
@@ -2188,6 +2261,12 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset)
         return -ENOTSUP;
     }
 
+    /* cannot proceed if image has dirty_bitmaps */
+    if (s->nb_dirty_bitmaps) {
+        error_report("Can't resize an image which has dirty bitmaps");
+        return -ENOTSUP;
+    }
+
     /* shrinking is currently not supported */
     if (offset < bs->total_sectors * 512) {
         error_report("qcow2 doesn't support shrinking images yet");
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 09/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load_check()
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (7 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 08/17] qcow2: add dirty bitmaps extension Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 10/17] block: store persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

The function checks existing of the bitmap without loading it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c                    | 15 +++++++++++++++
 block/qcow2-dirty-bitmap.c |  5 +++++
 block/qcow2.c              |  1 +
 block/qcow2.h              |  1 +
 include/block/block.h      |  2 ++
 include/block/block_int.h  |  2 ++
 6 files changed, 26 insertions(+)

diff --git a/block.c b/block.c
index 8c39d0a..69d7c30 100644
--- a/block.c
+++ b/block.c
@@ -3347,6 +3347,21 @@ void bdrv_release_meta_bitmap(BdrvDirtyBitmap *bitmap)
     }
 }
 
+bool bdrv_load_check_dirty_bitmap(BlockDriverState *file, const char *name)
+{
+    BlockDriver *drv = file->drv;
+    if (!drv) {
+        return false;
+    }
+    if (drv->bdrv_dirty_bitmap_load_check) {
+        return drv->bdrv_dirty_bitmap_load_check(file, name);
+    }
+    if (file->file)  {
+        return bdrv_load_check_dirty_bitmap(file->file, name);
+    }
+    return false;
+}
+
 BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
                                           uint32_t granularity,
                                           const char *name,
diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
index 39f54e4..2687edc 100644
--- a/block/qcow2-dirty-bitmap.c
+++ b/block/qcow2-dirty-bitmap.c
@@ -303,6 +303,11 @@ finish:
     return ret;
 }
 
+bool qcow2_dirty_bitmap_load_check(BlockDriverState *file, const char *name)
+{
+    return find_dirty_bitmap_by_name(file, name) != NULL;
+}
+
 BdrvDirtyBitmap * qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
                                           BlockDriverState *bs_file,
                                           const char *name,
diff --git a/block/qcow2.c b/block/qcow2.c
index de638e7..36f404c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3045,6 +3045,7 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_get_specific_info = qcow2_get_specific_info,
 
     .bdrv_dirty_bitmap_load = qcow2_dirty_bitmap_load,
+    .bdrv_dirty_bitmap_load_check = qcow2_dirty_bitmap_load_check,
     .bdrv_dirty_bitmap_store = qcow2_dirty_bitmap_store,
 
     .bdrv_save_vmstate    = qcow2_save_vmstate,
diff --git a/block/qcow2.h b/block/qcow2.h
index ec42cec..05f59e0 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -608,6 +608,7 @@ int qcow2_read_snapshots(BlockDriverState *bs);
 void qcow2_free_dirty_bitmaps(BlockDriverState *bs);
 int qcow2_read_dirty_bitmaps(BlockDriverState *bs);
 
+bool qcow2_dirty_bitmap_load_check(BlockDriverState *file, const char *name);
 BdrvDirtyBitmap *qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
                                          BlockDriverState *bs_file,
                                          const char *name,
diff --git a/include/block/block.h b/include/block/block.h
index 8166640..5e95e51 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -531,6 +531,8 @@ HBitmap *bdrv_create_meta_bitmap(BdrvDirtyBitmap *bitmap,
                                  uint64_t granularity);
 void bdrv_release_meta_bitmap(BdrvDirtyBitmap *bitmap);
 
+bool bdrv_load_check_dirty_bitmap(BlockDriverState *file, const char *name);
+
 void bdrv_enable_copy_on_read(BlockDriverState *bs);
 void bdrv_disable_copy_on_read(BlockDriverState *bs);
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index c66621e..2c35a73 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -204,6 +204,8 @@ struct BlockDriver {
     int (*bdrv_get_info)(BlockDriverState *bs, BlockDriverInfo *bdi);
     ImageInfoSpecific *(*bdrv_get_specific_info)(BlockDriverState *bs);
 
+    bool (*bdrv_dirty_bitmap_load_check)(BlockDriverState *file,
+                                         const char *name);
     BdrvDirtyBitmap *(*bdrv_dirty_bitmap_load)(BlockDriverState *bs_for,
                                                BlockDriverState *bs_file,
                                                const char *name,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 10/17] block: store persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (8 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 09/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load_check() Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 11/17] block: add bdrv_load_dirty_bitmap() Vladimir Sementsov-Ogievskiy
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Persistent dirty bitmaps are the bitmaps, for which the new field
BdrvDirtyBitmap.file is not NULL. We save all persistent dirty bitmaps
owned by BlockDriverState in corresponding bdrv_close().
BdrvDirtyBitmap.file is a BlockDriverState, where we want to save the
bitmap. It may be set in bdrv_dirty_bitmap_set_file() only once.
bdrv_ref/bdrv_unref are used for BdrvDirtyBitmap.file to be sure that
files will be closed and resources will be freed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c               | 33 +++++++++++++++++++++++++++++++++
 include/block/block.h |  3 +++
 2 files changed, 36 insertions(+)

diff --git a/block.c b/block.c
index 69d7c30..492abb4 100644
--- a/block.c
+++ b/block.c
@@ -73,6 +73,8 @@ struct BdrvDirtyBitmap {
     BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
     char *name;                 /* Optional non-empty unique ID */
     int64_t size;               /* Size of the bitmap (Number of sectors) */
+    BlockDriverState *file;     /* File where bitmap is loaded from (and should
+                                   be saved to) */
     bool disabled;              /* Bitmap is read-only */
     QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
@@ -1897,6 +1899,7 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 void bdrv_close(BlockDriverState *bs)
 {
     BdrvAioNotifier *ban, *ban_next;
+    BdrvDirtyBitmap *bm, *bm_next;
 
     if (bs->job) {
         block_job_cancel_sync(bs->job);
@@ -1906,6 +1909,15 @@ void bdrv_close(BlockDriverState *bs)
     bdrv_drain(bs); /* in case flush left pending I/O */
     notifier_list_notify(&bs->close_notifiers, bs);
 
+    /* save and release persistent dirty bitmaps */
+    QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, bm_next) {
+        if (bm->file) {
+            bdrv_store_dirty_bitmap(bm);
+            bdrv_unref(bm->file);
+            bdrv_release_dirty_bitmap(bs, bm);
+        }
+    }
+
     if (bs->drv) {
         BdrvChild *child, *next;
 
@@ -3362,6 +3374,18 @@ bool bdrv_load_check_dirty_bitmap(BlockDriverState *file, const char *name)
     return false;
 }
 
+int bdrv_store_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+    BlockDriverState *bs = bitmap->file;
+
+    if (bs == NULL || bs->drv == NULL ||
+            bs->drv->bdrv_dirty_bitmap_store == NULL) {
+        return -EINVAL;
+    }
+
+    return bs->drv->bdrv_dirty_bitmap_store(bs, bitmap);
+}
+
 BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
                                           uint32_t granularity,
                                           const char *name,
@@ -3532,6 +3556,15 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
     }
 }
 
+void bdrv_dirty_bitmap_set_file(BdrvDirtyBitmap *bitmap, BlockDriverState *file)
+{
+    assert(bitmap->file == NULL);
+    bitmap->file = file;
+    if (file != NULL) {
+        bdrv_ref(file);
+    }
+}
+
 void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
 {
     assert(!bdrv_dirty_bitmap_frozen(bitmap));
diff --git a/include/block/block.h b/include/block/block.h
index 5e95e51..4b9b898 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -491,6 +491,8 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
                                         const char *name);
 void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
+void bdrv_dirty_bitmap_set_file(BdrvDirtyBitmap *bitmap,
+                                BlockDriverState *file);
 void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
@@ -532,6 +534,7 @@ HBitmap *bdrv_create_meta_bitmap(BdrvDirtyBitmap *bitmap,
 void bdrv_release_meta_bitmap(BdrvDirtyBitmap *bitmap);
 
 bool bdrv_load_check_dirty_bitmap(BlockDriverState *file, const char *name);
+int bdrv_store_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 
 void bdrv_enable_copy_on_read(BlockDriverState *bs);
 void bdrv_disable_copy_on_read(BlockDriverState *bs);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 11/17] block: add bdrv_load_dirty_bitmap()
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (9 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 10/17] block: store persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 12/17] qcow2-dirty-bitmap: add autoclear bit Vladimir Sementsov-Ogievskiy
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

The funcion loads dirty bitmap from file, using underlying driver
function.

Note: the function doesn't change BdrvDirtyBitmap.file field. This field
is only used by bdrv_store_dirty_bitmap() function and is ONLY written
by bdrv_dirty_bitmap_set_file() function.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c               | 18 ++++++++++++++++++
 include/block/block.h |  4 ++++
 2 files changed, 22 insertions(+)

diff --git a/block.c b/block.c
index 492abb4..9148977 100644
--- a/block.c
+++ b/block.c
@@ -3374,6 +3374,24 @@ bool bdrv_load_check_dirty_bitmap(BlockDriverState *file, const char *name)
     return false;
 }
 
+BdrvDirtyBitmap *bdrv_load_dirty_bitmap(BlockDriverState *bs_for,
+                                        BlockDriverState *bs_file,
+                                        const char *name,
+                                        Error **errp)
+{
+    BlockDriver *drv = bs_file->drv;
+    if (!drv) {
+        return NULL;
+    }
+    if (drv->bdrv_dirty_bitmap_load) {
+        return drv->bdrv_dirty_bitmap_load(bs_for, bs_file, name, errp);
+    }
+    if (bs_file->file)  {
+        return bdrv_load_dirty_bitmap(bs_for, bs_file->file, name, errp);
+    }
+    return NULL;
+}
+
 int bdrv_store_dirty_bitmap(BdrvDirtyBitmap *bitmap)
 {
     BlockDriverState *bs = bitmap->file;
diff --git a/include/block/block.h b/include/block/block.h
index 4b9b898..f587a03 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -534,6 +534,10 @@ HBitmap *bdrv_create_meta_bitmap(BdrvDirtyBitmap *bitmap,
 void bdrv_release_meta_bitmap(BdrvDirtyBitmap *bitmap);
 
 bool bdrv_load_check_dirty_bitmap(BlockDriverState *file, const char *name);
+BdrvDirtyBitmap *bdrv_load_dirty_bitmap(BlockDriverState *bs_for,
+                                        BlockDriverState *bs_file,
+                                        const char *name,
+                                        Error **errp);
 int bdrv_store_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 
 void bdrv_enable_copy_on_read(BlockDriverState *bs);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 12/17] qcow2-dirty-bitmap: add autoclear bit
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (10 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 11/17] block: add bdrv_load_dirty_bitmap() Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 13/17] qemu: command line option for dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Add autoclear bit for handling rewriting image by old qemu version.

If autoclear bit is not set, but Dirty bitmaps extension is found it
would not be loaded and warning will be generated.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-dirty-bitmap.c |  4 ++++
 block/qcow2.c              | 12 ++++++++++--
 block/qcow2.h              |  9 +++++++++
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
index 2687edc..e96ee18 100644
--- a/block/qcow2-dirty-bitmap.c
+++ b/block/qcow2-dirty-bitmap.c
@@ -448,6 +448,7 @@ static int directory_push_entry(BlockDriverState *bs, QCowDirtyBitmapHeader *hea
     int64_t new_offset = 0, old_offset = 0;
     uint64_t new_size = s->dirty_bitmap_directory_size + entry_size, old_size = 0;
     void *p;
+    uint64_t old_autocl;
 
     if (new_size > QCOW_MAX_DIRTY_BITMAP_DIRECTORY_SIZE) {
         return -EINVAL;
@@ -460,6 +461,7 @@ static int directory_push_entry(BlockDriverState *bs, QCowDirtyBitmapHeader *hea
 
     old_offset = s->dirty_bitmap_directory_offset;
     old_size = s->dirty_bitmap_directory_size;
+    old_autocl = s->autoclear_features;
 
     uint8_t *new_dir = g_try_malloc(new_size);
     if (new_dir == NULL) {
@@ -481,6 +483,7 @@ static int directory_push_entry(BlockDriverState *bs, QCowDirtyBitmapHeader *hea
 
     s->dirty_bitmap_directory_offset = new_offset;
     s->dirty_bitmap_directory_size = new_size;
+    s->autoclear_features |= QCOW2_AUTOCLEAR_DIRTY_BITMAPS;
 
     ret = update_header_sync(bs);
     if (ret < 0) {
@@ -502,6 +505,7 @@ fail:
         qcow2_free_clusters(bs, new_offset, new_size, QCOW2_DISCARD_ALWAYS);
         s->dirty_bitmap_directory_offset = old_offset;
         s->dirty_bitmap_directory_size = old_size;
+        s->autoclear_features = old_autocl;
     }
 
     p = g_try_realloc(s->dirty_bitmap_directory, s->dirty_bitmap_directory_size);
diff --git a/block/qcow2.c b/block/qcow2.c
index 36f404c..8791081 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -162,6 +162,13 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
             break;
 
         case QCOW2_EXT_MAGIC_DIRTY_BITMAPS:
+            if (!(s->autoclear_features & QCOW2_AUTOCLEAR_DIRTY_BITMAPS)) {
+                fprintf(stderr,
+                        "WARNING: dirty_bitmaps_ext: autoclear flag is not "
+                        "set, all bitmaps will be considered as inconsistent");
+                break;
+            }
+
             ret = bdrv_pread(bs->file, offset, &dirty_bitmaps_ext, ext.len);
             if (ret < 0) {
                 error_setg_errno(errp, -ret, "ERROR: dirty_bitmaps_ext: "
@@ -961,8 +968,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
     }
 
     /* Clear unknown autoclear feature bits */
-    if (!bs->read_only && !(flags & BDRV_O_INCOMING) && s->autoclear_features) {
-        s->autoclear_features = 0;
+    if (!bs->read_only && !(flags & BDRV_O_INCOMING) &&
+        (s->autoclear_features & ~QCOW2_AUTOCLEAR_MASK)) {
+        s->autoclear_features &= QCOW2_AUTOCLEAR_MASK;
         ret = qcow2_update_header(bs);
         if (ret < 0) {
             error_setg_errno(errp, -ret, "Could not update qcow2 header");
diff --git a/block/qcow2.h b/block/qcow2.h
index 05f59e0..e699d39 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -216,6 +216,15 @@ enum {
     QCOW2_COMPAT_FEAT_MASK            = QCOW2_COMPAT_LAZY_REFCOUNTS,
 };
 
+/* Autoclear feature bits */
+enum {
+    QCOW2_AUTOCLEAR_DIRTY_BITMAPS_BITNR = 0,
+    QCOW2_AUTOCLEAR_DIRTY_BITMAPS       =
+        1 << QCOW2_AUTOCLEAR_DIRTY_BITMAPS_BITNR,
+
+    QCOW2_AUTOCLEAR_MASK                = QCOW2_AUTOCLEAR_DIRTY_BITMAPS,
+};
+
 enum qcow2_discard_type {
     QCOW2_DISCARD_NEVER = 0,
     QCOW2_DISCARD_ALWAYS,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 13/17] qemu: command line option for dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (11 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 12/17] qcow2-dirty-bitmap: add autoclear bit Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 14/17] qcow2-dirty-bitmap: add IN_USE flag Vladimir Sementsov-Ogievskiy
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

The patch adds the following command line option:

-dirty-bitmap [option1=val1][,option2=val2]...

Avaliable options are:

name
The name of the bitmap.
Should be unique per 'file'/'node' and per 'for_node'.

file
The separate qcow2 file for loading the bitmap 'name' from it.

file_id
When specified with 'file' option, then this 'file' will be available
through this 'file_id' for other -dirty-bitmap options.
When specified without 'file' option, then it is a reference to 'file',
specified with another -dirty-bitmap option (or -drive), and it will be
used to load the bitmap from.

node
The node to bind the bitmap to.
It should be specified as 'id' suboption of one of '-node' options.
If nor 'file' neither 'file_id' are specified, then the bitmap will be
loaded from that node (internal dirty bitmap).

granularity
Granularity (in bytes) for created dirty bitmap.
If the bitmap is already exists in specified 'file'/'file_id'/device
it's granularity will not be changed but only checked (an error will be
generated if this check fails).

enabled
on|off
Enabled flag for the bitmap.
By default the bitmap will be enabled.

create
on|off
By default is off.
If on, then new bitmap will be created in the image, if the bitmap with
same name is already exists an error will be generated.
If off, then the bitmap will be loaded from the image, if there is no
one an error will be generated.
If create=off and granularity is specified then granularity will be
checked for loaded bitmap and if not match an error will be generated.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 blockdev.c                |  44 +++++++++++++++++++
 include/sysemu/blockdev.h |   1 +
 include/sysemu/sysemu.h   |   1 +
 qemu-options.hx           |  45 +++++++++++++++++++
 vl.c                      | 110 ++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 201 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 62a4586..644ebde 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -179,6 +179,11 @@ QemuOpts *drive_def(const char *optstr)
     return qemu_opts_parse_noisily(qemu_find_opts("drive"), optstr, false);
 }
 
+QemuOpts *dirty_bitmap_def(const char *optstr)
+{
+    return qemu_opts_parse_noisily(qemu_find_opts("dirty-bitmap"), optstr, false);
+}
+
 QemuOpts *drive_add(BlockInterfaceType type, int index, const char *file,
                     const char *optstr)
 {
@@ -3119,6 +3124,45 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
     return head;
 }
 
+QemuOptsList qemu_dirty_bitmap_opts = {
+    .name = "dirty-bitmap",
+    .head = QTAILQ_HEAD_INITIALIZER(qemu_dirty_bitmap_opts.head),
+    .desc = {
+        {
+            .name = "name",
+            .type = QEMU_OPT_STRING,
+            .help = "Name of the dirty bitmap",
+        },{
+            .name = "file",
+            .type = QEMU_OPT_STRING,
+            .help = "file name to load the bitmap from",
+        },{
+            .name = "file_id",
+            .type = QEMU_OPT_STRING,
+            .help = "node name to load the bitmap from (or to set id for"
+                    " for file, opened by previous option)",
+        },{
+            .name = "node",
+            .type = QEMU_OPT_STRING,
+            .help = "node name to bind the bitmap to",
+        },{
+            .name = "granularity",
+            .type = QEMU_OPT_NUMBER,
+            .help = "granularity",
+        },{
+            .name = "enabled",
+            .type = QEMU_OPT_BOOL,
+            .help = "enabled flag (default is 'on')",
+        },{
+            .name = "create",
+            .type = QEMU_OPT_BOOL,
+            .help = "create flag (default is 'off'), "
+                    "if on, new dirty bitmap will be created, "
+                    "else the existing one will be loaded"
+        }
+    }
+};
+
 QemuOptsList qemu_common_drive_opts = {
     .name = "drive",
     .head = QTAILQ_HEAD_INITIALIZER(qemu_common_drive_opts.head),
diff --git a/include/sysemu/blockdev.h b/include/sysemu/blockdev.h
index 3104150..6078310 100644
--- a/include/sysemu/blockdev.h
+++ b/include/sysemu/blockdev.h
@@ -57,6 +57,7 @@ int drive_get_max_devs(BlockInterfaceType type);
 DriveInfo *drive_get_next(BlockInterfaceType type);
 
 QemuOpts *drive_def(const char *optstr);
+QemuOpts *dirty_bitmap_def(const char *optstr);
 QemuOpts *drive_add(BlockInterfaceType type, int index, const char *file,
                     const char *optstr);
 DriveInfo *drive_new(QemuOpts *arg, BlockInterfaceType block_default_type);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 44570d1..6ef240b 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -207,6 +207,7 @@ bool usb_enabled(void);
 
 extern QemuOptsList qemu_legacy_drive_opts;
 extern QemuOptsList qemu_common_drive_opts;
+extern QemuOptsList qemu_dirty_bitmap_opts;
 extern QemuOptsList qemu_drive_opts;
 extern QemuOptsList qemu_chardev_opts;
 extern QemuOptsList qemu_device_opts;
diff --git a/qemu-options.hx b/qemu-options.hx
index 77f5853..40aea22 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -620,6 +620,51 @@ qemu-system-i386 -hda a -hdb b
 @end example
 ETEXI
 
+DEF("dirty-bitmap", HAS_ARG, QEMU_OPTION_dirty_bitmap,
+    "-dirty-bitmap name=name[,file=file][,file_id=file_id][,node=@var{id}]\n"
+    "              [,granularity=granularity][,enabled=on|off][,create=on|off]\n",
+    QEMU_ARCH_ALL)
+STEXI
+@item -dirty-bitmap @var{option}[,@var{option}[,@var{option}[,...]]]
+@findex -dirty-bitmap
+
+Define a dirty-bitmap. Valid options are:
+
+@table @option
+@item name=@var{name}
+The name of the bitmap. Should be unique per @var{file}/@var{node} and per
+@var{for_node}.
+@item file=@var{file}
+The separate qcow2 file for loading the bitmap @var{name} from it.
+@item file_id=@var{file_id}
+When specified with @var{file} option, then this @var{file} will be available
+through this @var{file_id} for other @option{-dirty-bitmap} options.
+When specified without @var{file} option, then it is a reference to @var{file},
+specified with another @option{-dirty-bitmap} option, and it will be used to
+load the bitmap from.
+@item node=@var{node}
+The node to bind the bitmap to. It should be specified as @var{id} suboption
+of one of @option{-node} options.
+If nor @var{file} neither @var{file_id} are specified, then the bitmap will be
+loaded from that node (internal dirty bitmap).
+@item granularity=@var{granularity}
+Granularity (in bytes) for created dirty bitmap. If the bitmap is already
+exists in specified @var{file}/@var{file_id}/@var{device} it's granularity will
+not be changed but only checked (an error will be generated if this check
+fails).
+@item enabled=@var{enabled}
+Enabled flag for the bitmap. By default the bitmap will be enabled.
+@item create=@var{create}
+By default is off.
+If on, then new bitmap will be created in the image, if the bitmap with same
+name is already exists an error will be generated.
+If off, then the bitmap will be loaded from the image, if there is no one an
+error will be generated.
+If create=off and granularity is specified then granularity will be checked for
+loaded bitmap and if not match an error will be generated.
+@end table
+ETEXI
+
 DEF("mtdblock", HAS_ARG, QEMU_OPTION_mtdblock,
     "-mtdblock file  use 'file' as on-board Flash memory image\n",
     QEMU_ARCH_ALL)
diff --git a/vl.c b/vl.c
index 7d745b5..5acb1f5 100644
--- a/vl.c
+++ b/vl.c
@@ -1130,6 +1130,102 @@ static int cleanup_add_fd(void *opaque, QemuOpts *opts, Error **errp)
 #define MTD_OPTS ""
 #define SD_OPTS ""
 
+static int dirty_bitmap_func(void *opaque, QemuOpts *opts, Error **errp)
+{
+    BlockDriverState *file_bs = NULL, *for_bs = NULL;
+    BdrvDirtyBitmap *bitmap = NULL;
+
+    const char *name = qemu_opt_get(opts, "name");
+    const char *node = qemu_opt_get(opts, "node");
+    const char *file = qemu_opt_get(opts, "file");
+    const char *file_id = qemu_opt_get(opts, "file_id");
+
+    uint64_t granularity = qemu_opt_get_number(opts, "granularity", 0);
+    bool enabled = qemu_opt_get_bool(opts, "enabled", true);
+    bool create = qemu_opt_get_bool(opts, "create", false);
+
+    if (name == NULL) {
+        error_setg(errp, "'name' option is necessary");
+        goto fail;
+    }
+
+    if (node == NULL) {
+        error_setg(errp, "'node' option is necessary");
+        goto fail;
+    }
+
+    for_bs = bdrv_lookup_bs(node, node, errp);
+    if (for_bs == NULL) {
+        goto fail;
+    }
+
+    if (file != NULL) {
+        QDict *options = NULL;
+        if (file_id != NULL) {
+            options = qdict_new();
+            qdict_put(options, "node-name", qstring_from_str(file_id));
+        }
+
+        bdrv_open(&file_bs, file, NULL, options, BDRV_O_RDWR, NULL, errp);
+        if (options) {
+            QDECREF(options);
+        }
+        if (file_bs == NULL) {
+            goto fail;
+        }
+    } else if (file_id != NULL) {
+        file_bs = bdrv_find_node(file_id);
+        if (file_bs == NULL) {
+            error_setg(errp, "node '%s' is not found", node);
+            goto fail;
+        }
+    } else {
+        file_bs = for_bs;
+    }
+
+    if (create) {
+        if (bdrv_load_check_dirty_bitmap(file_bs, name)) {
+            error_setg(errp, "bitmap '%s' already exists", name);
+            goto fail;
+        }
+
+        if (granularity == 0) {
+            granularity = bdrv_get_default_bitmap_granularity(for_bs);
+        }
+
+        bitmap = bdrv_create_dirty_bitmap(for_bs, granularity, name, errp);
+        if (bitmap == NULL) {
+            goto fail;
+        }
+    } else {
+        bitmap = bdrv_load_dirty_bitmap(for_bs, file_bs, name, errp);
+        if (bitmap == NULL) {
+            goto fail;
+        }
+
+        if (granularity != 0 &&
+            granularity != bdrv_dirty_bitmap_granularity(bitmap)) {
+            bdrv_release_dirty_bitmap(for_bs, bitmap);
+            error_setg(errp, "bitmap '%s' is not found", name);
+            goto fail;
+        }
+    }
+
+    bdrv_dirty_bitmap_set_file(bitmap, file_bs);
+
+    if (!enabled) {
+        bdrv_disable_dirty_bitmap(bitmap);
+    }
+
+    return 0;
+
+fail:
+    if (file_bs != NULL) {
+        bdrv_close(file_bs);
+    }
+    return -1;
+}
+
 static int drive_init_func(void *opaque, QemuOpts *opts, Error **errp)
 {
     BlockInterfaceType *block_default_type = opaque;
@@ -2991,6 +3087,7 @@ int main(int argc, char **argv, char **envp)
     module_call_init(MODULE_INIT_QOM);
 
     qemu_add_opts(&qemu_drive_opts);
+    qemu_add_opts(&qemu_dirty_bitmap_opts);
     qemu_add_drive_opts(&qemu_legacy_drive_opts);
     qemu_add_drive_opts(&qemu_common_drive_opts);
     qemu_add_drive_opts(&qemu_drive_opts);
@@ -3125,6 +3222,11 @@ int main(int argc, char **argv, char **envp)
                     exit(1);
                 }
                 break;
+            case QEMU_OPTION_dirty_bitmap:
+                if (dirty_bitmap_def(optarg) == NULL) {
+                    exit(1);
+                }
+                break;
             case QEMU_OPTION_set:
                 if (qemu_set_option(optarg) != 0)
                     exit(1);
@@ -4458,6 +4560,14 @@ int main(int argc, char **argv, char **envp)
 
     parse_numa_opts(machine_class);
 
+    if (qemu_opts_foreach(qemu_find_opts("dirty-bitmap"), dirty_bitmap_func,
+                          NULL, &err)) {
+        if (err != NULL) {
+            error_report_err(err);
+        }
+        exit(1);
+    }
+
     if (qemu_opts_foreach(qemu_find_opts("mon"),
                           mon_init_func, NULL, NULL)) {
         exit(1);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 14/17] qcow2-dirty-bitmap: add IN_USE flag
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (12 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 13/17] qemu: command line option for dirty bitmaps Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 15/17] qcow2-dirty-bitmaps: handle store reqursion Vladimir Sementsov-Ogievskiy
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

This flag is set on bitmap load and unset on store. If it is already
set when loading the bitmap, the bitmap should not be load (it is in
use by other drive or it is inconsistent (was not successfully saved))

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-dirty-bitmap.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
index e96ee18..3d3c624 100644
--- a/block/qcow2-dirty-bitmap.c
+++ b/block/qcow2-dirty-bitmap.c
@@ -39,7 +39,8 @@
 #define DBM_MAX_NAME_SIZE 1023
 
 /* Dirty Bitmap Directory Enrty flags */
-#define DBM_RESERVED_FLAGS 0xffffffff
+#define DBM_RESERVED_FLAGS 0xfffffffe
+#define DBM_FLAG_IN_USE 1
 
 /* bits [0, 8] U [56, 63] are reserved */
 #define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
@@ -127,6 +128,29 @@ static QCowDirtyBitmapHeader *bitmap_header(BDRVQcowState *s,
            (s->dirty_bitmap_directory + bitmap->offset);
 }
 
+static int update_bitmap_header_sync(BlockDriverState *bs, QCowDirtyBitmap *bitmap)
+{
+    int ret;
+    BDRVQcowState *s = bs->opaque;
+    QCowDirtyBitmapHeader *h = bitmap_header(s, bitmap);
+
+    bitmap_header_to_be(h);
+    ret = bdrv_pwrite(bs->file,
+                      s->dirty_bitmap_directory_offset + bitmap->offset,
+                      h, dir_entry_size(h));
+    bitmap_header_to_cpu(h);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = bdrv_flush(bs);
+    if (ret < 0) {
+        return ret;
+    }
+
+    return 0;
+}
+
 static int directory_read(BlockDriverState *bs)
 {
     int ret;
@@ -330,6 +354,11 @@ BdrvDirtyBitmap * qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
     }
     bmh = bitmap_header(s, bm);
 
+    if (bmh->flags & DBM_FLAG_IN_USE) {
+        error_setg(errp, "Bitmap '%s' is in use", name);
+        return NULL;
+    }
+
     if (size != bmh->nb_virtual_bits) {
         error_setg(errp,
                    "Bitmap '%s' in the node '%s' has size = %" PRIu64
@@ -367,6 +396,13 @@ BdrvDirtyBitmap * qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
         goto finish;
     }
 
+    bmh->flags |= DBM_FLAG_IN_USE;
+    ret = update_bitmap_header_sync(bs_file, bm);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Could not set in_use in bitmap header");
+        goto finish;
+    }
+
 finish:
     if (*errp != NULL) {
         bdrv_release_dirty_bitmap(bs_for, bitmap);
@@ -773,6 +809,12 @@ int qcow2_dirty_bitmap_store(BlockDriverState *bs, const BdrvDirtyBitmap *bitmap
         goto finish;
     }
 
+    bmh->flags &= ~DBM_FLAG_IN_USE;
+    ret = update_bitmap_header_sync(bs, bm);
+    if (ret < 0) {
+        goto finish;
+    }
+
 finish:
     g_free(dirty_bitmap_table);
     return ret;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 15/17] qcow2-dirty-bitmaps: handle store reqursion
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (13 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 14/17] qcow2-dirty-bitmap: add IN_USE flag Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 16/17] iotests: add VM.test_launcn() Vladimir Sementsov-Ogievskiy
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

If persistent dirty bitmap bm tracks bs->file and stored in bs, then
saving this bitmap to the image will change (make bits dirty) the
bitmap bm. This is strange behaviour and should be forbidden.

RFC:
Should we check cases like
bs_for == bs_file->file->file, or bs_for->file == bs_file, or
bs_for->file == bs_file->file->file, etc?

The most common check would be

if bs_for == bs_file - it's ok
else
  if bs_for[->file...] == bs_file[->file...] - it's bad
  else - it's ok

so, there two 'ok' cases: bs_for and bs_file are the same or they are
absolutely unrelated.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c                    | 12 ++++++++++++
 block/qcow2-dirty-bitmap.c | 12 ++++++++++++
 include/block/block.h      |  1 +
 3 files changed, 25 insertions(+)

diff --git a/block.c b/block.c
index 9148977..df95bf9 100644
--- a/block.c
+++ b/block.c
@@ -3574,6 +3574,18 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
     }
 }
 
+bool bdrv_has_dirty_bitmap(BlockDriverState *bs, const BdrvDirtyBitmap *bitmap)
+{
+    BdrvDirtyBitmap *bm, *next;
+    QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
+        if (bm == bitmap) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
 void bdrv_dirty_bitmap_set_file(BdrvDirtyBitmap *bitmap, BlockDriverState *file)
 {
     assert(bitmap->file == NULL);
diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
index 3d3c624..d38f15f 100644
--- a/block/qcow2-dirty-bitmap.c
+++ b/block/qcow2-dirty-bitmap.c
@@ -346,6 +346,13 @@ BdrvDirtyBitmap * qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
     uint64_t size = bdrv_nb_sectors(bs_for);
     BdrvDirtyBitmap *bitmap = NULL;
 
+    /* reqursive storing is not allowed */
+    if (bs_for == bs_file->file) {
+        error_setg(errp, "Bitmap store recursion detected for bitmap '%s'",
+                   name);
+        return NULL;
+    }
+
     bm = find_dirty_bitmap_by_name(bs_file, name);
     if (bm == NULL) {
         error_setg(errp, "Could not find bitmap '%s' in the node '%s'", name,
@@ -770,6 +777,11 @@ int qcow2_dirty_bitmap_store(BlockDriverState *bs, const BdrvDirtyBitmap *bitmap
     uint64_t size = bdrv_dirty_bitmap_size(bitmap);
     int granularity = bdrv_dirty_bitmap_granularity(bitmap);
 
+    /* reqursive storing is not allowed */
+    if (bdrv_has_dirty_bitmap(bs->file, bitmap)) {
+        return -EINVAL;
+    }
+
     /* find/create dirty bitmap */
     bm = find_dirty_bitmap_by_name(bs, name);
     if (bm == NULL) {
diff --git a/include/block/block.h b/include/block/block.h
index f587a03..67a7f0c 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -491,6 +491,7 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
                                         const char *name);
 void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
+bool bdrv_has_dirty_bitmap(BlockDriverState *bs, const BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_set_file(BdrvDirtyBitmap *bitmap,
                                 BlockDriverState *file);
 void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 16/17] iotests: add VM.test_launcn()
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (14 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 15/17] qcow2-dirty-bitmaps: handle store reqursion Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 17/17] iotests: test internal persistent dirty bitmap Vladimir Sementsov-Ogievskiy
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Test vm can launch and print output in case of fail. This function is
needed for testing erroneous cases

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/iotests.py | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 85c5a99..7fb330a 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -200,6 +200,25 @@ class VM(object):
             os.remove(self._monitor_path)
             raise
 
+    def test_launch(self):
+        '''Launch the VM, an error is expected'''
+        try:
+            self.launch()
+        except:
+            self._popen.wait()
+            print "Test launch failed: %d" % self._popen.returncode
+            print "--- qemu output ---"
+            for line in open(self._qemu_log_path):
+                #filter qtest comments
+                if not "] OPENED" in line:
+                    print line
+            print "--- end qemu output ---"
+            return False
+
+        print "Tast launch successed!"
+        self.shutdown()
+        return True
+
     def shutdown(self):
         '''Terminate the VM and clean up'''
         if not self._popen is None:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH 17/17] iotests: test internal persistent dirty bitmap
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (15 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 16/17] iotests: add VM.test_launcn() Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:43 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:48 ` [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, stefanha, pbonzini, den, jsnow

Add simple test cases for testing persistent dirty bitmaps.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/118        | 112 ++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/118.out    |  21 ++++++++
 tests/qemu-iotests/group      |   1 +
 tests/qemu-iotests/iotests.py |   6 +++
 4 files changed, 140 insertions(+)
 create mode 100755 tests/qemu-iotests/118
 create mode 100644 tests/qemu-iotests/118.out

diff --git a/tests/qemu-iotests/118 b/tests/qemu-iotests/118
new file mode 100755
index 0000000..55932d4
--- /dev/null
+++ b/tests/qemu-iotests/118
@@ -0,0 +1,112 @@
+#!/usr/bin/env python
+#
+# Tests for persistent dirty bitmaps.
+#
+# Copyright: Vladimir Sementsov-Ogievskiy 2015
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+import iotests
+import time
+from iotests import qemu_img
+
+disk = os.path.join(iotests.test_dir, 'disk')
+
+size   = 0x40000000 # 1G
+sector_size = 512
+granularity = 0x10000
+regions1 = [
+    { 'start': 0,          'count': 0x100000 },
+    { 'start': 0x200000,   'count': 0x100000 }
+    ]
+regions2 = [
+    { 'start': 0x10000000, 'count': 0x20000  },
+    { 'start': 0x39990000, 'count': 0x10000  }
+    ]
+
+class TestPersistentDirtyBitmap(iotests.QMPTestCase):
+
+    def setUp(self):
+        qemu_img('create', '-f', iotests.imgfmt, disk, str(size))
+
+    def mkVm(self, create_bitmap):
+        vm = iotests.VM().add_drive(disk)
+        vm.add_dirty_bitmap('bitmap', 'drive0', create_bitmap)
+        return vm
+
+    def tearDown(self):
+        os.remove(disk)
+
+    def getMd5(self):
+        result = self.vm.qmp('query-block');
+        return result['return'][0]['dirty-bitmaps'][0]['md5']
+
+    def checkBitmap(self, md5):
+        result = self.vm.qmp('query-block');
+        self.assert_qmp(result, 'return[0]/dirty-bitmaps[0]/md5', md5);
+
+    def writeRegions(self, regions):
+        for r in regions:
+          self.vm.hmp_qemu_io('drive0',
+                                'write %d %d' % (r['start'], r['count']))
+
+    def test_persistent(self):
+        self.vm = self.mkVm(True)
+        self.vm.launch()
+
+        self.writeRegions(regions1)
+        md5 = self.getMd5()
+
+        self.vm.shutdown()
+        self.vm = self.mkVm(False)
+        self.vm.launch()
+
+        self.checkBitmap(md5)
+        self.writeRegions(regions2)
+        md5 = self.getMd5()
+
+        self.vm.shutdown()
+        self.vm.launch()
+
+        self.checkBitmap(md5)
+
+        self.vm.shutdown()
+
+    def test_not_exist(self):
+        vm = self.mkVm(False)
+        vm.test_launch()
+
+    def test_already_exists(self):
+        vm = self.mkVm(True)
+        vm.test_launch()
+        vm.test_launch()
+
+    def test_in_use(self):
+        vm = self.mkVm(True)
+        vm.launch()
+        vm.shutdown()
+
+        vm1 = self.mkVm(False)
+        vm1.launch()
+
+        vm2 = self.mkVm(False)
+        vm2.test_launch()
+
+        vm1.shutdown()
+
+
+if __name__ == '__main__':
+    iotests.main()
diff --git a/tests/qemu-iotests/118.out b/tests/qemu-iotests/118.out
new file mode 100644
index 0000000..f54a52b
--- /dev/null
+++ b/tests/qemu-iotests/118.out
@@ -0,0 +1,21 @@
+....
+----------------------------------------------------------------------
+Ran 4 tests
+
+OK
+Tast launch successed!
+Test launch failed: 1
+--- qemu output ---
+qemu-system-x86_64: -dirty-bitmap name=bitmap,node=drive0,create=on: bitmap 'bitmap' already exists
+
+--- end qemu output ---
+Test launch failed: 1
+--- qemu output ---
+qemu-system-x86_64: -dirty-bitmap name=bitmap,node=drive0,create=off: Bitmap 'bitmap' is in use
+
+--- end qemu output ---
+Test launch failed: 1
+--- qemu output ---
+qemu-system-x86_64: -dirty-bitmap name=bitmap,node=drive0,create=off: Could not find bitmap 'bitmap' in the node 'drive0'
+
+--- end qemu output ---
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 3a86575..923fe83 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -122,6 +122,7 @@
 115 rw auto
 116 rw auto quick
 117 rw auto quick
+118 rw auto quick
 119 rw auto quick
 120 rw auto quick
 121 rw auto
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 7fb330a..afc0e04 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -140,6 +140,12 @@ class VM(object):
         self._args.append(desc)
         return self
 
+    def add_dirty_bitmap(self, name, node, create=False):
+        '''Add dirty bitmap parameter to VM cmd'''
+        self._args.append('-dirty-bitmap')
+        self._args.append('name=%s,node=%s,create=%s' % (name, node, 'on' if create else 'off'))
+        return self
+
     def pause_drive(self, drive, event=None):
         '''Pause drive r/w operations'''
         if not event:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (16 preceding siblings ...)
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 17/17] iotests: test internal persistent dirty bitmap Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:48 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:51 ` Vladimir Sementsov-Ogievskiy
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

Add feature

Allow qcow2 images without l1_table and other staff but only with dirty 
bitmaps with minimum overhead

-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (17 preceding siblings ...)
  2015-09-05 16:48 ` [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:51 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:53 ` Vladimir Sementsov-Ogievskiy
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

Add tests.

Including cases:
  - no writes
  - empty bitmaps
  - empty files
  - different cmd opts (file, file_id)
  - several bitmaps

-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (18 preceding siblings ...)
  2015-09-05 16:51 ` Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:53 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 16:57 ` Vladimir Sementsov-Ogievskiy
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

Add feature:

Support of resizing a drive with dirty bitmaps (stored in it, stored in 
other disks).

-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (19 preceding siblings ...)
  2015-09-05 16:53 ` Vladimir Sementsov-Ogievskiy
@ 2015-09-05 16:57 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 17:03 ` Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 16:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

qemu-img:

print information about bitmaps
ability of removing inconsistent bitmaps (in_use=true or corresponding 
autoclear-bit=0)

-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (20 preceding siblings ...)
  2015-09-05 16:57 ` Vladimir Sementsov-Ogievskiy
@ 2015-09-05 17:03 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 17:09 ` Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 17:03 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

Add feature:

Do not stub on loding/storing large bitmaps:

- background loading (bitmap started empty, and then merged with loaded 
parts)
- meta bitmaps for storing (story only changed parts of the bitmap)

-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (21 preceding siblings ...)
  2015-09-05 17:03 ` Vladimir Sementsov-Ogievskiy
@ 2015-09-05 17:09 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 17:16 ` Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 17:09 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

Add feature?

Bitmap auto loading.
You can see flags for it in spec (0003), but it is not done yet.

Is it really needed? Will it be useful for libvirt for ex?

-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (22 preceding siblings ...)
  2015-09-05 17:09 ` Vladimir Sementsov-Ogievskiy
@ 2015-09-05 17:16 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 17:25 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 17:30 ` Vladimir Sementsov-Ogievskiy
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 17:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

Add feature:

Periodic flushing of the bitmaps

The idea:

Disk is written often.
Bitmap is updated more seldom.
HBitmap previous level is updated even more seldom..

To not store all bitmap levels in file, just save in the image file the 
number of largest consistent level:

flush bitmap: consistent_level = HBITMAP_MAX_LEVEL

change bitmap level X: if consistent_level > X then consistent_level = X 
- 1 (and flush consistent_level to file)

Then, after fail, we can restore the bitmap from last consistent level:

gran = 1 << (level_bits * (HBITMAP_MAX_LEVEL - consistent_level))
bitmap[i] = bitmap[i - i % gran] OR bitmap[i - i % gran + 1] OR ... OR 
bitmap[i - i % gran + (gran - 1)]


to make this scheme independent of HBitmap, it may be better to number 
levels from 0 (0 is largest level), and save level_bits to Image file too.

-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (23 preceding siblings ...)
  2015-09-05 17:16 ` Vladimir Sementsov-Ogievskiy
@ 2015-09-05 17:25 ` Vladimir Sementsov-Ogievskiy
  2015-09-05 17:30 ` Vladimir Sementsov-Ogievskiy
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 17:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

Add feature:

Migration of BdrvDirtyBitmap.file field.

Bitmap 'percistancy' can be migrated only if its(bitmap) storage disk 
has id and the destination qemu has disk with same id.


Stefan Hajnoczi wrote:
> Code might be necessary to ensure that:
>
> 1. The source host does not store the bitmap after successful live
>     migration handover.  (It could overwrite new data with old data!)
>
> 2. The destination host does not discard an "in_use" bitmap when it
>     opens the qcow2 file before migration handover.
>


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps
  2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
                   ` (24 preceding siblings ...)
  2015-09-05 17:25 ` Vladimir Sementsov-Ogievskiy
@ 2015-09-05 17:30 ` Vladimir Sementsov-Ogievskiy
  25 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 17:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

Add feature:

read-only mode.

For now bitmaps are loaded only from writable bs's: in_use should be set 
in the image.

Should we automatically load bitmaps in read-only mode if their storage 
is readonly? Hmm. I think it should be checked that bs_for - the disk, 
the bitmap is loading for is readonly too. Or may be there should be 
explicit cmd option for read-only loading.

Also, there may be needed read-only flag in Dirty Bitmap Directory 
Entry. Is it?


On 05.09.2015 19:43, Vladimir Sementsov-Ogievskiy wrote:
> v3:
>
> Hi all! After long delay I am starting a new thread about persistent
> dirty bitmaps. Not all ideas/requests are done from v2 thread, but I
> need to start somewhere. Also, there are a lot of code changes to be
> considered before everything is finished. Also, v2 thread is too large
> and it is hard to handle it.
>
> Here I'll summarize, what was changed. All ideas/comments, that was not
> done from v2 thread I'll add as replays to this cover letter. If I miss
> something, please add.
>
> 01-02: new patches, necessary preparation
>
> 03: docs
> - add constraints
> - document auto clear bit
> - renames
>      L1 table -> Dirty Bitmap Table
>      Dirty Bitmaps Table -> Dirty Bitmap Directory
> - remove Standard cluster with its zero bit,
>
> struct changes:
>
> extension header:
>    add 32bit dirty_bitmap_directory_size
>    it is needed to read the whole Dirty Bitmap Directory.
>    nb_dirty_bitmaps becomes excessive (it may be calculated by looking
>    through Dirty Bitmap Directory).
>    RFC: is it bad? should I drop nb_dirty_bitmaps?
>
> directory entry:
> - bitmap size is renamed to 'number of virtual bits'
> - flags added. (self/aut/read_only are not reliazed in code for now)
> - granularity is now represented by granularity_bits
>   
>
> 04: structs and consts
> Structures changed, accordingly to spec.
> QCowDirtyBitmap now only contains name and offset. Offset is offset in
> bytes in Dirty Bitmap Directory, so, bitmap header can be accessed:
> cached version: s->dirty_bitmap_directory + offset
> in image file: s->dirty_bitmap_directory_offset + offset
>
> == dirty-bitmaps feature ==
>
> initial patch [PATCH 2/8] qcow2: add dirty-bitmaps feature
> is splitted out to 5 parts. The first with license notice and
> constants is in 04. Then, there are:
> 05: read Dirty Bitmap Directory
> 06: load dirty bitmap
> 07: store dirty bitmap
> 08: add dirty bitmap extension
>
> The code is significantly changed and reorganized. For your comments:
> - fixes about flushes
> - g_try_... functions are used. But I left the old g_malloc/g_new for
> things with constant or strictly limited size (dirty bitmap header,
> with its name, name size < 1024, one cluster, bitmap name (< 1024))
> - input validation added.
> - dirty bitmap directory is read/written wholly, not by parts.
> - load/store through bdrv_dirty_bitmap_[de]serialize_part
> - 'delete dirty bitmap' function is removed. This function is not used
> in v3. May be this functionality should appear in qemu_img, or with qmp
> series.
> - qcow2_write_dirty_bitamps function is removed. This functionality is
> rewritten, now Bitmap Directory is rewritten from
> s->dirty_bitmap_directory, (directory_write()), and not generated from
> s->dirty_bitmaps
>
> 09: load check
> - new functionality, just check existance of dirty bitmap by name
>
> 10: store block dirty bitmap
> - bdrv_store_dirty_bitmap simplified, as bitmap is simply passed to
> qcow2_dirty_bitmap_store
>
> 11: load block dirty bitmap
> - bdrv_load_dirty_bitmap simplified, as bitmap is created by
> qcow2_dirty_bitmap_load. Also, granularity parameter is removed
>
> patch [[PATCH 5/8] qcow2: add qcow2_dirty_bitmap_delete_all] is
> removed. This function is not used in v3. May be this functionality
> should appear in qemu_img, or with qmp series.
>
> 12: autoclear bit
> - if it is not set, but bitmap extension is found, it just skipped and
> warning printed. Dirty bitmaps are not cleread from the image as in v2
> - handling in qcow2-dirty-bitmap.c is changed, the bit is just set when
> adding entry to Dirty Bitmap Directory in the image.
>
> 13: cmd line
> create=on|off flag added. Now, the bitmap is not auto-created.
> By default flag is off.
> If on, then new bitmap will be created in the image, if the bitmap with
> same name is already exists an error will be generated.
> If off, then the bitmap will be loaded from the image, if there is no
> one an error will be generated.
> If create=off and granularity is specified then granularity will be
> checked for loaded bitmap and if not match an error will be generated.
>
> also, s/drive/node
>
> also, dirty_bitmap_func():
>      - add errp parameter and handle this in main()
>      - open image with BDRV_O_RDWR for loading bitmaps
>
> 14-16: new patches
>
> 17: test
> - add three test cases.
>
>
> v2:
>   - rebase on my 'Dirty bitmaps migration' series
>   - remove 'print dirty bitmap', 'query-dirty-bitmap' and use md5 for
>     testing like with dirty bitmaps migration
>   - autoclean features
>
> v1:
>
> The bitmaps are saved into qcow2 file format. It provides both
> 'internal' and 'external' dirty bitmaps feature:
>   - for qcow2 drives we can store bitmaps in the same file
>   - for other formats we can store bitmaps in the separate qcow2 file
>
> QCow2 header is extended by fields 'nb_dirty_bitmaps' and
> 'dirty_bitmaps_offset' like with snapshots.
>
> Proposed command line syntax is the following:
>
> -dirty-bitmap [option1=val1][,option2=val2]...
>      Available options are:
>      name         The name for the bitmap (necessary).
>
>      file         The file to load the bitmap from.
>
>      file_id      When specified with 'file' option, then this file will
>                   be available through this id for other -dirty-bitmap
>                   options when specified without 'file' option, then it
>                   is a reference to 'file', specified with another
>                   -dirty-bitmap option, and it will be used to load the
>                   bitmap from.
>
>      drive        The drive to bind the bitmap to. It should be specified
>                   as 'id' suboption of one of -drive options. If nor
>                   'file' neither 'file_id' are specified, then the bitmap
>                   will be loaded from that drive (internal dirty bitmap).
>
>      granularity  The granularity for the bitmap. Not necessary, the
>                   default value may be used.
>
>      enabled      on|off. Default is 'on'. Disabled bitmaps are not
>                   changing regardless of writes to corresponding drive.
>
> Examples:
>
> qemu -drive file=a.qcow2,id=disk -dirty-bitmap name=b,drive=disk
> qemu -drive file=a.raw,id=disk \
>       -dirty-bitmap name=b,drive=disk,file=b.qcow2,enabled=off
>
> Vladimir Sementsov-Ogievskiy (8):
>    spec: add qcow2-dirty-bitmaps specification
>    qcow2: add dirty-bitmaps feature
>    block: store persistent dirty bitmaps
>    block: add bdrv_load_dirty_bitmap
>    qcow2: add qcow2_dirty_bitmap_delete_all
>    qcow2: add autoclear bit for dirty bitmaps
>    qemu: command line option for dirty bitmaps
>    iotests: test internal persistent dirty bitmap
>
>   block.c                       |  82 +++++++
>   block/Makefile.objs           |   2 +-
>   block/qcow2-dirty-bitmap.c    | 537 ++++++++++++++++++++++++++++++++++++++++++
>   block/qcow2.c                 |  69 +++++-
>   block/qcow2.h                 |  61 +++++
>   blockdev.c                    |  38 +++
>   docs/specs/qcow2.txt          |  66 ++++++
>   include/block/block.h         |   9 +
>   include/block/block_int.h     |  10 +
>   include/sysemu/blockdev.h     |   1 +
>   include/sysemu/sysemu.h       |   1 +
>   qemu-options.hx               |  37 +++
>   tests/qemu-iotests/118        |  83 +++++++
>   tests/qemu-iotests/118.out    |   5 +
>   tests/qemu-iotests/group      |   1 +
>   tests/qemu-iotests/iotests.py |   6 +
>   vl.c                          | 100 ++++++++
>   17 files changed, 1105 insertions(+), 3 deletions(-)
>   create mode 100644 block/qcow2-dirty-bitmap.c
>   create mode 100755 tests/qemu-iotests/118
>   create mode 100644 tests/qemu-iotests/118.out
>


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
@ 2015-09-05 17:33   ` Vladimir Sementsov-Ogievskiy
  2015-10-06 20:22     ` John Snow
  2015-09-15 16:24   ` Eric Blake
  2015-10-07 16:47   ` Max Reitz
  2 siblings, 1 reply; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-05 17:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha, pbonzini, den, jsnow

On 05.09.2015 19:43, Vladimir Sementsov-Ogievskiy wrote:
> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
> other drives (there may be qcow2 file with zero disk size but with
> several dirty bitmaps for other drives).
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   docs/specs/qcow2.txt | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 126 insertions(+), 1 deletion(-)
>
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 121dfc8..5fc0365 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -103,7 +103,13 @@ in the description of a field.
>                       write to an image with unknown auto-clear features if it
>                       clears the respective bits from this field first.
>   
> -                    Bits 0-63:  Reserved (set to 0)
> +                    Bit 0:      Dirty bitmaps bit. If this bit is set then
> +                                there is a _consistent_ Dirty bitmaps extension
> +                                in the image. If it is not set, but there is a
> +                                Dirty bitmaps extension, its data should be
> +                                considered as inconsistent.
> +
> +                    Bits 1-63:  Reserved (set to 0)
>   
>            96 -  99:  refcount_order
>                       Describes the width of a reference count block entry (width
> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the following:
>                           0x00000000 - End of the header extension area
>                           0xE2792ACA - Backing file format name
>                           0x6803f857 - Feature name table
> +                        0x23852875 - Dirty bitmaps
>                           other      - Unknown header extension, can be safely
>                                        ignored
>   
> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
>                       terminated if it has full length)
>   
>   
> +== Dirty bitmaps ==
> +
> +Dirty bitmaps is an optional header extension. It provides an ability to store
> +dirty bitmaps in a qcow2 image. The fields are:
> +
> +          0 -  3:  nb_dirty_bitmaps
> +                   The number of dirty bitmaps contained in the image. Valid
> +                   values: 0 - 65535.
> +
> +          4 -  7:  dirty_bitmap_directory_size
> +                   Size of the Dirty Bitmap Directory in bytes. Valid values:
> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
> +
> +          8 - 15:  dirty_bitmap_directory_offset
> +                   Offset into the image file at which the Dirty Bitmap
> +                   Directory starts. Must be aligned to a cluster boundary.
> +
> +
>   == Host cluster management ==
>   
>   qcow2 manages the allocation of host clusters by maintaining a reference count
> @@ -360,3 +385,103 @@ Snapshot table entry:
>   
>           variable:   Padding to round up the snapshot table entry size to the
>                       next multiple of 8.
> +
> +
> +== Dirty bitmaps ==
> +
> +The feature supports storing dirty bitmaps in a qcow2 image.
> +
> +=== Cluster mapping ===
> +
> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
> +bitmaps to host clusters. It is called Dirty Bitmap Table.
> +
> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
> +Directory Entry) and may use multiple clusters, however it must be contiguous
> +in the image file.
> +
> +Given an offset (in bytes) into the bitmap, the offset into the image file can
> +be obtained as follows:
> +
> +    byte_offset =
> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
> +
> +Taking into accout the granularity of the bitmap, an offset in bits into the
> +image file can be obtained like this:
> +
> +    bit_offset =
> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr / granularity) % 8
> +
> +Here bit_nr is a number of "virtual" bit of the bitmap, which is covered by
> +"physical" bit with number (bit_nr / granularity).
> +
> +Dirty Bitmap Table entry:
> +
> +    Bit  0 -  8:    Reserved
> +
> +         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
> +                    cluster boundary. If the offset is 0, the cluster is
> +                    unallocated, and should be read as all zeros.
> +
> +        56 - 63:    Reserved
> +
> +=== Dirty Bitmap Directory ===
> +
> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
> +Directory entry. Dirty Bitmap Directory is a contiguous area in the image file,
> +whose starting offset and length are given by the header extension fields
> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
> +the bitmap directory have variable length, depending on the length of the
> +bitmap name.
> +
> +Dirty Bitmap Directory Entry:
> +
> +    Byte 0 -  7:    dirty_bitmap_table_offset
> +                    Offset into the image file at which the Dirty Bitmap Table
> +                    for the bitmap starts. Must be aligned to a cluster
> +                    boundary.
> +
> +         8 - 15:    nb_virtual_bits
> +                    Number of "virtual" bits in the bitmap. Number of
> +                    "physical" bits would be:
> +                    (nb_virtual_bits + granularity - 1) / granularity
> +
> +        16 - 19:    dirty_bitmap_table_size
> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
> +                    Valid values: 0 - 0x8000000.
> +                    Also, (dirty_bitmap_table_size * cluster_size) should not
> +                    be greater than 0x20000000 (512 MB)
> +
> +        20 - 23:    granularity_bits
> +                    Granularity bits. Valid values are: 0 - 63.
> +
> +                    Granularity is calculated as
> +                        granularity = 1 << granularity_bits
> +
> +                    Granularity of the bitmap is how many "virtual" bits
> +                    accounts for one "physical" bit.
> +
> +        24 - 27:    flags
> +                    Bit
> +                      0: in_use
> +                         The bitmap is in use and may be inconsistent.
> +
> +                      1: self
> +                         The bitmap is a dirty bitmap for the containing image.
> +
> +                      2: auto
> +                         The bitmap should be autoloaded as block dirty bitmap.
> +                         Only available if bit 1 (self) is set.
> +
> +                      3: read_only
> +                         The bitmap should not be rewritten.
> +
> +                    Bits 4 - 31 are reserved.

Is this appropriate as field, reserved for future extensiion? Or we need 
an additional one? Do we need scheme like with snapshots? (somthing like 
field 'additional_area_size', and additional offset of this size after 
the name)

> +
> +        28 - 29:    name_size
> +                    Size of the bitmap name. Valid values: 0 - 1023.
> +
> +        variable:   The name of the bitmap (not null terminated).
> +
> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size to
> +                    the next multiple of 8.


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity()
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity() Vladimir Sementsov-Ogievskiy
@ 2015-09-15 15:36   ` Eric Blake
  2015-10-05 22:47   ` John Snow
  1 sibling, 0 replies; 53+ messages in thread
From: Eric Blake @ 2015-09-15 15:36 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, den, jsnow, stefanha, pbonzini

[-- Attachment #1: Type: text/plain, Size: 785 bytes --]

On 09/05/2015 10:43 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---

The commit message summary mentions a one-line "what" (good), but is
lacking a body that says "why" (that is, what was broken that needed
fixing?)

>  block.c               | 2 +-
>  include/block/block.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 

It looks like all the patch does is make the signature const-correct,
since querying the granularity does not modify the bitmap.  So the
commit message should say that.

With an improved commit message,
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size()
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size() Vladimir Sementsov-Ogievskiy
@ 2015-09-15 15:37   ` Eric Blake
  2015-10-05 22:48   ` John Snow
  1 sibling, 0 replies; 53+ messages in thread
From: Eric Blake @ 2015-09-15 15:37 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, den, jsnow, stefanha, pbonzini

[-- Attachment #1: Type: text/plain, Size: 405 bytes --]

On 09/05/2015 10:43 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c               | 5 +++++
>  include/block/block.h | 1 +
>  2 files changed, 6 insertions(+)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
  2015-09-05 17:33   ` Vladimir Sementsov-Ogievskiy
@ 2015-09-15 16:24   ` Eric Blake
  2015-09-16  8:52     ` Vladimir Sementsov-Ogievskiy
  2015-10-06  0:09     ` John Snow
  2015-10-07 16:47   ` Max Reitz
  2 siblings, 2 replies; 53+ messages in thread
From: Eric Blake @ 2015-09-15 16:24 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, den, jsnow, stefanha, pbonzini

[-- Attachment #1: Type: text/plain, Size: 9610 bytes --]

On 09/05/2015 10:43 AM, Vladimir Sementsov-Ogievskiy wrote:
> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
> other drives (there may be qcow2 file with zero disk size but with
> several dirty bitmaps for other drives).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  docs/specs/qcow2.txt | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 126 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 121dfc8..5fc0365 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -103,7 +103,13 @@ in the description of a field.
>                      write to an image with unknown auto-clear features if it
>                      clears the respective bits from this field first.
>  
> -                    Bits 0-63:  Reserved (set to 0)
> +                    Bit 0:      Dirty bitmaps bit. If this bit is set then
> +                                there is a _consistent_ Dirty bitmaps extension
> +                                in the image. If it is not set, but there is a
> +                                Dirty bitmaps extension, its data should be
> +                                considered as inconsistent.

Thanks for documenting this. I don't know that we use underscore for
_emphasis_ anywhere else in the file, but I don't have any better
suggestions.  Should you also require that it is an error if this bit is
set but no Dirty bitmap extension header is present?

> +
> +                    Bits 1-63:  Reserved (set to 0)
>  
>           96 -  99:  refcount_order
>                      Describes the width of a reference count block entry (width
> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the following:
>                          0x00000000 - End of the header extension area
>                          0xE2792ACA - Backing file format name
>                          0x6803f857 - Feature name table
> +                        0x23852875 - Dirty bitmaps
>                          other      - Unknown header extension, can be safely
>                                       ignored
>  
> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
>                      terminated if it has full length)
>  
>  
> +== Dirty bitmaps ==
> +
> +Dirty bitmaps is an optional header extension. It provides an ability to store
> +dirty bitmaps in a qcow2 image. The fields are:

Might not hurt to remind the reader about the auto-clear feature bit
mentioned earlier controlling whether this extension can be trusted as
consistent.

> +
> +          0 -  3:  nb_dirty_bitmaps
> +                   The number of dirty bitmaps contained in the image. Valid
> +                   values: 0 - 65535.
> +
> +          4 -  7:  dirty_bitmap_directory_size
> +                   Size of the Dirty Bitmap Directory in bytes. Valid values:
> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).

Is it always going to be 1024 * nb_dirty_bitmaps? If so, why do we need
a redundant field?  If not, then this wording needs help; from the rest
of this text, it looks like you want "at most 1024 * nb_dirty_bitmaps".
 Also, while Dirty Bitmap Directory entries are variable length (and
thus a variable maximum), they do have a minimum size (so the minimum
value for dirty_bitmap_directory_size must be larger than 0 unless
nb_dirty_bitmaps is 0, in which case why would we have this header
extension)

> +
> +          8 - 15:  dirty_bitmap_directory_offset
> +                   Offset into the image file at which the Dirty Bitmap
> +                   Directory starts. Must be aligned to a cluster boundary.
> +
> +
>  == Host cluster management ==
>  
>  qcow2 manages the allocation of host clusters by maintaining a reference count
> @@ -360,3 +385,103 @@ Snapshot table entry:
>  
>          variable:   Padding to round up the snapshot table entry size to the
>                      next multiple of 8.
> +
> +
> +== Dirty bitmaps ==
> +
> +The feature supports storing dirty bitmaps in a qcow2 image.
> +
> +=== Cluster mapping ===
> +
> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
> +bitmaps to host clusters. It is called Dirty Bitmap Table.

s/ONE/one/ (I didn't see the reason for the emphasis)

> +
> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap

s/The/Each/

> +Directory Entry) and may use multiple clusters, however it must be contiguous
> +in the image file.
> +
> +Given an offset (in bytes) into the bitmap, the offset into the image file can
> +be obtained as follows:
> +
> +    byte_offset =
> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
> +
> +Taking into accout the granularity of the bitmap, an offset in bits into the

s/accout/account/

> +image file can be obtained like this:
> +
> +    bit_offset =
> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr / granularity) % 8
> +
> +Here bit_nr is a number of "virtual" bit of the bitmap, which is covered by
> +"physical" bit with number (bit_nr / granularity).

I got a bit lost on this sentence. Maybe an example would help?  Is the
idea that every image has a certain number of clusters, one "virtual"
bit per cluster, and then the bitmap compresses multiple clusters into
one "physical" bit according to a compression ratio determined by the
bitmap granularity?  That is, if I have an image with 64k clusters but
128k bitmap granularity, then each physical bit of the bitmap covers 2
clusters as being dirty?

> +
> +Dirty Bitmap Table entry:
> +
> +    Bit  0 -  8:    Reserved

s/Reserved/Reserved, must be 0/

> +
> +         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
> +                    cluster boundary. If the offset is 0, the cluster is
> +                    unallocated, and should be read as all zeros.
> +
> +        56 - 63:    Reserved

and again (specifying the user must write 0 for now leaves the door open
for extension)

> +
> +=== Dirty Bitmap Directory ===
> +
> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap

s/bitmap,/bitmap/
s/in the Dirty/in a Dirty/

> +Directory entry. Dirty Bitmap Directory is a contiguous area in the image file,
> +whose starting offset and length are given by the header extension fields
> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
> +the bitmap directory have variable length, depending on the length of the
> +bitmap name.
> +
> +Dirty Bitmap Directory Entry:
> +
> +    Byte 0 -  7:    dirty_bitmap_table_offset
> +                    Offset into the image file at which the Dirty Bitmap Table
> +                    for the bitmap starts. Must be aligned to a cluster
> +                    boundary.
> +
> +         8 - 15:    nb_virtual_bits
> +                    Number of "virtual" bits in the bitmap. Number of
> +                    "physical" bits would be:
> +                    (nb_virtual_bits + granularity - 1) / granularity
> +
> +        16 - 19:    dirty_bitmap_table_size
> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
> +                    Valid values: 0 - 0x8000000.
> +                    Also, (dirty_bitmap_table_size * cluster_size) should not
> +                    be greater than 0x20000000 (512 MB)
> +
> +        20 - 23:    granularity_bits
> +                    Granularity bits. Valid values are: 0 - 63.
> +
> +                    Granularity is calculated as
> +                        granularity = 1 << granularity_bits

63 seems like a rather high limit.  Even 32 (1 bit covering 4 billion
clusters) is huge.

> +
> +                    Granularity of the bitmap is how many "virtual" bits
> +                    accounts for one "physical" bit.
> +
> +        24 - 27:    flags
> +                    Bit
> +                      0: in_use
> +                         The bitmap is in use and may be inconsistent.
> +
> +                      1: self
> +                         The bitmap is a dirty bitmap for the containing image.
> +
> +                      2: auto
> +                         The bitmap should be autoloaded as block dirty bitmap.
> +                         Only available if bit 1 (self) is set.
> +
> +                      3: read_only
> +                         The bitmap should not be rewritten.
> +
> +                    Bits 4 - 31 are reserved.
> +
> +        28 - 29:    name_size
> +                    Size of the bitmap name. Valid values: 0 - 1023.

This limit is inconsistent with the limit above that a directory entry
will be at most 1024 bytes; since you have already burned 30 bytes on
essential information.  Is 0 allowed (a bitmap with "" as its name), or
must the name_size be at least 1?  Should you document constraints that
each bitmap name should be unique within the file (that is, no two
bitmaps in the directory have the same name)?

> +
> +        variable:   The name of the bitmap (not null terminated).
> +
> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size to
> +                    the next multiple of 8.
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-09-15 16:24   ` Eric Blake
@ 2015-09-16  8:52     ` Vladimir Sementsov-Ogievskiy
  2015-10-06  0:09     ` John Snow
  1 sibling, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2015-09-16  8:52 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: kwolf, den, jsnow, stefanha, pbonzini

On 15.09.2015 19:24, Eric Blake wrote:
> On 09/05/2015 10:43 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>> other drives (there may be qcow2 file with zero disk size but with
>> several dirty bitmaps for other drives).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   docs/specs/qcow2.txt | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..5fc0365 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,13 @@ in the description of a field.
>>                       write to an image with unknown auto-clear features if it
>>                       clears the respective bits from this field first.
>>   
>> -                    Bits 0-63:  Reserved (set to 0)
>> +                    Bit 0:      Dirty bitmaps bit. If this bit is set then
>> +                                there is a _consistent_ Dirty bitmaps extension
>> +                                in the image. If it is not set, but there is a
>> +                                Dirty bitmaps extension, its data should be
>> +                                considered as inconsistent.
> Thanks for documenting this. I don't know that we use underscore for
> _emphasis_ anywhere else in the file, but I don't have any better
> suggestions.  Should you also require that it is an error if this bit is
> set but no Dirty bitmap extension header is present?

There should be an error, if there are cmd flags which tries to load the 
bitmap. If bitmap loading is not loading, would not a warning be enough?

>
>> +
>> +                    Bits 1-63:  Reserved (set to 0)
>>   
>>            96 -  99:  refcount_order
>>                       Describes the width of a reference count block entry (width
>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the following:
>>                           0x00000000 - End of the header extension area
>>                           0xE2792ACA - Backing file format name
>>                           0x6803f857 - Feature name table
>> +                        0x23852875 - Dirty bitmaps
>>                           other      - Unknown header extension, can be safely
>>                                        ignored
>>   
>> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
>>                       terminated if it has full length)
>>   
>>   
>> +== Dirty bitmaps ==
>> +
>> +Dirty bitmaps is an optional header extension. It provides an ability to store
>> +dirty bitmaps in a qcow2 image. The fields are:
> Might not hurt to remind the reader about the auto-clear feature bit
> mentioned earlier controlling whether this extension can be trusted as
> consistent.

agree, will add.

>
>> +
>> +          0 -  3:  nb_dirty_bitmaps
>> +                   The number of dirty bitmaps contained in the image. Valid
>> +                   values: 0 - 65535.
>> +
>> +          4 -  7:  dirty_bitmap_directory_size
>> +                   Size of the Dirty Bitmap Directory in bytes. Valid values:
>> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
> Is it always going to be 1024 * nb_dirty_bitmaps? If so, why do we need
> a redundant field?  If not, then this wording needs help; from the rest
> of this text, it looks like you want "at most 1024 * nb_dirty_bitmaps".
>   Also, while Dirty Bitmap Directory entries are variable length (and
> thus a variable maximum), they do have a minimum size (so the minimum
> value for dirty_bitmap_directory_size must be larger than 0 unless
> nb_dirty_bitmaps is 0, in which case why would we have this header
> extension)

Yes, strange mistake.
actually, it shoud be
<= (round_up_to_8byte_boundary(sizeof(dirty bitmap header) + 
max_dirty_bitmap_name)) * nb_dirty_bitmaps
and
 > (round_up_to_8byte_boundary(sizeof(dirty bitmap header) + 
min_dirty_bitmap_name)) * nb_dirty_bitmaps

So, what is better to leave there, these formulas, or take max and min 
of nb_dirty_bitmaps?

>
>> +
>> +          8 - 15:  dirty_bitmap_directory_offset
>> +                   Offset into the image file at which the Dirty Bitmap
>> +                   Directory starts. Must be aligned to a cluster boundary.
>> +
>> +
>>   == Host cluster management ==
>>   
>>   qcow2 manages the allocation of host clusters by maintaining a reference count
>> @@ -360,3 +385,103 @@ Snapshot table entry:
>>   
>>           variable:   Padding to round up the snapshot table entry size to the
>>                       next multiple of 8.
>> +
>> +
>> +== Dirty bitmaps ==
>> +
>> +The feature supports storing dirty bitmaps in a qcow2 image.
>> +
>> +=== Cluster mapping ===
>> +
>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
> s/ONE/one/ (I didn't see the reason for the emphasis)
>
>> +
>> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
> s/The/Each/
>
>> +Directory Entry) and may use multiple clusters, however it must be contiguous
>> +in the image file.
>> +
>> +Given an offset (in bytes) into the bitmap, the offset into the image file can
>> +be obtained as follows:
>> +
>> +    byte_offset =
>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>> +
>> +Taking into accout the granularity of the bitmap, an offset in bits into the
> s/accout/account/
>
>> +image file can be obtained like this:
>> +
>> +    bit_offset =
>> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr / granularity) % 8
>> +
>> +Here bit_nr is a number of "virtual" bit of the bitmap, which is covered by
>> +"physical" bit with number (bit_nr / granularity).
> I got a bit lost on this sentence. Maybe an example would help?  Is the
> idea that every image has a certain number of clusters, one "virtual"
> bit per cluster, and then the bitmap compresses multiple clusters into
> one "physical" bit according to a compression ratio determined by the
> bitmap granularity?  That is, if I have an image with 64k clusters but
> 128k bitmap granularity, then each physical bit of the bitmap covers 2
> clusters as being dirty?

You are right except that there are not clusters but sectors. For bdrv 
dirty bitmaps, viratual bits are corresponding to sectors. But there I'm 
trying to abstract from bdrv dirty bitmap and to store any dirty bitmap 
with granularity. It may be better to not call it dirty, but just a 
bitmap.. But I'm not sure.. JustABitmap is too general.

>
>> +
>> +Dirty Bitmap Table entry:
>> +
>> +    Bit  0 -  8:    Reserved
> s/Reserved/Reserved, must be 0/
>
>> +
>> +         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
>> +                    cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated, and should be read as all zeros.
>> +
>> +        56 - 63:    Reserved
> and again (specifying the user must write 0 for now leaves the door open
> for extension)
>
>> +
>> +=== Dirty Bitmap Directory ===
>> +
>> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
> s/bitmap,/bitmap/
> s/in the Dirty/in a Dirty/
>
>> +Directory entry. Dirty Bitmap Directory is a contiguous area in the image file,
>> +whose starting offset and length are given by the header extension fields
>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
>> +the bitmap directory have variable length, depending on the length of the
>> +bitmap name.
>> +
>> +Dirty Bitmap Directory Entry:
>> +
>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>> +                    Offset into the image file at which the Dirty Bitmap Table
>> +                    for the bitmap starts. Must be aligned to a cluster
>> +                    boundary.
>> +
>> +         8 - 15:    nb_virtual_bits
>> +                    Number of "virtual" bits in the bitmap. Number of
>> +                    "physical" bits would be:
>> +                    (nb_virtual_bits + granularity - 1) / granularity
>> +
>> +        16 - 19:    dirty_bitmap_table_size
>> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
>> +                    Valid values: 0 - 0x8000000.
>> +                    Also, (dirty_bitmap_table_size * cluster_size) should not
>> +                    be greater than 0x20000000 (512 MB)
>> +
>> +        20 - 23:    granularity_bits
>> +                    Granularity bits. Valid values are: 0 - 63.
>> +
>> +                    Granularity is calculated as
>> +                        granularity = 1 << granularity_bits
> 63 seems like a rather high limit.  Even 32 (1 bit covering 4 billion
> clusters) is huge.
>
>> +
>> +                    Granularity of the bitmap is how many "virtual" bits
>> +                    accounts for one "physical" bit.
>> +
>> +        24 - 27:    flags
>> +                    Bit
>> +                      0: in_use
>> +                         The bitmap is in use and may be inconsistent.
>> +
>> +                      1: self
>> +                         The bitmap is a dirty bitmap for the containing image.
>> +
>> +                      2: auto
>> +                         The bitmap should be autoloaded as block dirty bitmap.
>> +                         Only available if bit 1 (self) is set.
>> +
>> +                      3: read_only
>> +                         The bitmap should not be rewritten.
>> +
>> +                    Bits 4 - 31 are reserved.
>> +
>> +        28 - 29:    name_size
>> +                    Size of the bitmap name. Valid values: 0 - 1023.
> This limit is inconsistent with the limit above that a directory entry
> will be at most 1024 bytes; since you have already burned 30 bytes on
> essential information.  Is 0 allowed (a bitmap with "" as its name), or
> must the name_size be at least 1?  Should you document constraints that
> each bitmap name should be unique within the file (that is, no two
> bitmaps in the directory have the same name)?

Ok. And it is a real question: should we allow zero-length names for 
bitmaps?

>
>> +
>> +        variable:   The name of the bitmap (not null terminated).
>> +
>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size to
>> +                    the next multiple of 8.
>>


-- 
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity()
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity() Vladimir Sementsov-Ogievskiy
  2015-09-15 15:36   ` Eric Blake
@ 2015-10-05 22:47   ` John Snow
  1 sibling, 0 replies; 53+ messages in thread
From: John Snow @ 2015-10-05 22:47 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den



On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c               | 2 +-
>  include/block/block.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 4f7fc0d..6d14f5b 100644
> --- a/block.c
> +++ b/block.c
> @@ -3591,7 +3591,7 @@ uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
>      return granularity;
>  }
>  
> -uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
> +uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap)
>  {
>      return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
>  }
> diff --git a/include/block/block.h b/include/block/block.h
> index edc1510..fb7d410 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -495,7 +495,7 @@ void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
>  void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
>  BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
>  uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
> -uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
> +uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
>  bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
>  bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
>  DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
> 

As with Eric's review, with a commit message added:

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size()
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size() Vladimir Sementsov-Ogievskiy
  2015-09-15 15:37   ` Eric Blake
@ 2015-10-05 22:48   ` John Snow
  1 sibling, 0 replies; 53+ messages in thread
From: John Snow @ 2015-10-05 22:48 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den



On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c               | 5 +++++
>  include/block/block.h | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 6d14f5b..8c39d0a 100644
> --- a/block.c
> +++ b/block.c
> @@ -3632,6 +3632,11 @@ const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap)
>      return bitmap->name;
>  }
>  
> +int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
> +{
> +    return bitmap->size;
> +}
> +
>  uint64_t bdrv_dirty_bitmap_data_size(const BdrvDirtyBitmap *bitmap,
>                                       uint64_t count)
>  {
> diff --git a/include/block/block.h b/include/block/block.h
> index fb7d410..8166640 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -510,6 +510,7 @@ void bdrv_set_dirty_iter(struct HBitmapIter *hbi, int64_t offset);
>  int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
>  
>  const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
> +int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
>  uint64_t bdrv_dirty_bitmap_data_size(const BdrvDirtyBitmap *bitmap,
>                                       uint64_t count);
>  void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
> 

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-09-15 16:24   ` Eric Blake
  2015-09-16  8:52     ` Vladimir Sementsov-Ogievskiy
@ 2015-10-06  0:09     ` John Snow
  1 sibling, 0 replies; 53+ messages in thread
From: John Snow @ 2015-10-06  0:09 UTC (permalink / raw)
  To: Eric Blake, Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, den, Qemu-block, stefanha, pbonzini



On 09/15/2015 12:24 PM, Eric Blake wrote:
> On 09/05/2015 10:43 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>> other drives (there may be qcow2 file with zero disk size but with
>> several dirty bitmaps for other drives).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>  docs/specs/qcow2.txt | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..5fc0365 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,13 @@ in the description of a field.
>>                      write to an image with unknown auto-clear features if it
>>                      clears the respective bits from this field first.
>>  
>> -                    Bits 0-63:  Reserved (set to 0)
>> +                    Bit 0:      Dirty bitmaps bit. If this bit is set then
>> +                                there is a _consistent_ Dirty bitmaps extension
>> +                                in the image. If it is not set, but there is a
>> +                                Dirty bitmaps extension, its data should be
>> +                                considered as inconsistent.
> 
> Thanks for documenting this. I don't know that we use underscore for
> _emphasis_ anywhere else in the file, but I don't have any better
> suggestions.  Should you also require that it is an error if this bit is
> set but no Dirty bitmap extension header is present?
> 

An error, but one that can be safely corrected by any fsck-style
utility: clear the bit.

>> +
>> +                    Bits 1-63:  Reserved (set to 0)
>>  
>>           96 -  99:  refcount_order
>>                      Describes the width of a reference count block entry (width
>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the following:
>>                          0x00000000 - End of the header extension area
>>                          0xE2792ACA - Backing file format name
>>                          0x6803f857 - Feature name table
>> +                        0x23852875 - Dirty bitmaps
>>                          other      - Unknown header extension, can be safely
>>                                       ignored
>>  
>> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
>>                      terminated if it has full length)
>>  
>>  
>> +== Dirty bitmaps ==
>> +
>> +Dirty bitmaps is an optional header extension. It provides an ability to store
>> +dirty bitmaps in a qcow2 image. The fields are:
> 
> Might not hurt to remind the reader about the auto-clear feature bit
> mentioned earlier controlling whether this extension can be trusted as
> consistent.
> 
>> +
>> +          0 -  3:  nb_dirty_bitmaps
>> +                   The number of dirty bitmaps contained in the image. Valid
>> +                   values: 0 - 65535.
>> +
>> +          4 -  7:  dirty_bitmap_directory_size
>> +                   Size of the Dirty Bitmap Directory in bytes. Valid values:
>> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
> 
> Is it always going to be 1024 * nb_dirty_bitmaps? If so, why do we need
> a redundant field?  If not, then this wording needs help; from the rest
> of this text, it looks like you want "at most 1024 * nb_dirty_bitmaps".
>  Also, while Dirty Bitmap Directory entries are variable length (and
> thus a variable maximum), they do have a minimum size (so the minimum
> value for dirty_bitmap_directory_size must be larger than 0 unless
> nb_dirty_bitmaps is 0, in which case why would we have this header
> extension)
> 

Agree.

>> +
>> +          8 - 15:  dirty_bitmap_directory_offset
>> +                   Offset into the image file at which the Dirty Bitmap
>> +                   Directory starts. Must be aligned to a cluster boundary.
>> +
>> +
>>  == Host cluster management ==
>>  
>>  qcow2 manages the allocation of host clusters by maintaining a reference count
>> @@ -360,3 +385,103 @@ Snapshot table entry:
>>  
>>          variable:   Padding to round up the snapshot table entry size to the
>>                      next multiple of 8.
>> +
>> +
>> +== Dirty bitmaps ==
>> +
>> +The feature supports storing dirty bitmaps in a qcow2 image.
>> +
>> +=== Cluster mapping ===
>> +
>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
> 
> s/ONE/one/ (I didn't see the reason for the emphasis)
> 

Emphasis is likely because that's not how the cluster allocation
mechanism works in qcow2 otherwise. We're essentially storing data
straight into what would otherwise be the L1 table.

It's worth clarifying, in my opinion.

>> +
>> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
> 
> s/The/Each/
> 

Yes -- and adding to this, I might request that the Dirty Bitmap
Directory is documented before the Dirty Bitmap "Table". That way the
spec reads very naturally from header, to directory, to data.

>> +Directory Entry) and may use multiple clusters, however it must be contiguous
>> +in the image file.
>> +
>> +Given an offset (in bytes) into the bitmap, the offset into the image file can
>> +be obtained as follows:
>> +
>> +    byte_offset =
>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>> +
>> +Taking into accout the granularity of the bitmap, an offset in bits into the
> 
> s/accout/account/
> 
>> +image file can be obtained like this:
>> +
>> +    bit_offset =
>> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr / granularity) % 8
>> +
>> +Here bit_nr is a number of "virtual" bit of the bitmap, which is covered by
>> +"physical" bit with number (bit_nr / granularity).
> 
> I got a bit lost on this sentence. Maybe an example would help?  Is the
> idea that every image has a certain number of clusters, one "virtual"
> bit per cluster, and then the bitmap compresses multiple clusters into
> one "physical" bit according to a compression ratio determined by the
> bitmap granularity?  That is, if I have an image with 64k clusters but
> 128k bitmap granularity, then each physical bit of the bitmap covers 2
> clusters as being dirty?
> 

Yes -- implementation-wise, we create a bitmap with "n" number of
"virtual bits" with a mapping of one "virtual bit" per "sector." (the
sector we use here is BDRV_SECTOR_BITS = 9, 512 bytes.)

If a granularity of, say, g=1 is chosen, (which would have been a user
specified granularity of 512 bytes) then the number of "virtual bits" is
synonymous with the number of stored/physical/actual bits. Otherwise,
Each bit represents a dirty flag for 2^g *sectors*.

However, in the qcow2 world, this is useless, because qcow2 cares about
clusters -- so knowing about a single dirty sector is not necessarily
useful to us at this level. We want to know, generally, about dirty
*clusters*.

The addition of new dirty bitmaps via QMP tries to match the current
cluster size (which itself is a 64K default) and failing that, will use
64K (which we hope is a sane guess as a default.)

So in the case of a cluster size of 64K and the granularity expertly
chosen to be 7, each physical bit represents 2^7 = 128 "virtual" bits
(which are secretly sectors). The nice property here is that one
physical bit winds up mapping pretty directly to one cluster.

Clear as mud?
:)

>> +
>> +Dirty Bitmap Table entry:
>> +
>> +    Bit  0 -  8:    Reserved
> 
> s/Reserved/Reserved, must be 0/
> 
>> +
>> +         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
>> +                    cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated, and should be read as all zeros.
>> +
>> +        56 - 63:    Reserved
> 
> and again (specifying the user must write 0 for now leaves the door open
> for extension)
> 
>> +
>> +=== Dirty Bitmap Directory ===
>> +
>> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
> 
> s/bitmap,/bitmap/
> s/in the Dirty/in a Dirty/
> 
>> +Directory entry. Dirty Bitmap Directory is a contiguous area in the image file,
>> +whose starting offset and length are given by the header extension fields
>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
>> +the bitmap directory have variable length, depending on the length of the
>> +bitmap name.
>> +
>> +Dirty Bitmap Directory Entry:
>> +
>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>> +                    Offset into the image file at which the Dirty Bitmap Table
>> +                    for the bitmap starts. Must be aligned to a cluster
>> +                    boundary.
>> +
>> +         8 - 15:    nb_virtual_bits
>> +                    Number of "virtual" bits in the bitmap. Number of
>> +                    "physical" bits would be:
>> +                    (nb_virtual_bits + granularity - 1) / granularity
>> +
>> +        16 - 19:    dirty_bitmap_table_size
>> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
>> +                    Valid values: 0 - 0x8000000.
>> +                    Also, (dirty_bitmap_table_size * cluster_size) should not
>> +                    be greater than 0x20000000 (512 MB)
>> +
>> +        20 - 23:    granularity_bits
>> +                    Granularity bits. Valid values are: 0 - 63.
>> +
>> +                    Granularity is calculated as
>> +                        granularity = 1 << granularity_bits
> 
> 63 seems like a rather high limit.  Even 32 (1 bit covering 4 billion
> clusters) is huge.
> 

Sectors, but yes. g=32 would mean 1 bit covers 2TiB with
BDRV_SECTOR_BITS := 9.

bdrv_create_dirty_bitmap allows granularity to be specified in bytes as
a uint32_t. `g` or granularity /bits/ is ultimately calculated as:
g = log_2(sectors_per_bit = (granularity_bytes / (sector_size = 512)))

Since bytes is capped to 2^32-1 and we ensure that it is a value of 2,
the largest we can specify is 2^31, which would produce for us a `g` of 22.

>> +
>> +                    Granularity of the bitmap is how many "virtual" bits
>> +                    accounts for one "physical" bit.
>> +
>> +        24 - 27:    flags
>> +                    Bit
>> +                      0: in_use
>> +                         The bitmap is in use and may be inconsistent.
>> +
>> +                      1: self
>> +                         The bitmap is a dirty bitmap for the containing image.
>> +
>> +                      2: auto
>> +                         The bitmap should be autoloaded as block dirty bitmap.
>> +                         Only available if bit 1 (self) is set.
>> +
>> +                      3: read_only
>> +                         The bitmap should not be rewritten.
>> +
>> +                    Bits 4 - 31 are reserved.
>> +
>> +        28 - 29:    name_size
>> +                    Size of the bitmap name. Valid values: 0 - 1023.
> 
> This limit is inconsistent with the limit above that a directory entry
> will be at most 1024 bytes; since you have already burned 30 bytes on
> essential information.  Is 0 allowed (a bitmap with "" as its name), or
> must the name_size be at least 1?  Should you document constraints that
> each bitmap name should be unique within the file (that is, no two
> bitmaps in the directory have the same name)?
> 

To documenting the namespace: yes please!

To zero-length bitmap names: I am not sure what use we have for them --
If more than one empty-name bitmap is stored, they will become ambiguous
and difficult to retrieve meaningfully.

I am of the opinion that each bitmap should have some unique identifier
attached to it. To this end, allowing an empty string for *one* bitmap
is not a problem, but may confuse future developers into thinking it's
okay to have "un-named" bitmaps. It isn't.

I'm for prohibiting the empty bitmap name.

>> +
>> +        variable:   The name of the bitmap (not null terminated).
>> +
>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size to
>> +                    the next multiple of 8.
>>
> 

I will revisit this patch after reviewing the rest of the series for
further comments, if any.

Thanks, and sorry it took me so long to get to reviewing this again.

--js

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts Vladimir Sementsov-Ogievskiy
@ 2015-10-06 20:12   ` John Snow
  2015-10-06 20:16   ` John Snow
  1 sibling, 0 replies; 53+ messages in thread
From: John Snow @ 2015-10-06 20:12 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den



On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
> Add data structures and constraints accordingly to docs/specs/qcow2.txt
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/Makefile.objs        |  2 +-
>  block/qcow2-dirty-bitmap.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2.h              | 28 ++++++++++++++++++++++++++++
>  3 files changed, 71 insertions(+), 1 deletion(-)
>  create mode 100644 block/qcow2-dirty-bitmap.c
> 
> diff --git a/block/Makefile.objs b/block/Makefile.objs
> index 58ef2ef..c6e1f4b 100644
> --- a/block/Makefile.objs
> +++ b/block/Makefile.objs
> @@ -1,5 +1,5 @@
>  block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
> -block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
> +block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-dirty-bitmap.o
>  block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
>  block-obj-y += qed-check.o
>  block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
> new file mode 100644
> index 0000000..fd4e0ef
> --- /dev/null
> +++ b/block/qcow2-dirty-bitmap.c
> @@ -0,0 +1,42 @@
> +/*
> + * Dirty bitmaps for the QCOW version 2 format
> + *
> + * Copyright (c) 2014-2015 Vladimir Sementsov-Ogievskiy
> + *
> + * This file is derived from qcow2-snapshot.c, original copyright:
> + * Copyright (c) 2004-2006 Fabrice Bellard
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +/* NOTICE: DBM here means Dirty Bitmap and used as a namespace for _internal_
> + * constants. Please do not use this _internal_ abbreviation for other needs
> + * and/or outside of this file. */
> +
> +/* Dirty Bitmap Directory Enrty constraints */

Entry

> +#define DBM_MAX_TABLE_SIZE 0x8000000
> +#define DBM_MAX_PHYS_SIZE 0x20000000 /* 512 mb */
> +#define DBM_MAX_GRANULARITY_BITS 63
> +#define DBM_MAX_NAME_SIZE 1023
> +

MAX_TABLE_SIZE matches the spec, OK.
MAX_PHYS_SIZE is documented as:

"Also, (dirty_bitmap_table_size * cluster_size) should not be greater
than 0x20000000 (512 MB)"

I might use stricter wording:

"(dirty_bitmap_table_size * cluster_size) must not exceed 0x20000000
(512 MB)"

granularity is fine, but is high like stated in #03.
Eric has also pointed out that NAME_SIZE might not be entirely accurate,
as well.

> +/* Dirty Bitmap Directory Enrty flags */

Entry

> +#define DBM_RESERVED_FLAGS 0xffffffff

What does this one correlate to? The table entry appears to be described
below (ff.....1ff), but the dirty bitmap directory flags appear to
define the lower four bits and no others. what's *this* mask for?

I would expect to see 0xfffffff0 somewhere.

> +
> +/* bits [0, 8] U [56, 63] are reserved */
> +#define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 72e1328..a2a5d4a 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -52,6 +52,10 @@
>   * space for snapshot names and IDs */
>  #define QCOW_MAX_SNAPSHOTS_SIZE (1024 * QCOW_MAX_SNAPSHOTS)
>  
> +/* Dirty Bitmap Header Extension constraints */
> +#define QCOW_MAX_DIRTY_BITMAPS 65536
> +#define QCOW_MAX_DIRTY_BITMAP_DIRECTORY_SIZE (1024 * QCOW_MAX_DIRTY_BITMAPS)
> +

Matches spec.

>  /* indicate that the refcount of the referenced cluster is exactly one. */
>  #define QCOW_OFLAG_COPIED     (1ULL << 63)
>  /* indicate that the cluster is compressed (they never have the copied flag) */
> @@ -141,6 +145,19 @@ typedef struct QEMU_PACKED QCowSnapshotHeader {
>      /* name follows  */
>  } QCowSnapshotHeader;
>  
> +typedef struct QEMU_PACKED QCowDirtyBitmapHeader {
> +    /* header is 8 byte aligned */
> +    uint64_t dirty_bitmap_table_offset;
> +    uint64_t nb_virtual_bits;
> +
> +    uint32_t dirty_bitmap_table_size;
> +    uint32_t granularity_bits;
> +
> +    uint32_t flags;
> +    uint16_t name_size;
> +    /* name follows  */

Is it against our style to put zero-length arrays in structures? It's a
habit I got into, but maybe it's not appreciated in QEMU.

> +} QCowDirtyBitmapHeader;
> +

This is called the "header" here -- but it's the header for the Dirty
Bitmap Directory. I was confused for a second, no big deal.

Matches spec, though.

>  typedef struct QEMU_PACKED QCowSnapshotExtraData {
>      uint64_t vm_state_size_large;
>      uint64_t disk_size;
> @@ -159,6 +176,11 @@ typedef struct QCowSnapshot {
>      uint64_t vm_clock_nsec;
>  } QCowSnapshot;
>  
> +typedef struct QCowDirtyBitmap {
> +    uint64_t offset;
> +    char *name;
> +} QCowDirtyBitmap;
> +
>  struct Qcow2Cache;
>  typedef struct Qcow2Cache Qcow2Cache;
>  
> @@ -221,6 +243,12 @@ typedef uint64_t Qcow2GetRefcountFunc(const void *refcount_array,
>  typedef void Qcow2SetRefcountFunc(void *refcount_array,
>                                    uint64_t index, uint64_t value);
>  
> +typedef struct Qcow2DirtyBitmapHeaderExt {
> +    uint32_t nb_dirty_bitmaps;
> +    uint32_t dirty_bitmap_directory_size;
> +    uint64_t dirty_bitmap_directory_offset;
> +} QEMU_PACKED Qcow2DirtyBitmapHeaderExt;
> +

Matches spec.

>  typedef struct BDRVQcowState {
>      int cluster_bits;
>      int cluster_size;
> 

Needs a trivial rebase due to missing context in qcow2.h, but otherwise
it appears OK.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts Vladimir Sementsov-Ogievskiy
  2015-10-06 20:12   ` John Snow
@ 2015-10-06 20:16   ` John Snow
  2016-02-16 17:04     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 53+ messages in thread
From: John Snow @ 2015-10-06 20:16 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den



On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
> Add data structures and constraints accordingly to docs/specs/qcow2.txt
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/Makefile.objs        |  2 +-
>  block/qcow2-dirty-bitmap.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2.h              | 28 ++++++++++++++++++++++++++++
>  3 files changed, 71 insertions(+), 1 deletion(-)
>  create mode 100644 block/qcow2-dirty-bitmap.c
> 
> diff --git a/block/Makefile.objs b/block/Makefile.objs
> index 58ef2ef..c6e1f4b 100644
> --- a/block/Makefile.objs
> +++ b/block/Makefile.objs
> @@ -1,5 +1,5 @@
>  block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
> -block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
> +block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-dirty-bitmap.o
>  block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
>  block-obj-y += qed-check.o
>  block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
> new file mode 100644
> index 0000000..fd4e0ef
> --- /dev/null
> +++ b/block/qcow2-dirty-bitmap.c
> @@ -0,0 +1,42 @@
> +/*
> + * Dirty bitmaps for the QCOW version 2 format
> + *
> + * Copyright (c) 2014-2015 Vladimir Sementsov-Ogievskiy
> + *
> + * This file is derived from qcow2-snapshot.c, original copyright:
> + * Copyright (c) 2004-2006 Fabrice Bellard
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +/* NOTICE: DBM here means Dirty Bitmap and used as a namespace for _internal_
> + * constants. Please do not use this _internal_ abbreviation for other needs
> + * and/or outside of this file. */
> +
> +/* Dirty Bitmap Directory Enrty constraints */
> +#define DBM_MAX_TABLE_SIZE 0x8000000
> +#define DBM_MAX_PHYS_SIZE 0x20000000 /* 512 mb */
> +#define DBM_MAX_GRANULARITY_BITS 63
> +#define DBM_MAX_NAME_SIZE 1023
> +
> +/* Dirty Bitmap Directory Enrty flags */
> +#define DBM_RESERVED_FLAGS 0xffffffff
> +
> +/* bits [0, 8] U [56, 63] are reserved */
> +#define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 72e1328..a2a5d4a 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -52,6 +52,10 @@
>   * space for snapshot names and IDs */
>  #define QCOW_MAX_SNAPSHOTS_SIZE (1024 * QCOW_MAX_SNAPSHOTS)
>  
> +/* Dirty Bitmap Header Extension constraints */
> +#define QCOW_MAX_DIRTY_BITMAPS 65536
> +#define QCOW_MAX_DIRTY_BITMAP_DIRECTORY_SIZE (1024 * QCOW_MAX_DIRTY_BITMAPS)
> +
>  /* indicate that the refcount of the referenced cluster is exactly one. */
>  #define QCOW_OFLAG_COPIED     (1ULL << 63)
>  /* indicate that the cluster is compressed (they never have the copied flag) */
> @@ -141,6 +145,19 @@ typedef struct QEMU_PACKED QCowSnapshotHeader {
>      /* name follows  */
>  } QCowSnapshotHeader;
>  
> +typedef struct QEMU_PACKED QCowDirtyBitmapHeader {
> +    /* header is 8 byte aligned */
> +    uint64_t dirty_bitmap_table_offset;
> +    uint64_t nb_virtual_bits;
> +
> +    uint32_t dirty_bitmap_table_size;
> +    uint32_t granularity_bits;
> +
> +    uint32_t flags;

looks like DBM_RESERVED_FLAGS was meant to apply to this field, but I
think the mask is incorrect and should be 0xfffffff0.

> +    uint16_t name_size;
> +    /* name follows  */
> +} QCowDirtyBitmapHeader;
> +
>  typedef struct QEMU_PACKED QCowSnapshotExtraData {
>      uint64_t vm_state_size_large;
>      uint64_t disk_size;
> @@ -159,6 +176,11 @@ typedef struct QCowSnapshot {
>      uint64_t vm_clock_nsec;
>  } QCowSnapshot;
>  
> +typedef struct QCowDirtyBitmap {
> +    uint64_t offset;
> +    char *name;
> +} QCowDirtyBitmap;
> +
>  struct Qcow2Cache;
>  typedef struct Qcow2Cache Qcow2Cache;
>  
> @@ -221,6 +243,12 @@ typedef uint64_t Qcow2GetRefcountFunc(const void *refcount_array,
>  typedef void Qcow2SetRefcountFunc(void *refcount_array,
>                                    uint64_t index, uint64_t value);
>  
> +typedef struct Qcow2DirtyBitmapHeaderExt {
> +    uint32_t nb_dirty_bitmaps;
> +    uint32_t dirty_bitmap_directory_size;
> +    uint64_t dirty_bitmap_directory_offset;
> +} QEMU_PACKED Qcow2DirtyBitmapHeaderExt;
> +
>  typedef struct BDRVQcowState {
>      int cluster_bits;
>      int cluster_size;
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-09-05 17:33   ` Vladimir Sementsov-Ogievskiy
@ 2015-10-06 20:22     ` John Snow
  2015-10-06 20:33       ` Eric Blake
  0 siblings, 1 reply; 53+ messages in thread
From: John Snow @ 2015-10-06 20:22 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den



On 09/05/2015 01:33 PM, Vladimir Sementsov-Ogievskiy wrote:
> On 05.09.2015 19:43, Vladimir Sementsov-Ogievskiy wrote:
>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>> other drives (there may be qcow2 file with zero disk size but with
>> several dirty bitmaps for other drives).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   docs/specs/qcow2.txt | 127
>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..5fc0365 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,13 @@ in the description of a field.
>>                       write to an image with unknown auto-clear
>> features if it
>>                       clears the respective bits from this field first.
>>   -                    Bits 0-63:  Reserved (set to 0)
>> +                    Bit 0:      Dirty bitmaps bit. If this bit is set
>> then
>> +                                there is a _consistent_ Dirty bitmaps
>> extension
>> +                                in the image. If it is not set, but
>> there is a
>> +                                Dirty bitmaps extension, its data
>> should be
>> +                                considered as inconsistent.
>> +
>> +                    Bits 1-63:  Reserved (set to 0)
>>              96 -  99:  refcount_order
>>                       Describes the width of a reference count block
>> entry (width
>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the
>> following:
>>                           0x00000000 - End of the header extension area
>>                           0xE2792ACA - Backing file format name
>>                           0x6803f857 - Feature name table
>> +                        0x23852875 - Dirty bitmaps
>>                           other      - Unknown header extension, can
>> be safely
>>                                        ignored
>>   @@ -166,6 +173,24 @@ the header extension data. Each entry look like
>> this:
>>                       terminated if it has full length)
>>     +== Dirty bitmaps ==
>> +
>> +Dirty bitmaps is an optional header extension. It provides an ability
>> to store
>> +dirty bitmaps in a qcow2 image. The fields are:
>> +
>> +          0 -  3:  nb_dirty_bitmaps
>> +                   The number of dirty bitmaps contained in the
>> image. Valid
>> +                   values: 0 - 65535.
>> +
>> +          4 -  7:  dirty_bitmap_directory_size
>> +                   Size of the Dirty Bitmap Directory in bytes. Valid
>> values:
>> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
>> +
>> +          8 - 15:  dirty_bitmap_directory_offset
>> +                   Offset into the image file at which the Dirty Bitmap
>> +                   Directory starts. Must be aligned to a cluster
>> boundary.
>> +
>> +
>>   == Host cluster management ==
>>     qcow2 manages the allocation of host clusters by maintaining a
>> reference count
>> @@ -360,3 +385,103 @@ Snapshot table entry:
>>             variable:   Padding to round up the snapshot table entry
>> size to the
>>                       next multiple of 8.
>> +
>> +
>> +== Dirty bitmaps ==
>> +
>> +The feature supports storing dirty bitmaps in a qcow2 image.
>> +
>> +=== Cluster mapping ===
>> +
>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
>> +
>> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>> +Directory Entry) and may use multiple clusters, however it must be
>> contiguous
>> +in the image file.
>> +
>> +Given an offset (in bytes) into the bitmap, the offset into the image
>> file can
>> +be obtained as follows:
>> +
>> +    byte_offset =
>> +        dirty_bitmap_table[offset / cluster_size] + (offset %
>> cluster_size)
>> +
>> +Taking into accout the granularity of the bitmap, an offset in bits
>> into the
>> +image file can be obtained like this:
>> +
>> +    bit_offset =
>> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr /
>> granularity) % 8
>> +
>> +Here bit_nr is a number of "virtual" bit of the bitmap, which is
>> covered by
>> +"physical" bit with number (bit_nr / granularity).
>> +
>> +Dirty Bitmap Table entry:
>> +
>> +    Bit  0 -  8:    Reserved
>> +
>> +         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned
>> to a
>> +                    cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated, and should be read as all zeros.
>> +
>> +        56 - 63:    Reserved
>> +
>> +=== Dirty Bitmap Directory ===
>> +
>> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
>> +Directory entry. Dirty Bitmap Directory is a contiguous area in the
>> image file,
>> +whose starting offset and length are given by the header extension
>> fields
>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The
>> entries of
>> +the bitmap directory have variable length, depending on the length of
>> the
>> +bitmap name.
>> +
>> +Dirty Bitmap Directory Entry:
>> +
>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>> +                    Offset into the image file at which the Dirty
>> Bitmap Table
>> +                    for the bitmap starts. Must be aligned to a cluster
>> +                    boundary.
>> +
>> +         8 - 15:    nb_virtual_bits
>> +                    Number of "virtual" bits in the bitmap. Number of
>> +                    "physical" bits would be:
>> +                    (nb_virtual_bits + granularity - 1) / granularity
>> +
>> +        16 - 19:    dirty_bitmap_table_size
>> +                    Number of entries in the Dirty Bitmap Table of
>> the bitmap.
>> +                    Valid values: 0 - 0x8000000.
>> +                    Also, (dirty_bitmap_table_size * cluster_size)
>> should not
>> +                    be greater than 0x20000000 (512 MB)
>> +
>> +        20 - 23:    granularity_bits
>> +                    Granularity bits. Valid values are: 0 - 63.
>> +
>> +                    Granularity is calculated as
>> +                        granularity = 1 << granularity_bits
>> +
>> +                    Granularity of the bitmap is how many "virtual" bits
>> +                    accounts for one "physical" bit.
>> +
>> +        24 - 27:    flags
>> +                    Bit
>> +                      0: in_use
>> +                         The bitmap is in use and may be inconsistent.
>> +
>> +                      1: self
>> +                         The bitmap is a dirty bitmap for the
>> containing image.
>> +
>> +                      2: auto
>> +                         The bitmap should be autoloaded as block
>> dirty bitmap.
>> +                         Only available if bit 1 (self) is set.
>> +
>> +                      3: read_only
>> +                         The bitmap should not be rewritten.
>> +
>> +                    Bits 4 - 31 are reserved.
> 
> Is this appropriate as field, reserved for future extensiion? Or we need
> an additional one? Do we need scheme like with snapshots? (somthing like
> field 'additional_area_size', and additional offset of this size after
> the name)
> 

I think it would remain appropriate as long as we have a version header
for the bitmap extension as a whole.

e.g. "Bits 4 - 31 are reserved in qcow2.bitmap.v1 ..."

If a program that only knows about v1 opens a v2 file and find it
conforms to spec (does not use the new reserved bits), then it can
continue along happily.

If a reserved bit is set, it's an error and the v1 program must not
alter the image in case it ruins the consistency of the file.

The usual suspects (Kevin, Markus, Stefan and Eric) may have better
suggestions for how to handle future compatibility by drawing upon their
experience with qcow2.


>> +
>> +        28 - 29:    name_size
>> +                    Size of the bitmap name. Valid values: 0 - 1023.
>> +
>> +        variable:   The name of the bitmap (not null terminated).
>> +
>> +        variable:   Padding to round up the Dirty Bitmap Directory
>> Entry size to
>> +                    the next multiple of 8.
> 
> 

-- 
—js

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-10-06 20:22     ` John Snow
@ 2015-10-06 20:33       ` Eric Blake
  0 siblings, 0 replies; 53+ messages in thread
From: Eric Blake @ 2015-10-06 20:33 UTC (permalink / raw)
  To: John Snow, Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, pbonzini, stefanha, den

[-- Attachment #1: Type: text/plain, Size: 2579 bytes --]

On 10/06/2015 02:22 PM, John Snow wrote:


>>> +Dirty Bitmap Directory Entry:
>>> +

>>> +
>>> +        24 - 27:    flags
>>> +                    Bit
>>> +                      0: in_use
>>> +                         The bitmap is in use and may be inconsistent.
>>> +
>>> +                      1: self
>>> +                         The bitmap is a dirty bitmap for the
>>> containing image.
>>> +
>>> +                      2: auto
>>> +                         The bitmap should be autoloaded as block
>>> dirty bitmap.
>>> +                         Only available if bit 1 (self) is set.
>>> +
>>> +                      3: read_only
>>> +                         The bitmap should not be rewritten.
>>> +
>>> +                    Bits 4 - 31 are reserved.
>>
>> Is this appropriate as field, reserved for future extensiion? Or we need
>> an additional one? Do we need scheme like with snapshots? (somthing like
>> field 'additional_area_size', and additional offset of this size after
>> the name)
>>
> 
> I think it would remain appropriate as long as we have a version header
> for the bitmap extension as a whole.

Simply requiring that the bits must be 0 is good enough for now.

> 
> e.g. "Bits 4 - 31 are reserved in qcow2.bitmap.v1 ..."
> 
> If a program that only knows about v1 opens a v2 file and find it
> conforms to spec (does not use the new reserved bits), then it can
> continue along happily.

I don't know that you even have to mention versions of the header, so
much as the blanket statement that any set bit not described by this
version of the spec is either a data corruption or evidence of a newer
version of the spec having edited the file in the meantime.  It works as
long as you require conforming clients to set reserved bits to 0.

> 
> If a reserved bit is set, it's an error and the v1 program must not
> alter the image in case it ruins the consistency of the file.
> 
> The usual suspects (Kevin, Markus, Stefan and Eric) may have better
> suggestions for how to handle future compatibility by drawing upon their
> experience with qcow2.

No, I think we're just fine. If any future spec version requires
additional space, then it can use one of the bits 4-31 as a flag to call
out that space, and older clients will handily refuse to operate on the
file without having to know that the bit meant that the header occupied
more space in the newer version of the spec.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 05/17] qcow2-dirty-bitmap: read dirty bitmap directory
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 05/17] qcow2-dirty-bitmap: read dirty bitmap directory Vladimir Sementsov-Ogievskiy
@ 2015-10-06 21:27   ` John Snow
  2016-02-16 18:51     ` Vladimir Sementsov-Ogievskiy
  2016-02-17 15:03     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 2 replies; 53+ messages in thread
From: John Snow @ 2015-10-06 21:27 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den



On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
> Adds qcow2_read_dirty_bitmaps, reading Dirty Bitmap Directory as
> specified in docs/specs/qcow2.txt
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/qcow2-dirty-bitmap.c | 155 +++++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2.h              |  10 +++
>  2 files changed, 165 insertions(+)
> 
> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
> index fd4e0ef..1260d1d 100644
> --- a/block/qcow2-dirty-bitmap.c
> +++ b/block/qcow2-dirty-bitmap.c
> @@ -25,6 +25,9 @@
>   * THE SOFTWARE.
>   */
>  
> +#include "block/block_int.h"
> +#include "block/qcow2.h"
> +
>  /* NOTICE: DBM here means Dirty Bitmap and used as a namespace for _internal_
>   * constants. Please do not use this _internal_ abbreviation for other needs
>   * and/or outside of this file. */
> @@ -40,3 +43,155 @@
>  
>  /* bits [0, 8] U [56, 63] are reserved */
>  #define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
> +
> +void qcow2_free_dirty_bitmaps(BlockDriverState *bs)
> +{
> +    BDRVQcowState *s = bs->opaque;

BDRVQcow2State here and everywhere else in this patch, now.

> +    int i;
> +
> +    for (i = 0; i < s->nb_dirty_bitmaps; i++) {
> +        g_free(s->dirty_bitmaps[i].name);
> +    }
> +    g_free(s->dirty_bitmaps);
> +    s->dirty_bitmaps = NULL;
> +    s->nb_dirty_bitmaps = 0;
> +
> +    g_free(s->dirty_bitmap_directory);
> +    s->dirty_bitmap_directory = NULL;
> +}
> +
> +static void bitmap_header_to_cpu(QCowDirtyBitmapHeader *h)
> +{
> +    be64_to_cpus(&h->dirty_bitmap_table_offset);
> +    be64_to_cpus(&h->nb_virtual_bits);
> +    be32_to_cpus(&h->dirty_bitmap_table_size);
> +    be32_to_cpus(&h->granularity_bits);
> +    be32_to_cpus(&h->flags);
> +    be16_to_cpus(&h->name_size);

I realize you probably got these functions by example from the other
qcow2 files, but what exactly is cpu*s* here? What does the *s* stand for?

I guess it refers to the in-place swapping variants that the Linux
kernel defines?

hmm, just a curiosity on my part ...

the function looks correct, anyway. :)

> +}
> +
> +static int calc_dir_entry_size(size_t name_size)
> +{
> +    return align_offset(sizeof(QCowDirtyBitmapHeader) + name_size, 8);

Matches spec.

> +}
> +
> +static int dir_entry_size(QCowDirtyBitmapHeader *h)
> +{
> +    return calc_dir_entry_size(h->name_size);

OK.

> +}
> +
> +static int check_constraints(int cluster_size,
> +                             QCowDirtyBitmapHeader *h)
> +{
> +    uint64_t phys_bitmap_bytes =
> +        (uint64_t)h->dirty_bitmap_table_size * cluster_size;
> +    uint64_t max_virtual_bits = (phys_bitmap_bytes * 8) << h->granularity_bits;
> +
> +    int fail =
> +            (h->dirty_bitmap_table_offset % cluster_size) ||
> +            (h->dirty_bitmap_table_size > DBM_MAX_TABLE_SIZE) ||
> +            (phys_bitmap_bytes > DBM_MAX_PHYS_SIZE) ||
> +            (h->nb_virtual_bits > max_virtual_bits) ||
> +            (h->granularity_bits > DBM_MAX_GRANULARITY_BITS) ||
> +            (h->flags & DBM_RESERVED_FLAGS) ||
> +            (h->name_size > DBM_MAX_NAME_SIZE);
> +

Function is a little dense, but appears to be correct -- apart from the
DMB_RESERVED_FLAGS issue I mentioned earlier.

> +    return fail ? -EINVAL : 0;
> +}
> +
> +static int directory_read(BlockDriverState *bs)
> +{
> +    int ret;
> +    BDRVQcowState *s = bs->opaque;
> +    uint8_t *entry, *end;
> +
> +    if (s->dirty_bitmap_directory != NULL) {
> +        /* already read */
> +        return -EEXIST;
> +    }
> +
> +    s->dirty_bitmap_directory = g_try_malloc0(s->dirty_bitmap_directory_size);
> +    if (s->dirty_bitmap_directory == NULL) {
> +        return -ENOMEM;
> +    }
> +

I assume we're trying here in case the directory size is garbage, as a
method of preventing garbage from crashing our program. Since
dirty_bitmap_directory_size was in theory already read in (by a function
checked in later in this series), did we not validate that input value?

> +    ret = bdrv_pread(bs->file,
> +                     s->dirty_bitmap_directory_offset,
> +                     s->dirty_bitmap_directory,
> +                     s->dirty_bitmap_directory_size);
> +    if (ret < 0) {
> +        goto fail;
> +    }
> +

Alright, so we read the entire directory into memory... which can be as
large as 64K * 1024, or 64MiB. A non-trivial size.

> +    entry = s->dirty_bitmap_directory;
> +    end = s->dirty_bitmap_directory + s->dirty_bitmap_directory_size;
> +    while (entry < end) {
> +        QCowDirtyBitmapHeader *h = (QCowDirtyBitmapHeader *)entry;
> +        bitmap_header_to_cpu(h);
> +

OK, so we're interpreting the values in-place in memory, but leaving
them in the table.

> +        ret = check_constraints(s->cluster_size, h);
> +        if (ret < 0) {
> +            goto fail;
> +        }
> +
> +        entry += dir_entry_size(h);
> +    }
> +
> +    return 0;
> +
> +fail:
> +    g_free(s->dirty_bitmap_directory);
> +    s->dirty_bitmap_directory = NULL;
> +
> +    return ret;
> +}
> +
> +int qcow2_read_dirty_bitmaps(BlockDriverState *bs)
> +{
> +    int ret;
> +    BDRVQcowState *s = bs->opaque;
> +    size_t offset;
> +    QCowDirtyBitmap *bm, *end;
> +
> +    if (s->dirty_bitmap_directory != NULL || s->dirty_bitmaps != NULL) {
> +        /* already read */
> +        return -EEXIST;
> +    }
> +
> +    if (s->nb_dirty_bitmaps == 0) {
> +        /* No bitmaps - nothing to do */
> +        return 0;
> +    }
> +

OK, so this assumes that the extension header has been read, but that
code comes later in this series.

> +    ret = directory_read(bs);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +

At the end of this call we have interpreted the header into a CPU native
format, but not performed any processing on it whatsoever.

> +    s->dirty_bitmaps = g_try_new0(QCowDirtyBitmap, s->nb_dirty_bitmaps);
> +    if (s->dirty_bitmaps == NULL) {
> +        ret = -ENOMEM;
> +        goto out;
> +    }
> +

I think we could actually allocate this block of memory sooner (we
already have read and validated nb_dirty_bitmaps) and then during the
initial read, after validation, we can just fill the QcowDirtyBitmap
structures as we go.

If we keep "int n" as we parse bitmaps in the header, we can just unwind
on failure with:

for (i = n; i >= 0; i--) {
   bm = s->dirty_bitmaps[i];
   g_free(bm->name);
}
g_free(s->dirty_bitmaps);

Then we don't have to re-crawl through the structure looking for names,
getting sizes again, etc. It should be a little faster.

> +    offset = 0;
> +    end = s->dirty_bitmaps + s->nb_dirty_bitmaps;
> +    for (bm = s->dirty_bitmaps; bm < end; ++bm) {
> +        QCowDirtyBitmapHeader *h =
> +                (QCowDirtyBitmapHeader *)(s->dirty_bitmap_directory + offset);
> +
> +        bm->offset = offset;
> +        bm->name = g_malloc(h->name_size + 1);
> +        memcpy(bm->name, h + 1, h->name_size);
> +        bm->name[h->name_size] = '\0';

You can replace the last three lines if you want with just:

bm->name = g_strndup(h + 1, h->name_size);

> +
> +        offset += dir_entry_size(h);
> +    }
> +    ret = 0;
> +
> +out:
> +    if (ret < 0) {
> +        qcow2_free_dirty_bitmaps(bs);
> +    }
> +    return ret;
> +}
> diff --git a/block/qcow2.h b/block/qcow2.h
> index a2a5d4a..5016fa1 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -288,6 +288,12 @@ typedef struct BDRVQcowState {
>      unsigned int nb_snapshots;
>      QCowSnapshot *snapshots;
>  
> +    uint64_t dirty_bitmap_directory_offset;
> +    size_t dirty_bitmap_directory_size;

I guess these two are from the extension header.

> +    uint8_t *dirty_bitmap_directory;
> +    unsigned int nb_dirty_bitmaps;

This one is also from the extension header. Pointing out only for review
purposes that these values are set "elsewhere" in future patches.

> +    QCowDirtyBitmap *dirty_bitmaps;
> +
>      int flags;
>      int qcow_version;
>      bool use_lazy_refcounts;
> @@ -598,6 +604,10 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
>  void qcow2_free_snapshots(BlockDriverState *bs);
>  int qcow2_read_snapshots(BlockDriverState *bs);
>  
> +/* qcow2-dirty-bitmap.c functions */
> +void qcow2_free_dirty_bitmaps(BlockDriverState *bs);
> +int qcow2_read_dirty_bitmaps(BlockDriverState *bs);
> +
>  /* qcow2-cache.c functions */
>  Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
>  int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
> 

Patch order is a little strange in that we expect to have parsed the
header already, but nothing criminal if this was just the easiest way to
do it. I'll defer to your judgment.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 06/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load()
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 06/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load() Vladimir Sementsov-Ogievskiy
@ 2015-10-06 23:01   ` John Snow
  2015-10-07 17:05     ` Eric Blake
  2016-02-16 19:04     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 2 replies; 53+ messages in thread
From: John Snow @ 2015-10-06 23:01 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den



On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
> This function loads block dirty bitmap from qcow2.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/qcow2-dirty-bitmap.c | 155 +++++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2.c              |   2 +
>  block/qcow2.h              |   5 ++
>  include/block/block_int.h  |   5 ++
>  4 files changed, 167 insertions(+)
> 
> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
> index 1260d1d..ea50137 100644
> --- a/block/qcow2-dirty-bitmap.c
> +++ b/block/qcow2-dirty-bitmap.c
> @@ -99,6 +99,13 @@ static int check_constraints(int cluster_size,
>      return fail ? -EINVAL : 0;
>  }
>  
> +static QCowDirtyBitmapHeader *bitmap_header(BDRVQcowState *s,
> +                                            QCowDirtyBitmap *bitmap)
> +{

BDRVQcow2State here and everywhere below, again.

> +    return (QCowDirtyBitmapHeader *)
> +           (s->dirty_bitmap_directory + bitmap->offset);
> +}
> +
>  static int directory_read(BlockDriverState *bs)
>  {
>      int ret;
> @@ -195,3 +202,151 @@ out:
>      }
>      return ret;
>  }
> +
> +static QCowDirtyBitmap *find_dirty_bitmap_by_name(BlockDriverState *bs,
> +                                                  const char *name)
> +{
> +    BDRVQcowState *s = bs->opaque;
> +    QCowDirtyBitmap *bm, *end = s->dirty_bitmaps + s->nb_dirty_bitmaps;
> +
> +    for (bm = s->dirty_bitmaps; bm < end; ++bm) {
> +        if (strcmp(bm->name, name) == 0) {
> +            return bm;
> +        }
> +    }
> +
> +    return NULL;
> +}
> +

Whoops. This says to me we really need to prohibit bitmaps with the same
name from being stored in the same file, and mention this in the spec,
and test for it on load.

Perhaps we can create a hash-table and fail verification on open if
there's a collision. We can then use that hash-table here for
find_dirty_bitmap_by_name to speed up lookup since we already went
through the trouble of loading it.

Might help for large cases where we're approaching 64K bitmaps, will not
be too big of a performance hit for casual use.

> +/* dirty sectors in cluster is a number of sectors in the image, corresponding
> + * to one cluster of bitmap data */
> +static uint64_t dirty_sectors_in_cluster(const BDRVQcowState *s,
> +                                         const BdrvDirtyBitmap *bitmap)
> +{
> +    uint32_t sector_granularity =
> +            bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
> +
> +    return (uint64_t)sector_granularity * (s->cluster_size << 3);
> +}
> +
> +/* load_bitmap()
> + * load dirty bitmap from Dirty Bitmap Table
> + * Dirty Bitmap Table entries are assumed to be in big endian format */
> +static int load_bitmap(BlockDriverState *bs,
> +                       const uint64_t *dirty_bitmap_table,
> +                       uint32_t dirty_bitmap_table_size,
> +                       BdrvDirtyBitmap *bitmap)
> +{
> +    int ret = 0;
> +    BDRVQcowState *s = bs->opaque;
> +    uint64_t sector, dsc;
> +    uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);

I found some of this hard to unwind, bear with me:

AKA, the number of sectors that bitmap tracks ...

> +    int cl_size = s->cluster_size;
> +    uint8_t *buf = NULL;
> +    uint32_t i, tab_size =
> +            size_to_clusters(s, bdrv_dirty_bitmap_data_size(bitmap, bm_size));
> +

bdrv_dirty_bitmap_data_size(bitmap, COUNT) calculates for us how much
actual real size the lowest level of the hbitmap actually takes.

Then size_to_clusters tells us how many clusters we need to store that,
and therefore should map back to be the same as the predicted value,
dirty_bitmap_table_size.

> +    if (tab_size > dirty_bitmap_table_size) {
> +        return -EINVAL;
> +    }
> +

I assume this is not == because the real table size might have padding
or other such things, but if the calculated tab size is bigger than the
actual then we have a problem.

But I think that you've passed in "birty_ditmap_table_size" as the total
byte count of the table, but "tab_size" is computed here as the number
of entries. I think you should multiply tab_size by uint64_t and test if
they're equal.

> +    bdrv_clear_dirty_bitmap(bitmap);
> +

Clear takes the aio_context for the associated BDS and then releases it...

> +    buf = g_malloc0(cl_size);
> +    dsc = dirty_sectors_in_cluster(s, bitmap);
> +    for (i = 0, sector = 0; i < tab_size; ++i, sector += dsc) {
> +        uint64_t end = MIN(bm_size, sector + dsc);
> +        uint64_t offset = be64_to_cpu(dirty_bitmap_table[i]);
> +
> +        if (offset & DBM_TABLE_ENTRY_RESERVED_MASK) {
> +            ret = -EINVAL;
> +            goto finish;
> +        }
> +
> +        /* zero offset means cluster unallocated */
> +        if (offset) {
> +            ret = bdrv_pread(bs->file, offset, buf, cl_size);
> +            if (ret < 0) {
> +                goto finish;
> +            }
> +            bdrv_dirty_bitmap_deserialize_part(bitmap, buf, sector, end);

...but at this point, I believe we're editing this bitmap without its
associated lock, which might be a problem when we go to add QMP commands
later.

> +        }
> +    }
> +    ret = 0;
> +
> +    bdrv_dirty_bitmap_deserialize_finish(bitmap);
> +
> +finish:
> +    g_free(buf);
> +
> +    return ret;
> +}
> +
> +BdrvDirtyBitmap * qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
> +                                          BlockDriverState *bs_file,
> +                                          const char *name,
> +                                          Error **errp)
> +{
> +    BDRVQcowState *s = bs_file->opaque;
> +    int ret;
> +    QCowDirtyBitmap *bm;
> +    QCowDirtyBitmapHeader *bmh;
> +    uint64_t *dirty_bitmap_table = NULL;
> +    uint32_t granularity;
> +    uint64_t size = bdrv_nb_sectors(bs_for);
> +    BdrvDirtyBitmap *bitmap = NULL;
> +
> +    bm = find_dirty_bitmap_by_name(bs_file, name);
> +    if (bm == NULL) {
> +        error_setg(errp, "Could not find bitmap '%s' in the node '%s'", name,
> +                   bdrv_get_device_or_node_name(bs_file));
> +        return NULL;
> +    }
> +    bmh = bitmap_header(s, bm);
> +
> +    if (size != bmh->nb_virtual_bits) {
> +        error_setg(errp,
> +                   "Bitmap '%s' in the node '%s' has size = %" PRIu64
> +                   "when requested size (for node %s) = %" PRIu64,
> +                   name, bdrv_get_device_or_node_name(bs_file),
> +                   bmh->nb_virtual_bits,
> +                   bdrv_get_device_or_node_name(bs_for), size);
> +        return NULL;
> +    }
> +
> +
> +    dirty_bitmap_table = g_try_malloc(bmh->dirty_bitmap_table_size * sizeof(uint64_t));
> +    if (dirty_bitmap_table == NULL) {
> +        error_setg_errno(errp, -ENOMEM, "Could not allocate Dirty Bitmap Table");
> +        return NULL;
> +    }
> +
> +    ret = bdrv_pread(bs_file->file, bmh->dirty_bitmap_table_offset, dirty_bitmap_table,
> +                     bmh->dirty_bitmap_table_size * sizeof(uint64_t));
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret, "Could not read dirty_bitmap_table table from image");
> +        goto finish;
> +    }
> +
> +    granularity = BDRV_SECTOR_SIZE << bmh->granularity_bits;
> +    bitmap = bdrv_create_dirty_bitmap(bs_for, granularity, name, errp);
> +    if (bitmap == NULL) {
> +        error_setg_errno(errp, -ENOMEM, "Could not create dirty bitmap");

why -ENOMEM? create can fail for a number of reasons ... since we've
been given an errp parameter in this function, and we can trust
bdrv_create_dirty_bitmap to have set it, we can just return NULL here
and the caller can check errp to see what went wrong.

> +        goto finish;
> +    }
> +

Do we need to mark this bitmap as temporarily unusable until we complete
the load? I guess not in the context of bdrv_open at boot time ...

> +    ret = load_bitmap(bs_file, dirty_bitmap_table, bmh->dirty_bitmap_table_size, bitmap);
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret, "Could not read bitmap from image");
> +        goto finish;
> +    }
> +
> +finish:
> +    if (*errp != NULL) {
> +        bdrv_release_dirty_bitmap(bs_for, bitmap);
> +        bitmap = NULL;
> +    }
> +    g_free(dirty_bitmap_table);


I think we're not supposed to be reaching into errp to check its
implementation detail like this ... the usual paradigm I see is just
"goto fail" or similar statements instead of checking for
error-or-success in a shared return block.

finish:
  g_free(dirty_bitmap_table);
  return bitmap;
fail:
  g_free(dirty_bitmap_table);
  bdrv_release_dirty_bitmap(bs_for, bitmap);
  return NULL;

> +
> +    return bitmap;
> +}
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 76c331b..58ebdd3 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -2965,6 +2965,8 @@ BlockDriver bdrv_qcow2 = {
>      .bdrv_get_info          = qcow2_get_info,
>      .bdrv_get_specific_info = qcow2_get_specific_info,
>  
> +    .bdrv_dirty_bitmap_load = qcow2_dirty_bitmap_load,
> +
>      .bdrv_save_vmstate    = qcow2_save_vmstate,
>      .bdrv_load_vmstate    = qcow2_load_vmstate,
>  
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 5016fa1..51d1907 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -608,6 +608,11 @@ int qcow2_read_snapshots(BlockDriverState *bs);
>  void qcow2_free_dirty_bitmaps(BlockDriverState *bs);
>  int qcow2_read_dirty_bitmaps(BlockDriverState *bs);
>  
> +BdrvDirtyBitmap *qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
> +                                         BlockDriverState *bs_file,
> +                                         const char *name,
> +                                         Error **errp);
> +
>  /* qcow2-cache.c functions */
>  Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
>  int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 14ad4c3..f982adc 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -204,6 +204,11 @@ struct BlockDriver {
>      int (*bdrv_get_info)(BlockDriverState *bs, BlockDriverInfo *bdi);
>      ImageInfoSpecific *(*bdrv_get_specific_info)(BlockDriverState *bs);
>  
> +    BdrvDirtyBitmap *(*bdrv_dirty_bitmap_load)(BlockDriverState *bs_for,
> +                                               BlockDriverState *bs_file,
> +                                               const char *name,
> +                                               Error **errp);
> +
>      int (*bdrv_save_vmstate)(BlockDriverState *bs, QEMUIOVector *qiov,
>                               int64_t pos);
>      int (*bdrv_load_vmstate)(BlockDriverState *bs, uint8_t *buf,
> 

Looking good, thanks!
--js

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-09-05 16:43 ` [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
  2015-09-05 17:33   ` Vladimir Sementsov-Ogievskiy
  2015-09-15 16:24   ` Eric Blake
@ 2015-10-07 16:47   ` Max Reitz
  2015-10-07 19:05     ` Denis V. Lunev
  2 siblings, 1 reply; 53+ messages in thread
From: Max Reitz @ 2015-10-07 16:47 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-devel, Qemu-block
  Cc: kwolf, pbonzini, jsnow, stefanha, den

[-- Attachment #1: Type: text/plain, Size: 16588 bytes --]

On 05.09.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
> other drives (there may be qcow2 file with zero disk size but with
> several dirty bitmaps for other drives).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  docs/specs/qcow2.txt | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 126 insertions(+), 1 deletion(-)

Overall: I'm strongly against putting dirty bitmaps into qcow2 files, at
least not as it is envisioned by this series.


If you don't feel like reading why, and you'd rather read what I'd do if
you really, really want to put them into qcow2, files, skip ahead until
the "RANT OVER" line.


The first indication of why that is the case is that this patch does not
add any explanation to the qcow2 specification what these dirty bitmaps
are. Therefore, there are basically just binary data that is given a
name and dumped into a qcow2 file as if it were a tar file.

One could argue that this is qemu and we know what dirty bitmaps are.
But qcow2.txt is located in docs/specs/, not just in docs/. It is not an
explanation, but a *specification*, and as such it should explain
everything related to qcow2.

As a side notice, we already have a binary data dump in qcow2 files, and
that is the VM state. This is bad enough and if it would have been up to
me, it would have never been there. That's because it's something only
qemu can make use of, and not even different versions of qemu are
compatible there, so it was (in my opinion) a pretty bad idea to put it
into qcow2.

So what this specification is definitely lacking is an explanation on
how any independent program (i.e. *not qemu*) is to interpret the dirty
bitmaps. I do believe this is possible, as opposed to the VM state. The
VM state, nobody can do anything with it, it's even difficult for qemu
itself sometimes.

So let's imagine this specification would contain an explanation on what
dirty bitmaps are and what they mean. Actually, now that I think about
it, I cannot really imagine it, because I'm lacking that explanation.
What do they mean? As far as I can see from the series, they actually
don't mean anything. It's just a dump of data into a qcow2 file, and it
can be any bitmap, be it associated with the file itself or not.

This is further pointed to by your feature proposal "Allow qcow2 images
without l1_table and other staff but only with dirty bitmaps with
minimum overhead". There is a file format for exactly that, and it's
called tar (yes, you are missing some metadata, but just add a JSON
description file to the archive and you're done).

By the way: I heard John briefly touch this in his talk at KVM Forum
when he explained that this would make qcow2 files something like better
tar files, and I didn't like the idea back then either. I was hoping
that it would actually be differently, and was waiting for some
discussion to appear, but I didn't notice this series, because it
doesn't have "qcow2" in the cover letter's subject (and I wasn't CC'd,
but I don't really see why I should have been, as I'm not mentioned in
the MAINTAINERS file (what a lucky man I am!)). I only just noticed
today when I saw a lone reply from John on qemu-block to a patch with a
"qcow2:" prefix.

So, what you are apparently planning to do is to dump dirty bitmaps into
any available qcow2 file. If the image you are operating on is a qcow2
file, great! If it isn't, you create some empty qcow2 file and dump the
bitmaps there.

Then, I'm asking myself why you don't use tar files in the second case,
and then, why you don't use tar files in the first case. I do remember
John saying that there was a dicussion about it, but I don't know about
it, so I don't know why you dropped that idea in favor of making qcow2
files tar archives. The only reason I can think of off the top of my
head is that we have infrastructure for reading qcow2 files, but not for
tar files. However, this series is like just appending a tar file to a
qcow2 file, and then implementing a reader for tar archives inside of
the qcow2 driver, so it doesn't seem to be much simpler in practice.

In any case, if my assumptions so far are more or less correct, no
outside program can do anything with the dirty bitmaps contained in the
qcow2 file, because they are just binary data which does not necessarily
have any connection to the qcow2 file itself. Not even qemu can make
sense of them, it appears, it needs the user or the management tool to
do so.

I am strongly against putting binary data into a qcow2 file which does
not have any visible connection to the file's contents.

Obviously, it is possible that there is some connection which I am just
not seeing, though.



--- RANT OVER ---

Okay, that was enough destructive criticism, now to get some
constructive arguments and ideas.

So, there are two points I don't like: First, it's binary data which
isn't explained in the qcow2 specification. This can easily be fixed.

Second, there is no obvious connection between the qcow2 file and a
dirty bitmap. I'd drop the idea of "If you use anything else than qcow2,
we create an empty qcow2 file and put the dirty bitmaps there". Please
don't do that. If you are using something else and want this feature,
that's your problem. If you need features, you use qcow2. That's it. If
you really want to support it for other file formats, but the data into
tar archives and not into qcow2 files.

For comparison, this is like using a qcow2 file for implementing backing
files for raw images. The cluster offsets in L2 tables would then point
to offsets in the raw image (and the host offset would have to match the
guest offset), and by looking at which L2 table entries are unused, one
could deduce which sectors are to be read from the backing file. We
don't support that either, because you should just use qcow2 if you want
backing files.

Next we need to know for every dirty bitmap what the reference disk is.
Since generally that reference disk is stored in some image file
somewhere, I'd add a filename for each of the dirty bitmaps which is the
base file in respect to which these clusters are considered dirty.

As a measurement on how well you have done to associate a dirty bitmap
with a qcow2 file, imagine the following scenario: You are writing a
program independent of qemu, and that program is to make use of the
dirty bitmaps for incremental backups.

With my proposal above, it would open the qcow2 file and pick some
bitmap based on name, base image, user choice or maybe some property of
the bitmap itself (e.g. lowest dirty bit count). Then, it would create a
new overlay file (the backup image), let's say a qcow2 file, and use the
base image filename of the selected dirty bitmap as the filename of the
backing file for the backup image. Then, it would copy all dirty
clusters from the original qcow2 file to the backup image, and that's it.

Right now, with this patch alone, the tool has no idea what the base
image is, and some bitmaps may not even be related to the very qcow2
file they are in at all.


With that fixed, I could be moved to accept the concept of dirty bitmaps
in qcow2 files grudgingly. Maybe happily, if you give me a good reason
why we should not put them into tar files.


And I have some other comments in regards to the specification:

> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
> index 121dfc8..5fc0365 100644
> --- a/docs/specs/qcow2.txt
> +++ b/docs/specs/qcow2.txt
> @@ -103,7 +103,13 @@ in the description of a field.
>                      write to an image with unknown auto-clear features if it
>                      clears the respective bits from this field first.
>  
> -                    Bits 0-63:  Reserved (set to 0)
> +                    Bit 0:      Dirty bitmaps bit. If this bit is set then
> +                                there is a _consistent_ Dirty bitmaps extension
> +                                in the image. If it is not set, but there is a
> +                                Dirty bitmaps extension, its data should be
> +                                considered as inconsistent.
> +
> +                    Bits 1-63:  Reserved (set to 0)
>  
>           96 -  99:  refcount_order
>                      Describes the width of a reference count block entry (width
> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the following:
>                          0x00000000 - End of the header extension area
>                          0xE2792ACA - Backing file format name
>                          0x6803f857 - Feature name table
> +                        0x23852875 - Dirty bitmaps
>                          other      - Unknown header extension, can be safely
>                                       ignored
>  
> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
>                      terminated if it has full length)
>  
>  
> +== Dirty bitmaps ==
> +
> +Dirty bitmaps is an optional header extension. It provides an ability to store
> +dirty bitmaps in a qcow2 image. The fields are:
> +
> +          0 -  3:  nb_dirty_bitmaps
> +                   The number of dirty bitmaps contained in the image. Valid
> +                   values: 0 - 65535.

Why? Because that's what qemu supports? That's not a real reason. If so,
you may make a note of that (see the cluster_bits documentation), or
just omit it; for years, qemu only supported refcount_order = 4, but the
specification did not make a note of that. It was just a limitation of
qemu, but not of the format.

> +
> +          4 -  7:  dirty_bitmap_directory_size
> +                   Size of the Dirty Bitmap Directory in bytes. Valid values:
> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).

Same here.

> +
> +          8 - 15:  dirty_bitmap_directory_offset
> +                   Offset into the image file at which the Dirty Bitmap
> +                   Directory starts. Must be aligned to a cluster boundary.
> +
> +
>  == Host cluster management ==
>  
>  qcow2 manages the allocation of host clusters by maintaining a reference count
> @@ -360,3 +385,103 @@ Snapshot table entry:
>  
>          variable:   Padding to round up the snapshot table entry size to the
>                      next multiple of 8.
> +
> +
> +== Dirty bitmaps ==
> +
> +The feature supports storing dirty bitmaps in a qcow2 image.

I think I've made my point clear enough in the huge wall of text above,
but I'll just repeat it once more: This should explain what dirty
bitmaps are and how they are to be interpreted.

> +
> +=== Cluster mapping ===
> +
> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
> +bitmaps to host clusters. It is called Dirty Bitmap Table.
> +
> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
> +Directory Entry) and may use multiple clusters, however it must be contiguous
> +in the image file.
> +
> +Given an offset (in bytes) into the bitmap, the offset into the image file can
> +be obtained as follows:
> +
> +    byte_offset =
> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
> +
> +Taking into accout the granularity of the bitmap, an offset in bits into the
> +image file can be obtained like this:
> +
> +    bit_offset =
> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr / granularity) % 8
> +
> +Here bit_nr is a number of "virtual" bit of the bitmap, which is covered by
> +"physical" bit with number (bit_nr / granularity).
> +
> +Dirty Bitmap Table entry:
> +
> +    Bit  0 -  8:    Reserved
> +
> +         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
> +                    cluster boundary. If the offset is 0, the cluster is
> +                    unallocated, and should be read as all zeros.
> +
> +        56 - 63:    Reserved
> +
> +=== Dirty Bitmap Directory ===
> +
> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
> +Directory entry. Dirty Bitmap Directory is a contiguous area in the image file,
> +whose starting offset and length are given by the header extension fields
> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
> +the bitmap directory have variable length, depending on the length of the
> +bitmap name.
> +
> +Dirty Bitmap Directory Entry:
> +
> +    Byte 0 -  7:    dirty_bitmap_table_offset
> +                    Offset into the image file at which the Dirty Bitmap Table
> +                    for the bitmap starts. Must be aligned to a cluster
> +                    boundary.
> +
> +         8 - 15:    nb_virtual_bits
> +                    Number of "virtual" bits in the bitmap. Number of
> +                    "physical" bits would be:
> +                    (nb_virtual_bits + granularity - 1) / granularity
> +
> +        16 - 19:    dirty_bitmap_table_size
> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
> +                    Valid values: 0 - 0x8000000.
> +                    Also, (dirty_bitmap_table_size * cluster_size) should not
> +                    be greater than 0x20000000 (512 MB)

Again, is this a qemu limitation or is there another reason? Also, you
should decide between the two limitations. The second one automatically
limits the number of values to 0 - 1048575 at maximum (512 byte clusters).

> +
> +        20 - 23:    granularity_bits
> +                    Granularity bits. Valid values are: 0 - 63.
> +
> +                    Granularity is calculated as
> +                        granularity = 1 << granularity_bits
> +
> +                    Granularity of the bitmap is how many "virtual" bits
> +                    accounts for one "physical" bit.
> +
> +        24 - 27:    flags
> +                    Bit
> +                      0: in_use
> +                         The bitmap is in use and may be inconsistent.

What does "in use" mean? You are not supposed to use a qcow2 file which
is in use by qemu anyway.

> +
> +                      1: self
> +                         The bitmap is a dirty bitmap for the containing image.

As I said, I don't see why we should support this ever being not set, so
I am very much in favor of dropping this.

> +
> +                      2: auto
> +                         The bitmap should be autoloaded as block dirty bitmap.
> +                         Only available if bit 1 (self) is set.

The phrasing is too qemu-specific. Remember that this is not an
explanation for how qemu is to interpret qcow2 files, but a
*specification* of qcow2 files for *any* tool.

So if I understand the intention behind this flag, a more general
expression would be "The default bitmap". Then it is qemu's decision to
always auto-load this default bitmap.

> +
> +                      3: read_only
> +                         The bitmap should not be rewritten.
> +
> +                    Bits 4 - 31 are reserved.
> +
> +        28 - 29:    name_size
> +                    Size of the bitmap name. Valid values: 0 - 1023.
> +
> +        variable:   The name of the bitmap (not null terminated).
> +
> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size to
> +                    the next multiple of 8.
> 

The interesting thing is that I have written a huge wall of text above
and all my comments (except for "just put it into tar") can be addressed
relatively easy. Just add documentation for what dirty bitmaps are, and
a "variable: base_filename" field here, and that would be it.

But there is a reason why I'm keeping the wall of text there: I feel
like while these are very minor changes, they are fundamental design
differences. Without these changes, you just add a binary data dump
extension to qcow2, which is of no use to anyone but qemu (and not even
qemu alone, it needs the user or a management tool to tell it what to do
with it, unless the @auto flag is set).

With these changes, it suddenly actually becomes an integral part of the
qcow2 file which can be interpreted and used in a meaningful way by
tools other than qemu itself.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 06/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load()
  2015-10-06 23:01   ` John Snow
@ 2015-10-07 17:05     ` Eric Blake
  2016-02-16 19:04     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 53+ messages in thread
From: Eric Blake @ 2015-10-07 17:05 UTC (permalink / raw)
  To: John Snow, Vladimir Sementsov-Ogievskiy, qemu-devel
  Cc: kwolf, pbonzini, stefanha, den

[-- Attachment #1: Type: text/plain, Size: 1409 bytes --]

On 10/06/2015 05:01 PM, John Snow wrote:

>> +    ret = load_bitmap(bs_file, dirty_bitmap_table, bmh->dirty_bitmap_table_size, bitmap);
>> +    if (ret < 0) {
>> +        error_setg_errno(errp, -ret, "Could not read bitmap from image");
>> +        goto finish;
>> +    }
>> +
>> +finish:
>> +    if (*errp != NULL) {
>> +        bdrv_release_dirty_bitmap(bs_for, bitmap);
>> +        bitmap = NULL;
>> +    }
>> +    g_free(dirty_bitmap_table);
> 
> 
> I think we're not supposed to be reaching into errp to check its
> implementation detail like this ... the usual paradigm I see is just
> "goto fail" or similar statements instead of checking for
> error-or-success in a shared return block.
> 


If you have to make a decision based on whether an error was detected,
then you MUST pass a local error (as your caller may have passed NULL
because they don't care if you fail, even though you care if your helper
fails).  As in:

Error *err = NULL;
...
if (ret < 0) {
    error_setg_errno(&err, -ret, "Could not read...");
}
finish:
if (err) {
    bdrv_release_dirty_bitmap(...);
    error_propagate(errp, err);
    bitmap = NULL;
}

In short, any code that does *errp is a potential NULL dereference.  The
comments in error.h help explain the paradigms.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-10-07 16:47   ` Max Reitz
@ 2015-10-07 19:05     ` Denis V. Lunev
  2015-10-08 20:28       ` John Snow
  0 siblings, 1 reply; 53+ messages in thread
From: Denis V. Lunev @ 2015-10-07 19:05 UTC (permalink / raw)
  To: Max Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel, Qemu-block
  Cc: kwolf, pbonzini, jsnow, stefanha

On 10/07/2015 07:47 PM, Max Reitz wrote:
> On 05.09.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>> other drives (there may be qcow2 file with zero disk size but with
>> several dirty bitmaps for other drives).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   docs/specs/qcow2.txt | 127 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 126 insertions(+), 1 deletion(-)
> Overall: I'm strongly against putting dirty bitmaps into qcow2 files, at
> least not as it is envisioned by this series.
>
>
> If you don't feel like reading why, and you'd rather read what I'd do if
> you really, really want to put them into qcow2, files, skip ahead until
> the "RANT OVER" line.
>
>
> The first indication of why that is the case is that this patch does not
> add any explanation to the qcow2 specification what these dirty bitmaps
> are. Therefore, there are basically just binary data that is given a
> name and dumped into a qcow2 file as if it were a tar file.
>
> One could argue that this is qemu and we know what dirty bitmaps are.
> But qcow2.txt is located in docs/specs/, not just in docs/. It is not an
> explanation, but a *specification*, and as such it should explain
> everything related to qcow2.
>
> As a side notice, we already have a binary data dump in qcow2 files, and
> that is the VM state. This is bad enough and if it would have been up to
> me, it would have never been there. That's because it's something only
> qemu can make use of, and not even different versions of qemu are
> compatible there, so it was (in my opinion) a pretty bad idea to put it
> into qcow2.
>
> So what this specification is definitely lacking is an explanation on
> how any independent program (i.e. *not qemu*) is to interpret the dirty
> bitmaps. I do believe this is possible, as opposed to the VM state. The
> VM state, nobody can do anything with it, it's even difficult for qemu
> itself sometimes.
>
> So let's imagine this specification would contain an explanation on what
> dirty bitmaps are and what they mean. Actually, now that I think about
> it, I cannot really imagine it, because I'm lacking that explanation.
> What do they mean? As far as I can see from the series, they actually
> don't mean anything. It's just a dump of data into a qcow2 file, and it
> can be any bitmap, be it associated with the file itself or not.
>
> This is further pointed to by your feature proposal "Allow qcow2 images
> without l1_table and other staff but only with dirty bitmaps with
> minimum overhead". There is a file format for exactly that, and it's
> called tar (yes, you are missing some metadata, but just add a JSON
> description file to the archive and you're done).
>
> By the way: I heard John briefly touch this in his talk at KVM Forum
> when he explained that this would make qcow2 files something like better
> tar files, and I didn't like the idea back then either. I was hoping
> that it would actually be differently, and was waiting for some
> discussion to appear, but I didn't notice this series, because it
> doesn't have "qcow2" in the cover letter's subject (and I wasn't CC'd,
> but I don't really see why I should have been, as I'm not mentioned in
> the MAINTAINERS file (what a lucky man I am!)). I only just noticed
> today when I saw a lone reply from John on qemu-block to a patch with a
> "qcow2:" prefix.
>
> So, what you are apparently planning to do is to dump dirty bitmaps into
> any available qcow2 file. If the image you are operating on is a qcow2
> file, great! If it isn't, you create some empty qcow2 file and dump the
> bitmaps there.
>
> Then, I'm asking myself why you don't use tar files in the second case,
> and then, why you don't use tar files in the first case. I do remember
> John saying that there was a dicussion about it, but I don't know about
> it, so I don't know why you dropped that idea in favor of making qcow2
> files tar archives. The only reason I can think of off the top of my
> head is that we have infrastructure for reading qcow2 files, but not for
> tar files. However, this series is like just appending a tar file to a
> qcow2 file, and then implementing a reader for tar archives inside of
> the qcow2 driver, so it doesn't seem to be much simpler in practice.
>
> In any case, if my assumptions so far are more or less correct, no
> outside program can do anything with the dirty bitmaps contained in the
> qcow2 file, because they are just binary data which does not necessarily
> have any connection to the qcow2 file itself. Not even qemu can make
> sense of them, it appears, it needs the user or the management tool to
> do so.
>
> I am strongly against putting binary data into a qcow2 file which does
> not have any visible connection to the file's contents.
>
> Obviously, it is possible that there is some connection which I am just
> not seeing, though.
>
>
>
> --- RANT OVER ---
>
> Okay, that was enough destructive criticism, now to get some
> constructive arguments and ideas.
>
> So, there are two points I don't like: First, it's binary data which
> isn't explained in the qcow2 specification. This can easily be fixed.
>
> Second, there is no obvious connection between the qcow2 file and a
> dirty bitmap. I'd drop the idea of "If you use anything else than qcow2,
> we create an empty qcow2 file and put the dirty bitmaps there". Please
> don't do that. If you are using something else and want this feature,
> that's your problem. If you need features, you use qcow2. That's it. If
> you really want to support it for other file formats, but the data into
> tar archives and not into qcow2 files.
>
> For comparison, this is like using a qcow2 file for implementing backing
> files for raw images. The cluster offsets in L2 tables would then point
> to offsets in the raw image (and the host offset would have to match the
> guest offset), and by looking at which L2 table entries are unused, one
> could deduce which sectors are to be read from the backing file. We
> don't support that either, because you should just use qcow2 if you want
> backing files.
>
> Next we need to know for every dirty bitmap what the reference disk is.
> Since generally that reference disk is stored in some image file
> somewhere, I'd add a filename for each of the dirty bitmaps which is the
> base file in respect to which these clusters are considered dirty.
>
> As a measurement on how well you have done to associate a dirty bitmap
> with a qcow2 file, imagine the following scenario: You are writing a
> program independent of qemu, and that program is to make use of the
> dirty bitmaps for incremental backups.
>
> With my proposal above, it would open the qcow2 file and pick some
> bitmap based on name, base image, user choice or maybe some property of
> the bitmap itself (e.g. lowest dirty bit count). Then, it would create a
> new overlay file (the backup image), let's say a qcow2 file, and use the
> base image filename of the selected dirty bitmap as the filename of the
> backing file for the backup image. Then, it would copy all dirty
> clusters from the original qcow2 file to the backup image, and that's it.
>
> Right now, with this patch alone, the tool has no idea what the base
> image is, and some bitmaps may not even be related to the very qcow2
> file they are in at all.
>
>
> With that fixed, I could be moved to accept the concept of dirty bitmaps
> in qcow2 files grudgingly. Maybe happily, if you give me a good reason
> why we should not put them into tar files.
>
>
> And I have some other comments in regards to the specification:
>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..5fc0365 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,13 @@ in the description of a field.
>>                       write to an image with unknown auto-clear features if it
>>                       clears the respective bits from this field first.
>>   
>> -                    Bits 0-63:  Reserved (set to 0)
>> +                    Bit 0:      Dirty bitmaps bit. If this bit is set then
>> +                                there is a _consistent_ Dirty bitmaps extension
>> +                                in the image. If it is not set, but there is a
>> +                                Dirty bitmaps extension, its data should be
>> +                                considered as inconsistent.
>> +
>> +                    Bits 1-63:  Reserved (set to 0)
>>   
>>            96 -  99:  refcount_order
>>                       Describes the width of a reference count block entry (width
>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the following:
>>                           0x00000000 - End of the header extension area
>>                           0xE2792ACA - Backing file format name
>>                           0x6803f857 - Feature name table
>> +                        0x23852875 - Dirty bitmaps
>>                           other      - Unknown header extension, can be safely
>>                                        ignored
>>   
>> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
>>                       terminated if it has full length)
>>   
>>   
>> +== Dirty bitmaps ==
>> +
>> +Dirty bitmaps is an optional header extension. It provides an ability to store
>> +dirty bitmaps in a qcow2 image. The fields are:
>> +
>> +          0 -  3:  nb_dirty_bitmaps
>> +                   The number of dirty bitmaps contained in the image. Valid
>> +                   values: 0 - 65535.
> Why? Because that's what qemu supports? That's not a real reason. If so,
> you may make a note of that (see the cluster_bits documentation), or
> just omit it; for years, qemu only supported refcount_order = 4, but the
> specification did not make a note of that. It was just a limitation of
> qemu, but not of the format.
>
>> +
>> +          4 -  7:  dirty_bitmap_directory_size
>> +                   Size of the Dirty Bitmap Directory in bytes. Valid values:
>> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
> Same here.
>
>> +
>> +          8 - 15:  dirty_bitmap_directory_offset
>> +                   Offset into the image file at which the Dirty Bitmap
>> +                   Directory starts. Must be aligned to a cluster boundary.
>> +
>> +
>>   == Host cluster management ==
>>   
>>   qcow2 manages the allocation of host clusters by maintaining a reference count
>> @@ -360,3 +385,103 @@ Snapshot table entry:
>>   
>>           variable:   Padding to round up the snapshot table entry size to the
>>                       next multiple of 8.
>> +
>> +
>> +== Dirty bitmaps ==
>> +
>> +The feature supports storing dirty bitmaps in a qcow2 image.
> I think I've made my point clear enough in the huge wall of text above,
> but I'll just repeat it once more: This should explain what dirty
> bitmaps are and how they are to be interpreted.
>
>> +
>> +=== Cluster mapping ===
>> +
>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
>> +
>> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>> +Directory Entry) and may use multiple clusters, however it must be contiguous
>> +in the image file.
>> +
>> +Given an offset (in bytes) into the bitmap, the offset into the image file can
>> +be obtained as follows:
>> +
>> +    byte_offset =
>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>> +
>> +Taking into accout the granularity of the bitmap, an offset in bits into the
>> +image file can be obtained like this:
>> +
>> +    bit_offset =
>> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr / granularity) % 8
>> +
>> +Here bit_nr is a number of "virtual" bit of the bitmap, which is covered by
>> +"physical" bit with number (bit_nr / granularity).
>> +
>> +Dirty Bitmap Table entry:
>> +
>> +    Bit  0 -  8:    Reserved
>> +
>> +         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
>> +                    cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated, and should be read as all zeros.
>> +
>> +        56 - 63:    Reserved
>> +
>> +=== Dirty Bitmap Directory ===
>> +
>> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
>> +Directory entry. Dirty Bitmap Directory is a contiguous area in the image file,
>> +whose starting offset and length are given by the header extension fields
>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries of
>> +the bitmap directory have variable length, depending on the length of the
>> +bitmap name.
>> +
>> +Dirty Bitmap Directory Entry:
>> +
>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>> +                    Offset into the image file at which the Dirty Bitmap Table
>> +                    for the bitmap starts. Must be aligned to a cluster
>> +                    boundary.
>> +
>> +         8 - 15:    nb_virtual_bits
>> +                    Number of "virtual" bits in the bitmap. Number of
>> +                    "physical" bits would be:
>> +                    (nb_virtual_bits + granularity - 1) / granularity
>> +
>> +        16 - 19:    dirty_bitmap_table_size
>> +                    Number of entries in the Dirty Bitmap Table of the bitmap.
>> +                    Valid values: 0 - 0x8000000.
>> +                    Also, (dirty_bitmap_table_size * cluster_size) should not
>> +                    be greater than 0x20000000 (512 MB)
> Again, is this a qemu limitation or is there another reason? Also, you
> should decide between the two limitations. The second one automatically
> limits the number of values to 0 - 1048575 at maximum (512 byte clusters).
>
>> +
>> +        20 - 23:    granularity_bits
>> +                    Granularity bits. Valid values are: 0 - 63.
>> +
>> +                    Granularity is calculated as
>> +                        granularity = 1 << granularity_bits
>> +
>> +                    Granularity of the bitmap is how many "virtual" bits
>> +                    accounts for one "physical" bit.
>> +
>> +        24 - 27:    flags
>> +                    Bit
>> +                      0: in_use
>> +                         The bitmap is in use and may be inconsistent.
> What does "in use" mean? You are not supposed to use a qcow2 file which
> is in use by qemu anyway.
>
>> +
>> +                      1: self
>> +                         The bitmap is a dirty bitmap for the containing image.
> As I said, I don't see why we should support this ever being not set, so
> I am very much in favor of dropping this.
>
>> +
>> +                      2: auto
>> +                         The bitmap should be autoloaded as block dirty bitmap.
>> +                         Only available if bit 1 (self) is set.
> The phrasing is too qemu-specific. Remember that this is not an
> explanation for how qemu is to interpret qcow2 files, but a
> *specification* of qcow2 files for *any* tool.
>
> So if I understand the intention behind this flag, a more general
> expression would be "The default bitmap". Then it is qemu's decision to
> always auto-load this default bitmap.
>
>> +
>> +                      3: read_only
>> +                         The bitmap should not be rewritten.
>> +
>> +                    Bits 4 - 31 are reserved.
>> +
>> +        28 - 29:    name_size
>> +                    Size of the bitmap name. Valid values: 0 - 1023.
>> +
>> +        variable:   The name of the bitmap (not null terminated).
>> +
>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry size to
>> +                    the next multiple of 8.
>>
> The interesting thing is that I have written a huge wall of text above
> and all my comments (except for "just put it into tar") can be addressed
> relatively easy. Just add documentation for what dirty bitmaps are, and
> a "variable: base_filename" field here, and that would be it.
>
> But there is a reason why I'm keeping the wall of text there: I feel
> like while these are very minor changes, they are fundamental design
> differences. Without these changes, you just add a binary data dump
> extension to qcow2, which is of no use to anyone but qemu (and not even
> qemu alone, it needs the user or a management tool to tell it what to do
> with it, unless the @auto flag is set).
>
> With these changes, it suddenly actually becomes an integral part of the
> qcow2 file which can be interpreted and used in a meaningful way by
> tools other than qemu itself.
>
> Max
>
This is actually not a very big deal from my point of view if this would
put us into agreement and will allow to proceed further. The bitmap will
be available in QCOW2, Parallels image can also have bitmap inside,
not yet code on QEMU side.

This would be enough for me for a while.

Thus the question is on John side whether "bitmap in separate file"
feature is really necessary. This is mostly API question.

Den

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-10-07 19:05     ` Denis V. Lunev
@ 2015-10-08 20:28       ` John Snow
  2015-10-08 20:56         ` Denis V. Lunev
  2015-10-09 17:07         ` [Qemu-devel] [PATCH " Max Reitz
  0 siblings, 2 replies; 53+ messages in thread
From: John Snow @ 2015-10-08 20:28 UTC (permalink / raw)
  To: Denis V. Lunev, Max Reitz, Vladimir Sementsov-Ogievskiy,
	qemu-devel, Qemu-block
  Cc: kwolf, pbonzini, stefanha



On 10/07/2015 03:05 PM, Denis V. Lunev wrote:
> On 10/07/2015 07:47 PM, Max Reitz wrote:
>> On 05.09.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>>> other drives (there may be qcow2 file with zero disk size but with
>>> several dirty bitmaps for other drives).
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   docs/specs/qcow2.txt | 127
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>   1 file changed, 126 insertions(+), 1 deletion(-)
>> Overall: I'm strongly against putting dirty bitmaps into qcow2 files, at
>> least not as it is envisioned by this series.
>>
>>
>> If you don't feel like reading why, and you'd rather read what I'd do if
>> you really, really want to put them into qcow2, files, skip ahead until
>> the "RANT OVER" line.
>>
>>
>> The first indication of why that is the case is that this patch does not
>> add any explanation to the qcow2 specification what these dirty bitmaps
>> are. Therefore, there are basically just binary data that is given a
>> name and dumped into a qcow2 file as if it were a tar file.
>>
>> One could argue that this is qemu and we know what dirty bitmaps are.
>> But qcow2.txt is located in docs/specs/, not just in docs/. It is not an
>> explanation, but a *specification*, and as such it should explainhttp://www.preining.info/blog/2015/10/looking-at-the-facts-sarah-sharps-crusade/
>> everything related to qcow2.
>>
>> As a side notice, we already have a binary data dump in qcow2 files, and
>> that is the VM state. This is bad enough and if it would have been up to
>> me, it would have never been there. That's because it's something only
>> qemu can make use of, and not even different versions of qemu are
>> compatible there, so it was (in my opinion) a pretty bad idea to put it
>> into qcow2.
>>
>> So what this specification is definitely lacking is an explanation on
>> how any independent program (i.e. *not qemu*) is to interpret the dirty
>> bitmaps. I do believe this is possible, as opposed to the VM state. The
>> VM state, nobody can do anything with it, it's even difficult for qemu
>> itself sometimes.
>>
>> So let's imagine this specification would contain an explanation on what
>> dirty bitmaps are and what they mean. Actually, now that I think about
>> it, I cannot really imagine it, because I'm lacking that explanation.
>> What do they mean? As far as I can see from the series, they actually
>> don't mean anything. It's just a dump of data into a qcow2 file, and it
>> can be any bitmap, be it associated with the file itself or not.
>>
>> This is further pointed to by your feature proposal "Allow qcow2 images
>> without l1_table and other staff but only with dirty bitmaps with
>> minimum overhead". There is a file format for exactly that, and it's
>> called tar (yes, you are missing some metadata, but just add a JSON
>> description file to the archive and you're done).
>>
>> By the way: I heard John briefly touch this in his talk at KVM Forum
>> when he explained that this would make qcow2 files something like better
>> tar files, and I didn't like the idea back then either. I was hoping
>> that it would actually be differently, and was waiting for some
>> discussion to appear, but I didn't notice this series, because it
>> doesn't have "qcow2" in the cover letter's subject (and I wasn't CC'd,
>> but I don't really see why I should have been, as I'm not mentioned in
>> the MAINTAINERS file (what a lucky man I am!)). I only just noticed
>> today when I saw a lone reply from John on qemu-block to a patch with a
>> "qcow2:" prefix.
>>
>> So, what you are apparently planning to do is to dump dirty bitmaps into
>> any available qcow2 file. If the image you are operating on is a qcow2
>> file, great! If it isn't, you create some empty qcow2 file and dump the
>> bitmaps there.
>>
>> Then, I'm asking myself why you don't use tar files in the second case,
>> and then, why you don't use tar files in the first case. I do remember
>> John saying that there was a dicussion about it, but I don't know about
>> it, so I don't know why you dropped that idea in favor of making qcow2
>> files tar archives. The only reason I can think of off the top of my
>> head is that we have infrastructure for reading qcow2 files, but not for
>> tar files. However, this series is like just appending a tar file to a
>> qcow2 file, and then implementing a reader for tar archives inside of
>> the qcow2 driver, so it doesn't seem to be much simpler in practice.
>>
>> In any case, if my assumptions so far are more or less correct, no
>> outside program can do anything with the dirty bitmaps contained in the
>> qcow2 file, because they are just binary data which does not necessarily
>> have any connection to the qcow2 file itself. Not even qemu can make
>> sense of them, it appears, it needs the user or the management tool to
>> do so.
>>
>> I am strongly against putting binary data into a qcow2 file which does
>> not have any visible connection to the file's contents.
>>
>> Obviously, it is possible that there is some connection which I am just
>> not seeing, though.
>>
>>
>>
>> --- RANT OVER ---
>>
>> Okay, that was enough destructive criticism, now to get some
>> constructive arguments and ideas.
>>
>> So, there are two points I don't like: First, it's binary data which
>> isn't explained in the qcow2 specification. This can easily be fixed.
>>
>> Second, there is no obvious connection between the qcow2 file and a
>> dirty bitmap. I'd drop the idea of "If you use anything else than qcow2,
>> we create an empty qcow2 file and put the dirty bitmaps there". Please
>> don't do that. If you are using something else and want this feature,
>> that's your problem. If you need features, you use qcow2. That's it. If
>> you really want to support it for other file formats, but the data into
>> tar archives and not into qcow2 files.
>>
>> For comparison, this is like using a qcow2 file for implementing backing
>> files for raw images. The cluster offsets in L2 tables would then point
>> to offsets in the raw image (and the host offset would have to match the
>> guest offset), and by looking at which L2 table entries are unused, one
>> could deduce which sectors are to be read from the backing file. We
>> don't support that either, because you should just use qcow2 if you want
>> backing files.
>>
>> Next we need to know for every dirty bitmap what the reference disk is.
>> Since generally that reference disk is stored in some image file
>> somewhere, I'd add a filename for each of the dirty bitmaps which is the
>> base file in respect to which these clusters are considered dirty.
>>
>> As a measurement on how well you have done to associate a dirty bitmap
>> with a qcow2 file, imagine the following scenario: You are writing a
>> program independent of qemu, and that program is to make use of the
>> dirty bitmaps for incremental backups.
>>
>> With my proposal above, it would open the qcow2 file and pick some
>> bitmap based on name, base image, user choice or maybe some property of
>> the bitmap itself (e.g. lowest dirty bit count). Then, it would create a
>> new overlay file (the backup image), let's say a qcow2 file, and use the
>> base image filename of the selected dirty bitmap as the filename of the
>> backing file for the backup image. Then, it would copy all dirty
>> clusters from the original qcow2 file to the backup image, and that's it.
>>
>> Right now, with this patch alone, the tool has no idea what the base
>> image is, and some bitmaps may not even be related to the very qcow2
>> file they are in at all.
>>
>>
>> With that fixed, I could be moved to accept the concept of dirty bitmaps
>> in qcow2 files grudgingly. Maybe happily, if you give me a good reason
>> why we should not put them into tar files.
>>
>>
>> And I have some other comments in regards to the specification:
>>
>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>> index 121dfc8..5fc0365 100644
>>> --- a/docs/specs/qcow2.txt
>>> +++ b/docs/specs/qcow2.txt
>>> @@ -103,7 +103,13 @@ in the description of a field.
>>>                       write to an image with unknown auto-clear
>>> features if it
>>>                       clears the respective bits from this field first.
>>>   -                    Bits 0-63:  Reserved (set to 0)
>>> +                    Bit 0:      Dirty bitmaps bit. If this bit is
>>> set then
>>> +                                there is a _consistent_ Dirty
>>> bitmaps extension
>>> +                                in the image. If it is not set, but
>>> there is a
>>> +                                Dirty bitmaps extension, its data
>>> should be
>>> +                                considered as inconsistent.
>>> +
>>> +                    Bits 1-63:  Reserved (set to 0)
>>>              96 -  99:  refcount_order
>>>                       Describes the width of a reference count block
>>> entry (width
>>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like
>>> the following:
>>>                           0x00000000 - End of the header extension area
>>>                           0xE2792ACA - Backing file format name
>>>                           0x6803f857 - Feature name table
>>> +                        0x23852875 - Dirty bitmaps
>>>                           other      - Unknown header extension, can
>>> be safely
>>>                                        ignored
>>>   @@ -166,6 +173,24 @@ the header extension data. Each entry look
>>> like this:
>>>                       terminated if it has full length)
>>>     +== Dirty bitmaps ==
>>> +
>>> +Dirty bitmaps is an optional header extension. It provides an
>>> ability to store
>>> +dirty bitmaps in a qcow2 image. The fields are:
>>> +
>>> +          0 -  3:  nb_dirty_bitmaps
>>> +                   The number of dirty bitmaps contained in the
>>> image. Valid
>>> +                   values: 0 - 65535.
>> Why? Because that's what qemu supports? That's not a real reason. If so,
>> you may make a note of that (see the cluster_bits documentation), or
>> just omit it; for years, qemu only supported refcount_order = 4, but the
>> specification did not make a note of that. It was just a limitation of
>> qemu, but not of the format.
>>
>>> +
>>> +          4 -  7:  dirty_bitmap_directory_size
>>> +                   Size of the Dirty Bitmap Directory in bytes.
>>> Valid values:
>>> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
>> Same here.
>>
>>> +
>>> +          8 - 15:  dirty_bitmap_directory_offset
>>> +                   Offset into the image file at which the Dirty Bitmap
>>> +                   Directory starts. Must be aligned to a cluster
>>> boundary.
>>> +
>>> +
>>>   == Host cluster management ==
>>>     qcow2 manages the allocation of host clusters by maintaining a
>>> reference count
>>> @@ -360,3 +385,103 @@ Snapshot table entry:
>>>             variable:   Padding to round up the snapshot table entry
>>> size to the
>>>                       next multiple of 8.
>>> +
>>> +
>>> +== Dirty bitmaps ==
>>> +
>>> +The feature supports storing dirty bitmaps in a qcow2 image.
>> I think I've made my point clear enough in the huge wall of text above,
>> but I'll just repeat it once more: This should explain what dirty
>> bitmaps are and how they are to be interpreted.
>>
>>> +
>>> +=== Cluster mapping ===
>>> +
>>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
>>> +
>>> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>>> +Directory Entry) and may use multiple clusters, however it must be
>>> contiguous
>>> +in the image file.
>>> +
>>> +Given an offset (in bytes) into the bitmap, the offset into the
>>> image file can
>>> +be obtained as follows:
>>> +
>>> +    byte_offset =
>>> +        dirty_bitmap_table[offset / cluster_size] + (offset %
>>> cluster_size)
>>> +
>>> +Taking into accout the granularity of the bitmap, an offset in bits
>>> into the
>>> +image file can be obtained like this:
>>> +
>>> +    bit_offset =
>>> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr /
>>> granularity) % 8
>>> +
>>> +Here bit_nr is a number of "virtual" bit of the bitmap, which is
>>> covered by
>>> +"physical" bit with number (bit_nr / granularity).
>>> +
>>> +Dirty Bitmap Table entry:
>>> +
>>> +    Bit  0 -  8:    Reserved
>>> +
>>> +         9 - 55:    Bits 9-55 of host cluster offset. Must be
>>> aligned to a
>>> +                    cluster boundary. If the offset is 0, the
>>> cluster is
>>> +                    unallocated, and should be read as all zeros.
>>> +
>>> +        56 - 63:    Reserved
>>> +
>>> +=== Dirty Bitmap Directory ===
>>> +
>>> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
>>> +Directory entry. Dirty Bitmap Directory is a contiguous area in the
>>> image file,
>>> +whose starting offset and length are given by the header extension
>>> fields
>>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The
>>> entries of
>>> +the bitmap directory have variable length, depending on the length
>>> of the
>>> +bitmap name.
>>> +
>>> +Dirty Bitmap Directory Entry:
>>> +
>>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>>> +                    Offset into the image file at which the Dirty
>>> Bitmap Table
>>> +                    for the bitmap starts. Must be aligned to a cluster
>>> +                    boundary.
>>> +
>>> +         8 - 15:    nb_virtual_bits
>>> +                    Number of "virtual" bits in the bitmap. Number of
>>> +                    "physical" bits would be:
>>> +                    (nb_virtual_bits + granularity - 1) / granularity
>>> +
>>> +        16 - 19:    dirty_bitmap_table_size
>>> +                    Number of entries in the Dirty Bitmap Table of
>>> the bitmap.
>>> +                    Valid values: 0 - 0x8000000.
>>> +                    Also, (dirty_bitmap_table_size * cluster_size)
>>> should not
>>> +                    be greater than 0x20000000 (512 MB)
>> Again, is this a qemu limitation or is there another reason? Also, you
>> should decide between the two limitations. The second one automatically
>> limits the number of values to 0 - 1048575 at maximum (512 byte
>> clusters).
>>
>>> +
>>> +        20 - 23:    granularity_bits
>>> +                    Granularity bits. Valid values are: 0 - 63.
>>> +
>>> +                    Granularity is calculated as
>>> +                        granularity = 1 << granularity_bits
>>> +
>>> +                    Granularity of the bitmap is how many "virtual"
>>> bits
>>> +                    accounts for one "physical" bit.
>>> +
>>> +        24 - 27:    flags
>>> +                    Bit
>>> +                      0: in_use
>>> +                         The bitmap is in use and may be inconsistent.
>> What does "in use" mean? You are not supposed to use a qcow2 file which
>> is in use by qemu anyway.
>>
>>> +
>>> +                      1: self
>>> +                         The bitmap is a dirty bitmap for the
>>> containing image.
>> As I said, I don't see why we should support this ever being not set, so
>> I am very much in favor of dropping this.
>>
>>> +
>>> +                      2: auto
>>> +                         The bitmap should be autoloaded as block
>>> dirty bitmap.
>>> +                         Only available if bit 1 (self) is set.
>> The phrasing is too qemu-specific. Remember that this is not an
>> explanation for how qemu is to interpret qcow2 files, but a
>> *specification* of qcow2 files for *any* tool.
>>
>> So if I understand the intention behind this flag, a more general
>> expression would be "The default bitmap". Then it is qemu's decision to
>> always auto-load this default bitmap.
>>
>>> +
>>> +                      3: read_only
>>> +                         The bitmap should not be rewritten.
>>> +
>>> +                    Bits 4 - 31 are reserved.
>>> +
>>> +        28 - 29:    name_size
>>> +                    Size of the bitmap name. Valid values: 0 - 1023.
>>> +
>>> +        variable:   The name of the bitmap (not null terminated).
>>> +
>>> +        variable:   Padding to round up the Dirty Bitmap Directory
>>> Entry size to
>>> +                    the next multiple of 8.
>>>
>> The interesting thing is that I have written a huge wall of text above
>> and all my comments (except for "just put it into tar") can be addressed
>> relatively easy. Just add documentation for what dirty bitmaps are, and
>> a "variable: base_filename" field here, and that would be it.
>>
>> But there is a reason why I'm keeping the wall of text there: I feel
>> like while these are very minor changes, they are fundamental design
>> differences. Without these changes, you just add a binary data dump
>> extension to qcow2, which is of no use to anyone but qemu (and not even
>> qemu alone, it needs the user or a management tool to tell it what to do
>> with it, unless the @auto flag is set).
>>
>> With these changes, it suddenly actually becomes an integral part of the
>> qcow2 file which can be interpreted and used in a meaningful way by
>> tools other than qemu itself.
>>
>> Max
>>
> This is actually not a very big deal from my point of view if this would
> put us into agreement and will allow to proceed further. The bitmap will
> be available in QCOW2, Parallels image can also have bitmap inside,
> not yet code on QEMU side.
> 
> This would be enough for me for a while.
> 
> Thus the question is on John side whether "bitmap in separate file"
> feature is really necessary. This is mostly API question.
> 
> Den

(NB: I never got Max's original reply, so this reply is more to Max than
to Denis or Vladimir.)

I'll see your wall of text and raise you my own wall of text...

We consider the ability to use persistent bitmaps to create incremental
backups for non-qcow2 images to be a necessary and vital component of
complete incremental backup solution, especially considering the core
mechanism of the feature does not really rely on qcow2 for anything
outside of convenience (e.g. backing files.)

Vladimir's initial proposal of using .qcow2 to store the bitmap sounded
good to me, because I wanted to be able to store the bitmap in a qcow2
file anyway, and expanding the specification to allow it to store
/arbitrary/ bitmaps seemed like a natural fit to accomplish both goals
with a minimum of coding.

However, Max has raised some pretty good points here -- allow me to
paraphrase his Wall Of Text™:

- Since this patch is a modification of the qcow2 /specification/ which
is used by more than just QEMU, we must take care to avoid QEMU-isms
limitations and design a more universal approach to the specification
addendum.

- Specific caps on the number of bitmaps, the granularity of said
bitmaps, and the resultant size of said bitmaps should be addressed in
terms of the spec, not in terms of what's necessarily convenient or
sufficient for QEMU. I think this point will be easy to address with
some better spec wording.

- The bitmap language in the spec is generic and doesn't refer much to
anything. This is partially my fault, as I believe I likely guided
Vladimir towards using generic language that was tied more to the
HBitmap format than towards our specific implementation
(BdrvDirtyBitmap.) I recognize this as a bit of a misguided effort on my
part to keep things "generic," but what I succeeded in doing was keeping
it "useless" outside of QEMU. Example: "number of virtual bits" is
meaningless, but "number of 512 byte sectors" is not.

The last bit is the crux of our problem and the most deserving of our
attention:

- For a bitmap to be useful to an application outside of QEMU, all of
the necessary information for interpreting that bitmap must either be
present within the file or referenced. For bitmaps that describe the
file they are stored in, this is trivial with some specification editing.

For bitmaps stored for /other/ files, this gets... trickier. What is
this a bitmap for? What does it describe? What data does it describe?

Node-names and drive names here are useless outside of QEMU and can of
course change between QEMU invocations or be shared between different
QEMU instances, so this is useless ...

We could store filenames, but networked devices and distributed
filesystems may have interesting relative pathnames that will not remain
reliable once the .qcow2 file is shuffled around or migrated, so storing
path-name references seems like a losing battle here, too. Maybe we only
have a file descriptor and no name at all -- what do we write for the
"global identifier that uniquely identifies the data we belong to"? Is
it even possible?

The only conclusion I can reach here is that storing bitmaps inside of a
.qcow2 that remain meaningful to external applications is not going to
be easily possible.

Perhaps we need to abandon the idea that we can store any bitmap we want
into a .qcow2.

However, I'm still a big fan of storing bitmaps that describe the data
they go alongside in the same .qcow2 as a convenience feature --
especially now since Vladimir has done the hard work for us all writing
the feature.

For simple use cases in non-managed environments the use case for
storing the bitmap inside the qcow2 it describes is pretty compelling:

- No extra files to track or manage
- The command line used to boot QEMU the first time can be used to boot
QEMU subsequently, and we get the persistent bitmap automatically
without further modification.
- Migration across a shared medium using .qcow2 files is trivial
- Backups managed by qcow2-unaware applications trivially bring along
our persistent bitmap data for us without additional configuration.

It's simple, the data is meaningful to external applications, and we've
got most of the code we need already, thanks to Vladimir.

Sadly, we still need a way to store bitmap data for files that do not
offer .bitmap_load and .bitmap_store primitives for us.

Presumably, if we devise our own "generic bitmap container" format, we
don't have to store things like node names, filenames, etc in this
container and we can use it to just store (name, granularity, size,
[data]) like we were trying to do in qcow2.

Matching bitmap IDs up to the data they belong to becomes the
responsibility of the user/management layer.

Where this gets hairy, perhaps, is how to enforce that the drive data
that belongs to this bitmap isn't modified without our say-so? How do we
detect de-sync? The quick answer might be to store a hash alongside the
bitmap, and upon being re-applied to the drive if the hash doesn't
match, we throw an error/complain/etc -- but what about cases -- again
-- where we don't necessarily have a file we can trivially hash, like a
many-gigs-wide raw file being mounted over a networked file-system?
"external bitmaps" appear to pose a very real desync risk.

I suppose we would have had that problem anyway with the
qcow2-as-container idea.

Perhaps the best we can say is "This is kind of a dangerous feature to
use, use at your own peril!" and strongly recommend that external
bitmaps are used only in conjunction with a management interface.

That's about all of the thoughts I have on the matter currently.
Does anybody else have strong feelings on where we should go from here?

(A) Argue with Max and push for qcow2-as-container
(B) Use qcow2 for self-reference bitmaps only, use an external format
for formats that do not support .bitmap_load or .bitmap_store
(C) Forget about the qcow2 extension entirely, use only the new external
format
(D) Something else?

My vote is for (B), and if I can find a bit of consensus on that, we can
draft an internal-use specification for the file, but I am very wary of
how we will manage de-sync or if we will be able manage it at all.
("Your fault for touching this file when QEMU was not running.")

Thoughts?

--js

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-10-08 20:28       ` John Snow
@ 2015-10-08 20:56         ` Denis V. Lunev
  2015-10-09 18:14           ` [Qemu-devel] [PAT​CH " Max Reitz
  2015-10-09 17:07         ` [Qemu-devel] [PATCH " Max Reitz
  1 sibling, 1 reply; 53+ messages in thread
From: Denis V. Lunev @ 2015-10-08 20:56 UTC (permalink / raw)
  To: John Snow, Max Reitz, Vladimir Sementsov-Ogievskiy, qemu-devel,
	Qemu-block
  Cc: kwolf, pbonzini, stefanha

On 10/08/2015 11:28 PM, John Snow wrote:
>
> On 10/07/2015 03:05 PM, Denis V. Lunev wrote:
>> On 10/07/2015 07:47 PM, Max Reitz wrote:
>>> On 05.09.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>>>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>>>> other drives (there may be qcow2 file with zero disk size but with
>>>> several dirty bitmaps for other drives).
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>    docs/specs/qcow2.txt | 127
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>    1 file changed, 126 insertions(+), 1 deletion(-)
>>> Overall: I'm strongly against putting dirty bitmaps into qcow2 files, at
>>> least not as it is envisioned by this series.
>>>
>>>
>>> If you don't feel like reading why, and you'd rather read what I'd do if
>>> you really, really want to put them into qcow2, files, skip ahead until
>>> the "RANT OVER" line.
>>>
>>>
>>> The first indication of why that is the case is that this patch does not
>>> add any explanation to the qcow2 specification what these dirty bitmaps
>>> are. Therefore, there are basically just binary data that is given a
>>> name and dumped into a qcow2 file as if it were a tar file.
>>>
>>> One could argue that this is qemu and we know what dirty bitmaps are.
>>> But qcow2.txt is located in docs/specs/, not just in docs/. It is not an
>>> explanation, but a *specification*, and as such it should explainhttp://www.preining.info/blog/2015/10/looking-at-the-facts-sarah-sharps-crusade/
>>> everything related to qcow2.
>>>
>>> As a side notice, we already have a binary data dump in qcow2 files, and
>>> that is the VM state. This is bad enough and if it would have been up to
>>> me, it would have never been there. That's because it's something only
>>> qemu can make use of, and not even different versions of qemu are
>>> compatible there, so it was (in my opinion) a pretty bad idea to put it
>>> into qcow2.
>>>
>>> So what this specification is definitely lacking is an explanation on
>>> how any independent program (i.e. *not qemu*) is to interpret the dirty
>>> bitmaps. I do believe this is possible, as opposed to the VM state. The
>>> VM state, nobody can do anything with it, it's even difficult for qemu
>>> itself sometimes.
>>>
>>> So let's imagine this specification would contain an explanation on what
>>> dirty bitmaps are and what they mean. Actually, now that I think about
>>> it, I cannot really imagine it, because I'm lacking that explanation.
>>> What do they mean? As far as I can see from the series, they actually
>>> don't mean anything. It's just a dump of data into a qcow2 file, and it
>>> can be any bitmap, be it associated with the file itself or not.
>>>
>>> This is further pointed to by your feature proposal "Allow qcow2 images
>>> without l1_table and other staff but only with dirty bitmaps with
>>> minimum overhead". There is a file format for exactly that, and it's
>>> called tar (yes, you are missing some metadata, but just add a JSON
>>> description file to the archive and you're done).
>>>
>>> By the way: I heard John briefly touch this in his talk at KVM Forum
>>> when he explained that this would make qcow2 files something like better
>>> tar files, and I didn't like the idea back then either. I was hoping
>>> that it would actually be differently, and was waiting for some
>>> discussion to appear, but I didn't notice this series, because it
>>> doesn't have "qcow2" in the cover letter's subject (and I wasn't CC'd,
>>> but I don't really see why I should have been, as I'm not mentioned in
>>> the MAINTAINERS file (what a lucky man I am!)). I only just noticed
>>> today when I saw a lone reply from John on qemu-block to a patch with a
>>> "qcow2:" prefix.
>>>
>>> So, what you are apparently planning to do is to dump dirty bitmaps into
>>> any available qcow2 file. If the image you are operating on is a qcow2
>>> file, great! If it isn't, you create some empty qcow2 file and dump the
>>> bitmaps there.
>>>
>>> Then, I'm asking myself why you don't use tar files in the second case,
>>> and then, why you don't use tar files in the first case. I do remember
>>> John saying that there was a dicussion about it, but I don't know about
>>> it, so I don't know why you dropped that idea in favor of making qcow2
>>> files tar archives. The only reason I can think of off the top of my
>>> head is that we have infrastructure for reading qcow2 files, but not for
>>> tar files. However, this series is like just appending a tar file to a
>>> qcow2 file, and then implementing a reader for tar archives inside of
>>> the qcow2 driver, so it doesn't seem to be much simpler in practice.
>>>
>>> In any case, if my assumptions so far are more or less correct, no
>>> outside program can do anything with the dirty bitmaps contained in the
>>> qcow2 file, because they are just binary data which does not necessarily
>>> have any connection to the qcow2 file itself. Not even qemu can make
>>> sense of them, it appears, it needs the user or the management tool to
>>> do so.
>>>
>>> I am strongly against putting binary data into a qcow2 file which does
>>> not have any visible connection to the file's contents.
>>>
>>> Obviously, it is possible that there is some connection which I am just
>>> not seeing, though.
>>>
>>>
>>>
>>> --- RANT OVER ---
>>>
>>> Okay, that was enough destructive criticism, now to get some
>>> constructive arguments and ideas.
>>>
>>> So, there are two points I don't like: First, it's binary data which
>>> isn't explained in the qcow2 specification. This can easily be fixed.
>>>
>>> Second, there is no obvious connection between the qcow2 file and a
>>> dirty bitmap. I'd drop the idea of "If you use anything else than qcow2,
>>> we create an empty qcow2 file and put the dirty bitmaps there". Please
>>> don't do that. If you are using something else and want this feature,
>>> that's your problem. If you need features, you use qcow2. That's it. If
>>> you really want to support it for other file formats, but the data into
>>> tar archives and not into qcow2 files.
>>>
>>> For comparison, this is like using a qcow2 file for implementing backing
>>> files for raw images. The cluster offsets in L2 tables would then point
>>> to offsets in the raw image (and the host offset would have to match the
>>> guest offset), and by looking at which L2 table entries are unused, one
>>> could deduce which sectors are to be read from the backing file. We
>>> don't support that either, because you should just use qcow2 if you want
>>> backing files.
>>>
>>> Next we need to know for every dirty bitmap what the reference disk is.
>>> Since generally that reference disk is stored in some image file
>>> somewhere, I'd add a filename for each of the dirty bitmaps which is the
>>> base file in respect to which these clusters are considered dirty.
>>>
>>> As a measurement on how well you have done to associate a dirty bitmap
>>> with a qcow2 file, imagine the following scenario: You are writing a
>>> program independent of qemu, and that program is to make use of the
>>> dirty bitmaps for incremental backups.
>>>
>>> With my proposal above, it would open the qcow2 file and pick some
>>> bitmap based on name, base image, user choice or maybe some property of
>>> the bitmap itself (e.g. lowest dirty bit count). Then, it would create a
>>> new overlay file (the backup image), let's say a qcow2 file, and use the
>>> base image filename of the selected dirty bitmap as the filename of the
>>> backing file for the backup image. Then, it would copy all dirty
>>> clusters from the original qcow2 file to the backup image, and that's it.
>>>
>>> Right now, with this patch alone, the tool has no idea what the base
>>> image is, and some bitmaps may not even be related to the very qcow2
>>> file they are in at all.
>>>
>>>
>>> With that fixed, I could be moved to accept the concept of dirty bitmaps
>>> in qcow2 files grudgingly. Maybe happily, if you give me a good reason
>>> why we should not put them into tar files.
>>>
>>>
>>> And I have some other comments in regards to the specification:
>>>
>>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>>> index 121dfc8..5fc0365 100644
>>>> --- a/docs/specs/qcow2.txt
>>>> +++ b/docs/specs/qcow2.txt
>>>> @@ -103,7 +103,13 @@ in the description of a field.
>>>>                        write to an image with unknown auto-clear
>>>> features if it
>>>>                        clears the respective bits from this field first.
>>>>    -                    Bits 0-63:  Reserved (set to 0)
>>>> +                    Bit 0:      Dirty bitmaps bit. If this bit is
>>>> set then
>>>> +                                there is a _consistent_ Dirty
>>>> bitmaps extension
>>>> +                                in the image. If it is not set, but
>>>> there is a
>>>> +                                Dirty bitmaps extension, its data
>>>> should be
>>>> +                                considered as inconsistent.
>>>> +
>>>> +                    Bits 1-63:  Reserved (set to 0)
>>>>               96 -  99:  refcount_order
>>>>                        Describes the width of a reference count block
>>>> entry (width
>>>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like
>>>> the following:
>>>>                            0x00000000 - End of the header extension area
>>>>                            0xE2792ACA - Backing file format name
>>>>                            0x6803f857 - Feature name table
>>>> +                        0x23852875 - Dirty bitmaps
>>>>                            other      - Unknown header extension, can
>>>> be safely
>>>>                                         ignored
>>>>    @@ -166,6 +173,24 @@ the header extension data. Each entry look
>>>> like this:
>>>>                        terminated if it has full length)
>>>>      +== Dirty bitmaps ==
>>>> +
>>>> +Dirty bitmaps is an optional header extension. It provides an
>>>> ability to store
>>>> +dirty bitmaps in a qcow2 image. The fields are:
>>>> +
>>>> +          0 -  3:  nb_dirty_bitmaps
>>>> +                   The number of dirty bitmaps contained in the
>>>> image. Valid
>>>> +                   values: 0 - 65535.
>>> Why? Because that's what qemu supports? That's not a real reason. If so,
>>> you may make a note of that (see the cluster_bits documentation), or
>>> just omit it; for years, qemu only supported refcount_order = 4, but the
>>> specification did not make a note of that. It was just a limitation of
>>> qemu, but not of the format.
>>>
>>>> +
>>>> +          4 -  7:  dirty_bitmap_directory_size
>>>> +                   Size of the Dirty Bitmap Directory in bytes.
>>>> Valid values:
>>>> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
>>> Same here.
>>>
>>>> +
>>>> +          8 - 15:  dirty_bitmap_directory_offset
>>>> +                   Offset into the image file at which the Dirty Bitmap
>>>> +                   Directory starts. Must be aligned to a cluster
>>>> boundary.
>>>> +
>>>> +
>>>>    == Host cluster management ==
>>>>      qcow2 manages the allocation of host clusters by maintaining a
>>>> reference count
>>>> @@ -360,3 +385,103 @@ Snapshot table entry:
>>>>              variable:   Padding to round up the snapshot table entry
>>>> size to the
>>>>                        next multiple of 8.
>>>> +
>>>> +
>>>> +== Dirty bitmaps ==
>>>> +
>>>> +The feature supports storing dirty bitmaps in a qcow2 image.
>>> I think I've made my point clear enough in the huge wall of text above,
>>> but I'll just repeat it once more: This should explain what dirty
>>> bitmaps are and how they are to be interpreted.
>>>
>>>> +
>>>> +=== Cluster mapping ===
>>>> +
>>>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>>>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
>>>> +
>>>> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
>>>> +Directory Entry) and may use multiple clusters, however it must be
>>>> contiguous
>>>> +in the image file.
>>>> +
>>>> +Given an offset (in bytes) into the bitmap, the offset into the
>>>> image file can
>>>> +be obtained as follows:
>>>> +
>>>> +    byte_offset =
>>>> +        dirty_bitmap_table[offset / cluster_size] + (offset %
>>>> cluster_size)
>>>> +
>>>> +Taking into accout the granularity of the bitmap, an offset in bits
>>>> into the
>>>> +image file can be obtained like this:
>>>> +
>>>> +    bit_offset =
>>>> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr /
>>>> granularity) % 8
>>>> +
>>>> +Here bit_nr is a number of "virtual" bit of the bitmap, which is
>>>> covered by
>>>> +"physical" bit with number (bit_nr / granularity).
>>>> +
>>>> +Dirty Bitmap Table entry:
>>>> +
>>>> +    Bit  0 -  8:    Reserved
>>>> +
>>>> +         9 - 55:    Bits 9-55 of host cluster offset. Must be
>>>> aligned to a
>>>> +                    cluster boundary. If the offset is 0, the
>>>> cluster is
>>>> +                    unallocated, and should be read as all zeros.
>>>> +
>>>> +        56 - 63:    Reserved
>>>> +
>>>> +=== Dirty Bitmap Directory ===
>>>> +
>>>> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
>>>> +Directory entry. Dirty Bitmap Directory is a contiguous area in the
>>>> image file,
>>>> +whose starting offset and length are given by the header extension
>>>> fields
>>>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The
>>>> entries of
>>>> +the bitmap directory have variable length, depending on the length
>>>> of the
>>>> +bitmap name.
>>>> +
>>>> +Dirty Bitmap Directory Entry:
>>>> +
>>>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>>>> +                    Offset into the image file at which the Dirty
>>>> Bitmap Table
>>>> +                    for the bitmap starts. Must be aligned to a cluster
>>>> +                    boundary.
>>>> +
>>>> +         8 - 15:    nb_virtual_bits
>>>> +                    Number of "virtual" bits in the bitmap. Number of
>>>> +                    "physical" bits would be:
>>>> +                    (nb_virtual_bits + granularity - 1) / granularity
>>>> +
>>>> +        16 - 19:    dirty_bitmap_table_size
>>>> +                    Number of entries in the Dirty Bitmap Table of
>>>> the bitmap.
>>>> +                    Valid values: 0 - 0x8000000.
>>>> +                    Also, (dirty_bitmap_table_size * cluster_size)
>>>> should not
>>>> +                    be greater than 0x20000000 (512 MB)
>>> Again, is this a qemu limitation or is there another reason? Also, you
>>> should decide between the two limitations. The second one automatically
>>> limits the number of values to 0 - 1048575 at maximum (512 byte
>>> clusters).
>>>
>>>> +
>>>> +        20 - 23:    granularity_bits
>>>> +                    Granularity bits. Valid values are: 0 - 63.
>>>> +
>>>> +                    Granularity is calculated as
>>>> +                        granularity = 1 << granularity_bits
>>>> +
>>>> +                    Granularity of the bitmap is how many "virtual"
>>>> bits
>>>> +                    accounts for one "physical" bit.
>>>> +
>>>> +        24 - 27:    flags
>>>> +                    Bit
>>>> +                      0: in_use
>>>> +                         The bitmap is in use and may be inconsistent.
>>> What does "in use" mean? You are not supposed to use a qcow2 file which
>>> is in use by qemu anyway.
>>>
>>>> +
>>>> +                      1: self
>>>> +                         The bitmap is a dirty bitmap for the
>>>> containing image.
>>> As I said, I don't see why we should support this ever being not set, so
>>> I am very much in favor of dropping this.
>>>
>>>> +
>>>> +                      2: auto
>>>> +                         The bitmap should be autoloaded as block
>>>> dirty bitmap.
>>>> +                         Only available if bit 1 (self) is set.
>>> The phrasing is too qemu-specific. Remember that this is not an
>>> explanation for how qemu is to interpret qcow2 files, but a
>>> *specification* of qcow2 files for *any* tool.
>>>
>>> So if I understand the intention behind this flag, a more general
>>> expression would be "The default bitmap". Then it is qemu's decision to
>>> always auto-load this default bitmap.
>>>
>>>> +
>>>> +                      3: read_only
>>>> +                         The bitmap should not be rewritten.
>>>> +
>>>> +                    Bits 4 - 31 are reserved.
>>>> +
>>>> +        28 - 29:    name_size
>>>> +                    Size of the bitmap name. Valid values: 0 - 1023.
>>>> +
>>>> +        variable:   The name of the bitmap (not null terminated).
>>>> +
>>>> +        variable:   Padding to round up the Dirty Bitmap Directory
>>>> Entry size to
>>>> +                    the next multiple of 8.
>>>>
>>> The interesting thing is that I have written a huge wall of text above
>>> and all my comments (except for "just put it into tar") can be addressed
>>> relatively easy. Just add documentation for what dirty bitmaps are, and
>>> a "variable: base_filename" field here, and that would be it.
>>>
>>> But there is a reason why I'm keeping the wall of text there: I feel
>>> like while these are very minor changes, they are fundamental design
>>> differences. Without these changes, you just add a binary data dump
>>> extension to qcow2, which is of no use to anyone but qemu (and not even
>>> qemu alone, it needs the user or a management tool to tell it what to do
>>> with it, unless the @auto flag is set).
>>>
>>> With these changes, it suddenly actually becomes an integral part of the
>>> qcow2 file which can be interpreted and used in a meaningful way by
>>> tools other than qemu itself.
>>>
>>> Max
>>>
>> This is actually not a very big deal from my point of view if this would
>> put us into agreement and will allow to proceed further. The bitmap will
>> be available in QCOW2, Parallels image can also have bitmap inside,
>> not yet code on QEMU side.
>>
>> This would be enough for me for a while.
>>
>> Thus the question is on John side whether "bitmap in separate file"
>> feature is really necessary. This is mostly API question.
>>
>> Den
> (NB: I never got Max's original reply, so this reply is more to Max than
> to Denis or Vladimir.)
>
> I'll see your wall of text and raise you my own wall of text...
>
> We consider the ability to use persistent bitmaps to create incremental
> backups for non-qcow2 images to be a necessary and vital component of
> complete incremental backup solution, especially considering the core
> mechanism of the feature does not really rely on qcow2 for anything
> outside of convenience (e.g. backing files.)
>
> Vladimir's initial proposal of using .qcow2 to store the bitmap sounded
> good to me, because I wanted to be able to store the bitmap in a qcow2
> file anyway, and expanding the specification to allow it to store
> /arbitrary/ bitmaps seemed like a natural fit to accomplish both goals
> with a minimum of coding.
>
> However, Max has raised some pretty good points here -- allow me to
> paraphrase his Wall Of Text™:
>
> - Since this patch is a modification of the qcow2 /specification/ which
> is used by more than just QEMU, we must take care to avoid QEMU-isms
> limitations and design a more universal approach to the specification
> addendum.
>
> - Specific caps on the number of bitmaps, the granularity of said
> bitmaps, and the resultant size of said bitmaps should be addressed in
> terms of the spec, not in terms of what's necessarily convenient or
> sufficient for QEMU. I think this point will be easy to address with
> some better spec wording.
>
> - The bitmap language in the spec is generic and doesn't refer much to
> anything. This is partially my fault, as I believe I likely guided
> Vladimir towards using generic language that was tied more to the
> HBitmap format than towards our specific implementation
> (BdrvDirtyBitmap.) I recognize this as a bit of a misguided effort on my
> part to keep things "generic," but what I succeeded in doing was keeping
> it "useless" outside of QEMU. Example: "number of virtual bits" is
> meaningless, but "number of 512 byte sectors" is not.
>
> The last bit is the crux of our problem and the most deserving of our
> attention:
>
> - For a bitmap to be useful to an application outside of QEMU, all of
> the necessary information for interpreting that bitmap must either be
> present within the file or referenced. For bitmaps that describe the
> file they are stored in, this is trivial with some specification editing.
>
> For bitmaps stored for /other/ files, this gets... trickier. What is
> this a bitmap for? What does it describe? What data does it describe?
>
> Node-names and drive names here are useless outside of QEMU and can of
> course change between QEMU invocations or be shared between different
> QEMU instances, so this is useless ...
>
> We could store filenames, but networked devices and distributed
> filesystems may have interesting relative pathnames that will not remain
> reliable once the .qcow2 file is shuffled around or migrated, so storing
> path-name references seems like a losing battle here, too. Maybe we only
> have a file descriptor and no name at all -- what do we write for the
> "global identifier that uniquely identifies the data we belong to"? Is
> it even possible?
>
> The only conclusion I can reach here is that storing bitmaps inside of a
> .qcow2 that remain meaningful to external applications is not going to
> be easily possible.
>
> Perhaps we need to abandon the idea that we can store any bitmap we want
> into a .qcow2.
>
> However, I'm still a big fan of storing bitmaps that describe the data
> they go alongside in the same .qcow2 as a convenience feature --
> especially now since Vladimir has done the hard work for us all writing
> the feature.
>
> For simple use cases in non-managed environments the use case for
> storing the bitmap inside the qcow2 it describes is pretty compelling:
>
> - No extra files to track or manage
> - The command line used to boot QEMU the first time can be used to boot
> QEMU subsequently, and we get the persistent bitmap automatically
> without further modification.
> - Migration across a shared medium using .qcow2 files is trivial
> - Backups managed by qcow2-unaware applications trivially bring along
> our persistent bitmap data for us without additional configuration.
>
> It's simple, the data is meaningful to external applications, and we've
> got most of the code we need already, thanks to Vladimir.
>
> Sadly, we still need a way to store bitmap data for files that do not
> offer .bitmap_load and .bitmap_store primitives for us.
>
> Presumably, if we devise our own "generic bitmap container" format, we
> don't have to store things like node names, filenames, etc in this
> container and we can use it to just store (name, granularity, size,
> [data]) like we were trying to do in qcow2.
>
> Matching bitmap IDs up to the data they belong to becomes the
> responsibility of the user/management layer.
>
> Where this gets hairy, perhaps, is how to enforce that the drive data
> that belongs to this bitmap isn't modified without our say-so? How do we
> detect de-sync? The quick answer might be to store a hash alongside the
> bitmap, and upon being re-applied to the drive if the hash doesn't
> match, we throw an error/complain/etc -- but what about cases -- again
> -- where we don't necessarily have a file we can trivially hash, like a
> many-gigs-wide raw file being mounted over a networked file-system?
> "external bitmaps" appear to pose a very real desync risk.
>
> I suppose we would have had that problem anyway with the
> qcow2-as-container idea.
>
> Perhaps the best we can say is "This is kind of a dangerous feature to
> use, use at your own peril!" and strongly recommend that external
> bitmaps are used only in conjunction with a management interface.
>
> That's about all of the thoughts I have on the matter currently.
> Does anybody else have strong feelings on where we should go from here?
>
> (A) Argue with Max and push for qcow2-as-container
> (B) Use qcow2 for self-reference bitmaps only, use an external format
> for formats that do not support .bitmap_load or .bitmap_store
> (C) Forget about the qcow2 extension entirely, use only the new external
> format
> (D) Something else?
>
> My vote is for (B), and if I can find a bit of consensus on that, we can
> draft an internal-use specification for the file, but I am very wary of
> how we will manage de-sync or if we will be able manage it at all.
> ("Your fault for touching this file when QEMU was not running.")
>
> Thoughts?
>
> --js
the better way is (A) if possible at all but we can follow (B)
if (A) is not possible at all.

At least we know what to do. Frankly speaking the only sad
really necessary to support format is raw image which does
not have obvious container to keep the bitmap.

Here are some arguments which could be valuable or may
be not valuable to Max.

We have to have a bitmap inside QCOW2 file for a reasons
listed above by John. They are really valuable. For the time
being we were able to keep a lot of binary data inside the
image and VM management was really quite simple. We have
just to copy the image from one host to another. It seems
important to me to keep this feature rolling. Thus the bitmap
will stay inside QCOW2 image.

Actually all later things are a matter of external API. Would
we allow to create image without data or not. End-users
will try to fake us with all their brains to save dirty bitmap
if bitmap based backup will become useful and if they will
not use QCOW2.

Are we stopping the train at full speed using a sheet of paper
or not, preventing to create such files or not is a real question.
Any other extra different external format will costs us a LOT of
efforts and I do not see a volunteer who will perform this job
and this is an unfortunate side of things.

 From my side I am really uncomfortable to drop the work
performed by Vladimir for a lot of reasons and one of
them is time frame. We have already spent around 9 months
of work to get here. I am feeling like a farther :)

Max, do you have the force with you to drive creation of this
new format stuff?

Anyway, we all have written several really lengthy letters.
May be it would be wise to discuss things verbally somehow?

Den

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-10-08 20:28       ` John Snow
  2015-10-08 20:56         ` Denis V. Lunev
@ 2015-10-09 17:07         ` Max Reitz
  2015-10-09 20:14           ` [Qemu-devel] [Qemu-block] " Eric Blake
  1 sibling, 1 reply; 53+ messages in thread
From: Max Reitz @ 2015-10-09 17:07 UTC (permalink / raw)
  To: John Snow, Denis V. Lunev, Vladimir Sementsov-Ogievskiy,
	qemu-devel, Qemu-block
  Cc: kwolf, pbonzini, stefanha

[-- Attachment #1: Type: text/plain, Size: 9766 bytes --]

On 08.10.2015 22:28, John Snow wrote:

[...]

> (NB: I never got Max's original reply, so this reply is more to Max than
> to Denis or Vladimir.)

Let's hope you'll see this one, then. :-)

> I'll see your wall of text and raise you my own wall of text...
> 
> We consider the ability to use persistent bitmaps to create incremental
> backups for non-qcow2 images to be a necessary and vital component of
> complete incremental backup solution, especially considering the core
> mechanism of the feature does not really rely on qcow2 for anything
> outside of convenience (e.g. backing files.)
> 
> Vladimir's initial proposal of using .qcow2 to store the bitmap sounded
> good to me, because I wanted to be able to store the bitmap in a qcow2
> file anyway, and expanding the specification to allow it to store
> /arbitrary/ bitmaps seemed like a natural fit to accomplish both goals
> with a minimum of coding.
> 
> However, Max has raised some pretty good points here -- allow me to
> paraphrase his Wall Of Text™:
> 
> - Since this patch is a modification of the qcow2 /specification/ which
> is used by more than just QEMU, we must take care to avoid QEMU-isms
> limitations and design a more universal approach to the specification
> addendum.
> 
> - Specific caps on the number of bitmaps, the granularity of said
> bitmaps, and the resultant size of said bitmaps should be addressed in
> terms of the spec, not in terms of what's necessarily convenient or
> sufficient for QEMU. I think this point will be easy to address with
> some better spec wording.
> 
> - The bitmap language in the spec is generic and doesn't refer much to
> anything. This is partially my fault, as I believe I likely guided
> Vladimir towards using generic language that was tied more to the
> HBitmap format than towards our specific implementation
> (BdrvDirtyBitmap.) I recognize this as a bit of a misguided effort on my
> part to keep things "generic," but what I succeeded in doing was keeping
> it "useless" outside of QEMU. Example: "number of virtual bits" is
> meaningless, but "number of 512 byte sectors" is not.

And these are all things that are only a question of the implementation,
so to speak. While it may not be easy writing up the necessary bits for
the specification, I don't think there'll be much discussion on it.

Except maybe the last bit, because "512 byte sector" basically is
meaningless when talking about a qcow2 file (which works in terms of
clusters), but that's where the second part comes in:

> The last bit is the crux of our problem and the most deserving of our
> attention:
> 
> - For a bitmap to be useful to an application outside of QEMU, all of
> the necessary information for interpreting that bitmap must either be
> present within the file or referenced. For bitmaps that describe the
> file they are stored in, this is trivial with some specification editing.
> 
> For bitmaps stored for /other/ files, this gets... trickier. What is
> this a bitmap for? What does it describe? What data does it describe?
> 
> Node-names and drive names here are useless outside of QEMU and can of
> course change between QEMU invocations or be shared between different
> QEMU instances, so this is useless ...
> 
> We could store filenames, but networked devices and distributed
> filesystems may have interesting relative pathnames that will not remain
> reliable once the .qcow2 file is shuffled around or migrated, so storing
> path-name references seems like a losing battle here, too. Maybe we only
> have a file descriptor and no name at all -- what do we write for the
> "global identifier that uniquely identifies the data we belong to"? Is
> it even possible?

I'd be fine with filenames. It works reasonably well for backing files,
and it's basically the same problem there.

Anyway, even if you could describe the image the dirty bitmap is for,
I'd still oppose putting all that into qcow2. Imagine you're writing a
qcow2 interpreting tool and reading the specification, then:

“This field contains the filename of the image this dirty bitmap is for.
This field contains the filename of the clean image. This field contains
the resolution of the dirty bitmap in units of 512 bytes.”

While this may make sense from the perspective of qemu, it doesn't make
any sense from the perspective of qcow2. As said tool writer, you'd be
asking yourself: “OK, so this information is completely useless because
it says nothing about the qcow2 file itself? Actually, it doesn't even
have any connection to this file.”

It's actually not better than a binary data dump without any information
on how to interpret it, then. Because you cannot interpret it, even
though you know how to; if nothing else, that's because you're writing a
qcow2 tool and the other image is very unlikely to be a qcow2 image as well.

> The only conclusion I can reach here is that storing bitmaps inside of a
> .qcow2 that remain meaningful to external applications is not going to
> be easily possible.
> 
> Perhaps we need to abandon the idea that we can store any bitmap we want
> into a .qcow2.
> 
> However, I'm still a big fan of storing bitmaps that describe the data
> they go alongside in the same .qcow2 as a convenience feature --

I'd be fine with that.

> especially now since Vladimir has done the hard work for us all writing
> the feature.

It's just that if we need a new format for all the other image formats
anyway, from the effort side of things, having a special implementation
for qcow2 won't make the implementation any easier, even if we already
have it.

> For simple use cases in non-managed environments the use case for
> storing the bitmap inside the qcow2 it describes is pretty compelling:
> 
> - No extra files to track or manage
> - The command line used to boot QEMU the first time can be used to boot
> QEMU subsequently, and we get the persistent bitmap automatically
> without further modification.
> - Migration across a shared medium using .qcow2 files is trivial
> - Backups managed by qcow2-unaware applications trivially bring along
> our persistent bitmap data for us without additional configuration.

OK, those points look good enough to justify making qcow2 a special case.

Even though I'm still not really convinced in regards to the command
line, because I still think it's a management tool level feature.

> It's simple, the data is meaningful to external applications, and we've
> got most of the code we need already, thanks to Vladimir.
> 
> Sadly, we still need a way to store bitmap data for files that do not
> offer .bitmap_load and .bitmap_store primitives for us.
> 
> Presumably, if we devise our own "generic bitmap container" format, we
> don't have to store things like node names, filenames, etc in this
> container and we can use it to just store (name, granularity, size,
> [data]) like we were trying to do in qcow2.
> 
> Matching bitmap IDs up to the data they belong to becomes the
> responsibility of the user/management layer.

Yep.

> Where this gets hairy, perhaps, is how to enforce that the drive data
> that belongs to this bitmap isn't modified without our say-so? How do we
> detect de-sync? The quick answer might be to store a hash alongside the
> bitmap, and upon being re-applied to the drive if the hash doesn't
> match, we throw an error/complain/etc -- but what about cases -- again
> -- where we don't necessarily have a file we can trivially hash, like a
> many-gigs-wide raw file being mounted over a networked file-system?
> "external bitmaps" appear to pose a very real desync risk.
> 
> I suppose we would have had that problem anyway with the
> qcow2-as-container idea.

Exactly. But I'd say we can worry about that later. If we have something
like a JSON description file inside of the container, we can always add
timestamps or hashes later on.

> Perhaps the best we can say is "This is kind of a dangerous feature to
> use, use at your own peril!" and strongly recommend that external
> bitmaps are used only in conjunction with a management interface.

Yes, that's what I'd do.

> That's about all of the thoughts I have on the matter currently.
> Does anybody else have strong feelings on where we should go from here?
> 
> (A) Argue with Max and push for qcow2-as-container

:-)

> (B) Use qcow2 for self-reference bitmaps only, use an external format
> for formats that do not support .bitmap_load or .bitmap_store
> (C) Forget about the qcow2 extension entirely, use only the new external
> format
> (D) Something else?
> 
> My vote is for (B),

Sounds good to me.

>                     and if I can find a bit of consensus on that, we can
> draft an internal-use specification for the file, but I am very wary

“very wary”, yes, I noticed. Because I am very wary of word combinations
like that.

>                                                                      of
> how we will manage de-sync or if we will be able manage it at all.
> ("Your fault for touching this file when QEMU was not running.")

Well, yes, the problem is a different than the "Your fault for touching
a qcow2 file when qemu was using it", but I mean, what would the use
case be?

If you're not using a management tool, well, then it *is* your fault.

If you are using a management tool, then that means you are using
incremental backups and from time to time you are writing data to the
image from outside qemu and expect it to be caught by your management
tool automatically. Sounds like "your fault" to me, too.

> Thoughts?

All above. :-)

> --js

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PAT​CH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-10-08 20:56         ` Denis V. Lunev
@ 2015-10-09 18:14           ` Max Reitz
  0 siblings, 0 replies; 53+ messages in thread
From: Max Reitz @ 2015-10-09 18:14 UTC (permalink / raw)
  To: Denis V. Lunev, John Snow, Vladimir Sementsov-Ogievskiy,
	qemu-devel, Qemu-block
  Cc: kwolf, pbonzini, stefanha

[-- Attachment #1: Type: text/plain, Size: 9114 bytes --]

On 08.10.2015 22:56, Denis V. Lunev wrote:
> On 10/08/2015 11:28 PM, John Snow wrote:

[...]

>> That's about all of the thoughts I have on the matter currently.
>> Does anybody else have strong feelings on where we should go from here?
>>
>> (A) Argue with Max and push for qcow2-as-container
>> (B) Use qcow2 for self-reference bitmaps only, use an external format
>> for formats that do not support .bitmap_load or .bitmap_store
>> (C) Forget about the qcow2 extension entirely, use only the new external
>> format
>> (D) Something else?
>>
>> My vote is for (B), and if I can find a bit of consensus on that, we can
>> draft an internal-use specification for the file, but I am very wary of
>> how we will manage de-sync or if we will be able manage it at all.
>> ("Your fault for touching this file when QEMU was not running.")
>>
>> Thoughts?
>>
>> --js
> the better way is (A) if possible at all but we can follow (B)
> if (A) is not possible at all.

Well, arguing and pushing is always possible, the question is whether
it's more effort and whether it'll succeed.

Kevin has not participated in this discussion yet, but I seem to
remember that he wasn't happy about the specific wording “using qcow2 as
tar” either. Him being the qcow2 maintainer means that he'll have the
final say.

> At least we know what to do. Frankly speaking the only sad
> really necessary to support format is raw image which does
> not have obvious container to keep the bitmap.

I'm not opposed to only support qcow2, you and John are. :-)

> Here are some arguments which could be valuable or may
> be not valuable to Max.

Any honest argument is valuable.

> We have to have a bitmap inside QCOW2 file for a reasons
> listed above by John. They are really valuable. For the time
> being we were able to keep a lot of binary data inside the
> image and VM management was really quite simple. We have
> just to copy the image from one host to another. It seems
> important to me to keep this feature rolling. Thus the bitmap
> will stay inside QCOW2 image.

As long as the bitmap directly relates to the qcow2 file, i.e. its
entries describe the dirtiness of clusters in the qcow2 file, all good.

> Actually all later things are a matter of external API. Would
> we allow to create image without data or not. End-users
> will try to fake us with all their brains to save dirty bitmap
> if bitmap based backup will become useful and if they will
> not use QCOW2.

Well, the first argument would be “use qcow2”. That was the topic of
Kevin's and my talk at KVM Forum: If you need features only qcow2
provides, you should just use qcow2.

There is not a significant performance difference between preallocated
qcow2 and raw images. Thus, if you want raw performance and need dirty
bitmaps, you can just create a preallocation=metadata qcow2 image and
that's it.

Other than that: What do you mean by “fake us with all their brains”? It
just wouldn't be possible if we didn't allow foreign bitmaps in qcow2
files and didn't have an external container format for non-qcow2 files.

Maybe they can try to press us with all their might, but first I'd still
consider the “use preallocated qcow2 images” to be sufficient, and
second, well, if they press hard enough, that'd be enough to devise and
external container format.

> Are we stopping the train at full speed using a sheet of paper
> or not, preventing to create such files or not is a real question.
> Any other extra different external format will costs us a LOT of
> efforts

Will they really? The greatest facilitation I can see one gains by
putting it all into qcow2 is that you can rely on features already
present in the block layer.

The implementation would be ugly, but other than that I fail to see a
problem right now. It'd be ugly because there's no infrastructure we can
use to put a general tar driver in; so we'd have to fuse the tar driver
with the bitmap storage driver. But even that doesn't seem too bad to me.

>         and I do not see a volunteer who will perform this job
> and this is an unfortunate side of things.

I'd probably take up the challenge if I didn't have close to a hundred
patches in flight on qemu-devel, and if I weren't working part time.

I think it would be fine to implement the feature for qcow2 only for
now. Later, we can do one of three things:

(1) Add the “proprietary” external container format for non-qcow2
    images.

(2) Add foreign bitmaps to qcow2 as envisioned in this series.

(3) Make qcow2 that proprietary external container format. That means,
    these files would technically be qcow2 files, but we would modify
    them in a way that they cannot be opened by programs other than qemu
    (e.g. make the version number -1). These files would not contain an
    L1 table, but only bitmap data for foreign images.

Way (3) would be a compromise I'd still oppose but not as strongly as
way (2).

Way (2) is abuse without acknowledging it. Way (3) at least shows that
we know that what we're doing isn't right and that no program other than
qemu can work with these files, even though qcow2 is supposed to be an
open format.

> From my side I am really uncomfortable to drop the work
> performed by Vladimir for a lot of reasons

There is no need to drop it. For qcow2 it's fine. The question remains
whether it is for other formats, and I don't think it is as it is.

>                                            and one of
> them is time frame. We have already spent around 9 months
> of work to get here. I am feeling like a farther :)

I can relate, a lot of the patches I have on qemu-devel I have worked on
for close to a year. But still, if I don't like it I won't say I do just
because it has been a lot of work.

> Max, do you have the force with you to drive creation of this
> new format stuff?

Yes, I do. I'm so opposed to dumping binary blobs into regular qcow2
files that if need be I'd put all other stuff I have going on on hold
and implement it myself.

That is, unless e.g. Kevin is completely fine with it. He's the
maintainer, he has to decide.

> Anyway, we all have written several really lengthy letters.
> May be it would be wise to discuss things verbally somehow?

I don't know. We have written lengthy letters because I didn't notice
the initial discussion, I suppose. That's why I felt I had to make my
point very clear why I'm opposing at a point when a consensus between
some people had already been reached and patches existed.

If you feel like a verbal discussion (IRC? Phone?) might be better than
just emails, I'm in. But I feel it will mostly be me saying “I don't
want it” and you saying “But it's been so long and it's the simplest way
to implement it”.

To me, it looks like we acknowledge each other's arguments, but we weigh
them differently. I think a qcow2 file which we pretend is a standard
qcow2 file but of no use to anyone but qemu is not a qcow2 file, and
that this therefore is nothing we can reasonably add to the
specification. I don't think having the code for it and having worked on
it for a long time is a very strong argument.*

You on the other hand do consider the effort and time already spent a
very strong argument, but don't consider keeping qcow2 open as strong as me.

I don't know whether a verbal discussion will help us with that.
Ideally, we want to find a compromise with which both of us are happy,
of course. Unideally, Kevin will just pick one because he's the maintainer.


* While I do accept some blame for not noticing the discussion, I will
not accept “You did not notice, the code is here, so it's your problem
now.” ;-)

git blame would have told you that I do some work on qcow2, so inviting
me to the discussion early might have been a good idea. You probably did
the right thing by inviting Kevin, because he's done the most work, he's
the maintainer, and he's done most of the qcow2 v3 design, as far as I
know. But the thing is that he gets so much mail that he often doesn't
notice some discussions until late.

Anyway, that wouldn't have been a problem if you had put “qcow2” in the
subject of the cover letter. I personally skip through qemu-devel and
qemu-block based on the subject of the cover letter, and if it doesn't
seem directly relevant to me, I'll mark it read and that's it.

So it's not my fault alone that I did not participate in discussions
leading up to this point, and this is why while I do have compassion for
patches which have been worked on for a long time (because I can
relate), I am not really willing to “go weak” on these patches.

Please don't take this as a “not my fault, you're to blame!”, though. :-)

I'm just saying that it's very unfortunate that we're only discussing
this now, and that it's not really anyone's fault.

(And I'm saying this so you know why I don't feel like “We've worked on
it for so long” is a strong point.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
  2015-10-09 17:07         ` [Qemu-devel] [PATCH " Max Reitz
@ 2015-10-09 20:14           ` Eric Blake
  0 siblings, 0 replies; 53+ messages in thread
From: Eric Blake @ 2015-10-09 20:14 UTC (permalink / raw)
  To: Max Reitz, John Snow, Denis V. Lunev,
	Vladimir Sementsov-Ogievskiy, qemu-devel, Qemu-block
  Cc: kwolf, pbonzini, stefanha

[-- Attachment #1: Type: text/plain, Size: 2180 bytes --]

On 10/09/2015 11:07 AM, Max Reitz wrote:

> 
> Except maybe the last bit, because "512 byte sector" basically is
> meaningless when talking about a qcow2 file (which works in terms of
> clusters),

At KVM Forum, Kevin was mentioning an idea of adding an incompatible
feature to qcow2 that would let it track per-sector dirty/zero/backing
information within a cluster (things would still be allocated by
cluster, but you could get fine-grained COW and other perks), by having
the feature bit turn on an alternative L2 table entry representation
that occupies more than 64 bits.  If that happens, then qcow2 would have
a bit more per-sector smarts in addition to its existing per-cluster
focus.  But not something that should hold up this discussion.

>> We could store filenames, but networked devices and distributed
>> filesystems may have interesting relative pathnames that will not remain
>> reliable once the .qcow2 file is shuffled around or migrated, so storing
>> path-name references seems like a losing battle here, too. Maybe we only
>> have a file descriptor and no name at all -- what do we write for the
>> "global identifier that uniquely identifies the data we belong to"? Is
>> it even possible?
> 
> I'd be fine with filenames. It works reasonably well for backing files,
> and it's basically the same problem there.

Filenames with the escape clause of json:{...} pseudo-filenames
(matching what we already allow for complex backing files) is fine by me.

>> Does anybody else have strong feelings on where we should go from here?
>>
>> (A) Argue with Max and push for qcow2-as-container
> 
> :-)
> 
>> (B) Use qcow2 for self-reference bitmaps only, use an external format
>> for formats that do not support .bitmap_load or .bitmap_store
>> (C) Forget about the qcow2 extension entirely, use only the new external
>> format
>> (D) Something else?
>>
>> My vote is for (B),
> 
> Sounds good to me.

I'm also leaning towards (B) at the moment, but could possibly still be
swayed by persuasive arguments.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts
  2015-10-06 20:16   ` John Snow
@ 2016-02-16 17:04     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-02-16 17:04 UTC (permalink / raw)
  To: John Snow, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den

On 06.10.2015 23:16, John Snow wrote:
>
> On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
>> Add data structures and constraints accordingly to docs/specs/qcow2.txt
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/Makefile.objs        |  2 +-
>>   block/qcow2-dirty-bitmap.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>   block/qcow2.h              | 28 ++++++++++++++++++++++++++++
>>   3 files changed, 71 insertions(+), 1 deletion(-)
>>   create mode 100644 block/qcow2-dirty-bitmap.c
>>
>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>> index 58ef2ef..c6e1f4b 100644
>> --- a/block/Makefile.objs
>> +++ b/block/Makefile.objs
>> @@ -1,5 +1,5 @@
>>   block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
>> -block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
>> +block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-dirty-bitmap.o
>>   block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
>>   block-obj-y += qed-check.o
>>   block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
>> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
>> new file mode 100644
>> index 0000000..fd4e0ef
>> --- /dev/null
>> +++ b/block/qcow2-dirty-bitmap.c
>> @@ -0,0 +1,42 @@
>> +/*
>> + * Dirty bitmaps for the QCOW version 2 format
>> + *
>> + * Copyright (c) 2014-2015 Vladimir Sementsov-Ogievskiy
>> + *
>> + * This file is derived from qcow2-snapshot.c, original copyright:
>> + * Copyright (c) 2004-2006 Fabrice Bellard
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a copy
>> + * of this software and associated documentation files (the "Software"), to deal
>> + * in the Software without restriction, including without limitation the rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>> + * THE SOFTWARE.
>> + */
>> +
>> +/* NOTICE: DBM here means Dirty Bitmap and used as a namespace for _internal_
>> + * constants. Please do not use this _internal_ abbreviation for other needs
>> + * and/or outside of this file. */
>> +
>> +/* Dirty Bitmap Directory Enrty constraints */
>> +#define DBM_MAX_TABLE_SIZE 0x8000000
>> +#define DBM_MAX_PHYS_SIZE 0x20000000 /* 512 mb */
>> +#define DBM_MAX_GRANULARITY_BITS 63
>> +#define DBM_MAX_NAME_SIZE 1023
>> +
>> +/* Dirty Bitmap Directory Enrty flags */
>> +#define DBM_RESERVED_FLAGS 0xffffffff
>> +
>> +/* bits [0, 8] U [56, 63] are reserved */
>> +#define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
>> diff --git a/block/qcow2.h b/block/qcow2.h
>> index 72e1328..a2a5d4a 100644
>> --- a/block/qcow2.h
>> +++ b/block/qcow2.h
>> @@ -52,6 +52,10 @@
>>    * space for snapshot names and IDs */
>>   #define QCOW_MAX_SNAPSHOTS_SIZE (1024 * QCOW_MAX_SNAPSHOTS)
>>   
>> +/* Dirty Bitmap Header Extension constraints */
>> +#define QCOW_MAX_DIRTY_BITMAPS 65536
>> +#define QCOW_MAX_DIRTY_BITMAP_DIRECTORY_SIZE (1024 * QCOW_MAX_DIRTY_BITMAPS)
>> +
>>   /* indicate that the refcount of the referenced cluster is exactly one. */
>>   #define QCOW_OFLAG_COPIED     (1ULL << 63)
>>   /* indicate that the cluster is compressed (they never have the copied flag) */
>> @@ -141,6 +145,19 @@ typedef struct QEMU_PACKED QCowSnapshotHeader {
>>       /* name follows  */
>>   } QCowSnapshotHeader;
>>   
>> +typedef struct QEMU_PACKED QCowDirtyBitmapHeader {
>> +    /* header is 8 byte aligned */
>> +    uint64_t dirty_bitmap_table_offset;
>> +    uint64_t nb_virtual_bits;
>> +
>> +    uint32_t dirty_bitmap_table_size;
>> +    uint32_t granularity_bits;
>> +
>> +    uint32_t flags;
> looks like DBM_RESERVED_FLAGS was meant to apply to this field, but I
> think the mask is incorrect and should be 0xfffffff0.

I've decided to make patches consistent, so DBM_RESERVED_FLAGS is 
changed in following patches, where flags added.

>
>> +    uint16_t name_size;
>> +    /* name follows  */
>> +} QCowDirtyBitmapHeader;
>> +
>>   typedef struct QEMU_PACKED QCowSnapshotExtraData {
>>       uint64_t vm_state_size_large;
>>       uint64_t disk_size;
>> @@ -159,6 +176,11 @@ typedef struct QCowSnapshot {
>>       uint64_t vm_clock_nsec;
>>   } QCowSnapshot;
>>   
>> +typedef struct QCowDirtyBitmap {
>> +    uint64_t offset;
>> +    char *name;
>> +} QCowDirtyBitmap;
>> +
>>   struct Qcow2Cache;
>>   typedef struct Qcow2Cache Qcow2Cache;
>>   
>> @@ -221,6 +243,12 @@ typedef uint64_t Qcow2GetRefcountFunc(const void *refcount_array,
>>   typedef void Qcow2SetRefcountFunc(void *refcount_array,
>>                                     uint64_t index, uint64_t value);
>>   
>> +typedef struct Qcow2DirtyBitmapHeaderExt {
>> +    uint32_t nb_dirty_bitmaps;
>> +    uint32_t dirty_bitmap_directory_size;
>> +    uint64_t dirty_bitmap_directory_offset;
>> +} QEMU_PACKED Qcow2DirtyBitmapHeaderExt;
>> +
>>   typedef struct BDRVQcowState {
>>       int cluster_bits;
>>       int cluster_size;
>>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 05/17] qcow2-dirty-bitmap: read dirty bitmap directory
  2015-10-06 21:27   ` John Snow
@ 2016-02-16 18:51     ` Vladimir Sementsov-Ogievskiy
  2016-02-17 15:03     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-02-16 18:51 UTC (permalink / raw)
  To: John Snow, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den

On 07.10.2015 00:27, John Snow wrote:
>
> On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
>> Adds qcow2_read_dirty_bitmaps, reading Dirty Bitmap Directory as
>> specified in docs/specs/qcow2.txt
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/qcow2-dirty-bitmap.c | 155 +++++++++++++++++++++++++++++++++++++++++++++
>>   block/qcow2.h              |  10 +++
>>   2 files changed, 165 insertions(+)
>>
>> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
>> index fd4e0ef..1260d1d 100644
>> --- a/block/qcow2-dirty-bitmap.c
>> +++ b/block/qcow2-dirty-bitmap.c
>> @@ -25,6 +25,9 @@
>>    * THE SOFTWARE.
>>    */
>>   
>> +#include "block/block_int.h"
>> +#include "block/qcow2.h"
>> +
>>   /* NOTICE: DBM here means Dirty Bitmap and used as a namespace for _internal_
>>    * constants. Please do not use this _internal_ abbreviation for other needs
>>    * and/or outside of this file. */
>> @@ -40,3 +43,155 @@
>>   
>>   /* bits [0, 8] U [56, 63] are reserved */
>>   #define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
>> +
>> +void qcow2_free_dirty_bitmaps(BlockDriverState *bs)
>> +{
>> +    BDRVQcowState *s = bs->opaque;
> BDRVQcow2State here and everywhere else in this patch, now.
>
>> +    int i;
>> +
>> +    for (i = 0; i < s->nb_dirty_bitmaps; i++) {
>> +        g_free(s->dirty_bitmaps[i].name);
>> +    }
>> +    g_free(s->dirty_bitmaps);
>> +    s->dirty_bitmaps = NULL;
>> +    s->nb_dirty_bitmaps = 0;
>> +
>> +    g_free(s->dirty_bitmap_directory);
>> +    s->dirty_bitmap_directory = NULL;
>> +}
>> +
>> +static void bitmap_header_to_cpu(QCowDirtyBitmapHeader *h)
>> +{
>> +    be64_to_cpus(&h->dirty_bitmap_table_offset);
>> +    be64_to_cpus(&h->nb_virtual_bits);
>> +    be32_to_cpus(&h->dirty_bitmap_table_size);
>> +    be32_to_cpus(&h->granularity_bits);
>> +    be32_to_cpus(&h->flags);
>> +    be16_to_cpus(&h->name_size);
> I realize you probably got these functions by example from the other
> qcow2 files, but what exactly is cpu*s* here? What does the *s* stand for?
>
> I guess it refers to the in-place swapping variants that the Linux

yes, in-place swapping

> kernel defines?
>
> hmm, just a curiosity on my part ...
>
> the function looks correct, anyway. :)
>
>> +}
>> +
>> +static int calc_dir_entry_size(size_t name_size)
>> +{
>> +    return align_offset(sizeof(QCowDirtyBitmapHeader) + name_size, 8);
> Matches spec.
>
>> +}
>> +
>> +static int dir_entry_size(QCowDirtyBitmapHeader *h)
>> +{
>> +    return calc_dir_entry_size(h->name_size);
> OK.
>
>> +}
>> +
>> +static int check_constraints(int cluster_size,
>> +                             QCowDirtyBitmapHeader *h)
>> +{
>> +    uint64_t phys_bitmap_bytes =
>> +        (uint64_t)h->dirty_bitmap_table_size * cluster_size;
>> +    uint64_t max_virtual_bits = (phys_bitmap_bytes * 8) << h->granularity_bits;
>> +
>> +    int fail =
>> +            (h->dirty_bitmap_table_offset % cluster_size) ||
>> +            (h->dirty_bitmap_table_size > DBM_MAX_TABLE_SIZE) ||
>> +            (phys_bitmap_bytes > DBM_MAX_PHYS_SIZE) ||
>> +            (h->nb_virtual_bits > max_virtual_bits) ||
>> +            (h->granularity_bits > DBM_MAX_GRANULARITY_BITS) ||
>> +            (h->flags & DBM_RESERVED_FLAGS) ||
>> +            (h->name_size > DBM_MAX_NAME_SIZE);
>> +
> Function is a little dense, but appears to be correct -- apart from the
> DMB_RESERVED_FLAGS issue I mentioned earlier.

with this patch there are no flags, they will be added with the 
following patches and this mask will be changed.


>
>> +    return fail ? -EINVAL : 0;
>> +}
>> +
>> +static int directory_read(BlockDriverState *bs)
>> +{
>> +    int ret;
>> +    BDRVQcowState *s = bs->opaque;
>> +    uint8_t *entry, *end;
>> +
>> +    if (s->dirty_bitmap_directory != NULL) {
>> +        /* already read */
>> +        return -EEXIST;
>> +    }
>> +
>> +    s->dirty_bitmap_directory = g_try_malloc0(s->dirty_bitmap_directory_size);
>> +    if (s->dirty_bitmap_directory == NULL) {
>> +        return -ENOMEM;
>> +    }
>> +
> I assume we're trying here in case the directory size is garbage, as a
> method of preventing garbage from crashing our program. Since
> dirty_bitmap_directory_size was in theory already read in (by a function
> checked in later in this series), did we not validate that input value?

Hmm, it is verified, but the allowed range is large.. I'm not sure, but 
it seems like someone asked me to use _try_ for user defined or large 
allocations, you or Stefan..

>
>> +    ret = bdrv_pread(bs->file,
>> +                     s->dirty_bitmap_directory_offset,
>> +                     s->dirty_bitmap_directory,
>> +                     s->dirty_bitmap_directory_size);
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
> Alright, so we read the entire directory into memory... which can be as
> large as 64K * 1024, or 64MiB. A non-trivial size.

But, on the other hand, in normal cases with 1-2 bitmaps it will be 
little, and I'm not sure that it is good idea to implement now more 
complex solution.

Also, if all 64K bitmaps will be loaded into the memory, it will much 
more memory than 64mib..

>
>> +    entry = s->dirty_bitmap_directory;
>> +    end = s->dirty_bitmap_directory + s->dirty_bitmap_directory_size;
>> +    while (entry < end) {
>> +        QCowDirtyBitmapHeader *h = (QCowDirtyBitmapHeader *)entry;
>> +        bitmap_header_to_cpu(h);
>> +
> OK, so we're interpreting the values in-place in memory, but leaving
> them in the table.
>
>> +        ret = check_constraints(s->cluster_size, h);
>> +        if (ret < 0) {
>> +            goto fail;
>> +        }
>> +
>> +        entry += dir_entry_size(h);
>> +    }
>> +
>> +    return 0;
>> +
>> +fail:
>> +    g_free(s->dirty_bitmap_directory);
>> +    s->dirty_bitmap_directory = NULL;
>> +
>> +    return ret;
>> +}
>> +
>> +int qcow2_read_dirty_bitmaps(BlockDriverState *bs)
>> +{
>> +    int ret;
>> +    BDRVQcowState *s = bs->opaque;
>> +    size_t offset;
>> +    QCowDirtyBitmap *bm, *end;
>> +
>> +    if (s->dirty_bitmap_directory != NULL || s->dirty_bitmaps != NULL) {
>> +        /* already read */
>> +        return -EEXIST;
>> +    }
>> +
>> +    if (s->nb_dirty_bitmaps == 0) {
>> +        /* No bitmaps - nothing to do */
>> +        return 0;
>> +    }
>> +
> OK, so this assumes that the extension header has been read, but that
> code comes later in this series.
>
>> +    ret = directory_read(bs);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +
> At the end of this call we have interpreted the header into a CPU native
> format, but not performed any processing on it whatsoever.
>
>> +    s->dirty_bitmaps = g_try_new0(QCowDirtyBitmap, s->nb_dirty_bitmaps);
>> +    if (s->dirty_bitmaps == NULL) {
>> +        ret = -ENOMEM;
>> +        goto out;
>> +    }
>> +
> I think we could actually allocate this block of memory sooner (we
> already have read and validated nb_dirty_bitmaps) and then during the
> initial read, after validation, we can just fill the QcowDirtyBitmap
> structures as we go.
>
> If we keep "int n" as we parse bitmaps in the header, we can just unwind
> on failure with:
>
> for (i = n; i >= 0; i--) {
>     bm = s->dirty_bitmaps[i];
>     g_free(bm->name);
> }
> g_free(s->dirty_bitmaps);
>
> Then we don't have to re-crawl through the structure looking for names,
> getting sizes again, etc. It should be a little faster.

ok

>
>> +    offset = 0;
>> +    end = s->dirty_bitmaps + s->nb_dirty_bitmaps;
>> +    for (bm = s->dirty_bitmaps; bm < end; ++bm) {
>> +        QCowDirtyBitmapHeader *h =
>> +                (QCowDirtyBitmapHeader *)(s->dirty_bitmap_directory + offset);
>> +
>> +        bm->offset = offset;
>> +        bm->name = g_malloc(h->name_size + 1);
>> +        memcpy(bm->name, h + 1, h->name_size);
>> +        bm->name[h->name_size] = '\0';
> You can replace the last three lines if you want with just:
>
> bm->name = g_strndup(h + 1, h->name_size);

cool, thanks

>
>> +
>> +        offset += dir_entry_size(h);
>> +    }
>> +    ret = 0;
>> +
>> +out:
>> +    if (ret < 0) {
>> +        qcow2_free_dirty_bitmaps(bs);
>> +    }
>> +    return ret;
>> +}
>> diff --git a/block/qcow2.h b/block/qcow2.h
>> index a2a5d4a..5016fa1 100644
>> --- a/block/qcow2.h
>> +++ b/block/qcow2.h
>> @@ -288,6 +288,12 @@ typedef struct BDRVQcowState {
>>       unsigned int nb_snapshots;
>>       QCowSnapshot *snapshots;
>>   
>> +    uint64_t dirty_bitmap_directory_offset;
>> +    size_t dirty_bitmap_directory_size;
> I guess these two are from the extension header.
>
>> +    uint8_t *dirty_bitmap_directory;
>> +    unsigned int nb_dirty_bitmaps;
> This one is also from the extension header. Pointing out only for review
> purposes that these values are set "elsewhere" in future patches.
>
>> +    QCowDirtyBitmap *dirty_bitmaps;
>> +
>>       int flags;
>>       int qcow_version;
>>       bool use_lazy_refcounts;
>> @@ -598,6 +604,10 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
>>   void qcow2_free_snapshots(BlockDriverState *bs);
>>   int qcow2_read_snapshots(BlockDriverState *bs);
>>   
>> +/* qcow2-dirty-bitmap.c functions */
>> +void qcow2_free_dirty_bitmaps(BlockDriverState *bs);
>> +int qcow2_read_dirty_bitmaps(BlockDriverState *bs);
>> +
>>   /* qcow2-cache.c functions */
>>   Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
>>   int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
>>
> Patch order is a little strange in that we expect to have parsed the
> header already, but nothing criminal if this was just the easiest way to
> do it. I'll defer to your judgment.

I'm just trying not to change behaviour of the Qemu as long as possible, 
so the sequence of patches is
- prepare functionality
- enable it

May be, it is not the best approach...

>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 06/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load()
  2015-10-06 23:01   ` John Snow
  2015-10-07 17:05     ` Eric Blake
@ 2016-02-16 19:04     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-02-16 19:04 UTC (permalink / raw)
  To: John Snow, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den

On 07.10.2015 02:01, John Snow wrote:
>
> On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
>> This function loads block dirty bitmap from qcow2.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/qcow2-dirty-bitmap.c | 155 +++++++++++++++++++++++++++++++++++++++++++++
>>   block/qcow2.c              |   2 +
>>   block/qcow2.h              |   5 ++
>>   include/block/block_int.h  |   5 ++
>>   4 files changed, 167 insertions(+)
>>
>> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
>> index 1260d1d..ea50137 100644
>> --- a/block/qcow2-dirty-bitmap.c
>> +++ b/block/qcow2-dirty-bitmap.c
>> @@ -99,6 +99,13 @@ static int check_constraints(int cluster_size,
>>       return fail ? -EINVAL : 0;
>>   }
>>   
>> +static QCowDirtyBitmapHeader *bitmap_header(BDRVQcowState *s,
>> +                                            QCowDirtyBitmap *bitmap)
>> +{
> BDRVQcow2State here and everywhere below, again.
>
>> +    return (QCowDirtyBitmapHeader *)
>> +           (s->dirty_bitmap_directory + bitmap->offset);
>> +}
>> +
>>   static int directory_read(BlockDriverState *bs)
>>   {
>>       int ret;
>> @@ -195,3 +202,151 @@ out:
>>       }
>>       return ret;
>>   }
>> +
>> +static QCowDirtyBitmap *find_dirty_bitmap_by_name(BlockDriverState *bs,
>> +                                                  const char *name)
>> +{
>> +    BDRVQcowState *s = bs->opaque;
>> +    QCowDirtyBitmap *bm, *end = s->dirty_bitmaps + s->nb_dirty_bitmaps;
>> +
>> +    for (bm = s->dirty_bitmaps; bm < end; ++bm) {
>> +        if (strcmp(bm->name, name) == 0) {
>> +            return bm;
>> +        }
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
> Whoops. This says to me we really need to prohibit bitmaps with the same
> name from being stored in the same file, and mention this in the spec,
> and test for it on load.
>
> Perhaps we can create a hash-table and fail verification on open if
> there's a collision. We can then use that hash-table here for
> find_dirty_bitmap_by_name to speed up lookup since we already went
> through the trouble of loading it.
>
> Might help for large cases where we're approaching 64K bitmaps, will not
> be too big of a performance hit for casual use.

So, it (hash table approach) may be implemented later

>
>> +/* dirty sectors in cluster is a number of sectors in the image, corresponding
>> + * to one cluster of bitmap data */
>> +static uint64_t dirty_sectors_in_cluster(const BDRVQcowState *s,
>> +                                         const BdrvDirtyBitmap *bitmap)
>> +{
>> +    uint32_t sector_granularity =
>> +            bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
>> +
>> +    return (uint64_t)sector_granularity * (s->cluster_size << 3);
>> +}
>> +
>> +/* load_bitmap()
>> + * load dirty bitmap from Dirty Bitmap Table
>> + * Dirty Bitmap Table entries are assumed to be in big endian format */
>> +static int load_bitmap(BlockDriverState *bs,
>> +                       const uint64_t *dirty_bitmap_table,
>> +                       uint32_t dirty_bitmap_table_size,
>> +                       BdrvDirtyBitmap *bitmap)
>> +{
>> +    int ret = 0;
>> +    BDRVQcowState *s = bs->opaque;
>> +    uint64_t sector, dsc;
>> +    uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
> I found some of this hard to unwind, bear with me:
>
> AKA, the number of sectors that bitmap tracks ...
>
>> +    int cl_size = s->cluster_size;
>> +    uint8_t *buf = NULL;
>> +    uint32_t i, tab_size =
>> +            size_to_clusters(s, bdrv_dirty_bitmap_data_size(bitmap, bm_size));
>> +
> bdrv_dirty_bitmap_data_size(bitmap, COUNT) calculates for us how much
> actual real size the lowest level of the hbitmap actually takes.
>
> Then size_to_clusters tells us how many clusters we need to store that,
> and therefore should map back to be the same as the predicted value,
> dirty_bitmap_table_size.
>
>> +    if (tab_size > dirty_bitmap_table_size) {
>> +        return -EINVAL;
>> +    }
>> +
> I assume this is not == because the real table size might have padding
> or other such things, but if the calculated tab size is bigger than the
> actual then we have a problem.
>
> But I think that you've passed in "birty_ditmap_table_size" as the total
> byte count of the table, but "tab_size" is computed here as the number
> of entries. I think you should multiply tab_size by uint64_t and test if
> they're equal.
>
>> +    bdrv_clear_dirty_bitmap(bitmap);
>> +
> Clear takes the aio_context for the associated BDS and then releases it...
>
>> +    buf = g_malloc0(cl_size);
>> +    dsc = dirty_sectors_in_cluster(s, bitmap);
>> +    for (i = 0, sector = 0; i < tab_size; ++i, sector += dsc) {
>> +        uint64_t end = MIN(bm_size, sector + dsc);
>> +        uint64_t offset = be64_to_cpu(dirty_bitmap_table[i]);
>> +
>> +        if (offset & DBM_TABLE_ENTRY_RESERVED_MASK) {
>> +            ret = -EINVAL;
>> +            goto finish;
>> +        }
>> +
>> +        /* zero offset means cluster unallocated */
>> +        if (offset) {
>> +            ret = bdrv_pread(bs->file, offset, buf, cl_size);
>> +            if (ret < 0) {
>> +                goto finish;
>> +            }
>> +            bdrv_dirty_bitmap_deserialize_part(bitmap, buf, sector, end);
> ...but at this point, I believe we're editing this bitmap without its
> associated lock, which might be a problem when we go to add QMP commands
> later.


in the next version load_bitmap is changed a lot, and now not using 
part-serialization.


>> +        }
>> +    }
>> +    ret = 0;
>> +
>> +    bdrv_dirty_bitmap_deserialize_finish(bitmap);
>> +
>> +finish:
>> +    g_free(buf);
>> +
>> +    return ret;
>> +}
>> +
>> +BdrvDirtyBitmap * qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
>> +                                          BlockDriverState *bs_file,
>> +                                          const char *name,
>> +                                          Error **errp)
>> +{
>> +    BDRVQcowState *s = bs_file->opaque;
>> +    int ret;
>> +    QCowDirtyBitmap *bm;
>> +    QCowDirtyBitmapHeader *bmh;
>> +    uint64_t *dirty_bitmap_table = NULL;
>> +    uint32_t granularity;
>> +    uint64_t size = bdrv_nb_sectors(bs_for);
>> +    BdrvDirtyBitmap *bitmap = NULL;
>> +
>> +    bm = find_dirty_bitmap_by_name(bs_file, name);
>> +    if (bm == NULL) {
>> +        error_setg(errp, "Could not find bitmap '%s' in the node '%s'", name,
>> +                   bdrv_get_device_or_node_name(bs_file));
>> +        return NULL;
>> +    }
>> +    bmh = bitmap_header(s, bm);
>> +
>> +    if (size != bmh->nb_virtual_bits) {
>> +        error_setg(errp,
>> +                   "Bitmap '%s' in the node '%s' has size = %" PRIu64
>> +                   "when requested size (for node %s) = %" PRIu64,
>> +                   name, bdrv_get_device_or_node_name(bs_file),
>> +                   bmh->nb_virtual_bits,
>> +                   bdrv_get_device_or_node_name(bs_for), size);
>> +        return NULL;
>> +    }
>> +
>> +
>> +    dirty_bitmap_table = g_try_malloc(bmh->dirty_bitmap_table_size * sizeof(uint64_t));
>> +    if (dirty_bitmap_table == NULL) {
>> +        error_setg_errno(errp, -ENOMEM, "Could not allocate Dirty Bitmap Table");
>> +        return NULL;
>> +    }
>> +
>> +    ret = bdrv_pread(bs_file->file, bmh->dirty_bitmap_table_offset, dirty_bitmap_table,
>> +                     bmh->dirty_bitmap_table_size * sizeof(uint64_t));
>> +    if (ret < 0) {
>> +        error_setg_errno(errp, -ret, "Could not read dirty_bitmap_table table from image");
>> +        goto finish;
>> +    }
>> +
>> +    granularity = BDRV_SECTOR_SIZE << bmh->granularity_bits;
>> +    bitmap = bdrv_create_dirty_bitmap(bs_for, granularity, name, errp);
>> +    if (bitmap == NULL) {
>> +        error_setg_errno(errp, -ENOMEM, "Could not create dirty bitmap");
> why -ENOMEM? create can fail for a number of reasons ... since we've
> been given an errp parameter in this function, and we can trust
> bdrv_create_dirty_bitmap to have set it, we can just return NULL here
> and the caller can check errp to see what went wrong.
>
>> +        goto finish;
>> +    }
>> +
> Do we need to mark this bitmap as temporarily unusable until we complete
> the load? I guess not in the context of bdrv_open at boot time ...

Also, keep in mind that now we have only bitmaps, related to the same 
bds which contains it.

>
>> +    ret = load_bitmap(bs_file, dirty_bitmap_table, bmh->dirty_bitmap_table_size, bitmap);
>> +    if (ret < 0) {
>> +        error_setg_errno(errp, -ret, "Could not read bitmap from image");
>> +        goto finish;
>> +    }
>> +
>> +finish:
>> +    if (*errp != NULL) {
>> +        bdrv_release_dirty_bitmap(bs_for, bitmap);
>> +        bitmap = NULL;
>> +    }
>> +    g_free(dirty_bitmap_table);
>
> I think we're not supposed to be reaching into errp to check its
> implementation detail like this ... the usual paradigm I see is just
> "goto fail" or similar statements instead of checking for
> error-or-success in a shared return block.
>
> finish:
>    g_free(dirty_bitmap_table);
>    return bitmap;
> fail:
>    g_free(dirty_bitmap_table);
>    bdrv_release_dirty_bitmap(bs_for, bitmap);
>    return NULL;

ok

>
>> +
>> +    return bitmap;
>> +}
>> diff --git a/block/qcow2.c b/block/qcow2.c
>> index 76c331b..58ebdd3 100644
>> --- a/block/qcow2.c
>> +++ b/block/qcow2.c
>> @@ -2965,6 +2965,8 @@ BlockDriver bdrv_qcow2 = {
>>       .bdrv_get_info          = qcow2_get_info,
>>       .bdrv_get_specific_info = qcow2_get_specific_info,
>>   
>> +    .bdrv_dirty_bitmap_load = qcow2_dirty_bitmap_load,
>> +
>>       .bdrv_save_vmstate    = qcow2_save_vmstate,
>>       .bdrv_load_vmstate    = qcow2_load_vmstate,
>>   
>> diff --git a/block/qcow2.h b/block/qcow2.h
>> index 5016fa1..51d1907 100644
>> --- a/block/qcow2.h
>> +++ b/block/qcow2.h
>> @@ -608,6 +608,11 @@ int qcow2_read_snapshots(BlockDriverState *bs);
>>   void qcow2_free_dirty_bitmaps(BlockDriverState *bs);
>>   int qcow2_read_dirty_bitmaps(BlockDriverState *bs);
>>   
>> +BdrvDirtyBitmap *qcow2_dirty_bitmap_load(BlockDriverState *bs_for,
>> +                                         BlockDriverState *bs_file,
>> +                                         const char *name,
>> +                                         Error **errp);
>> +
>>   /* qcow2-cache.c functions */
>>   Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
>>   int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 14ad4c3..f982adc 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -204,6 +204,11 @@ struct BlockDriver {
>>       int (*bdrv_get_info)(BlockDriverState *bs, BlockDriverInfo *bdi);
>>       ImageInfoSpecific *(*bdrv_get_specific_info)(BlockDriverState *bs);
>>   
>> +    BdrvDirtyBitmap *(*bdrv_dirty_bitmap_load)(BlockDriverState *bs_for,
>> +                                               BlockDriverState *bs_file,
>> +                                               const char *name,
>> +                                               Error **errp);
>> +
>>       int (*bdrv_save_vmstate)(BlockDriverState *bs, QEMUIOVector *qiov,
>>                                int64_t pos);
>>       int (*bdrv_load_vmstate)(BlockDriverState *bs, uint8_t *buf,
>>
> Looking good, thanks!
> --js


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH 05/17] qcow2-dirty-bitmap: read dirty bitmap directory
  2015-10-06 21:27   ` John Snow
  2016-02-16 18:51     ` Vladimir Sementsov-Ogievskiy
@ 2016-02-17 15:03     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 53+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-02-17 15:03 UTC (permalink / raw)
  To: John Snow, qemu-devel; +Cc: kwolf, pbonzini, stefanha, den

On 07.10.2015 00:27, John Snow wrote:
>
> On 09/05/2015 12:43 PM, Vladimir Sementsov-Ogievskiy wrote:
>> Adds qcow2_read_dirty_bitmaps, reading Dirty Bitmap Directory as
>> specified in docs/specs/qcow2.txt
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/qcow2-dirty-bitmap.c | 155 +++++++++++++++++++++++++++++++++++++++++++++
>>   block/qcow2.h              |  10 +++
>>   2 files changed, 165 insertions(+)
>>
>> diff --git a/block/qcow2-dirty-bitmap.c b/block/qcow2-dirty-bitmap.c
>> index fd4e0ef..1260d1d 100644
>> --- a/block/qcow2-dirty-bitmap.c
>> +++ b/block/qcow2-dirty-bitmap.c
>> @@ -25,6 +25,9 @@
>>    * THE SOFTWARE.
>>    */
>>   
>> +#include "block/block_int.h"
>> +#include "block/qcow2.h"
>> +
>>   /* NOTICE: DBM here means Dirty Bitmap and used as a namespace for _internal_
>>    * constants. Please do not use this _internal_ abbreviation for other needs
>>    * and/or outside of this file. */
>> @@ -40,3 +43,155 @@
>>   
>>   /* bits [0, 8] U [56, 63] are reserved */
>>   #define DBM_TABLE_ENTRY_RESERVED_MASK 0xff000000000001ff
>> +
>> +void qcow2_free_dirty_bitmaps(BlockDriverState *bs)
>> +{
>> +    BDRVQcowState *s = bs->opaque;
> BDRVQcow2State here and everywhere else in this patch, now.
>
>> +    int i;
>> +
>> +    for (i = 0; i < s->nb_dirty_bitmaps; i++) {
>> +        g_free(s->dirty_bitmaps[i].name);
>> +    }
>> +    g_free(s->dirty_bitmaps);
>> +    s->dirty_bitmaps = NULL;
>> +    s->nb_dirty_bitmaps = 0;
>> +
>> +    g_free(s->dirty_bitmap_directory);
>> +    s->dirty_bitmap_directory = NULL;
>> +}
>> +
>> +static void bitmap_header_to_cpu(QCowDirtyBitmapHeader *h)
>> +{
>> +    be64_to_cpus(&h->dirty_bitmap_table_offset);
>> +    be64_to_cpus(&h->nb_virtual_bits);
>> +    be32_to_cpus(&h->dirty_bitmap_table_size);
>> +    be32_to_cpus(&h->granularity_bits);
>> +    be32_to_cpus(&h->flags);
>> +    be16_to_cpus(&h->name_size);
> I realize you probably got these functions by example from the other
> qcow2 files, but what exactly is cpu*s* here? What does the *s* stand for?
>
> I guess it refers to the in-place swapping variants that the Linux
> kernel defines?
>
> hmm, just a curiosity on my part ...
>
> the function looks correct, anyway. :)
>
>> +}
>> +
>> +static int calc_dir_entry_size(size_t name_size)
>> +{
>> +    return align_offset(sizeof(QCowDirtyBitmapHeader) + name_size, 8);
> Matches spec.
>
>> +}
>> +
>> +static int dir_entry_size(QCowDirtyBitmapHeader *h)
>> +{
>> +    return calc_dir_entry_size(h->name_size);
> OK.
>
>> +}
>> +
>> +static int check_constraints(int cluster_size,
>> +                             QCowDirtyBitmapHeader *h)
>> +{
>> +    uint64_t phys_bitmap_bytes =
>> +        (uint64_t)h->dirty_bitmap_table_size * cluster_size;
>> +    uint64_t max_virtual_bits = (phys_bitmap_bytes * 8) << h->granularity_bits;
>> +
>> +    int fail =
>> +            (h->dirty_bitmap_table_offset % cluster_size) ||
>> +            (h->dirty_bitmap_table_size > DBM_MAX_TABLE_SIZE) ||
>> +            (phys_bitmap_bytes > DBM_MAX_PHYS_SIZE) ||
>> +            (h->nb_virtual_bits > max_virtual_bits) ||
>> +            (h->granularity_bits > DBM_MAX_GRANULARITY_BITS) ||
>> +            (h->flags & DBM_RESERVED_FLAGS) ||
>> +            (h->name_size > DBM_MAX_NAME_SIZE);
>> +
> Function is a little dense, but appears to be correct -- apart from the
> DMB_RESERVED_FLAGS issue I mentioned earlier.
>
>> +    return fail ? -EINVAL : 0;
>> +}
>> +
>> +static int directory_read(BlockDriverState *bs)
>> +{
>> +    int ret;
>> +    BDRVQcowState *s = bs->opaque;
>> +    uint8_t *entry, *end;
>> +
>> +    if (s->dirty_bitmap_directory != NULL) {
>> +        /* already read */
>> +        return -EEXIST;
>> +    }
>> +
>> +    s->dirty_bitmap_directory = g_try_malloc0(s->dirty_bitmap_directory_size);
>> +    if (s->dirty_bitmap_directory == NULL) {
>> +        return -ENOMEM;
>> +    }
>> +
> I assume we're trying here in case the directory size is garbage, as a
> method of preventing garbage from crashing our program. Since
> dirty_bitmap_directory_size was in theory already read in (by a function
> checked in later in this series), did we not validate that input value?
>
>> +    ret = bdrv_pread(bs->file,
>> +                     s->dirty_bitmap_directory_offset,
>> +                     s->dirty_bitmap_directory,
>> +                     s->dirty_bitmap_directory_size);
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
> Alright, so we read the entire directory into memory... which can be as
> large as 64K * 1024, or 64MiB. A non-trivial size.
>
>> +    entry = s->dirty_bitmap_directory;
>> +    end = s->dirty_bitmap_directory + s->dirty_bitmap_directory_size;
>> +    while (entry < end) {
>> +        QCowDirtyBitmapHeader *h = (QCowDirtyBitmapHeader *)entry;
>> +        bitmap_header_to_cpu(h);
>> +
> OK, so we're interpreting the values in-place in memory, but leaving
> them in the table.
>
>> +        ret = check_constraints(s->cluster_size, h);
>> +        if (ret < 0) {
>> +            goto fail;
>> +        }
>> +
>> +        entry += dir_entry_size(h);
>> +    }
>> +
>> +    return 0;
>> +
>> +fail:
>> +    g_free(s->dirty_bitmap_directory);
>> +    s->dirty_bitmap_directory = NULL;
>> +
>> +    return ret;
>> +}
>> +
>> +int qcow2_read_dirty_bitmaps(BlockDriverState *bs)
>> +{
>> +    int ret;
>> +    BDRVQcowState *s = bs->opaque;
>> +    size_t offset;
>> +    QCowDirtyBitmap *bm, *end;
>> +
>> +    if (s->dirty_bitmap_directory != NULL || s->dirty_bitmaps != NULL) {
>> +        /* already read */
>> +        return -EEXIST;
>> +    }
>> +
>> +    if (s->nb_dirty_bitmaps == 0) {
>> +        /* No bitmaps - nothing to do */
>> +        return 0;
>> +    }
>> +
> OK, so this assumes that the extension header has been read, but that
> code comes later in this series.
>
>> +    ret = directory_read(bs);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +
> At the end of this call we have interpreted the header into a CPU native
> format, but not performed any processing on it whatsoever.

bitmap directory is store in ram in cpu native format

>
>> +    s->dirty_bitmaps = g_try_new0(QCowDirtyBitmap, s->nb_dirty_bitmaps);
>> +    if (s->dirty_bitmaps == NULL) {
>> +        ret = -ENOMEM;
>> +        goto out;
>> +    }
>> +
> I think we could actually allocate this block of memory sooner (we
> already have read and validated nb_dirty_bitmaps) and then during the
> initial read, after validation, we can just fill the QcowDirtyBitmap
> structures as we go.
>
> If we keep "int n" as we parse bitmaps in the header, we can just unwind
> on failure with:
>
> for (i = n; i >= 0; i--) {
>     bm = s->dirty_bitmaps[i];
>     g_free(bm->name);
> }
> g_free(s->dirty_bitmaps);
>
> Then we don't have to re-crawl through the structure looking for names,
> getting sizes again, etc. It should be a little faster.
>
>> +    offset = 0;
>> +    end = s->dirty_bitmaps + s->nb_dirty_bitmaps;
>> +    for (bm = s->dirty_bitmaps; bm < end; ++bm) {
>> +        QCowDirtyBitmapHeader *h =
>> +                (QCowDirtyBitmapHeader *)(s->dirty_bitmap_directory + offset);
>> +
>> +        bm->offset = offset;
>> +        bm->name = g_malloc(h->name_size + 1);
>> +        memcpy(bm->name, h + 1, h->name_size);
>> +        bm->name[h->name_size] = '\0';
> You can replace the last three lines if you want with just:
>
> bm->name = g_strndup(h + 1, h->name_size);
>
>> +
>> +        offset += dir_entry_size(h);
>> +    }
>> +    ret = 0;
>> +
>> +out:
>> +    if (ret < 0) {
>> +        qcow2_free_dirty_bitmaps(bs);
>> +    }
>> +    return ret;
>> +}
>> diff --git a/block/qcow2.h b/block/qcow2.h
>> index a2a5d4a..5016fa1 100644
>> --- a/block/qcow2.h
>> +++ b/block/qcow2.h
>> @@ -288,6 +288,12 @@ typedef struct BDRVQcowState {
>>       unsigned int nb_snapshots;
>>       QCowSnapshot *snapshots;
>>   
>> +    uint64_t dirty_bitmap_directory_offset;
>> +    size_t dirty_bitmap_directory_size;
> I guess these two are from the extension header.
>
>> +    uint8_t *dirty_bitmap_directory;
>> +    unsigned int nb_dirty_bitmaps;
> This one is also from the extension header. Pointing out only for review
> purposes that these values are set "elsewhere" in future patches.
>
>> +    QCowDirtyBitmap *dirty_bitmaps;
>> +
>>       int flags;
>>       int qcow_version;
>>       bool use_lazy_refcounts;
>> @@ -598,6 +604,10 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
>>   void qcow2_free_snapshots(BlockDriverState *bs);
>>   int qcow2_read_snapshots(BlockDriverState *bs);
>>   
>> +/* qcow2-dirty-bitmap.c functions */
>> +void qcow2_free_dirty_bitmaps(BlockDriverState *bs);
>> +int qcow2_read_dirty_bitmaps(BlockDriverState *bs);
>> +
>>   /* qcow2-cache.c functions */
>>   Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
>>   int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
>>
> Patch order is a little strange in that we expect to have parsed the
> header already, but nothing criminal if this was just the easiest way to
> do it. I'll defer to your judgment.
>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2016-02-17 15:04 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-05 16:43 [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 01/17] block: fix bdrv_dirty_bitmap_granularity() Vladimir Sementsov-Ogievskiy
2015-09-15 15:36   ` Eric Blake
2015-10-05 22:47   ` John Snow
2015-09-05 16:43 ` [Qemu-devel] [PATCH 02/17] block: add bdrv_dirty_bitmap_size() Vladimir Sementsov-Ogievskiy
2015-09-15 15:37   ` Eric Blake
2015-10-05 22:48   ` John Snow
2015-09-05 16:43 ` [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification Vladimir Sementsov-Ogievskiy
2015-09-05 17:33   ` Vladimir Sementsov-Ogievskiy
2015-10-06 20:22     ` John Snow
2015-10-06 20:33       ` Eric Blake
2015-09-15 16:24   ` Eric Blake
2015-09-16  8:52     ` Vladimir Sementsov-Ogievskiy
2015-10-06  0:09     ` John Snow
2015-10-07 16:47   ` Max Reitz
2015-10-07 19:05     ` Denis V. Lunev
2015-10-08 20:28       ` John Snow
2015-10-08 20:56         ` Denis V. Lunev
2015-10-09 18:14           ` [Qemu-devel] [PAT​CH " Max Reitz
2015-10-09 17:07         ` [Qemu-devel] [PATCH " Max Reitz
2015-10-09 20:14           ` [Qemu-devel] [Qemu-block] " Eric Blake
2015-09-05 16:43 ` [Qemu-devel] [PATCH 04/17] qcow2: Dirty Bitmaps Ext: structs and consts Vladimir Sementsov-Ogievskiy
2015-10-06 20:12   ` John Snow
2015-10-06 20:16   ` John Snow
2016-02-16 17:04     ` Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 05/17] qcow2-dirty-bitmap: read dirty bitmap directory Vladimir Sementsov-Ogievskiy
2015-10-06 21:27   ` John Snow
2016-02-16 18:51     ` Vladimir Sementsov-Ogievskiy
2016-02-17 15:03     ` Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 06/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load() Vladimir Sementsov-Ogievskiy
2015-10-06 23:01   ` John Snow
2015-10-07 17:05     ` Eric Blake
2016-02-16 19:04     ` Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 07/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_store() Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 08/17] qcow2: add dirty bitmaps extension Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 09/17] qcow2-dirty-bitmap: add qcow2_dirty_bitmap_load_check() Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 10/17] block: store persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 11/17] block: add bdrv_load_dirty_bitmap() Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 12/17] qcow2-dirty-bitmap: add autoclear bit Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 13/17] qemu: command line option for dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 14/17] qcow2-dirty-bitmap: add IN_USE flag Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 15/17] qcow2-dirty-bitmaps: handle store reqursion Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 16/17] iotests: add VM.test_launcn() Vladimir Sementsov-Ogievskiy
2015-09-05 16:43 ` [Qemu-devel] [PATCH 17/17] iotests: test internal persistent dirty bitmap Vladimir Sementsov-Ogievskiy
2015-09-05 16:48 ` [Qemu-devel] [PATCH v3 RFC 0/17] block: persistent dirty bitmaps Vladimir Sementsov-Ogievskiy
2015-09-05 16:51 ` Vladimir Sementsov-Ogievskiy
2015-09-05 16:53 ` Vladimir Sementsov-Ogievskiy
2015-09-05 16:57 ` Vladimir Sementsov-Ogievskiy
2015-09-05 17:03 ` Vladimir Sementsov-Ogievskiy
2015-09-05 17:09 ` Vladimir Sementsov-Ogievskiy
2015-09-05 17:16 ` Vladimir Sementsov-Ogievskiy
2015-09-05 17:25 ` Vladimir Sementsov-Ogievskiy
2015-09-05 17:30 ` Vladimir Sementsov-Ogievskiy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).