All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 0/5] block/qcow2: Image file option amendment
@ 2013-09-02 10:04 Max Reitz
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 1/5] block: " Max Reitz
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Max Reitz @ 2013-09-02 10:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

This series adds support to qemu-img, block and qcow2 for amending image
options on existing image files.

Depends on:
 - option: Add assigned flag to QEMUOptionParameter
 - qcow2-refcount: Snapshot update for zero clusters (series, v3)
 - Add metadata overlap checks (series, v5)

v4:
 - rebased on the metadata overlap check series (and fit to it)
 - split patch 2 into three distinct patches (2, 3 and 4)
 - extended test for zero expansion on backed and inactive backed clusters
   (and fixed according to the metadata overlap check series (i.e.,
   adjusted header length))
 - fixed zero expansion with shared L2 tables

v3:
 - deallocate non-preallocated zero clusters on non-backed images instead
   of zero expanding them
 - qcow2 version downgrade: error out on refcount_order != 4
 - implemented Eric's comments regarding the qemu-img amend and img_amend
   itself

v2:
 - Generally implemented Kevin's comments, especially:
   - Zero cluster expansion for inactive L2 tables
   - Correct handling of preallocated zero clusters
   - More test cases

Max Reitz (5):
  block: Image file option amendment
  qcow2-cluster: Expand zero clusters
  qcow2: Save refcount order in BDRVQcowState
  qcow2: Implement bdrv_amend_options
  qemu-iotest: qcow2 image option amendment

 block.c                    |   8 ++
 block/qcow2-cluster.c      | 228 +++++++++++++++++++++++++++++
 block/qcow2-refcount.c     |  29 ++--
 block/qcow2.c              | 198 ++++++++++++++++++++++++-
 block/qcow2.h              |   6 +
 include/block/block.h      |   2 +
 include/block/block_int.h  |   3 +
 qemu-img-cmds.hx           |   6 +
 qemu-img.c                 |  84 +++++++++++
 qemu-img.texi              |   5 +
 tests/qemu-iotests/061     | 178 +++++++++++++++++++++++
 tests/qemu-iotests/061.out | 349 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 13 files changed, 1082 insertions(+), 15 deletions(-)
 create mode 100755 tests/qemu-iotests/061
 create mode 100644 tests/qemu-iotests/061.out

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 1/5] block: Image file option amendment
  2013-09-02 10:04 [Qemu-devel] [PATCH v4 0/5] block/qcow2: Image file option amendment Max Reitz
@ 2013-09-02 10:04 ` Max Reitz
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 2/5] qcow2-cluster: Expand zero clusters Max Reitz
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Max Reitz @ 2013-09-02 10:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

This patch adds the "amend" option to qemu-img which allows changing
image options on existing image files. It also adds the generic bdrv
implementation which is basically just a wrapper for the image format
specific function.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c                   |  8 +++++
 include/block/block.h     |  2 ++
 include/block/block_int.h |  3 ++
 qemu-img-cmds.hx          |  6 ++++
 qemu-img.c                | 84 +++++++++++++++++++++++++++++++++++++++++++++++
 qemu-img.texi             |  5 +++
 6 files changed, 108 insertions(+)

diff --git a/block.c b/block.c
index a387c1a..9c40a15 100644
--- a/block.c
+++ b/block.c
@@ -4674,3 +4674,11 @@ void bdrv_add_before_write_notifier(BlockDriverState *bs,
 {
     notifier_with_return_list_add(&bs->before_write_notifiers, notifier);
 }
+
+int bdrv_amend_options(BlockDriverState *bs, QEMUOptionParameter *options)
+{
+    if (bs->drv->bdrv_amend_options == NULL) {
+        return -ENOTSUP;
+    }
+    return bs->drv->bdrv_amend_options(bs, options);
+}
diff --git a/include/block/block.h b/include/block/block.h
index e6b391c..c284b4a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -223,6 +223,8 @@ typedef enum {
 
 int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
 
+int bdrv_amend_options(BlockDriverState *bs_new, QEMUOptionParameter *options);
+
 /* async block I/O */
 typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector,
                                      int sector_num);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8012e25..3c93766 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -205,6 +205,9 @@ struct BlockDriver {
     int (*bdrv_check)(BlockDriverState* bs, BdrvCheckResult *result,
         BdrvCheckMode fix);
 
+    int (*bdrv_amend_options)(BlockDriverState *bs,
+        QEMUOptionParameter *options);
+
     void (*bdrv_debug_event)(BlockDriverState *bs, BlkDebugEvent event);
 
     /* TODO Better pass a option string/QDict/QemuOpts to add any rule? */
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 4ca7e95..5a066b5 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -61,5 +61,11 @@ DEF("resize", img_resize,
     "resize [-q] filename [+ | -]size")
 STEXI
 @item resize [-q] @var{filename} [+ | -]@var{size}
+ETEXI
+
+DEF("amend", img_amend,
+    "amend [-q] [-f fmt] -o options filename")
+STEXI
+@item amend [-q] [-f @var{fmt}] -o @var{options} @var{filename}
 @end table
 ETEXI
diff --git a/qemu-img.c b/qemu-img.c
index b9a848d..7a8f064 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2308,6 +2308,90 @@ out:
     return 0;
 }
 
+static int img_amend(int argc, char **argv)
+{
+    int c, ret = 0;
+    char *options = NULL;
+    QEMUOptionParameter *create_options = NULL, *options_param = NULL;
+    const char *fmt = NULL, *filename;
+    bool quiet = false;
+    BlockDriverState *bs = NULL;
+
+    for (;;) {
+        c = getopt(argc, argv, "hqf:o:");
+        if (c == -1) {
+            break;
+        }
+
+        switch (c) {
+            case 'h':
+            case '?':
+                help();
+                break;
+            case 'o':
+                options = optarg;
+                break;
+            case 'f':
+                fmt = optarg;
+                break;
+            case 'q':
+                quiet = true;
+                break;
+        }
+    }
+
+    if (optind != argc - 1) {
+        help();
+    }
+
+    if (!options) {
+        help();
+    }
+
+    filename = argv[argc - 1];
+
+    bs = bdrv_new_open(filename, fmt, BDRV_O_FLAGS | BDRV_O_RDWR, true, quiet);
+    if (!bs) {
+        error_report("Could not open image '%s'", filename);
+        ret = -1;
+        goto out;
+    }
+
+    fmt = bs->drv->format_name;
+
+    if (is_help_option(options)) {
+        ret = print_block_option_help(filename, fmt);
+        goto out;
+    }
+
+    create_options = append_option_parameters(create_options,
+            bs->drv->create_options);
+    options_param = parse_option_parameters(options, create_options,
+            options_param);
+    if (options_param == NULL) {
+        error_report("Invalid options for file format '%s'", fmt);
+        ret = -1;
+        goto out;
+    }
+
+    ret = bdrv_amend_options(bs, options_param);
+    if (ret < 0) {
+        error_report("Error while amending options: %s", strerror(-ret));
+        goto out;
+    }
+
+out:
+    if (bs) {
+        bdrv_delete(bs);
+    }
+    free_option_parameters(create_options);
+    free_option_parameters(options_param);
+    if (ret) {
+        return 1;
+    }
+    return 0;
+}
+
 static const img_cmd_t img_cmds[] = {
 #define DEF(option, callback, arg_string)        \
     { option, callback },
diff --git a/qemu-img.texi b/qemu-img.texi
index 69f1bda..8697f23 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -282,6 +282,11 @@ sizes accordingly.  Failure to do so will result in data loss!
 After using this command to grow a disk image, you must use file system and
 partitioning tools inside the VM to actually begin using the new space on the
 device.
+
+@item amend [-f @var{fmt}] -o @var{options} @var{filename}
+
+Amends the image format specific @var{options} for the image file
+@var{filename}. Not all file formats support this operation.
 @end table
 @c man end
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 2/5] qcow2-cluster: Expand zero clusters
  2013-09-02 10:04 [Qemu-devel] [PATCH v4 0/5] block/qcow2: Image file option amendment Max Reitz
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 1/5] block: " Max Reitz
@ 2013-09-02 10:04 ` Max Reitz
  2013-09-02 15:13   ` Kevin Wolf
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 3/5] qcow2: Save refcount order in BDRVQcowState Max Reitz
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Max Reitz @ 2013-09-02 10:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Add functionality for expanding zero clusters. This is necessary for
downgrading the image version to one without zero cluster support.

For non-backed images, this function may also just discard zero clusters
instead of truly expanding them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c  | 228 +++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2-refcount.c |  29 ++++---
 block/qcow2.h          |   5 ++
 3 files changed, 248 insertions(+), 14 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 2d5aa92..c90fb51 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1497,3 +1497,231 @@ fail:
 
     return ret;
 }
+
+/*
+ * Expands all zero clusters in a specific L1 table (or deallocates them, for
+ * non-backed non-pre-allocated zero clusters).
+ *
+ * expanded_clusters is a bitmap where every bit corresponds to one cluster in
+ * the image file; a bit gets set if the corresponding cluster has been used for
+ * zero expansion (i.e., has been filled with zeroes and is referenced from an
+ * L2 table). nb_clusters contains the total cluster count of the image file,
+ * i.e., the number of bits in expanded_clusters.
+ */
+static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
+                                      int l1_size, uint8_t *expanded_clusters,
+                                      uint64_t nb_clusters)
+{
+    BDRVQcowState *s = bs->opaque;
+    bool is_active_l1 = (l1_table == s->l1_table);
+    uint64_t *l2_table = NULL;
+    int ret;
+    int i, j;
+
+    if (!is_active_l1) {
+        /* inactive L2 tables require a buffer to be stored in when loading
+         * them from disk */
+        l2_table = qemu_blockalign(bs, s->cluster_size);
+    }
+
+    for (i = 0; i < l1_size; i++) {
+        uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
+        bool l2_dirty = false;
+
+        if (!l2_offset) {
+            /* unallocated */
+            continue;
+        }
+
+        if (is_active_l1) {
+            /* get active L2 tables from cache */
+            ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
+                    (void **)&l2_table);
+        } else {
+            /* load inactive L2 tables from disk */
+            ret = bdrv_read(bs->file, l2_offset / BDRV_SECTOR_SIZE,
+                    (void *)l2_table, s->cluster_sectors);
+        }
+        if (ret < 0) {
+            goto fail;
+        }
+
+        for (j = 0; j < s->l2_size; j++) {
+            uint64_t l2_entry = be64_to_cpu(l2_table[j]);
+            int64_t offset = l2_entry & L2E_OFFSET_MASK, cluster_index;
+            int cluster_type = qcow2_get_cluster_type(l2_entry);
+
+            if (cluster_type == QCOW2_CLUSTER_NORMAL) {
+                cluster_index = offset >> s->cluster_bits;
+                assert((cluster_index >= 0) && (cluster_index < nb_clusters));
+                if (expanded_clusters[cluster_index / 8] &
+                    (1 << (cluster_index % 8))) {
+                    /* Probably a shared L2 table; this cluster was a zero
+                     * cluster which has been expanded, its refcount
+                     * therefore most likely requires an update. */
+                    ret = qcow2_update_cluster_refcount(bs, cluster_index, 1,
+                                                        QCOW2_DISCARD_NEVER);
+                    if (ret < 0) {
+                        goto fail;
+                    }
+                    /* Since we just increased the refcount, the COPIED flag may
+                     * no longer be set. */
+                    l2_table[j] = cpu_to_be64(l2_entry & ~QCOW_OFLAG_COPIED);
+                    l2_dirty = true;
+                }
+                continue;
+            }
+            else if (qcow2_get_cluster_type(l2_entry) != QCOW2_CLUSTER_ZERO) {
+                continue;
+            }
+
+            if (!offset) {
+                /* not preallocated */
+                if (!bs->backing_hd) {
+                    /* not backed; therefore we can simply deallocate the
+                     * cluster */
+                    l2_table[j] = 0;
+                    l2_dirty = true;
+                    continue;
+                }
+
+                offset = qcow2_alloc_clusters(bs, s->cluster_size);
+                if (offset < 0) {
+                    ret = offset;
+                    goto fail;
+                }
+            }
+
+            ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_DEFAULT,
+                                                offset, s->cluster_size);
+            if (ret < 0) {
+                qcow2_free_clusters(bs, offset, s->cluster_size,
+                        QCOW2_DISCARD_ALWAYS);
+                goto fail;
+            }
+
+            ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE,
+                                    s->cluster_sectors);
+            if (ret < 0) {
+                qcow2_free_clusters(bs, offset, s->cluster_size,
+                        QCOW2_DISCARD_ALWAYS);
+                goto fail;
+            }
+
+            l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
+            l2_dirty = true;
+
+            cluster_index = offset >> s->cluster_bits;
+            assert((cluster_index >= 0) && (cluster_index < nb_clusters));
+            expanded_clusters[cluster_index / 8] |= 1 << (cluster_index % 8);
+        }
+
+        if (is_active_l1) {
+            if (l2_dirty) {
+                qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+                qcow2_cache_depends_on_flush(s->l2_table_cache);
+            }
+            ret = qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
+            if (ret < 0) {
+                l2_table = NULL;
+                goto fail;
+            }
+        } else {
+            if (l2_dirty) {
+                ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_DEFAULT &
+                        ~(QCOW2_OL_INACTIVE_L2 | QCOW2_OL_ACTIVE_L2), l2_offset,
+                        s->cluster_size);
+                if (ret < 0) {
+                    goto fail;
+                }
+
+                ret = bdrv_write(bs->file, l2_offset / BDRV_SECTOR_SIZE,
+                        (void *)l2_table, s->cluster_sectors);
+                if (ret < 0) {
+                    goto fail;
+                }
+            }
+        }
+    }
+
+    ret = 0;
+
+fail:
+    if (l2_table) {
+        if (!is_active_l1) {
+            qemu_vfree(l2_table);
+        } else {
+            if (ret < 0) {
+                qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
+            } else {
+                ret = qcow2_cache_put(bs, s->l2_table_cache,
+                        (void **)&l2_table);
+            }
+        }
+    }
+    return ret;
+}
+
+/*
+ * For backed images, expands all zero clusters on the image. For non-backed
+ * images, deallocates all non-pre-allocated zero clusters (and claims the
+ * allocation for pre-allocated ones). This is important for downgrading to a
+ * qcow2 version which doesn't yet support metadata zero clusters.
+ */
+int qcow2_expand_zero_clusters(BlockDriverState *bs)
+{
+    BDRVQcowState *s = bs->opaque;
+    uint64_t *l1_table = NULL;
+    uint64_t nb_clusters;
+    uint8_t *expanded_clusters;
+    int ret;
+    int i, j;
+
+    nb_clusters = bs->total_sectors >> (s->cluster_bits - BDRV_SECTOR_BITS);
+    expanded_clusters = g_malloc0(nb_clusters / 8);
+
+    ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
+                                     expanded_clusters, nb_clusters);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    /* Inactive L1 tables may point to active L2 tables - therefore it is
+     * necessary to flush the L2 table cache before trying to access the L2
+     * tables pointed to by inactive L1 entries (else we might try to expand
+     * zero clusters that have already been expanded) */
+    ret = qcow2_cache_flush(bs, s->l2_table_cache);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    for (i = 0; i < s->nb_snapshots; i++) {
+        int l1_sectors = (s->snapshots[i].l1_size * sizeof(uint64_t) +
+                BDRV_SECTOR_SIZE - 1) / BDRV_SECTOR_SIZE;
+
+        l1_table = g_realloc(l1_table, l1_sectors * BDRV_SECTOR_SIZE);
+
+        ret = bdrv_read(bs->file, s->snapshots[i].l1_table_offset /
+                BDRV_SECTOR_SIZE, (void *)l1_table, l1_sectors);
+        if (ret < 0) {
+            goto fail;
+        }
+
+        for (j = 0; j < s->snapshots[i].l1_size; j++) {
+            be64_to_cpus(&l1_table[j]);
+        }
+
+        ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size,
+                                         expanded_clusters, nb_clusters);
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
+    ret = 0;
+
+fail:
+    g_free(expanded_clusters);
+    g_free(l1_table);
+    return ret;
+}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ba129de..4264148 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -601,10 +601,10 @@ fail:
  * If the return value is non-negative, it is the new refcount of the cluster.
  * If it is negative, it is -errno and indicates an error.
  */
-static int update_cluster_refcount(BlockDriverState *bs,
-                                   int64_t cluster_index,
-                                   int addend,
-                                   enum qcow2_discard_type type)
+int qcow2_update_cluster_refcount(BlockDriverState *bs,
+                                  int64_t cluster_index,
+                                  int addend,
+                                  enum qcow2_discard_type type)
 {
     BDRVQcowState *s = bs->opaque;
     int ret;
@@ -733,8 +733,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
         if (free_in_cluster == 0)
             s->free_byte_offset = 0;
         if ((offset & (s->cluster_size - 1)) != 0)
-            update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
-                                    QCOW2_DISCARD_NEVER);
+            qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
+                                          QCOW2_DISCARD_NEVER);
     } else {
         offset = qcow2_alloc_clusters(bs, s->cluster_size);
         if (offset < 0) {
@@ -744,8 +744,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
         if ((cluster_offset + s->cluster_size) == offset) {
             /* we are lucky: contiguous data */
             offset = s->free_byte_offset;
-            update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
-                                    QCOW2_DISCARD_NEVER);
+            qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
+                                          QCOW2_DISCARD_NEVER);
             s->free_byte_offset += size;
         } else {
             s->free_byte_offset = offset;
@@ -754,8 +754,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
     }
 
     /* The cluster refcount was incremented, either by qcow2_alloc_clusters()
-     * or explicitly by update_cluster_refcount().  Refcount blocks must be
-     * flushed before the caller's L2 table updates.
+     * or explicitly by qcow2_update_cluster_refcount().  Refcount blocks must
+     * be flushed before the caller's L2 table updates.
      */
     qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache);
     return offset;
@@ -896,8 +896,9 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                             break;
                         }
                         if (addend != 0) {
-                            refcount = update_cluster_refcount(bs, cluster_index, addend,
-                                                               QCOW2_DISCARD_SNAPSHOT);
+                            refcount = qcow2_update_cluster_refcount(bs,
+                                    cluster_index, addend,
+                                    QCOW2_DISCARD_SNAPSHOT);
                         } else {
                             refcount = get_refcount(bs, cluster_index);
                         }
@@ -936,8 +937,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 
 
             if (addend != 0) {
-                refcount = update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend,
-                                                   QCOW2_DISCARD_SNAPSHOT);
+                refcount = qcow2_update_cluster_refcount(bs, l2_offset >>
+                        s->cluster_bits, addend, QCOW2_DISCARD_SNAPSHOT);
             } else {
                 refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
             }
diff --git a/block/qcow2.h b/block/qcow2.h
index 10b7bf4..25176ac 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -406,6 +406,9 @@ int qcow2_update_header(BlockDriverState *bs);
 int qcow2_refcount_init(BlockDriverState *bs);
 void qcow2_refcount_close(BlockDriverState *bs);
 
+int qcow2_update_cluster_refcount(BlockDriverState *bs, int64_t cluster_index,
+                                  int addend, enum qcow2_discard_type type);
+
 int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size);
 int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
     int nb_clusters);
@@ -453,6 +456,8 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
     int nb_sectors);
 int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);
 
+int qcow2_expand_zero_clusters(BlockDriverState *bs);
+
 /* qcow2-snapshot.c functions */
 int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info);
 int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 3/5] qcow2: Save refcount order in BDRVQcowState
  2013-09-02 10:04 [Qemu-devel] [PATCH v4 0/5] block/qcow2: Image file option amendment Max Reitz
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 1/5] block: " Max Reitz
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 2/5] qcow2-cluster: Expand zero clusters Max Reitz
@ 2013-09-02 10:04 ` Max Reitz
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 4/5] qcow2: Implement bdrv_amend_options Max Reitz
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment Max Reitz
  4 siblings, 0 replies; 11+ messages in thread
From: Max Reitz @ 2013-09-02 10:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Save the image refcount order in BDRVQcowState. This will be relevant
for future code supporting different refcount orders than four and also
for code that needs to verify a certain refcount order for an opened
image.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2.c | 3 ++-
 block/qcow2.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index aeb2ebb..28b104e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -455,6 +455,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
         ret = -ENOTSUP;
         goto fail;
     }
+    s->refcount_order = header.refcount_order;
 
     if (header.cluster_bits < MIN_CLUSTER_BITS ||
         header.cluster_bits > MAX_CLUSTER_BITS) {
@@ -1133,7 +1134,7 @@ int qcow2_update_header(BlockDriverState *bs)
         .incompatible_features  = cpu_to_be64(s->incompatible_features),
         .compatible_features    = cpu_to_be64(s->compatible_features),
         .autoclear_features     = cpu_to_be64(s->autoclear_features),
-        .refcount_order         = cpu_to_be32(3 + REFCOUNT_SHIFT),
+        .refcount_order         = cpu_to_be32(s->refcount_order),
         .header_length          = cpu_to_be32(header_length),
     };
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 25176ac..b070709 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -199,6 +199,7 @@ typedef struct BDRVQcowState {
     int flags;
     int qcow_version;
     bool use_lazy_refcounts;
+    int refcount_order;
 
     bool discard_passthrough[QCOW2_DISCARD_MAX];
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 4/5] qcow2: Implement bdrv_amend_options
  2013-09-02 10:04 [Qemu-devel] [PATCH v4 0/5] block/qcow2: Image file option amendment Max Reitz
                   ` (2 preceding siblings ...)
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 3/5] qcow2: Save refcount order in BDRVQcowState Max Reitz
@ 2013-09-02 10:04 ` Max Reitz
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment Max Reitz
  4 siblings, 0 replies; 11+ messages in thread
From: Max Reitz @ 2013-09-02 10:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Implement bdrv_amend_options for compat, size, backing_file, backing_fmt
and lazy_refcounts.

Downgrading images from compat=1.1 to compat=0.10 is achieved through
handling all incompatible flags accordingly, clearing all compatible and
autoclear flags and expanding all zero clusters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2.c | 195 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 195 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 28b104e..254cf8c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1813,6 +1813,200 @@ static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf,
     return ret;
 }
 
+/*
+ * Downgrades an image's version. To achieve this, any incompatible features
+ * have to be removed.
+ */
+static int qcow2_downgrade(BlockDriverState *bs, int target_version)
+{
+    BDRVQcowState *s = bs->opaque;
+    int current_version = s->qcow_version;
+    int ret;
+
+    if (target_version == current_version) {
+        return 0;
+    } else if (target_version > current_version) {
+        return -EINVAL;
+    } else if (target_version != 2) {
+        return -EINVAL;
+    }
+
+    if (s->refcount_order != 4) {
+        /* we would have to convert the image to a refcount_order == 4 image
+         * here; however, since qemu (at the time of writing this) does not
+         * support anything different than 4 anyway, there is no point in doing
+         * so right now; however, we should error out (if qemu supports this in
+         * the future and this code has not been adapted) */
+        error_report("qcow2_downgrade: Image refcount orders other than 4 are"
+                     "currently not supported.");
+        return -ENOTSUP;
+    }
+
+    /* clear incompatible features */
+    if (s->incompatible_features & QCOW2_INCOMPAT_DIRTY) {
+        ret = qcow2_mark_clean(bs);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    /* with QCOW2_INCOMPAT_CORRUPT, it is pretty much impossible to get here in
+     * the first place; if that happens nonetheless, returning -ENOTSUP is the
+     * best thing to do anyway */
+
+    if (s->incompatible_features) {
+        return -ENOTSUP;
+    }
+
+    /* since we can ignore compatible features, we can set them to 0 as well */
+    s->compatible_features = 0;
+    /* if lazy refcounts have been used, they have already been fixed through
+     * clearing the dirty flag */
+
+    /* clearing autoclear features is trivial */
+    s->autoclear_features = 0;
+
+    ret = qcow2_expand_zero_clusters(bs);
+    if (ret < 0) {
+        return ret;
+    }
+
+    s->qcow_version = target_version;
+    ret = qcow2_update_header(bs);
+    if (ret < 0) {
+        s->qcow_version = current_version;
+        return ret;
+    }
+    return 0;
+}
+
+static int qcow2_amend_options(BlockDriverState *bs,
+                               QEMUOptionParameter *options)
+{
+    BDRVQcowState *s = bs->opaque;
+    int old_version = s->qcow_version, new_version = old_version;
+    uint64_t new_size = 0;
+    const char *backing_file = NULL, *backing_format = NULL;
+    bool lazy_refcounts = s->use_lazy_refcounts;
+    int ret;
+    int i;
+
+    for (i = 0; options[i].name; i++)
+    {
+        if (!strcmp(options[i].name, "compat")) {
+            if (!options[i].value.s) {
+                /* preserve default */
+            } else if (!strcmp(options[i].value.s, "0.10")) {
+                new_version = 2;
+            } else if (!strcmp(options[i].value.s, "1.1")) {
+                new_version = 3;
+            } else {
+                fprintf(stderr, "Unknown compatibility level %s.\n",
+                        options[i].value.s);
+                return -EINVAL;
+            }
+        } else if (!strcmp(options[i].name, "preallocation")) {
+            if (options[i].assigned) {
+                fprintf(stderr, "Cannot change preallocation mode.\n");
+                return -ENOTSUP;
+            }
+        } else if (!strcmp(options[i].name, "size")) {
+            new_size = options[i].value.n;
+        } else if (!strcmp(options[i].name, "backing_file")) {
+            backing_file = options[i].value.s;
+        } else if (!strcmp(options[i].name, "backing_fmt")) {
+            backing_format = options[i].value.s;
+        } else if (!strcmp(options[i].name, "encryption")) {
+            if (options[i].assigned &&
+                (options[i].value.n != !!s->crypt_method)) {
+                fprintf(stderr, "Changing the encryption flag is not "
+                        "supported.\n");
+                return -ENOTSUP;
+            }
+        } else if (!strcmp(options[i].name, "cluster_size")) {
+            if (options[i].assigned && (options[i].value.n != s->cluster_size))
+            {
+                fprintf(stderr, "Changing the cluster size is not "
+                        "supported.\n");
+                return -ENOTSUP;
+            }
+        } else if (!strcmp(options[i].name, "lazy_refcounts")) {
+            if (options[i].assigned) {
+                lazy_refcounts = options[i].value.n;
+            }
+        } else {
+            /* if this assertion fails, this probably means a new option was
+             * added without having it covered here */
+            assert(false);
+        }
+    }
+
+    if (new_version != old_version) {
+        if (new_version > old_version) {
+            /* Upgrade */
+            s->qcow_version = new_version;
+            ret = qcow2_update_header(bs);
+            if (ret < 0) {
+                s->qcow_version = old_version;
+                return ret;
+            }
+        } else {
+            ret = qcow2_downgrade(bs, new_version);
+            if (ret < 0) {
+                return ret;
+            }
+        }
+    }
+
+    if (new_size) {
+        ret = qcow2_truncate(bs, new_size);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    if (backing_file || backing_format) {
+        ret = qcow2_change_backing_file(bs, backing_file ?: bs->backing_file,
+                                        backing_format ?: bs->backing_format);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    if (s->use_lazy_refcounts != lazy_refcounts) {
+        if (lazy_refcounts) {
+            if (s->qcow_version < 3) {
+                fprintf(stderr, "Lazy refcounts only supported with compatibility "
+                        "level 1.1 and above (use compat=1.1 or greater)\n");
+                return -EINVAL;
+            }
+            s->compatible_features |= QCOW2_COMPAT_LAZY_REFCOUNTS;
+            ret = qcow2_update_header(bs);
+            if (ret < 0) {
+                s->compatible_features &= ~QCOW2_COMPAT_LAZY_REFCOUNTS;
+                return ret;
+            }
+            s->use_lazy_refcounts = true;
+        } else {
+            /* make image clean first */
+            ret = qcow2_mark_clean(bs);
+            if (ret < 0) {
+                return ret;
+            }
+            /* now disallow lazy refcounts */
+            s->compatible_features &= ~QCOW2_COMPAT_LAZY_REFCOUNTS;
+            ret = qcow2_update_header(bs);
+            if (ret < 0) {
+                s->compatible_features |= QCOW2_COMPAT_LAZY_REFCOUNTS;
+                return ret;
+            }
+            s->use_lazy_refcounts = false;
+        }
+    }
+
+    return 0;
+}
+
 static QEMUOptionParameter qcow2_create_options[] = {
     {
         .name = BLOCK_OPT_SIZE,
@@ -1896,6 +2090,7 @@ static BlockDriver bdrv_qcow2 = {
 
     .create_options = qcow2_create_options,
     .bdrv_check = qcow2_check,
+    .bdrv_amend_options = qcow2_amend_options,
 };
 
 static void bdrv_qcow2_init(void)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment
  2013-09-02 10:04 [Qemu-devel] [PATCH v4 0/5] block/qcow2: Image file option amendment Max Reitz
                   ` (3 preceding siblings ...)
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 4/5] qcow2: Implement bdrv_amend_options Max Reitz
@ 2013-09-02 10:04 ` Max Reitz
  2013-09-02 15:36   ` Kevin Wolf
  2013-09-03  8:23   ` Kevin Wolf
  4 siblings, 2 replies; 11+ messages in thread
From: Max Reitz @ 2013-09-02 10:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Add tests for qemu-img amend on qcow2 image files.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/061     | 178 +++++++++++++++++++++++
 tests/qemu-iotests/061.out | 349 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 528 insertions(+)
 create mode 100755 tests/qemu-iotests/061
 create mode 100644 tests/qemu-iotests/061.out

diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
new file mode 100755
index 0000000..86404e6
--- /dev/null
+++ b/tests/qemu-iotests/061
@@ -0,0 +1,178 @@
+#!/bin/bash
+#
+# Test case for image option amendment in qcow2.
+#
+# Copyright (C) 2013 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=mreitz@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+	_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# This tests qocw2-specific low-level functionality
+_supported_fmt qcow2
+_supported_proto generic
+_supported_os Linux
+
+echo
+echo "=== Testing version downgrade with zero expansion ==="
+echo
+IMGOPTS="compat=1.1,lazy_refcounts=on" _make_test_img 64M
+$QEMU_IO -c "write -z 0 128k" "$TEST_IMG" | _filter_qemu_io
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IO -c "read -P 0 0 128k" "$TEST_IMG" | _filter_qemu_io
+_check_test_img
+
+echo
+echo "=== Testing dirty version downgrade ==="
+echo
+IMGOPTS="compat=1.1,lazy_refcounts=on" _make_test_img 64M
+$QEMU_IO -c "write -P 0x2a 0 128k" -c flush -c abort "$TEST_IMG" | _filter_qemu_io
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IO -c "read -P 0x2a 0 128k" "$TEST_IMG" | _filter_qemu_io
+_check_test_img
+
+echo
+echo "=== Testing version downgrade with unknown compat/autoclear flags ==="
+echo
+IMGOPTS="compat=1.1" _make_test_img 64M
+./qcow2.py "$TEST_IMG" set-feature-bit compatible 42
+./qcow2.py "$TEST_IMG" set-feature-bit autoclear 42
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+./qcow2.py "$TEST_IMG" dump-header
+_check_test_img
+
+echo
+echo "=== Testing version upgrade and resize ==="
+echo
+IMGOPTS="compat=0.10" _make_test_img 64M
+$QEMU_IO -c "write -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG amend -o "compat=1.1,lazy_refcounts=on,size=128M" "$TEST_IMG"
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IO -c "read -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
+_check_test_img
+
+echo
+echo "=== Testing dirty lazy_refcounts=off ==="
+echo
+IMGOPTS="compat=1.1,lazy_refcounts=on" _make_test_img 64M
+$QEMU_IO -c "write -P 0x2a 0 128k" -c flush -c abort "$TEST_IMG" | _filter_qemu_io
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG amend -o "lazy_refcounts=off" "$TEST_IMG"
+./qcow2.py "$TEST_IMG" dump-header
+$QEMU_IO -c "read -P 0x2a 0 128k" "$TEST_IMG" | _filter_qemu_io
+_check_test_img
+
+echo
+echo "=== Testing backing file ==="
+echo
+IMGOPTS="compat=1.1" _make_test_img 64M
+IMGOPTS="compat=1.1" TEST_IMG="$TEST_IMG.base" _make_test_img 64M
+$QEMU_IO -c "write -P 0x2a 0 128k" "$TEST_IMG.base" | _filter_qemu_io
+$QEMU_IO -c "read -P 0 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG amend -o "backing_file=$TEST_IMG.base,backing_fmt=qcow2" "$TEST_IMG"
+$QEMU_IO -c "read -P 0x2a 0 128k" "$TEST_IMG" | _filter_qemu_io
+_check_test_img
+
+echo
+echo "=== Testing invalid configurations ==="
+echo
+IMGOPTS="compat=0.10" _make_test_img 64M
+$QEMU_IMG amend -o "lazy_refcounts=on" "$TEST_IMG"
+$QEMU_IMG amend -o "compat=1.1" "$TEST_IMG" # actually valid
+$QEMU_IMG amend -o "compat=0.10,lazy_refcounts=on" "$TEST_IMG"
+$QEMU_IMG amend -o "compat=0.42" "$TEST_IMG"
+$QEMU_IMG amend -o "foo=bar" "$TEST_IMG"
+$QEMU_IMG amend -o "cluster_size=1k" "$TEST_IMG"
+$QEMU_IMG amend -o "encryption=on" "$TEST_IMG"
+$QEMU_IMG amend -o "preallocation=on" "$TEST_IMG"
+
+echo
+echo "=== Testing correct handling of unset value ==="
+echo
+IMGOPTS="compat=1.1,cluster_size=1k" _make_test_img 64M
+echo "Should work:"
+$QEMU_IMG amend -o "lazy_refcounts=on" "$TEST_IMG"
+echo "Should not work:" # Just to know which of these tests actually fails
+$QEMU_IMG amend -o "cluster_size=64k" "$TEST_IMG"
+
+echo
+echo "=== Testing zero expansion on inactive clusters ==="
+echo
+IMGOPTS="compat=1.1" _make_test_img 64M
+$QEMU_IO -c "write -z 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -c foo "$TEST_IMG"
+$QEMU_IO -c "write -P 0x2a 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0x2a 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -a foo "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0 0 128k" "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Testing zero expansion on backed image ==="
+echo
+IMGOPTS="compat=1.1" TEST_IMG="$TEST_IMG.base" _make_test_img 64M
+$QEMU_IO -c "write -P 0x2a 0 128k" "$TEST_IMG.base" | _filter_qemu_io
+IMGOPTS="compat=1.1,backing_file=$TEST_IMG.base" _make_test_img 64M
+$QEMU_IO -c "read -P 0x2a 0 128k" -c "write -z 0 64k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0 0 64k" -c "read -P 0x2a 64k 64k" "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "=== Testing zero expansion on backed inactive clusters ==="
+echo
+IMGOPTS="compat=1.1" TEST_IMG="$TEST_IMG.base" _make_test_img 64M
+$QEMU_IO -c "write -P 0x2a 0 128k" "$TEST_IMG.base" | _filter_qemu_io
+IMGOPTS="compat=1.1,backing_file=$TEST_IMG.base" _make_test_img 64M
+$QEMU_IO -c "write -z 0 64k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -c foo "$TEST_IMG"
+$QEMU_IO -c "write -P 0x42 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0x42 0 128k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -a foo "$TEST_IMG"
+_check_test_img
+$QEMU_IO -c "read -P 0 0 64k" -c "read -P 0x2a 64k 64k" "$TEST_IMG" | _filter_qemu_io
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
new file mode 100644
index 0000000..24432d6
--- /dev/null
+++ b/tests/qemu-iotests/061.out
@@ -0,0 +1,349 @@
+QA output created by 061
+
+=== Testing version downgrade with zero expansion ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x0
+compatible_features       0x1
+autoclear_features        0x0
+refcount_order            4
+header_length             104
+
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x0
+compatible_features       0x0
+autoclear_features        0x0
+refcount_order            4
+header_length             72
+
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+=== Testing dirty version downgrade ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x1
+compatible_features       0x1
+autoclear_features        0x0
+refcount_order            4
+header_length             104
+
+Repairing cluster 5 refcount=0 reference=1
+Repairing cluster 6 refcount=0 reference=1
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x0
+compatible_features       0x0
+autoclear_features        0x0
+refcount_order            4
+header_length             72
+
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+=== Testing version downgrade with unknown compat/autoclear flags ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x0
+compatible_features       0x40000000000
+autoclear_features        0x40000000000
+refcount_order            4
+header_length             104
+
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x0
+compatible_features       0x0
+autoclear_features        0x0
+refcount_order            4
+header_length             72
+
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
+No errors were found on the image.
+
+=== Testing version upgrade and resize ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 65536/65536 bytes at offset 44040192
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x0
+compatible_features       0x0
+autoclear_features        0x0
+refcount_order            4
+header_length             72
+
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x0
+compatible_features       0x1
+autoclear_features        0x0
+refcount_order            4
+header_length             104
+
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
+read 65536/65536 bytes at offset 44040192
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+=== Testing dirty lazy_refcounts=off ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x1
+compatible_features       0x1
+autoclear_features        0x0
+refcount_order            4
+header_length             104
+
+Repairing cluster 5 refcount=0 reference=1
+Repairing cluster 6 refcount=0 reference=1
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              0
+snapshot_offset           0x0
+incompatible_features     0x0
+compatible_features       0x0
+autoclear_features        0x0
+refcount_order            4
+header_length             104
+
+Header extension:
+magic                     0x6803f857
+length                    144
+data                      <binary>
+
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+=== Testing backing file ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+
+=== Testing invalid configurations ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Lazy refcounts only supported with compatibility level 1.1 and above (use compat=1.1 or greater)
+qemu-img: Error while amending options: Invalid argument
+Lazy refcounts only supported with compatibility level 1.1 and above (use compat=1.1 or greater)
+qemu-img: Error while amending options: Invalid argument
+Unknown compatibility level 0.42.
+qemu-img: Error while amending options: Invalid argument
+Unknown option 'foo'
+qemu-img: Invalid options for file format 'qcow2'
+Changing the cluster size is not supported.
+qemu-img: Error while amending options: Operation not supported
+Changing the encryption flag is not supported.
+qemu-img: Error while amending options: Operation not supported
+Cannot change preallocation mode.
+qemu-img: Error while amending options: Operation not supported
+
+=== Testing correct handling of unset value ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+Should work:
+Should not work:
+Changing the cluster size is not supported.
+qemu-img: Error while amending options: Operation not supported
+
+=== Testing zero expansion on inactive clusters ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Testing zero expansion on backed image ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.base' 
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Testing zero expansion on backed inactive clusters ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.base' 
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+No errors were found on the image.
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index d46bd6b..ff3b2e6 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -65,4 +65,5 @@
 056 rw auto backing
 059 rw auto
 060 rw auto
+061 rw auto
 062 rw auto
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/5] qcow2-cluster: Expand zero clusters
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 2/5] qcow2-cluster: Expand zero clusters Max Reitz
@ 2013-09-02 15:13   ` Kevin Wolf
  2013-09-03  7:11     ` Max Reitz
  0 siblings, 1 reply; 11+ messages in thread
From: Kevin Wolf @ 2013-09-02 15:13 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 02.09.2013 um 12:04 hat Max Reitz geschrieben:
> Add functionality for expanding zero clusters. This is necessary for
> downgrading the image version to one without zero cluster support.
> 
> For non-backed images, this function may also just discard zero clusters
> instead of truly expanding them.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-cluster.c  | 228 +++++++++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2-refcount.c |  29 ++++---
>  block/qcow2.h          |   5 ++
>  3 files changed, 248 insertions(+), 14 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 2d5aa92..c90fb51 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1497,3 +1497,231 @@ fail:
>  
>      return ret;
>  }
> +
> +/*
> + * Expands all zero clusters in a specific L1 table (or deallocates them, for
> + * non-backed non-pre-allocated zero clusters).
> + *
> + * expanded_clusters is a bitmap where every bit corresponds to one cluster in
> + * the image file; a bit gets set if the corresponding cluster has been used for
> + * zero expansion (i.e., has been filled with zeroes and is referenced from an
> + * L2 table). nb_clusters contains the total cluster count of the image file,
> + * i.e., the number of bits in expanded_clusters.
> + */
> +static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
> +                                      int l1_size, uint8_t *expanded_clusters,
> +                                      uint64_t nb_clusters)
> +{
> +    BDRVQcowState *s = bs->opaque;
> +    bool is_active_l1 = (l1_table == s->l1_table);
> +    uint64_t *l2_table = NULL;
> +    int ret;
> +    int i, j;
> +
> +    if (!is_active_l1) {
> +        /* inactive L2 tables require a buffer to be stored in when loading
> +         * them from disk */
> +        l2_table = qemu_blockalign(bs, s->cluster_size);
> +    }
> +
> +    for (i = 0; i < l1_size; i++) {
> +        uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
> +        bool l2_dirty = false;
> +
> +        if (!l2_offset) {
> +            /* unallocated */
> +            continue;
> +        }
> +
> +        if (is_active_l1) {
> +            /* get active L2 tables from cache */
> +            ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
> +                    (void **)&l2_table);
> +        } else {
> +            /* load inactive L2 tables from disk */
> +            ret = bdrv_read(bs->file, l2_offset / BDRV_SECTOR_SIZE,
> +                    (void *)l2_table, s->cluster_sectors);
> +        }
> +        if (ret < 0) {
> +            goto fail;
> +        }
> +
> +        for (j = 0; j < s->l2_size; j++) {
> +            uint64_t l2_entry = be64_to_cpu(l2_table[j]);
> +            int64_t offset = l2_entry & L2E_OFFSET_MASK, cluster_index;
> +            int cluster_type = qcow2_get_cluster_type(l2_entry);
> +
> +            if (cluster_type == QCOW2_CLUSTER_NORMAL) {
> +                cluster_index = offset >> s->cluster_bits;
> +                assert((cluster_index >= 0) && (cluster_index < nb_clusters));
> +                if (expanded_clusters[cluster_index / 8] &
> +                    (1 << (cluster_index % 8))) {
> +                    /* Probably a shared L2 table; this cluster was a zero
> +                     * cluster which has been expanded, its refcount
> +                     * therefore most likely requires an update. */
> +                    ret = qcow2_update_cluster_refcount(bs, cluster_index, 1,
> +                                                        QCOW2_DISCARD_NEVER);
> +                    if (ret < 0) {
> +                        goto fail;
> +                    }
> +                    /* Since we just increased the refcount, the COPIED flag may
> +                     * no longer be set. */
> +                    l2_table[j] = cpu_to_be64(l2_entry & ~QCOW_OFLAG_COPIED);
> +                    l2_dirty = true;
> +                }
> +                continue;
> +            }
> +            else if (qcow2_get_cluster_type(l2_entry) != QCOW2_CLUSTER_ZERO) {
> +                continue;
> +            }
> +
> +            if (!offset) {
> +                /* not preallocated */
> +                if (!bs->backing_hd) {
> +                    /* not backed; therefore we can simply deallocate the
> +                     * cluster */
> +                    l2_table[j] = 0;
> +                    l2_dirty = true;
> +                    continue;
> +                }
> +
> +                offset = qcow2_alloc_clusters(bs, s->cluster_size);
> +                if (offset < 0) {
> +                    ret = offset;
> +                    goto fail;
> +                }
> +            }
> +
> +            ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_DEFAULT,
> +                                                offset, s->cluster_size);
> +            if (ret < 0) {
> +                qcow2_free_clusters(bs, offset, s->cluster_size,
> +                        QCOW2_DISCARD_ALWAYS);
> +                goto fail;
> +            }
> +
> +            ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE,
> +                                    s->cluster_sectors);
> +            if (ret < 0) {
> +                qcow2_free_clusters(bs, offset, s->cluster_size,
> +                        QCOW2_DISCARD_ALWAYS);
> +                goto fail;
> +            }
> +
> +            l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
> +            l2_dirty = true;
> +
> +            cluster_index = offset >> s->cluster_bits;
> +            assert((cluster_index >= 0) && (cluster_index < nb_clusters));
> +            expanded_clusters[cluster_index / 8] |= 1 << (cluster_index % 8);
> +        }
> +
> +        if (is_active_l1) {
> +            if (l2_dirty) {
> +                qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
> +                qcow2_cache_depends_on_flush(s->l2_table_cache);
> +            }
> +            ret = qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
> +            if (ret < 0) {
> +                l2_table = NULL;
> +                goto fail;
> +            }
> +        } else {
> +            if (l2_dirty) {
> +                ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_DEFAULT &
> +                        ~(QCOW2_OL_INACTIVE_L2 | QCOW2_OL_ACTIVE_L2), l2_offset,
> +                        s->cluster_size);
> +                if (ret < 0) {
> +                    goto fail;
> +                }
> +
> +                ret = bdrv_write(bs->file, l2_offset / BDRV_SECTOR_SIZE,
> +                        (void *)l2_table, s->cluster_sectors);
> +                if (ret < 0) {
> +                    goto fail;
> +                }
> +            }
> +        }
> +    }
> +
> +    ret = 0;
> +
> +fail:
> +    if (l2_table) {
> +        if (!is_active_l1) {
> +            qemu_vfree(l2_table);
> +        } else {
> +            if (ret < 0) {
> +                qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
> +            } else {
> +                ret = qcow2_cache_put(bs, s->l2_table_cache,
> +                        (void **)&l2_table);
> +            }
> +        }
> +    }
> +    return ret;
> +}
> +
> +/*
> + * For backed images, expands all zero clusters on the image. For non-backed
> + * images, deallocates all non-pre-allocated zero clusters (and claims the
> + * allocation for pre-allocated ones). This is important for downgrading to a
> + * qcow2 version which doesn't yet support metadata zero clusters.
> + */
> +int qcow2_expand_zero_clusters(BlockDriverState *bs)
> +{
> +    BDRVQcowState *s = bs->opaque;
> +    uint64_t *l1_table = NULL;
> +    uint64_t nb_clusters;
> +    uint8_t *expanded_clusters;
> +    int ret;
> +    int i, j;
> +
> +    nb_clusters = bs->total_sectors >> (s->cluster_bits - BDRV_SECTOR_BITS);

Shouldn't this value actually be rounded up?

> +    expanded_clusters = g_malloc0(nb_clusters / 8);

Just like the size here?

> +
> +    ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
> +                                     expanded_clusters, nb_clusters);
> +    if (ret < 0) {
> +        goto fail;
> +    }
> +
> +    /* Inactive L1 tables may point to active L2 tables - therefore it is
> +     * necessary to flush the L2 table cache before trying to access the L2
> +     * tables pointed to by inactive L1 entries (else we might try to expand
> +     * zero clusters that have already been expanded) */
> +    ret = qcow2_cache_flush(bs, s->l2_table_cache);
> +    if (ret < 0) {
> +        goto fail;
> +    }

I'm afraid that this is not enough in fact. If you modify an active L2
table through an inactive L1, you bypass the cache and use bdrv_write().
The cache will still have a stale copy of the table.

Can't we just access inactive L2 tables through the cache as well? Or
can we get into trouble in other places if we do that? If so, we'd have
to implement a function that flushes _and empties_ the cache. (And then
we could always go the bdrv_read/write path, so there's some code to
save either way)

> +    for (i = 0; i < s->nb_snapshots; i++) {
> +        int l1_sectors = (s->snapshots[i].l1_size * sizeof(uint64_t) +
> +                BDRV_SECTOR_SIZE - 1) / BDRV_SECTOR_SIZE;
> +
> +        l1_table = g_realloc(l1_table, l1_sectors * BDRV_SECTOR_SIZE);
> +
> +        ret = bdrv_read(bs->file, s->snapshots[i].l1_table_offset /
> +                BDRV_SECTOR_SIZE, (void *)l1_table, l1_sectors);
> +        if (ret < 0) {
> +            goto fail;
> +        }
> +
> +        for (j = 0; j < s->snapshots[i].l1_size; j++) {
> +            be64_to_cpus(&l1_table[j]);
> +        }
> +
> +        ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size,
> +                                         expanded_clusters, nb_clusters);
> +        if (ret < 0) {
> +            goto fail;
> +        }
> +    }
> +
> +    ret = 0;
> +
> +fail:
> +    g_free(expanded_clusters);
> +    g_free(l1_table);
> +    return ret;
> +}

Kevin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment Max Reitz
@ 2013-09-02 15:36   ` Kevin Wolf
  2013-09-03  7:12     ` Max Reitz
  2013-09-03  8:23   ` Kevin Wolf
  1 sibling, 1 reply; 11+ messages in thread
From: Kevin Wolf @ 2013-09-02 15:36 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 02.09.2013 um 12:04 hat Max Reitz geschrieben:
> Add tests for qemu-img amend on qcow2 image files.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/061     | 178 +++++++++++++++++++++++
>  tests/qemu-iotests/061.out | 349 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 528 insertions(+)
>  create mode 100755 tests/qemu-iotests/061
>  create mode 100644 tests/qemu-iotests/061.out

> +echo
> +echo "=== Testing version upgrade and resize ==="
> +echo
> +IMGOPTS="compat=0.10" _make_test_img 64M
> +$QEMU_IO -c "write -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
> +./qcow2.py "$TEST_IMG" dump-header
> +$QEMU_IMG amend -o "compat=1.1,lazy_refcounts=on,size=128M" "$TEST_IMG"
> +./qcow2.py "$TEST_IMG" dump-header
> +$QEMU_IO -c "read -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
> +_check_test_img

Your reference output isn't correct for this test: It expects a 64 MB
image after the amend. Looks like there's a bug somewhere.

Kevin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v4 2/5] qcow2-cluster: Expand zero clusters
  2013-09-02 15:13   ` Kevin Wolf
@ 2013-09-03  7:11     ` Max Reitz
  0 siblings, 0 replies; 11+ messages in thread
From: Max Reitz @ 2013-09-03  7:11 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi

Am 02.09.2013 17:13, schrieb Kevin Wolf:
> Am 02.09.2013 um 12:04 hat Max Reitz geschrieben:
>> Add functionality for expanding zero clusters. This is necessary for
>> downgrading the image version to one without zero cluster support.
>>
>> For non-backed images, this function may also just discard zero clusters
>> instead of truly expanding them.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/qcow2-cluster.c  | 228 +++++++++++++++++++++++++++++++++++++++++++++++++
>>   block/qcow2-refcount.c |  29 ++++---
>>   block/qcow2.h          |   5 ++
>>   3 files changed, 248 insertions(+), 14 deletions(-)
>>
>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>> index 2d5aa92..c90fb51 100644
>> --- a/block/qcow2-cluster.c
>> +++ b/block/qcow2-cluster.c
>> @@ -1497,3 +1497,231 @@ fail:
>>   
>>       return ret;
>>   }
>> +
>> +/*
>> + * Expands all zero clusters in a specific L1 table (or deallocates them, for
>> + * non-backed non-pre-allocated zero clusters).
>> + *
>> + * expanded_clusters is a bitmap where every bit corresponds to one cluster in
>> + * the image file; a bit gets set if the corresponding cluster has been used for
>> + * zero expansion (i.e., has been filled with zeroes and is referenced from an
>> + * L2 table). nb_clusters contains the total cluster count of the image file,
>> + * i.e., the number of bits in expanded_clusters.
>> + */
>> +static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
>> +                                      int l1_size, uint8_t *expanded_clusters,
>> +                                      uint64_t nb_clusters)
>> +{
>> +    BDRVQcowState *s = bs->opaque;
>> +    bool is_active_l1 = (l1_table == s->l1_table);
>> +    uint64_t *l2_table = NULL;
>> +    int ret;
>> +    int i, j;
>> +
>> +    if (!is_active_l1) {
>> +        /* inactive L2 tables require a buffer to be stored in when loading
>> +         * them from disk */
>> +        l2_table = qemu_blockalign(bs, s->cluster_size);
>> +    }
>> +
>> +    for (i = 0; i < l1_size; i++) {
>> +        uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
>> +        bool l2_dirty = false;
>> +
>> +        if (!l2_offset) {
>> +            /* unallocated */
>> +            continue;
>> +        }
>> +
>> +        if (is_active_l1) {
>> +            /* get active L2 tables from cache */
>> +            ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
>> +                    (void **)&l2_table);
>> +        } else {
>> +            /* load inactive L2 tables from disk */
>> +            ret = bdrv_read(bs->file, l2_offset / BDRV_SECTOR_SIZE,
>> +                    (void *)l2_table, s->cluster_sectors);
>> +        }
>> +        if (ret < 0) {
>> +            goto fail;
>> +        }
>> +
>> +        for (j = 0; j < s->l2_size; j++) {
>> +            uint64_t l2_entry = be64_to_cpu(l2_table[j]);
>> +            int64_t offset = l2_entry & L2E_OFFSET_MASK, cluster_index;
>> +            int cluster_type = qcow2_get_cluster_type(l2_entry);
>> +
>> +            if (cluster_type == QCOW2_CLUSTER_NORMAL) {
>> +                cluster_index = offset >> s->cluster_bits;
>> +                assert((cluster_index >= 0) && (cluster_index < nb_clusters));
>> +                if (expanded_clusters[cluster_index / 8] &
>> +                    (1 << (cluster_index % 8))) {
>> +                    /* Probably a shared L2 table; this cluster was a zero
>> +                     * cluster which has been expanded, its refcount
>> +                     * therefore most likely requires an update. */
>> +                    ret = qcow2_update_cluster_refcount(bs, cluster_index, 1,
>> +                                                        QCOW2_DISCARD_NEVER);
>> +                    if (ret < 0) {
>> +                        goto fail;
>> +                    }
>> +                    /* Since we just increased the refcount, the COPIED flag may
>> +                     * no longer be set. */
>> +                    l2_table[j] = cpu_to_be64(l2_entry & ~QCOW_OFLAG_COPIED);
>> +                    l2_dirty = true;
>> +                }
>> +                continue;
>> +            }
>> +            else if (qcow2_get_cluster_type(l2_entry) != QCOW2_CLUSTER_ZERO) {
>> +                continue;
>> +            }
>> +
>> +            if (!offset) {
>> +                /* not preallocated */
>> +                if (!bs->backing_hd) {
>> +                    /* not backed; therefore we can simply deallocate the
>> +                     * cluster */
>> +                    l2_table[j] = 0;
>> +                    l2_dirty = true;
>> +                    continue;
>> +                }
>> +
>> +                offset = qcow2_alloc_clusters(bs, s->cluster_size);
>> +                if (offset < 0) {
>> +                    ret = offset;
>> +                    goto fail;
>> +                }
>> +            }
>> +
>> +            ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_DEFAULT,
>> +                                                offset, s->cluster_size);
>> +            if (ret < 0) {
>> +                qcow2_free_clusters(bs, offset, s->cluster_size,
>> +                        QCOW2_DISCARD_ALWAYS);
>> +                goto fail;
>> +            }
>> +
>> +            ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE,
>> +                                    s->cluster_sectors);
>> +            if (ret < 0) {
>> +                qcow2_free_clusters(bs, offset, s->cluster_size,
>> +                        QCOW2_DISCARD_ALWAYS);
>> +                goto fail;
>> +            }
>> +
>> +            l2_table[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
>> +            l2_dirty = true;
>> +
>> +            cluster_index = offset >> s->cluster_bits;
>> +            assert((cluster_index >= 0) && (cluster_index < nb_clusters));
>> +            expanded_clusters[cluster_index / 8] |= 1 << (cluster_index % 8);
>> +        }
>> +
>> +        if (is_active_l1) {
>> +            if (l2_dirty) {
>> +                qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
>> +                qcow2_cache_depends_on_flush(s->l2_table_cache);
>> +            }
>> +            ret = qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
>> +            if (ret < 0) {
>> +                l2_table = NULL;
>> +                goto fail;
>> +            }
>> +        } else {
>> +            if (l2_dirty) {
>> +                ret = qcow2_pre_write_overlap_check(bs, QCOW2_OL_DEFAULT &
>> +                        ~(QCOW2_OL_INACTIVE_L2 | QCOW2_OL_ACTIVE_L2), l2_offset,
>> +                        s->cluster_size);
>> +                if (ret < 0) {
>> +                    goto fail;
>> +                }
>> +
>> +                ret = bdrv_write(bs->file, l2_offset / BDRV_SECTOR_SIZE,
>> +                        (void *)l2_table, s->cluster_sectors);
>> +                if (ret < 0) {
>> +                    goto fail;
>> +                }
>> +            }
>> +        }
>> +    }
>> +
>> +    ret = 0;
>> +
>> +fail:
>> +    if (l2_table) {
>> +        if (!is_active_l1) {
>> +            qemu_vfree(l2_table);
>> +        } else {
>> +            if (ret < 0) {
>> +                qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table);
>> +            } else {
>> +                ret = qcow2_cache_put(bs, s->l2_table_cache,
>> +                        (void **)&l2_table);
>> +            }
>> +        }
>> +    }
>> +    return ret;
>> +}
>> +
>> +/*
>> + * For backed images, expands all zero clusters on the image. For non-backed
>> + * images, deallocates all non-pre-allocated zero clusters (and claims the
>> + * allocation for pre-allocated ones). This is important for downgrading to a
>> + * qcow2 version which doesn't yet support metadata zero clusters.
>> + */
>> +int qcow2_expand_zero_clusters(BlockDriverState *bs)
>> +{
>> +    BDRVQcowState *s = bs->opaque;
>> +    uint64_t *l1_table = NULL;
>> +    uint64_t nb_clusters;
>> +    uint8_t *expanded_clusters;
>> +    int ret;
>> +    int i, j;
>> +
>> +    nb_clusters = bs->total_sectors >> (s->cluster_bits - BDRV_SECTOR_BITS);
> Shouldn't this value actually be rounded up?
Oh, yes, of course.

>> +    expanded_clusters = g_malloc0(nb_clusters / 8);
> Just like the size here?
>
>> +
>> +    ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
>> +                                     expanded_clusters, nb_clusters);
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
>> +    /* Inactive L1 tables may point to active L2 tables - therefore it is
>> +     * necessary to flush the L2 table cache before trying to access the L2
>> +     * tables pointed to by inactive L1 entries (else we might try to expand
>> +     * zero clusters that have already been expanded) */
>> +    ret = qcow2_cache_flush(bs, s->l2_table_cache);
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
> I'm afraid that this is not enough in fact. If you modify an active L2
> table through an inactive L1, you bypass the cache and use bdrv_write().
> The cache will still have a stale copy of the table.
Hm, yes, that didn't occur to me because the image isn't being written 
to after this zero cluster expansion; at least not in the tests I ran.

> Can't we just access inactive L2 tables through the cache as well? Or
> can we get into trouble in other places if we do that? If so, we'd have
> to implement a function that flushes _and empties_ the cache. (And then
> we could always go the bdrv_read/write path, so there's some code to
> save either way)
I'd personally go for emptying the cache.

>> +    for (i = 0; i < s->nb_snapshots; i++) {
>> +        int l1_sectors = (s->snapshots[i].l1_size * sizeof(uint64_t) +
>> +                BDRV_SECTOR_SIZE - 1) / BDRV_SECTOR_SIZE;
>> +
>> +        l1_table = g_realloc(l1_table, l1_sectors * BDRV_SECTOR_SIZE);
>> +
>> +        ret = bdrv_read(bs->file, s->snapshots[i].l1_table_offset /
>> +                BDRV_SECTOR_SIZE, (void *)l1_table, l1_sectors);
>> +        if (ret < 0) {
>> +            goto fail;
>> +        }
>> +
>> +        for (j = 0; j < s->snapshots[i].l1_size; j++) {
>> +            be64_to_cpus(&l1_table[j]);
>> +        }
>> +
>> +        ret = expand_zero_clusters_in_l1(bs, l1_table, s->snapshots[i].l1_size,
>> +                                         expanded_clusters, nb_clusters);
>> +        if (ret < 0) {
>> +            goto fail;
>> +        }
>> +    }
>> +
>> +    ret = 0;
>> +
>> +fail:
>> +    g_free(expanded_clusters);
>> +    g_free(l1_table);
>> +    return ret;
>> +}
> Kevin

Max

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment
  2013-09-02 15:36   ` Kevin Wolf
@ 2013-09-03  7:12     ` Max Reitz
  0 siblings, 0 replies; 11+ messages in thread
From: Max Reitz @ 2013-09-03  7:12 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi

Am 02.09.2013 17:36, schrieb Kevin Wolf:
> Am 02.09.2013 um 12:04 hat Max Reitz geschrieben:
>> Add tests for qemu-img amend on qcow2 image files.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   tests/qemu-iotests/061     | 178 +++++++++++++++++++++++
>>   tests/qemu-iotests/061.out | 349 +++++++++++++++++++++++++++++++++++++++++++++
>>   tests/qemu-iotests/group   |   1 +
>>   3 files changed, 528 insertions(+)
>>   create mode 100755 tests/qemu-iotests/061
>>   create mode 100644 tests/qemu-iotests/061.out
>> +echo
>> +echo "=== Testing version upgrade and resize ==="
>> +echo
>> +IMGOPTS="compat=0.10" _make_test_img 64M
>> +$QEMU_IO -c "write -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
>> +./qcow2.py "$TEST_IMG" dump-header
>> +$QEMU_IMG amend -o "compat=1.1,lazy_refcounts=on,size=128M" "$TEST_IMG"
>> +./qcow2.py "$TEST_IMG" dump-header
>> +$QEMU_IO -c "read -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
>> +_check_test_img
> Your reference output isn't correct for this test: It expects a 64 MB
> image after the amend. Looks like there's a bug somewhere.
>
> Kevin
Oh, right, thanks for catching it!

Max

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment
  2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment Max Reitz
  2013-09-02 15:36   ` Kevin Wolf
@ 2013-09-03  8:23   ` Kevin Wolf
  1 sibling, 0 replies; 11+ messages in thread
From: Kevin Wolf @ 2013-09-03  8:23 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 02.09.2013 um 12:04 hat Max Reitz geschrieben:
> Add tests for qemu-img amend on qcow2 image files.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/061     | 178 +++++++++++++++++++++++
>  tests/qemu-iotests/061.out | 349 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 528 insertions(+)
>  create mode 100755 tests/qemu-iotests/061
>  create mode 100644 tests/qemu-iotests/061.out

It might be worth adding test cases for...

* Leaving an encrypted image encrypted, implicitly or explicitly
* Zero cluster expansion with an (active/inactive) L2 table with
  refcount > 1
* State after a failed amend operation (or do we even promise anything?
  I guess if you pass multiple options, some may be applied and some not)

What's there looks good (except for the one bug I mentioned)

Kevin

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-09-03  8:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-02 10:04 [Qemu-devel] [PATCH v4 0/5] block/qcow2: Image file option amendment Max Reitz
2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 1/5] block: " Max Reitz
2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 2/5] qcow2-cluster: Expand zero clusters Max Reitz
2013-09-02 15:13   ` Kevin Wolf
2013-09-03  7:11     ` Max Reitz
2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 3/5] qcow2: Save refcount order in BDRVQcowState Max Reitz
2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 4/5] qcow2: Implement bdrv_amend_options Max Reitz
2013-09-02 10:04 ` [Qemu-devel] [PATCH v4 5/5] qemu-iotest: qcow2 image option amendment Max Reitz
2013-09-02 15:36   ` Kevin Wolf
2013-09-03  7:12     ` Max Reitz
2013-09-03  8:23   ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.