All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP
@ 2014-06-07 18:51 Max Reitz
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 01/14] qcow2: Allow "full" discard Max Reitz
                   ` (14 more replies)
  0 siblings, 15 replies; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

qemu-img should use QMP commands whenever possible in order to ensure
feature completeness of both online and offline image operations. For
the "commit" command, this is relatively easy, so implement it first
(in the hope that indeed others will follow).

As qemu-img does not have access to QMP (due to QMP being intertwined
with basically everything in qemu), we cannot directly use QMP, but at
least use the functions the corresponding QMP commands are using (which
would be "block-commit", in this case).


With Stefan's pull request for his dataplane series now out, I thought
this a good opportunity to send a rebase of this series.

Patches 7 and 9 through 13 are yet to be reviewed, if I can trust grep.


v8: Rebased on Stefan's block branch
- Patch 5: modify qapi/block-core.json rather than qapi-schema.json
- Patch 12: test number 092 is taken, move to 094
- Patch 13: test number 093 will be taken (by Fam), move to 095

v7:
- Patch 1: rephrased comment [Eric]
- Patch 3:
  - give BDRV_REQ_MAY_UNMAP to bdrv_write_zeroes() [Eric]
  - adjusted comment about calculation of total cluster count [Eric]
  - added error handling for bdrv_discard() [Eric]
- Patch 7: replaced qemu_aio_wait() by aio_poll()
  [Stefan's dataplane series]
- Patch 8: rephrased comment and manpage content [Eric]
- Patch 12: more test passes for implicit and explicit -d


git-backport-diff output against v7:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/14:[----] [--] 'qcow2: Allow "full" discard'
002/14:[----] [--] 'qcow2: Implement bdrv_make_empty()'
003/14:[----] [--] 'qcow2: Optimize bdrv_make_empty()'
004/14:[----] [--] 'blockjob: Introduce block_job_complete_sync()'
005/14:[----] [--] 'blockjob: Add "ready" field'
006/14:[----] [--] 'block/mirror: Improve progress report'
007/14:[----] [-C] 'qemu-img: Implement commit like QMP'
008/14:[----] [--] 'qemu-img: Empty image after commit'
009/14:[----] [--] 'qemu-img: Enable progress output for commit'
010/14:[----] [--] 'qemu-img: Specify backing file for commit'
011/14:[----] [--] 'iotests: Add _filter_qemu_img_map'
012/14:[0002] [FC] 'iotests: Add test for backing-chain commits'
013/14:[0002] [FC] 'iotests: Add test for qcow2's bdrv_make_empty'
014/14:[----] [--] 'iotests: Omit length/offset test in 040 and 041'


Max Reitz (14):
  qcow2: Allow "full" discard
  qcow2: Implement bdrv_make_empty()
  qcow2: Optimize bdrv_make_empty()
  blockjob: Introduce block_job_complete_sync()
  blockjob: Add "ready" field
  block/mirror: Improve progress report
  qemu-img: Implement commit like QMP
  qemu-img: Empty image after commit
  qemu-img: Enable progress output for commit
  qemu-img: Specify backing file for commit
  iotests: Add _filter_qemu_img_map
  iotests: Add test for backing-chain commits
  iotests: Add test for qcow2's bdrv_make_empty
  iotests: Omit length/offset test in 040 and 041

 block/Makefile.objs              |   2 +-
 block/mirror.c                   |  32 ++--
 block/qcow2-cluster.c            |  27 ++-
 block/qcow2-snapshot.c           |   2 +-
 block/qcow2.c                    | 388 ++++++++++++++++++++++++++++++++++++++-
 block/qcow2.h                    |   2 +-
 blockjob.c                       |  46 ++++-
 include/block/blockjob.h         |  20 ++
 qapi/block-core.json             |   4 +-
 qemu-img-cmds.hx                 |   4 +-
 qemu-img.c                       | 149 ++++++++++++---
 qemu-img.texi                    |  13 +-
 tests/qemu-iotests/040           |   4 +-
 tests/qemu-iotests/041           |   3 +-
 tests/qemu-iotests/094           | 122 ++++++++++++
 tests/qemu-iotests/094.out       | 119 ++++++++++++
 tests/qemu-iotests/095           |  72 ++++++++
 tests/qemu-iotests/095.out       |  26 +++
 tests/qemu-iotests/common.filter |   7 +
 tests/qemu-iotests/group         |   2 +
 tests/qemu-iotests/iotests.py    |   3 +-
 21 files changed, 980 insertions(+), 67 deletions(-)
 create mode 100755 tests/qemu-iotests/094
 create mode 100644 tests/qemu-iotests/094.out
 create mode 100755 tests/qemu-iotests/095
 create mode 100644 tests/qemu-iotests/095.out

-- 
2.0.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 01/14] qcow2: Allow "full" discard
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-30 10:00   ` Kevin Wolf
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 02/14] qcow2: Implement bdrv_make_empty() Max Reitz
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Normally, discarded sectors should read back as zero. However, there are
cases in which a sector (or rather cluster) should be discarded as if
they were never written in the first place, that is, reading them should
fall through to the backing file again.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-cluster.c  | 27 +++++++++++++++++----------
 block/qcow2-snapshot.c |  2 +-
 block/qcow2.c          |  2 +-
 block/qcow2.h          |  2 +-
 4 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 4208dc0..ce52f9b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1351,7 +1351,7 @@ int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset)
  * clusters.
  */
 static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
-    unsigned int nb_clusters, enum qcow2_discard_type type)
+    unsigned int nb_clusters, enum qcow2_discard_type type, bool full_discard)
 {
     BDRVQcowState *s = bs->opaque;
     uint64_t *l2_table;
@@ -1373,23 +1373,30 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
         old_l2_entry = be64_to_cpu(l2_table[l2_index + i]);
 
         /*
-         * Make sure that a discarded area reads back as zeroes for v3 images
-         * (we cannot do it for v2 without actually writing a zero-filled
-         * buffer). We can skip the operation if the cluster is already marked
-         * as zero, or if it's unallocated and we don't have a backing file.
+         * If full_discard is false, make sure that a discarded area reads back
+         * as zeroes for v3 images (we cannot do it for v2 without actually
+         * writing a zero-filled buffer). We can skip the operation if the
+         * cluster is already marked as zero, or if it's unallocated and we
+         * don't have a backing file.
          *
          * TODO We might want to use bdrv_get_block_status(bs) here, but we're
          * holding s->lock, so that doesn't work today.
+         *
+         * If full_discard is true, the sector should not read back as zeroes,
+         * but rather fall through to the backing file.
          */
         switch (qcow2_get_cluster_type(old_l2_entry)) {
             case QCOW2_CLUSTER_UNALLOCATED:
-                if (!bs->backing_hd) {
+                if (full_discard || !bs->backing_hd) {
                     continue;
                 }
                 break;
 
             case QCOW2_CLUSTER_ZERO:
-                continue;
+                if (!full_discard) {
+                    continue;
+                }
+                break;
 
             case QCOW2_CLUSTER_NORMAL:
             case QCOW2_CLUSTER_COMPRESSED:
@@ -1401,7 +1408,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
 
         /* First remove L2 entries */
         qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
-        if (s->qcow_version >= 3) {
+        if (!full_discard && s->qcow_version >= 3) {
             l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
         } else {
             l2_table[l2_index + i] = cpu_to_be64(0);
@@ -1420,7 +1427,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
 }
 
 int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
-    int nb_sectors, enum qcow2_discard_type type)
+    int nb_sectors, enum qcow2_discard_type type, bool full_discard)
 {
     BDRVQcowState *s = bs->opaque;
     uint64_t end_offset;
@@ -1443,7 +1450,7 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
 
     /* Each L2 table is handled by its own loop iteration */
     while (nb_clusters > 0) {
-        ret = discard_single_l2(bs, offset, nb_clusters, type);
+        ret = discard_single_l2(bs, offset, nb_clusters, type, full_discard);
         if (ret < 0) {
             goto fail;
         }
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 0aa9def..c5ea2cd 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -436,7 +436,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
     qcow2_discard_clusters(bs, qcow2_vm_state_offset(s),
                            align_offset(sn->vm_state_size, s->cluster_size)
                                 >> BDRV_SECTOR_BITS,
-                           QCOW2_DISCARD_NEVER);
+                           QCOW2_DISCARD_NEVER, false);
 
 #ifdef DEBUG_ALLOC
     {
diff --git a/block/qcow2.c b/block/qcow2.c
index a54d2ba..676fe1d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1865,7 +1865,7 @@ static coroutine_fn int qcow2_co_discard(BlockDriverState *bs,
 
     qemu_co_mutex_lock(&s->lock);
     ret = qcow2_discard_clusters(bs, sector_num << BDRV_SECTOR_BITS,
-        nb_sectors, QCOW2_DISCARD_REQUEST);
+        nb_sectors, QCOW2_DISCARD_REQUEST, false);
     qemu_co_mutex_unlock(&s->lock);
     return ret;
 }
diff --git a/block/qcow2.h b/block/qcow2.h
index b49424b..2332634 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -519,7 +519,7 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
 
 int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
 int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
-    int nb_sectors, enum qcow2_discard_type type);
+    int nb_sectors, enum qcow2_discard_type type, bool full_discard);
 int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors);
 
 int qcow2_expand_zero_clusters(BlockDriverState *bs);
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 02/14] qcow2: Implement bdrv_make_empty()
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 01/14] qcow2: Allow "full" discard Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-30 10:00   ` Kevin Wolf
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty() Max Reitz
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Implement this function by making all clusters in the image file fall
through to the backing file (by using the recently extended discard).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 676fe1d..bd7a315 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2007,6 +2007,32 @@ fail:
     return ret;
 }
 
+static int qcow2_make_empty(BlockDriverState *bs)
+{
+    int ret = 0;
+    uint64_t start_sector;
+    int sector_step = INT_MAX / BDRV_SECTOR_SIZE;
+
+    for (start_sector = 0; start_sector < bs->total_sectors;
+         start_sector += sector_step)
+    {
+        /* As this function is generally used after committing an external
+         * snapshot, QCOW2_DISCARD_SNAPSHOT seems appropriate. Also, the
+         * default action for this kind of discard is to pass the discard,
+         * which will ideally result in an actually smaller image file, as
+         * is probably desired. */
+        ret = qcow2_discard_clusters(bs, start_sector * BDRV_SECTOR_SIZE,
+                                     MIN(sector_step,
+                                         bs->total_sectors - start_sector),
+                                     QCOW2_DISCARD_SNAPSHOT, true);
+        if (ret < 0) {
+            break;
+        }
+    }
+
+    return ret;
+}
+
 static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
 {
     BDRVQcowState *s = bs->opaque;
@@ -2389,6 +2415,7 @@ static BlockDriver bdrv_qcow2 = {
     .bdrv_co_discard        = qcow2_co_discard,
     .bdrv_truncate          = qcow2_truncate,
     .bdrv_write_compressed  = qcow2_write_compressed,
+    .bdrv_make_empty        = qcow2_make_empty,
 
     .bdrv_snapshot_create   = qcow2_snapshot_create,
     .bdrv_snapshot_goto     = qcow2_snapshot_goto,
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty()
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 01/14] qcow2: Allow "full" discard Max Reitz
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 02/14] qcow2: Implement bdrv_make_empty() Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-30 11:33   ` Kevin Wolf
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 04/14] blockjob: Introduce block_job_complete_sync() Max Reitz
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

bdrv_make_empty() is currently only called if the current image
represents an external snapshot that has been committed to its base
image; it is therefore unlikely to have internal snapshots. In this
case, bdrv_make_empty() can be greatly sped up by creating an empty L1
table and dropping all data clusters at once by recreating the refcount
structure accordingly instead of normally discarding all clusters.

If there are snapshots, fall back to the simple implementation (discard
all clusters).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c | 389 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 374 insertions(+), 15 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index bd7a315..6cc6789 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2007,27 +2007,386 @@ fail:
     return ret;
 }
 
+/*
+ * Creates a reftable pointing to refblocks following right afterwards and an
+ * empty L1 table at the given @offset. @refblocks is the number of refblocks
+ * to create.
+ *
+ * This combination of structures (reftable+refblocks+L1) is here called a
+ * "blob".
+ */
+static int create_refcount_l1(BlockDriverState *bs, uint64_t offset,
+                              uint64_t refblocks)
+{
+    BDRVQcowState *s = bs->opaque;
+    uint64_t *reftable = NULL;
+    uint16_t *refblock = NULL;
+    uint64_t reftable_clusters;
+    uint64_t rbi;
+    uint64_t blob_start, blob_end;
+    uint64_t l2_tables, l1_clusters, l1_size2;
+    uint8_t l1_size_and_offset[12];
+    uint64_t rt_offset;
+    int ret, i;
+
+    reftable_clusters = DIV_ROUND_UP(refblocks, s->cluster_size / 8);
+    l2_tables = DIV_ROUND_UP(bs->total_sectors / s->cluster_sectors,
+                             s->cluster_size / 8);
+    l1_clusters = DIV_ROUND_UP(l2_tables, s->cluster_size / 8);
+    l1_size2 = l1_clusters << s->cluster_bits;
+
+    blob_start = offset;
+    blob_end = offset + ((reftable_clusters + refblocks + l1_clusters) <<
+                s->cluster_bits);
+
+    ret = qcow2_pre_write_overlap_check(bs, 0, blob_start,
+                                        blob_end - blob_start);
+    if (ret < 0) {
+        return ret;
+    }
+
+    /* Create the reftable pointing with entries pointing consecutively to the
+     * following clusters */
+    reftable = g_malloc0_n(reftable_clusters, s->cluster_size);
+
+    for (rbi = 0; rbi < refblocks; rbi++) {
+        reftable[rbi] = cpu_to_be64(offset + ((reftable_clusters + rbi) <<
+                        s->cluster_bits));
+    }
+
+    ret = bdrv_write(bs->file, offset >> BDRV_SECTOR_BITS, (uint8_t *)reftable,
+                     reftable_clusters * s->cluster_sectors);
+    if (ret < 0) {
+        goto out;
+    }
+
+    offset += reftable_clusters << s->cluster_bits;
+
+    /* Keep the reftable, as we will need it for the BDRVQcowState anyway */
+
+    /* Now, create all the refblocks */
+    refblock = g_malloc(s->cluster_size);
+
+    for (rbi = 0; rbi < refblocks; rbi++) {
+        for (i = 0; i < s->cluster_size / 2; i++) {
+            uint64_t cluster_offset = ((rbi << (s->cluster_bits - 1)) + i)
+                                      << s->cluster_bits;
+
+            /* Only 0 and 1 are possible as refcounts here */
+            refblock[i] = cpu_to_be16(!cluster_offset ||
+                                      (cluster_offset >= blob_start &&
+                                       cluster_offset <  blob_end));
+        }
+
+        ret = bdrv_write(bs->file, offset >> BDRV_SECTOR_BITS,
+                         (uint8_t *)refblock, s->cluster_sectors);
+        if (ret < 0) {
+            goto out;
+        }
+
+        offset += s->cluster_size;
+    }
+
+    g_free(refblock);
+    refblock = NULL;
+
+    /* The L1 table is very simple */
+    ret = bdrv_write_zeroes(bs->file, offset >> BDRV_SECTOR_BITS,
+                            l1_clusters * s->cluster_sectors,
+                            BDRV_REQ_MAY_UNMAP);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* Now make sure all changes are stable and clear all caches */
+    bdrv_flush(bs);
+
+    /* This is probably not really necessary, but it cannot hurt, either */
+    ret = qcow2_mark_clean(bs);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = qcow2_cache_empty(bs, s->l2_table_cache);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = qcow2_cache_empty(bs, s->refcount_block_cache);
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* Modify the image header to point to the new blank L1 table. This will
+     * leak all currently existing data clusters, which is fine. */
+    BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE);
+
+    assert(l1_size2 / 8 <= UINT32_MAX);
+    cpu_to_be32w((uint32_t *)&l1_size_and_offset[0], l1_size2 / 8);
+    cpu_to_be64w((uint64_t *)&l1_size_and_offset[4], offset);
+    ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size),
+                           l1_size_and_offset, sizeof(l1_size_and_offset));
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* Adapt the in-memory L1 table accordingly */
+    s->l1_table_offset = offset;
+    s->l1_size         = l1_size2 / 8;
+
+    s->l1_table = g_realloc(s->l1_table, l1_size2);
+    memset(s->l1_table, 0, l1_size2);
+
+    /* Modify the image header to point to the refcount table. This will fix all
+     * leaks (unless an error occurs). */
+    BLKDBG_EVENT(bs->file, BLKDBG_REFTABLE_UPDATE);
+
+    rt_offset = cpu_to_be64(blob_start);
+    ret = bdrv_pwrite_sync(bs->file,
+                           offsetof(QCowHeader, refcount_table_offset),
+                           &rt_offset, sizeof(rt_offset));
+    if (ret < 0) {
+        goto out;
+    }
+
+    /* The image is now clean. The only thing left is to adapt the in-memory
+     * refcount table to match it. */
+    s->refcount_table_offset = blob_start;
+    s->refcount_table_size   = reftable_clusters << (s->cluster_bits - 3);
+
+    for (rbi = 0; rbi < refblocks; rbi++) {
+        be64_to_cpus(&reftable[rbi]);
+    }
+
+    g_free(s->refcount_table);
+    s->refcount_table = reftable;
+    reftable = NULL;
+
+out:
+    g_free(refblock);
+    g_free(reftable);
+    return ret;
+}
+
+/*
+ * Calculates the number of clusters required for an L1 table for an image with
+ * the given parameters plus the reftable and the refblocks required to cover
+ * themselves, the L1 table and a given number of clusters @overhead.
+ *
+ * @ts:       Total number of guest sectors the image provides
+ * @cb:       1 << @cb is the cluster size in bytes
+ * @spcb:     1 << @spcb is the number of clusters per sector
+ * @overhead: Number of clusters which shall additionally be covered by the
+ *            refcount structures
+ */
+static uint64_t minimal_blob_size(uint64_t ts, int cb, int spcb,
+                                  uint64_t overhead)
+{
+    uint64_t cs  = UINT64_C(1) << cb;
+    uint64_t spc = UINT64_C(1) << spcb;
+
+    /* The following statement is a solution for this system of equations:
+     *
+     * Let req_cls be the required number of clusters required for the reftable,
+     * all refblocks and the L1 table.
+     *
+     * rtc be the clusters required for the reftable, rbc the clusters for all
+     * refblocks (i.e., the number of refblocks), l1c the clusters for the L1
+     * table and l2c the clusters for all L2 tables (i.e., the number of L2
+     * tables).
+     *
+     * cs be the cluster size (in bytes), ts the total number of sectors, spc
+     * the number of sectors per cluster and tc the total number of clusters.
+     *
+     * overhead is a number of clusters which should additionally be covered by
+     * the refcount structures (i.e. all clusters before this blob of req_cls
+     * clusters).
+     *
+     * req_cls >= rtc + rbc + l1c
+     *   -- should be obvious
+     * rbc      = ceil((overhead + req_cls) / (cs / 2))
+     *   -- as each refblock holds cs/2 entries
+     * rtc      = ceil(rbc                  / (cs / 8))
+     *   -- as each reftable cluster holds cs/8 entries
+     * tc       = ceil(ts / spc)
+     *   -- should be obvious as well
+     * l2c      = ceil(tc  / (cs / 8))
+     *   -- as each L2 table holds cs/8 entries
+     * l1c      = ceil(l2c / (cs / 8))
+     *   -- as each L1 table cluster holds cs/8 entries
+     *
+     * The following equation yields a result which satisfies the constraint.
+     * The result may be too high, but is never too low.
+     *
+     * The original calculation (without bitshifting) was:
+     *
+     * DIV_ROUND_UP(overhead * (1 + cs / 8) +
+     *              3 * cs * cs / 16 - 5 * cs / 8 - 1 +
+     *              4 * (ts + spc * cs / 8 - 1) / spc,
+     *              cs * cs / 16 - cs / 8 - 1)
+     *
+     */
+
+    return DIV_ROUND_UP(overhead + (overhead << (cb - 3)) +
+                        (3 << (2 * cb - 4)) - (5 << (cb - 3)) - 1 +
+                        (4 * (ts + (spc << (cb - 3)) - 1) >> spcb),
+                        (1 << (2 * cb - 4)) - cs / 8 - 1);
+}
+
 static int qcow2_make_empty(BlockDriverState *bs)
 {
+    BDRVQcowState *s = bs->opaque;
     int ret = 0;
-    uint64_t start_sector;
-    int sector_step = INT_MAX / BDRV_SECTOR_SIZE;
 
-    for (start_sector = 0; start_sector < bs->total_sectors;
-         start_sector += sector_step)
-    {
-        /* As this function is generally used after committing an external
-         * snapshot, QCOW2_DISCARD_SNAPSHOT seems appropriate. Also, the
-         * default action for this kind of discard is to pass the discard,
-         * which will ideally result in an actually smaller image file, as
-         * is probably desired. */
-        ret = qcow2_discard_clusters(bs, start_sector * BDRV_SECTOR_SIZE,
-                                     MIN(sector_step,
-                                         bs->total_sectors - start_sector),
-                                     QCOW2_DISCARD_SNAPSHOT, true);
+    if (s->snapshots) {
+        uint64_t start_sector;
+        int sector_step = INT_MAX / BDRV_SECTOR_SIZE;
+
+        /* If there are snapshots, every active cluster has to be discarded */
+
+        for (start_sector = 0; start_sector < bs->total_sectors;
+             start_sector += sector_step)
+        {
+            /* As this function is generally used after committing an external
+             * snapshot, QCOW2_DISCARD_SNAPSHOT seems appropriate. Also, the
+             * default action for this kind of discard is to pass the discard,
+             * which will ideally result in an actually smaller image file, as
+             * is probably desired. */
+            ret = qcow2_discard_clusters(bs, start_sector * BDRV_SECTOR_SIZE,
+                                         MIN(sector_step,
+                                             bs->total_sectors - start_sector),
+                                         QCOW2_DISCARD_SNAPSHOT, true);
+            if (ret < 0) {
+                break;
+            }
+        }
+    } else {
+        uint64_t min_size_1, min_size_2;
+        int64_t blob_start;
+        uint64_t blob_end, real_blob_size, clusters;
+        uint64_t refblocks, reftable_clusters, l2_tables, l1_clusters;
+
+        /* If there are no snapshots, we basically want to create a new empty
+         * image. This function is however not for creating a new image and
+         * renaming it so it shadows the existing but rather for emptying an
+         * image, so do exactly that.
+         *
+         * Therefore, the image should be valid at all points in time and may
+         * at worst leak clusters.
+         *
+         * Any valid qcow2 image requires an L1 table which is large enough to
+         * cover all of the guest cluster range, therefore it is impossible to
+         * simply drop the L1 table (make its size 0) and create a minimal
+         * refcount structure.
+         *
+         * Instead, allocate a blob of data which is large enough to hold a new
+         * refcount structure (refcount table and all blocks) covering the whole
+         * image until the end of that blob, and an empty L1 table covering the
+         * whole guest cluster range. Then these structures are initialized and
+         * the image header is modified to point to them.
+         *
+         * Generally, this blob will be allocated at the end of the image (that
+         * is, its offset will be greater than its size). If that is indeed the
+         * case, the same operation is repeated, but this time the new blob
+         * starts right after the image header which will then indeed lead to a
+         * minimal image. If this is not the case, the image will be nearly
+         * minimal as well, as long as the underlying protocol supports discard.
+         *
+         * Note that this implementation never frees allocated clusters. This is
+         * because in case of success, the current refcounts are invalid anyway;
+         * and in case of failure, it would be too cumbersome to keep track of
+         * all allocated cluster ranges and free them then.
+         *
+         * min_size_1 will contain the number of clusters minimally required for
+         * a blob that starts right after the image header; min_size_2 will
+         * contain the number of clusters minimally required for the blob which
+         * can be allocated based on the existing refcount structure.
+         */
+
+        /* Repeat until a blob could be allocated which is large enough to
+         * contain all structures necessary for describing itself. Allocated
+         * clusters are not freed, even if they are not suitable, as this would
+         * result in exactly the same cluster range being returned during the
+         * retry (which is obviously not desirable). In case of success, the
+         * current refcounts do not matter anyway; and in case of failure, the
+         * clusters are only leaked (which can be easily repaired). */
+        do {
+            uint64_t fci = s->free_cluster_index;
+
+            /* TODO: Optimize, we do not need refblocks for other parts of the
+             * image than the header and this blob */
+            min_size_2 = minimal_blob_size(bs->total_sectors, s->cluster_bits,
+                                           s->cluster_bits - BDRV_SECTOR_BITS,
+                                           fci);
+
+            blob_start = qcow2_alloc_clusters(bs,
+                                              min_size_2 << s->cluster_bits);
+            if (blob_start < 0) {
+                return blob_start;
+            }
+
+            clusters          = (blob_start >> s->cluster_bits) + min_size_2;
+            refblocks         = DIV_ROUND_UP(clusters, s->cluster_size / 2);
+            reftable_clusters = DIV_ROUND_UP(refblocks, s->cluster_size / 8);
+            l2_tables         = DIV_ROUND_UP(bs->total_sectors /
+                                             s->cluster_sectors,
+                                             s->cluster_size / 8);
+            l1_clusters       = DIV_ROUND_UP(l2_tables, s->cluster_size / 8);
+
+            real_blob_size = reftable_clusters + refblocks + l1_clusters;
+        } while (min_size_2 < real_blob_size);
+
+        ret = create_refcount_l1(bs, blob_start, refblocks);
         if (ret < 0) {
-            break;
+            return ret;
+        }
+
+        /* The only overhead is the image header */
+        min_size_1 = minimal_blob_size(bs->total_sectors, s->cluster_bits,
+                                       s->cluster_bits - BDRV_SECTOR_BITS, 1);
+
+        /* If we cannot create a new blob before the current one, just discard
+         * the sectors in between and return. Even if the discard does nothing,
+         * only up to min_size_1 clusters plus the refcount blocks for those
+         * are unused. The worst case is therefore an image of double the size
+         * it needs to be, which is not too bad. */
+        if ((blob_start >> s->cluster_bits) < 1 + min_size_1) {
+            uint64_t sector, end;
+            int step = INT_MAX / BDRV_SECTOR_SIZE;
+
+            end = blob_start >> (s->cluster_bits - BDRV_SECTOR_SIZE);
+
+            /* skip the image header */
+            for (sector = s->cluster_sectors; sector < end; sector += step) {
+                ret = bdrv_discard(bs->file, sector, MIN(step, end - sector));
+                if (ret < 0) {
+                    return ret;
+                }
+            }
+
+            blob_end = (blob_start + real_blob_size) << s->cluster_bits;
+        } else {
+            clusters          = 1 + min_size_1;
+            refblocks         = DIV_ROUND_UP(clusters, s->cluster_size / 2);
+            reftable_clusters = DIV_ROUND_UP(refblocks, s->cluster_size / 8);
+            l2_tables         = DIV_ROUND_UP(bs->total_sectors /
+                                             s->cluster_sectors,
+                                             s->cluster_size / 8);
+            l1_clusters       = DIV_ROUND_UP(l2_tables, s->cluster_size / 8);
+
+            real_blob_size = reftable_clusters + refblocks + l1_clusters;
+
+            assert(min_size_1 >= real_blob_size);
+
+            ret = create_refcount_l1(bs, s->cluster_size, refblocks);
+            if (ret < 0) {
+                return ret;
+            }
+
+            blob_end = (1 + real_blob_size) << s->cluster_bits;
         }
+
+        ret = bdrv_truncate(bs->file, blob_end);
     }
 
     return ret;
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 04/14] blockjob: Introduce block_job_complete_sync()
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (2 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty() Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 05/14] blockjob: Add "ready" field Max Reitz
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Implement block_job_complete_sync() by doing the exact same thing as
block_job_cancel_sync() does, only with calling block_job_complete()
instead of block_job_cancel().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 blockjob.c               | 39 ++++++++++++++++++++++++++++++++-------
 include/block/blockjob.h | 15 +++++++++++++++
 2 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 7d84ca1..2423c8a 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -152,7 +152,7 @@ void block_job_iostatus_reset(BlockJob *job)
     }
 }
 
-struct BlockCancelData {
+struct BlockFinishData {
     BlockJob *job;
     BlockDriverCompletionFunc *cb;
     void *opaque;
@@ -160,19 +160,22 @@ struct BlockCancelData {
     int ret;
 };
 
-static void block_job_cancel_cb(void *opaque, int ret)
+static void block_job_finish_cb(void *opaque, int ret)
 {
-    struct BlockCancelData *data = opaque;
+    struct BlockFinishData *data = opaque;
 
     data->cancelled = block_job_is_cancelled(data->job);
     data->ret = ret;
     data->cb(data->opaque, ret);
 }
 
-int block_job_cancel_sync(BlockJob *job)
+static int block_job_finish_sync(BlockJob *job,
+                                 void (*finish)(BlockJob *, Error **errp),
+                                 Error **errp)
 {
-    struct BlockCancelData data;
+    struct BlockFinishData data;
     BlockDriverState *bs = job->bs;
+    Error *local_err = NULL;
 
     assert(bs->job == job);
 
@@ -183,15 +186,37 @@ int block_job_cancel_sync(BlockJob *job)
     data.cb = job->cb;
     data.opaque = job->opaque;
     data.ret = -EINPROGRESS;
-    job->cb = block_job_cancel_cb;
+    job->cb = block_job_finish_cb;
     job->opaque = &data;
-    block_job_cancel(job);
+    finish(job, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return -EBUSY;
+    }
     while (data.ret == -EINPROGRESS) {
         qemu_aio_wait();
     }
     return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
 }
 
+/* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
+ * used with block_job_finish_sync() without the need for (rather nasty)
+ * function pointer casts there. */
+static void block_job_cancel_err(BlockJob *job, Error **errp)
+{
+    block_job_cancel(job);
+}
+
+int block_job_cancel_sync(BlockJob *job)
+{
+    return block_job_finish_sync(job, &block_job_cancel_err, NULL);
+}
+
+int block_job_complete_sync(BlockJob *job, Error **errp)
+{
+    return block_job_finish_sync(job, &block_job_complete, errp);
+}
+
 void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
 {
     assert(job->busy);
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index c0a7875..df289dc 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -256,6 +256,21 @@ bool block_job_is_paused(BlockJob *job);
 int block_job_cancel_sync(BlockJob *job);
 
 /**
+ * block_job_complete_sync:
+ * @job: The job to be completed.
+ * @errp: Error object which may be set by block_job_complete(); this is not
+ *        necessarily set on every error, the job return value has to be
+ *        checked as well.
+ *
+ * Synchronously complete the job.  The completion callback is called before the
+ * function returns, unless it is NULL (which is permissible when using this
+ * function).
+ *
+ * Returns the return value from the job.
+ */
+int block_job_complete_sync(BlockJob *job, Error **errp);
+
+/**
  * block_job_iostatus_reset:
  * @job: The job whose I/O status should be reset.
  *
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 05/14] blockjob: Add "ready" field
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (3 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 04/14] blockjob: Introduce block_job_complete_sync() Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 06/14] block/mirror: Improve progress report Max Reitz
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

When a block job signals readiness, this is currently reported only
through QMP. If qemu wants to use block jobs for internal tasks, there
needs to be another way to correctly detect when a block job may be
completed.

For this reason, introduce a bool "ready" which is set when the block
job may be completed.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 blockjob.c               | 7 ++++++-
 include/block/blockjob.h | 5 +++++
 qapi/block-core.json     | 4 +++-
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 2423c8a..108cadd 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -246,6 +246,7 @@ BlockJobInfo *block_job_query(BlockJob *job)
     info->offset    = job->offset;
     info->speed     = job->speed;
     info->io_status = job->iostatus;
+    info->ready     = job->ready;
     return info;
 }
 
@@ -274,7 +275,11 @@ QObject *qobject_from_block_job(BlockJob *job)
 
 void block_job_ready(BlockJob *job)
 {
-    QObject *data = qobject_from_block_job(job);
+    QObject *data;
+
+    job->ready = true;
+
+    data = qobject_from_block_job(job);
     monitor_protocol_event(QEVENT_BLOCK_JOB_READY, data);
     qobject_decref(data);
 }
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index df289dc..5db4c14 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -91,6 +91,11 @@ struct BlockJob {
      */
     bool busy;
 
+    /**
+     * Set to true when the job is ready to be completed.
+     */
+    bool ready;
+
     /** Status that is published by the query-block-jobs QMP API */
     BlockDeviceIoStatus iostatus;
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7215e48..864285c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -505,12 +505,14 @@
 #
 # @io-status: the status of the job (since 1.3)
 #
+# @ready: true if the job may be completed (since 2.1)
+#
 # Since: 1.1
 ##
 { 'type': 'BlockJobInfo',
   'data': {'type': 'str', 'device': 'str', 'len': 'int',
            'offset': 'int', 'busy': 'bool', 'paused': 'bool', 'speed': 'int',
-           'io-status': 'BlockDeviceIoStatus'} }
+           'io-status': 'BlockDeviceIoStatus', 'ready': 'bool'} }
 
 ##
 # @query-block-jobs:
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 06/14] block/mirror: Improve progress report
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (4 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 05/14] blockjob: Add "ready" field Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 07/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Instead of taking the total length of the block device as the block
job's length, use the number of dirty sectors. The progress is now the
number of sectors mirrored to the target block device. Note that this
may result in the job's length increasing during operation, which is
however in fact desirable.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/mirror.c | 32 +++++++++++++++++++++-----------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 94c8661..2fd4ea3 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -39,6 +39,7 @@ typedef struct MirrorBlockJob {
     int64_t sector_num;
     int64_t granularity;
     size_t buf_size;
+    int64_t bdev_length;
     unsigned long *cow_bitmap;
     BdrvDirtyBitmap *dirty_bitmap;
     HBitmapIter hbi;
@@ -48,6 +49,7 @@ typedef struct MirrorBlockJob {
 
     unsigned long *in_flight_bitmap;
     int in_flight;
+    int sectors_in_flight;
     int ret;
 } MirrorBlockJob;
 
@@ -81,6 +83,7 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
     trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret);
 
     s->in_flight--;
+    s->sectors_in_flight -= op->nb_sectors;
     iov = op->qiov.iov;
     for (i = 0; i < op->qiov.niov; i++) {
         MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
@@ -92,8 +95,11 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
     chunk_num = op->sector_num / sectors_per_chunk;
     nb_chunks = op->nb_sectors / sectors_per_chunk;
     bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks);
-    if (s->cow_bitmap && ret >= 0) {
-        bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
+    if (ret >= 0) {
+        if (s->cow_bitmap) {
+            bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
+        }
+        s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
     }
 
     qemu_iovec_destroy(&op->qiov);
@@ -166,7 +172,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
     hbitmap_next_sector = s->sector_num;
     sector_num = s->sector_num;
     sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
-    end = s->common.len >> BDRV_SECTOR_BITS;
+    end = s->bdev_length / BDRV_SECTOR_SIZE;
 
     /* Extend the QEMUIOVector to include all adjacent blocks that will
      * be copied in this operation.
@@ -278,6 +284,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
 
     /* Copy the dirty cluster.  */
     s->in_flight++;
+    s->sectors_in_flight += nb_sectors;
     trace_mirror_one_iteration(s, sector_num, nb_sectors);
     bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
                    mirror_read_complete, op);
@@ -323,13 +330,13 @@ static void coroutine_fn mirror_run(void *opaque)
         goto immediate_exit;
     }
 
-    s->common.len = bdrv_getlength(bs);
-    if (s->common.len <= 0) {
-        ret = s->common.len;
+    s->bdev_length = bdrv_getlength(bs);
+    if (s->bdev_length <= 0) {
+        ret = s->bdev_length;
         goto immediate_exit;
     }
 
-    length = DIV_ROUND_UP(s->common.len, s->granularity);
+    length = DIV_ROUND_UP(s->bdev_length, s->granularity);
     s->in_flight_bitmap = bitmap_new(length);
 
     /* If we have no backing file yet in the destination, we cannot let
@@ -349,7 +356,7 @@ static void coroutine_fn mirror_run(void *opaque)
         }
     }
 
-    end = s->common.len >> BDRV_SECTOR_BITS;
+    end = s->bdev_length / BDRV_SECTOR_SIZE;
     s->buf = qemu_blockalign(bs, s->buf_size);
     sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
     mirror_free_init(s);
@@ -389,6 +396,12 @@ static void coroutine_fn mirror_run(void *opaque)
         }
 
         cnt = bdrv_get_dirty_count(bs, s->dirty_bitmap);
+        /* s->common.offset contains the number of bytes already processed so
+         * far, cnt is the number of dirty sectors remaining and
+         * s->sectors_in_flight is the number of sectors currently being
+         * processed; together those are the current total operation length */
+        s->common.len = s->common.offset +
+                        (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
 
         /* Note that even when no rate limit is applied we need to yield
          * periodically with no pending I/O so that qemu_aio_flush() returns.
@@ -424,7 +437,6 @@ static void coroutine_fn mirror_run(void *opaque)
                  * report completion.  This way, block-job-cancel will leave
                  * the target in a consistent state.
                  */
-                s->common.offset = end * BDRV_SECTOR_SIZE;
                 if (!s->synced) {
                     block_job_ready(&s->common);
                     s->synced = true;
@@ -453,8 +465,6 @@ static void coroutine_fn mirror_run(void *opaque)
         ret = 0;
         trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
         if (!s->synced) {
-            /* Publish progress */
-            s->common.offset = (end - cnt) * BDRV_SECTOR_SIZE;
             block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
             if (block_job_is_cancelled(&s->common)) {
                 break;
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 07/14] qemu-img: Implement commit like QMP
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (5 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 06/14] block/mirror: Improve progress report Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-09 16:53   ` Eric Blake
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 08/14] qemu-img: Empty image after commit Max Reitz
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

qemu-img should use QMP commands whenever possible in order to ensure
feature completeness of both online and offline image operations. As
qemu-img itself has no access to QMP (since this would basically require
just everything being linked into qemu-img), imitate QMP's
implementation of block-commit by using commit_active_start() and then
waiting for the block job to finish.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/Makefile.objs |  2 +-
 qemu-img.c          | 83 +++++++++++++++++++++++++++++++++++++++++------------
 2 files changed, 65 insertions(+), 20 deletions(-)

diff --git a/block/Makefile.objs b/block/Makefile.objs
index fd88c03..2c37e80 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -9,6 +9,7 @@ block-obj-y += snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
 block-obj-$(CONFIG_POSIX) += raw-posix.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
+block-obj-y += mirror.o
 
 ifeq ($(CONFIG_POSIX),y)
 block-obj-y += nbd.o nbd-client.o sheepdog.o
@@ -22,7 +23,6 @@ endif
 
 common-obj-y += stream.o
 common-obj-y += commit.o
-common-obj-y += mirror.o
 common-obj-y += backup.o
 
 iscsi.o-cflags     := $(LIBISCSI_CFLAGS)
diff --git a/qemu-img.c b/qemu-img.c
index aa89ba2..2c73414 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -30,6 +30,7 @@
 #include "qemu/osdep.h"
 #include "sysemu/sysemu.h"
 #include "block/block_int.h"
+#include "block/blockjob.h"
 #include "block/qapi.h"
 #include <getopt.h>
 #include <glib.h>
@@ -720,12 +721,46 @@ fail:
     return ret;
 }
 
+typedef struct CommonBlockJobCBInfo {
+    BlockDriverState *bs;
+    Error **errp;
+} CommonBlockJobCBInfo;
+
+static void common_block_job_cb(void *opaque, int ret)
+{
+    CommonBlockJobCBInfo *cbi = opaque;
+
+    if (ret < 0) {
+        error_setg_errno(cbi->errp, -ret, "Block job failed");
+    }
+
+    /* Drop this block job's reference */
+    bdrv_unref(cbi->bs);
+}
+
+static void run_block_job(BlockJob *job, Error **errp)
+{
+    AioContext *aio_context = bdrv_get_aio_context(job->bs);
+
+    do {
+        aio_poll(aio_context, true);
+
+        if (!job->busy && !job->ready) {
+            block_job_resume(job);
+        }
+    } while (!job->ready);
+
+    block_job_complete_sync(job, errp);
+}
+
 static int img_commit(int argc, char **argv)
 {
     int c, ret, flags;
     const char *filename, *fmt, *cache;
-    BlockDriverState *bs;
+    BlockDriverState *bs, *base_bs;
     bool quiet = false;
+    Error *local_err = NULL;
+    CommonBlockJobCBInfo cbi;
 
     fmt = NULL;
     cache = BDRV_DEFAULT_CACHE;
@@ -766,29 +801,39 @@ static int img_commit(int argc, char **argv)
     if (!bs) {
         return 1;
     }
-    ret = bdrv_commit(bs);
-    switch(ret) {
-    case 0:
-        qprintf(quiet, "Image committed.\n");
-        break;
-    case -ENOENT:
-        error_report("No disk inserted");
-        break;
-    case -EACCES:
-        error_report("Image is read-only");
-        break;
-    case -ENOTSUP:
-        error_report("Image is already committed");
-        break;
-    default:
-        error_report("Error while committing image");
-        break;
+
+    /* This is different from QMP, which by default uses the deepest file in the
+     * backing chain (i.e., the very base); however, the traditional behavior of
+     * qemu-img commit is using the immediate backing file. */
+    base_bs = bs->backing_hd;
+    if (!base_bs) {
+        error_set(&local_err, QERR_BASE_NOT_FOUND, "NULL");
+        goto done;
+    }
+
+    cbi = (CommonBlockJobCBInfo){
+        .errp = &local_err,
+        .bs   = bs,
+    };
+
+    commit_active_start(bs, base_bs, 0, BLOCKDEV_ON_ERROR_REPORT,
+                        common_block_job_cb, &cbi, &local_err);
+    if (local_err) {
+        goto done;
     }
 
+    run_block_job(bs->job, &local_err);
+
+done:
     bdrv_unref(bs);
-    if (ret) {
+
+    if (local_err) {
+        qerror_report_err(local_err);
+        error_free(local_err);
         return 1;
     }
+
+    qprintf(quiet, "Image committed.\n");
     return 0;
 }
 
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 08/14] qemu-img: Empty image after commit
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (6 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 07/14] qemu-img: Implement commit like QMP Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 09/14] qemu-img: Enable progress output for commit Max Reitz
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

After the top image has been committed, it should be emptied unless
specified otherwise.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 qemu-img-cmds.hx |  4 ++--
 qemu-img.c       | 34 +++++++++++++++++++++++++++++++---
 qemu-img.texi    |  6 +++++-
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index d029609..b31d81c 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -22,9 +22,9 @@ STEXI
 ETEXI
 
 DEF("commit", img_commit,
-    "commit [-q] [-f fmt] [-t cache] filename")
+    "commit [-q] [-f fmt] [-t cache] [-d] filename")
 STEXI
-@item commit [-q] [-f @var{fmt}] [-t @var{cache}] @var{filename}
+@item commit [-q] [-f @var{fmt}] [-t @var{cache}] [-d] @var{filename}
 ETEXI
 
 DEF("compare", img_compare,
diff --git a/qemu-img.c b/qemu-img.c
index 2c73414..c0bb276 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -758,14 +758,14 @@ static int img_commit(int argc, char **argv)
     int c, ret, flags;
     const char *filename, *fmt, *cache;
     BlockDriverState *bs, *base_bs;
-    bool quiet = false;
+    bool quiet = false, drop = false;
     Error *local_err = NULL;
     CommonBlockJobCBInfo cbi;
 
     fmt = NULL;
     cache = BDRV_DEFAULT_CACHE;
     for(;;) {
-        c = getopt(argc, argv, "f:ht:q");
+        c = getopt(argc, argv, "f:ht:dq");
         if (c == -1) {
             break;
         }
@@ -780,6 +780,9 @@ static int img_commit(int argc, char **argv)
         case 't':
             cache = optarg;
             break;
+        case 'd':
+            drop = true;
+            break;
         case 'q':
             quiet = true;
             break;
@@ -790,7 +793,7 @@ static int img_commit(int argc, char **argv)
     }
     filename = argv[optind++];
 
-    flags = BDRV_O_RDWR;
+    flags = BDRV_O_RDWR | BDRV_O_UNMAP;
     ret = bdrv_parse_cache_flags(cache, &flags);
     if (ret < 0) {
         error_report("Invalid cache option: %s", cache);
@@ -822,7 +825,32 @@ static int img_commit(int argc, char **argv)
         goto done;
     }
 
+    /* The block job will swap base_bs and bs (which is not what we really want
+     * here, but okay) and unref base_bs (after the swap, i.e., the old top
+     * image). In order to still be able to empty that top image afterwards,
+     * increment the reference counter here preemptively. */
+    if (!drop) {
+        bdrv_ref(base_bs);
+    }
+
     run_block_job(bs->job, &local_err);
+    if (local_err) {
+        goto unref_backing;
+    }
+
+    if (!drop && base_bs->drv->bdrv_make_empty) {
+        ret = base_bs->drv->bdrv_make_empty(base_bs);
+        if (ret) {
+            error_setg_errno(&local_err, -ret, "Could not empty %s",
+                             filename);
+            goto unref_backing;
+        }
+    }
+
+unref_backing:
+    if (!drop) {
+        bdrv_unref(base_bs);
+    }
 
 done:
     bdrv_unref(bs);
diff --git a/qemu-img.texi b/qemu-img.texi
index c68b541..b48221f 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -163,7 +163,7 @@ this case. @var{backing_file} will never be modified unless you use the
 The size can also be specified using the @var{size} option with @code{-o},
 it doesn't need to be specified separately in this case.
 
-@item commit [-f @var{fmt}] [-t @var{cache}] @var{filename}
+@item commit [-f @var{fmt}] [-t @var{cache}] [-d] @var{filename}
 
 Commit the changes recorded in @var{filename} in its base image or backing file.
 If the backing file is smaller than the snapshot, then the backing file will be
@@ -172,6 +172,10 @@ the backing file, the backing file will not be truncated.  If you want the
 backing file to match the size of the smaller snapshot, you can safely truncate
 it yourself once the commit operation successfully completes.
 
+The image @var{filename} is emptied after the operation has succeeded. If you do
+not need @var{filename} afterwards and intend to drop it, you may skip emptying
+@var{filename} by specifying the @code{-d} flag.
+
 @item compare [-f @var{fmt}] [-F @var{fmt}] [-p] [-s] [-q] @var{filename1} @var{filename2}
 
 Check if two images have the same content. You can compare images with
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 09/14] qemu-img: Enable progress output for commit
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (7 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 08/14] qemu-img: Empty image after commit Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-09 17:28   ` Eric Blake
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 10/14] qemu-img: Specify backing file " Max Reitz
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Implement progress output for the commit command by querying the
progress of the block job.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 qemu-img-cmds.hx |  4 ++--
 qemu-img.c       | 24 ++++++++++++++++++++++--
 qemu-img.texi    |  2 +-
 3 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index b31d81c..ea41d4f 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -22,9 +22,9 @@ STEXI
 ETEXI
 
 DEF("commit", img_commit,
-    "commit [-q] [-f fmt] [-t cache] [-d] filename")
+    "commit [-q] [-f fmt] [-t cache] [-d] [-p] filename")
 STEXI
-@item commit [-q] [-f @var{fmt}] [-t @var{cache}] [-d] @var{filename}
+@item commit [-q] [-f @var{fmt}] [-t @var{cache}] [-d] [-p] @var{filename}
 ETEXI
 
 DEF("compare", img_compare,
diff --git a/qemu-img.c b/qemu-img.c
index c0bb276..77abccd 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -745,12 +745,18 @@ static void run_block_job(BlockJob *job, Error **errp)
     do {
         aio_poll(aio_context, true);
 
+        qemu_progress_print((float)job->offset / job->len * 100.f, 0);
+
         if (!job->busy && !job->ready) {
             block_job_resume(job);
         }
     } while (!job->ready);
 
     block_job_complete_sync(job, errp);
+
+    /* A block job may finish instantaneously without publishing any progress,
+     * so just signal completion here */
+    qemu_progress_print(100.f, 0);
 }
 
 static int img_commit(int argc, char **argv)
@@ -758,14 +764,14 @@ static int img_commit(int argc, char **argv)
     int c, ret, flags;
     const char *filename, *fmt, *cache;
     BlockDriverState *bs, *base_bs;
-    bool quiet = false, drop = false;
+    bool progress = false, quiet = false, drop = false;
     Error *local_err = NULL;
     CommonBlockJobCBInfo cbi;
 
     fmt = NULL;
     cache = BDRV_DEFAULT_CACHE;
     for(;;) {
-        c = getopt(argc, argv, "f:ht:dq");
+        c = getopt(argc, argv, "f:ht:dpq");
         if (c == -1) {
             break;
         }
@@ -783,11 +789,20 @@ static int img_commit(int argc, char **argv)
         case 'd':
             drop = true;
             break;
+        case 'p':
+            progress = true;
+            break;
         case 'q':
             quiet = true;
             break;
         }
     }
+
+    /* Progress is not shown in Quiet mode */
+    if (quiet) {
+        progress = false;
+    }
+
     if (optind != argc - 1) {
         error_exit("Expecting one image file name");
     }
@@ -805,6 +820,9 @@ static int img_commit(int argc, char **argv)
         return 1;
     }
 
+    qemu_progress_init(progress, 1.f);
+    qemu_progress_print(0.f, 100);
+
     /* This is different from QMP, which by default uses the deepest file in the
      * backing chain (i.e., the very base); however, the traditional behavior of
      * qemu-img commit is using the immediate backing file. */
@@ -853,6 +871,8 @@ unref_backing:
     }
 
 done:
+    qemu_progress_end();
+
     bdrv_unref(bs);
 
     if (local_err) {
diff --git a/qemu-img.texi b/qemu-img.texi
index b48221f..ed59b51 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -163,7 +163,7 @@ this case. @var{backing_file} will never be modified unless you use the
 The size can also be specified using the @var{size} option with @code{-o},
 it doesn't need to be specified separately in this case.
 
-@item commit [-f @var{fmt}] [-t @var{cache}] [-d] @var{filename}
+@item commit [-q] [-f @var{fmt}] [-t @var{cache}] [-d] [-p] @var{filename}
 
 Commit the changes recorded in @var{filename} in its base image or backing file.
 If the backing file is smaller than the snapshot, then the backing file will be
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 10/14] qemu-img: Specify backing file for commit
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (8 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 09/14] qemu-img: Enable progress output for commit Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-09 17:40   ` Eric Blake
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 11/14] iotests: Add _filter_qemu_img_map Max Reitz
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Introduce a new parameter for qemu-img commit which may be used to
explicitly specify the backing file into which an image should be
committed if the backing chain has more than a single layer.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 qemu-img-cmds.hx |  4 ++--
 qemu-img.c       | 24 +++++++++++++++++-------
 qemu-img.texi    |  9 ++++++++-
 3 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index ea41d4f..4331949 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -22,9 +22,9 @@ STEXI
 ETEXI
 
 DEF("commit", img_commit,
-    "commit [-q] [-f fmt] [-t cache] [-d] [-p] filename")
+    "commit [-q] [-f fmt] [-t cache] [-b base] [-d] [-p] filename")
 STEXI
-@item commit [-q] [-f @var{fmt}] [-t @var{cache}] [-d] [-p] @var{filename}
+@item commit [-q] [-f @var{fmt}] [-t @var{cache}] [-b @var{base}] [-d] [-p] @var{filename}
 ETEXI
 
 DEF("compare", img_compare,
diff --git a/qemu-img.c b/qemu-img.c
index 77abccd..d372236 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -762,7 +762,7 @@ static void run_block_job(BlockJob *job, Error **errp)
 static int img_commit(int argc, char **argv)
 {
     int c, ret, flags;
-    const char *filename, *fmt, *cache;
+    const char *filename, *fmt, *cache, *base;
     BlockDriverState *bs, *base_bs;
     bool progress = false, quiet = false, drop = false;
     Error *local_err = NULL;
@@ -770,8 +770,9 @@ static int img_commit(int argc, char **argv)
 
     fmt = NULL;
     cache = BDRV_DEFAULT_CACHE;
+    base = NULL;
     for(;;) {
-        c = getopt(argc, argv, "f:ht:dpq");
+        c = getopt(argc, argv, "f:ht:b:dpq");
         if (c == -1) {
             break;
         }
@@ -786,6 +787,11 @@ static int img_commit(int argc, char **argv)
         case 't':
             cache = optarg;
             break;
+        case 'b':
+            base = optarg;
+            /* -b implies -d */
+            drop = true;
+            break;
         case 'd':
             drop = true;
             break;
@@ -823,12 +829,16 @@ static int img_commit(int argc, char **argv)
     qemu_progress_init(progress, 1.f);
     qemu_progress_print(0.f, 100);
 
-    /* This is different from QMP, which by default uses the deepest file in the
-     * backing chain (i.e., the very base); however, the traditional behavior of
-     * qemu-img commit is using the immediate backing file. */
-    base_bs = bs->backing_hd;
+    if (base) {
+        base_bs = bdrv_find_backing_image(bs, base);
+    } else {
+        /* This is different from QMP, which by default uses the deepest file in
+         * the backing chain (i.e., the very base); however, the traditional
+         * behavior of qemu-img commit is using the immediate backing file. */
+        base_bs = bs->backing_hd;
+    }
     if (!base_bs) {
-        error_set(&local_err, QERR_BASE_NOT_FOUND, "NULL");
+        error_set(&local_err, QERR_BASE_NOT_FOUND, base ?: "NULL");
         goto done;
     }
 
diff --git a/qemu-img.texi b/qemu-img.texi
index ed59b51..83b7be4 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -163,7 +163,7 @@ this case. @var{backing_file} will never be modified unless you use the
 The size can also be specified using the @var{size} option with @code{-o},
 it doesn't need to be specified separately in this case.
 
-@item commit [-q] [-f @var{fmt}] [-t @var{cache}] [-d] [-p] @var{filename}
+@item commit [-q] [-f @var{fmt}] [-t @var{cache}] [-b @var{base}] [-d] [-p] @var{filename}
 
 Commit the changes recorded in @var{filename} in its base image or backing file.
 If the backing file is smaller than the snapshot, then the backing file will be
@@ -176,6 +176,13 @@ The image @var{filename} is emptied after the operation has succeeded. If you do
 not need @var{filename} afterwards and intend to drop it, you may skip emptying
 @var{filename} by specifying the @code{-d} flag.
 
+If the backing chain of the given image file @var{filename} has more than one
+layer, the backing file into which the changes will be committed may be
+specified as @var{base} (which has to be part of @var{filename}'s backing
+chain). If @var{base} is not specified, the immediate backing file of the top
+image (which is @var{filename}) will be used. For reasons of consistency,
+explicitly specifying @var{base} will always imply @code{-d}.
+
 @item compare [-f @var{fmt}] [-F @var{fmt}] [-p] [-s] [-q] @var{filename1} @var{filename2}
 
 Check if two images have the same content. You can compare images with
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 11/14] iotests: Add _filter_qemu_img_map
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (9 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 10/14] qemu-img: Specify backing file " Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-09 17:51   ` Eric Blake
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 12/14] iotests: Add test for backing-chain commits Max Reitz
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

As different image formats most probably map guest addresses to
different host addresses, add a filter to filter the host addresses out;
also, the image filename should be filtered.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/common.filter | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index a04df7f..8b14edb 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -170,5 +170,12 @@ _filter_qmp()
         -e 's#^{"QMP":.*}$#QMP_VERSION#'
 }
 
+# filter out offsets and file names from qemu-img map
+_filter_qemu_img_map()
+{
+    sed -e 's/\([0-9a-fx]* *[0-9a-fx]* *\)[0-9a-fx]* */\1/g' \
+        -e 's/Mapped to *//' | _filter_testdir | _filter_imgfmt
+}
+
 # make sure this script returns success
 /bin/true
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 12/14] iotests: Add test for backing-chain commits
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (10 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 11/14] iotests: Add _filter_qemu_img_map Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-09 18:50   ` Eric Blake
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 13/14] iotests: Add test for qcow2's bdrv_make_empty Max Reitz
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Add a test for qemu-img commit on backing chains with more than two
images. This test also checks whether the top image is emptied (unless
this is prevented by specifying either -d or -b)  emptied and does
therefore not work for qed and vmdk which requires it to be separate
from 020.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/094     | 122 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/094.out | 119 +++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 242 insertions(+)
 create mode 100755 tests/qemu-iotests/094
 create mode 100644 tests/qemu-iotests/094.out

diff --git a/tests/qemu-iotests/094 b/tests/qemu-iotests/094
new file mode 100755
index 0000000..c7a613b
--- /dev/null
+++ b/tests/qemu-iotests/094
@@ -0,0 +1,122 @@
+#!/bin/bash
+#
+# Commit changes into backing chains and empty the top image if the
+# backing image is not explicitly specified
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=mreitz@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+    _rm_test_img "$TEST_IMG.itmd"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.pattern
+
+# Any format supporting backing files and bdrv_make_empty
+_supported_fmt qcow qcow2
+_supported_proto file
+_supported_os Linux
+
+
+# Four passes:
+#  0: Two-layer backing chain, commit to upper backing file (implicitly)
+#     (in this case, the top image will be emptied)
+#  1: Two-layer backing chain, commit to upper backing file (explicitly)
+#     (in this case, the top image will implicitly stay unchanged)
+#  2: Two-layer backing chain, commit to upper backing file (implicitly with -d)
+#     (in this case, the top image will explicitly stay unchanged)
+#  3: Two-layer backing chain, commit to lower backing file
+#     (in this case, the top image will implicitly stay unchanged)
+#
+# 020 already tests committing, so this only tests whether image chains are
+# working properly and that all images above the base are emptied; therefore,
+# no complicated patterns are necessary
+for i in 0 1 2 3; do
+
+echo
+echo "=== Test pass $i ==="
+echo
+
+TEST_IMG="$TEST_IMG.base" _make_test_img 64M
+TEST_IMG="$TEST_IMG.itmd" _make_test_img -b "$TEST_IMG.base" 64M
+_make_test_img -b "$TEST_IMG.itmd" 64M
+
+$QEMU_IO -c 'write -P 1 0 192k' "$TEST_IMG.base" | _filter_qemu_io
+$QEMU_IO -c 'write -P 2 64k 128k' "$TEST_IMG.itmd" | _filter_qemu_io
+$QEMU_IO -c 'write -P 3 128k 64k' "$TEST_IMG" | _filter_qemu_io
+
+if [ $i -lt 3 ]; then
+    if [ $i == 0 ]; then
+        # -b "$TEST_IMG.itmd" should be the default (that is, committing to the
+        # first backing file in the chain)
+        $QEMU_IMG commit "$TEST_IMG"
+    elif [ $i == 1 ]; then
+        # explicitly specify the commit target (this should imply -d)
+        $QEMU_IMG commit -b "$TEST_IMG.itmd" "$TEST_IMG"
+    else
+        # do not explicitly specify the commit target, but use -d to leave the
+        # top image unchanged
+        $QEMU_IMG commit -d "$TEST_IMG"
+    fi
+
+    # Bottom should be unchanged
+    $QEMU_IO -c 'read -P 1 0 192k' "$TEST_IMG.base" | _filter_qemu_io
+
+    # Intermediate should contain changes from top
+    $QEMU_IO -c 'read -P 1 0 64k' "$TEST_IMG.itmd" | _filter_qemu_io
+    $QEMU_IO -c 'read -P 2 64k 64k' "$TEST_IMG.itmd" | _filter_qemu_io
+    $QEMU_IO -c 'read -P 3 128k 64k' "$TEST_IMG.itmd" | _filter_qemu_io
+
+    # And in pass 0, the top image should be empty, whereas in both other passes
+    # it should be unchanged (which is both checked by qemu-img map)
+else
+    $QEMU_IMG commit -b "$TEST_IMG.base" "$TEST_IMG"
+
+    # Bottom should contain all changes
+    $QEMU_IO -c 'read -P 1 0 64k' "$TEST_IMG.base" | _filter_qemu_io
+    $QEMU_IO -c 'read -P 2 64k 64k' "$TEST_IMG.base" | _filter_qemu_io
+    $QEMU_IO -c 'read -P 3 128k 64k' "$TEST_IMG.base" | _filter_qemu_io
+
+    # Both top and intermediate should be unchanged
+fi
+
+$QEMU_IMG map "$TEST_IMG.base" | _filter_qemu_img_map
+$QEMU_IMG map "$TEST_IMG.itmd" | _filter_qemu_img_map
+$QEMU_IMG map "$TEST_IMG" | _filter_qemu_img_map
+
+done
+
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/094.out b/tests/qemu-iotests/094.out
new file mode 100644
index 0000000..1ee4b9b
--- /dev/null
+++ b/tests/qemu-iotests/094.out
@@ -0,0 +1,119 @@
+QA output created by 092
+
+=== Test pass 0 ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.IMGFMT.itmd', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.base' 
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.itmd' 
+wrote 196608/196608 bytes at offset 0
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 131072/131072 bytes at offset 65536
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 65536/65536 bytes at offset 131072
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Image committed.
+read 196608/196608 bytes at offset 0
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 131072
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset          Length          File
+0               0x30000         TEST_DIR/t.IMGFMT.base
+Offset          Length          File
+0               0x10000         TEST_DIR/t.IMGFMT.base
+0x10000         0x20000         TEST_DIR/t.IMGFMT.itmd
+Offset          Length          File
+0               0x10000         TEST_DIR/t.IMGFMT.base
+0x10000         0x20000         TEST_DIR/t.IMGFMT.itmd
+
+=== Test pass 1 ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.IMGFMT.itmd', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.base' 
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.itmd' 
+wrote 196608/196608 bytes at offset 0
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 131072/131072 bytes at offset 65536
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 65536/65536 bytes at offset 131072
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Image committed.
+read 196608/196608 bytes at offset 0
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 131072
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset          Length          File
+0               0x30000         TEST_DIR/t.IMGFMT.base
+Offset          Length          File
+0               0x10000         TEST_DIR/t.IMGFMT.base
+0x10000         0x20000         TEST_DIR/t.IMGFMT.itmd
+Offset          Length          File
+0               0x10000         TEST_DIR/t.IMGFMT.base
+0x10000         0x10000         TEST_DIR/t.IMGFMT.itmd
+0x20000         0x10000         TEST_DIR/t.IMGFMT
+
+=== Test pass 2 ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.IMGFMT.itmd', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.base' 
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.itmd' 
+wrote 196608/196608 bytes at offset 0
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 131072/131072 bytes at offset 65536
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 65536/65536 bytes at offset 131072
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Image committed.
+read 196608/196608 bytes at offset 0
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 131072
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset          Length          File
+0               0x30000         TEST_DIR/t.IMGFMT.base
+Offset          Length          File
+0               0x10000         TEST_DIR/t.IMGFMT.base
+0x10000         0x20000         TEST_DIR/t.IMGFMT.itmd
+Offset          Length          File
+0               0x10000         TEST_DIR/t.IMGFMT.base
+0x10000         0x10000         TEST_DIR/t.IMGFMT.itmd
+0x20000         0x10000         TEST_DIR/t.IMGFMT
+
+=== Test pass 3 ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.IMGFMT.itmd', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.base' 
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.itmd' 
+wrote 196608/196608 bytes at offset 0
+192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 131072/131072 bytes at offset 65536
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 65536/65536 bytes at offset 131072
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Image committed.
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 65536/65536 bytes at offset 131072
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset          Length          File
+0               0x30000         TEST_DIR/t.IMGFMT.base
+Offset          Length          File
+0               0x10000         TEST_DIR/t.IMGFMT.base
+0x10000         0x20000         TEST_DIR/t.IMGFMT.itmd
+Offset          Length          File
+0               0x10000         TEST_DIR/t.IMGFMT.base
+0x10000         0x10000         TEST_DIR/t.IMGFMT.itmd
+0x20000         0x10000         TEST_DIR/t.IMGFMT
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 0f07440..ec23f4b 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -99,3 +99,4 @@
 090 rw auto quick
 091 rw auto
 092 rw auto quick
+094 rw auto backing
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 13/14] iotests: Add test for qcow2's bdrv_make_empty
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (11 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 12/14] iotests: Add test for backing-chain commits Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-09 18:55   ` Eric Blake
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 14/14] iotests: Omit length/offset test in 040 and 041 Max Reitz
  2014-06-27 22:07 ` [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

Add a test for qcow2's fast bdrv_make_empty implementation on images
without internal snapshots.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/095     | 72 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/095.out | 26 +++++++++++++++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 99 insertions(+)
 create mode 100755 tests/qemu-iotests/095
 create mode 100644 tests/qemu-iotests/095.out

diff --git a/tests/qemu-iotests/095 b/tests/qemu-iotests/095
new file mode 100755
index 0000000..85b8a42
--- /dev/null
+++ b/tests/qemu-iotests/095
@@ -0,0 +1,72 @@
+#!/bin/bash
+#
+# Test qcow2's bdrv_make_empty for images without internal snapshots
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=mreitz@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+    rm -f "$TEST_DIR/blkdebug.conf"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.pattern
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+
+for event in l1_update reftable_update; do
+
+echo
+echo "=== $event ==="
+echo
+
+TEST_IMG="$TEST_IMG.base" _make_test_img 64M
+_make_test_img -b "$TEST_IMG.base" 64M
+
+cat > "$TEST_DIR/blkdebug.conf" <<EOF
+[inject-error]
+event = "$event"
+EOF
+
+$QEMU_IMG commit "blkdebug:$TEST_DIR/blkdebug.conf:$TEST_IMG"
+
+_check_test_img
+
+done
+
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/095.out b/tests/qemu-iotests/095.out
new file mode 100644
index 0000000..94399bc
--- /dev/null
+++ b/tests/qemu-iotests/095.out
@@ -0,0 +1,26 @@
+QA output created by 093
+
+=== l1_update ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.base' 
+qemu-img: Could not empty blkdebug:/home/maxx/projects/qemu/tests/qemu-iotests/scratch/blkdebug.conf:/home/maxx/projects/qemu/tests/qemu-iotests/scratch/t.qcow2: Input/output error
+Leaked cluster 4 refcount=1 reference=0
+Leaked cluster 5 refcount=1 reference=0
+Leaked cluster 6 refcount=1 reference=0
+
+3 leaked clusters were found on the image.
+This means waste of disk space, but no harm to data.
+
+=== reftable_update ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864 
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 backing_file='TEST_DIR/t.IMGFMT.base' 
+qemu-img: Could not empty blkdebug:/home/maxx/projects/qemu/tests/qemu-iotests/scratch/blkdebug.conf:/home/maxx/projects/qemu/tests/qemu-iotests/scratch/t.qcow2: Input/output error
+Leaked cluster 3 refcount=1 reference=0
+Leaked cluster 4 refcount=1 reference=0
+Leaked cluster 5 refcount=1 reference=0
+
+3 leaked clusters were found on the image.
+This means waste of disk space, but no harm to data.
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index ec23f4b..16acc78 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -100,3 +100,4 @@
 091 rw auto
 092 rw auto quick
 094 rw auto backing
+095 rw auto backing quick
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v8 14/14] iotests: Omit length/offset test in 040 and 041
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (12 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 13/14] iotests: Add test for qcow2's bdrv_make_empty Max Reitz
@ 2014-06-07 18:51 ` Max Reitz
  2014-06-27 22:07 ` [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
  14 siblings, 0 replies; 29+ messages in thread
From: Max Reitz @ 2014-06-07 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi, Max Reitz

As the length of a mirror block job no longer directly depends on the
size of the block device, drop those checks from this test. Instead,
just check whether the final offset equals the block job length.

As 041 uses the wait_until_completed function from iotests.py, the same
applies there as well which in turn affects tests 030, 055 and 056. On
the other hand, a block job's length does not have to be related to the
length of the image file in the first place, so that check was
questionable anyway.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/040        | 4 +---
 tests/qemu-iotests/041        | 3 +--
 tests/qemu-iotests/iotests.py | 3 +--
 3 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index 734b6a6..ca2b82e 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -46,13 +46,11 @@ class ImageCommitTestCase(iotests.QMPTestCase):
                 if event['event'] == 'BLOCK_JOB_COMPLETED':
                     self.assert_qmp(event, 'data/type', 'commit')
                     self.assert_qmp(event, 'data/device', 'drive0')
-                    self.assert_qmp(event, 'data/offset', self.image_len)
-                    self.assert_qmp(event, 'data/len', self.image_len)
+                    self.assert_qmp(event, 'data/offset', event['data']['len'])
                     completed = True
                 elif event['event'] == 'BLOCK_JOB_READY':
                     self.assert_qmp(event, 'data/type', 'commit')
                     self.assert_qmp(event, 'data/device', 'drive0')
-                    self.assert_qmp(event, 'data/len', self.image_len)
                     self.vm.qmp('block-job-complete', device='drive0')
 
         self.assert_no_active_block_jobs()
diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index ec470b2..59b958b 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -46,8 +46,7 @@ class ImageMirroringTestCase(iotests.QMPTestCase):
         event = self.cancel_and_wait()
         self.assertEquals(event['event'], 'BLOCK_JOB_COMPLETED')
         self.assert_qmp(event, 'data/type', 'mirror')
-        self.assert_qmp(event, 'data/offset', self.image_len)
-        self.assert_qmp(event, 'data/len', self.image_len)
+        self.assert_qmp(event, 'data/offset', event['data']['len'])
 
     def complete_and_wait(self, drive='drive0', wait_ready=True):
         '''Complete a block job and wait for it to finish'''
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index f6c437c..ff474d4 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -266,8 +266,7 @@ class QMPTestCase(unittest.TestCase):
                     self.assert_qmp(event, 'data/device', drive)
                     self.assert_qmp_absent(event, 'data/error')
                     if check_offset:
-                        self.assert_qmp(event, 'data/offset', self.image_len)
-                    self.assert_qmp(event, 'data/len', self.image_len)
+                        self.assert_qmp(event, 'data/offset', event['data']['len'])
                     completed = True
 
         self.assert_no_active_block_jobs()
-- 
2.0.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 07/14] qemu-img: Implement commit like QMP
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 07/14] qemu-img: Implement commit like QMP Max Reitz
@ 2014-06-09 16:53   ` Eric Blake
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2014-06-09 16:53 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 830 bytes --]

On 06/07/2014 12:51 PM, Max Reitz wrote:
> qemu-img should use QMP commands whenever possible in order to ensure
> feature completeness of both online and offline image operations. As
> qemu-img itself has no access to QMP (since this would basically require
> just everything being linked into qemu-img), imitate QMP's
> implementation of block-commit by using commit_active_start() and then
> waiting for the block job to finish.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/Makefile.objs |  2 +-
>  qemu-img.c          | 83 +++++++++++++++++++++++++++++++++++++++++------------
>  2 files changed, 65 insertions(+), 20 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 09/14] qemu-img: Enable progress output for commit
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 09/14] qemu-img: Enable progress output for commit Max Reitz
@ 2014-06-09 17:28   ` Eric Blake
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2014-06-09 17:28 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 526 bytes --]

On 06/07/2014 12:51 PM, Max Reitz wrote:
> Implement progress output for the commit command by querying the
> progress of the block job.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  qemu-img-cmds.hx |  4 ++--
>  qemu-img.c       | 24 ++++++++++++++++++++++--
>  qemu-img.texi    |  2 +-
>  3 files changed, 25 insertions(+), 5 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 10/14] qemu-img: Specify backing file for commit
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 10/14] qemu-img: Specify backing file " Max Reitz
@ 2014-06-09 17:40   ` Eric Blake
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2014-06-09 17:40 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 639 bytes --]

On 06/07/2014 12:51 PM, Max Reitz wrote:
> Introduce a new parameter for qemu-img commit which may be used to
> explicitly specify the backing file into which an image should be
> committed if the backing chain has more than a single layer.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  qemu-img-cmds.hx |  4 ++--
>  qemu-img.c       | 24 +++++++++++++++++-------
>  qemu-img.texi    |  9 ++++++++-
>  3 files changed, 27 insertions(+), 10 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 11/14] iotests: Add _filter_qemu_img_map
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 11/14] iotests: Add _filter_qemu_img_map Max Reitz
@ 2014-06-09 17:51   ` Eric Blake
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2014-06-09 17:51 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 1477 bytes --]

On 06/07/2014 12:51 PM, Max Reitz wrote:
> As different image formats most probably map guest addresses to
> different host addresses, add a filter to filter the host addresses out;
> also, the image filename should be filtered.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/common.filter | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
> index a04df7f..8b14edb 100644
> --- a/tests/qemu-iotests/common.filter
> +++ b/tests/qemu-iotests/common.filter
> @@ -170,5 +170,12 @@ _filter_qmp()
>          -e 's#^{"QMP":.*}$#QMP_VERSION#'
>  }
>  
> +# filter out offsets and file names from qemu-img map
> +_filter_qemu_img_map()
> +{
> +    sed -e 's/\([0-9a-fx]* *[0-9a-fx]* *\)[0-9a-fx]* */\1/g' \

The 'g' modifier to the s/// is not necessary, since there are no lines
output by 'qemu-img map' that contain more than 3 hex numbers.  But it
doesn't hurt either.

> +        -e 's/Mapped to *//' | _filter_testdir | _filter_imgfmt
> +}

A cut by column number may have been shorter to write, but this does
indeed appear to do the trick for all but a perverse person that names
their backing files with substrings that include something like
'/path/to/Mapped to gotcha/'.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 12/14] iotests: Add test for backing-chain commits
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 12/14] iotests: Add test for backing-chain commits Max Reitz
@ 2014-06-09 18:50   ` Eric Blake
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2014-06-09 18:50 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 998 bytes --]

On 06/07/2014 12:51 PM, Max Reitz wrote:
> Add a test for qemu-img commit on backing chains with more than two
> images. This test also checks whether the top image is emptied (unless
> this is prevented by specifying either -d or -b)  emptied and does

s/  emptied//

> therefore not work for qed and vmdk which requires it to be separate
> from 020.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/094     | 122 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/094.out | 119 +++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 242 insertions(+)
>  create mode 100755 tests/qemu-iotests/094
>  create mode 100644 tests/qemu-iotests/094.out
> 

The commit message can be touched up by whoever does the PULL request, so:

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 13/14] iotests: Add test for qcow2's bdrv_make_empty
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 13/14] iotests: Add test for qcow2's bdrv_make_empty Max Reitz
@ 2014-06-09 18:55   ` Eric Blake
  0 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2014-06-09 18:55 UTC (permalink / raw)
  To: Max Reitz, qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 675 bytes --]

On 06/07/2014 12:51 PM, Max Reitz wrote:
> Add a test for qcow2's fast bdrv_make_empty implementation on images
> without internal snapshots.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/095     | 72 ++++++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/095.out | 26 +++++++++++++++++
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 99 insertions(+)
>  create mode 100755 tests/qemu-iotests/095
>  create mode 100644 tests/qemu-iotests/095.out
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP
  2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
                   ` (13 preceding siblings ...)
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 14/14] iotests: Omit length/offset test in 040 and 041 Max Reitz
@ 2014-06-27 22:07 ` Max Reitz
  2014-06-30  9:50   ` Kevin Wolf
  14 siblings, 1 reply; 29+ messages in thread
From: Max Reitz @ 2014-06-27 22:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, Stefan Hajnoczi

On 07.06.2014 20:51, Max Reitz wrote:
> qemu-img should use QMP commands whenever possible in order to ensure
> feature completeness of both online and offline image operations. For
> the "commit" command, this is relatively easy, so implement it first
> (in the hope that indeed others will follow).
>
> As qemu-img does not have access to QMP (due to QMP being intertwined
> with basically everything in qemu), we cannot directly use QMP, but at
> least use the functions the corresponding QMP commands are using (which
> would be "block-commit", in this case).
>
>
> With Stefan's pull request for his dataplane series now out, I thought
> this a good opportunity to send a rebase of this series.

Ping; Hu Tao will need "minimal_blob_size()" from patch 3 for the next 
iteration of his "qemu-img: add preallocation=full" series. Sending an 
own patch just for that function seems infeasible, as it is a static 
function which would be unused in the meantime (which throws a compiler 
warning and an error thanks to -Werror). Using __attribute__((unused)) 
just for this seems like a hack; especially considering that all patches 
of this series have been reviewed and it should therefore be ready to merge.

In case there are some objections because you want to test it more, it 
would be fine to merge the first three patches (which suffice for the 
preallocation series and should only introduce unused codepaths) now and 
the rest later on.

Max

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP
  2014-06-27 22:07 ` [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
@ 2014-06-30  9:50   ` Kevin Wolf
  0 siblings, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2014-06-30  9:50 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 28.06.2014 um 00:07 hat Max Reitz geschrieben:
> On 07.06.2014 20:51, Max Reitz wrote:
> >qemu-img should use QMP commands whenever possible in order to ensure
> >feature completeness of both online and offline image operations. For
> >the "commit" command, this is relatively easy, so implement it first
> >(in the hope that indeed others will follow).
> >
> >As qemu-img does not have access to QMP (due to QMP being intertwined
> >with basically everything in qemu), we cannot directly use QMP, but at
> >least use the functions the corresponding QMP commands are using (which
> >would be "block-commit", in this case).
> >
> >
> >With Stefan's pull request for his dataplane series now out, I thought
> >this a good opportunity to send a rebase of this series.
> 
> Ping; Hu Tao will need "minimal_blob_size()" from patch 3 for the
> next iteration of his "qemu-img: add preallocation=full" series.
> Sending an own patch just for that function seems infeasible, as it
> is a static function which would be unused in the meantime (which
> throws a compiler warning and an error thanks to -Werror). Using
> __attribute__((unused)) just for this seems like a hack; especially
> considering that all patches of this series have been reviewed and
> it should therefore be ready to merge.
> 
> In case there are some objections because you want to test it more,
> it would be fine to merge the first three patches (which suffice for
> the preallocation series and should only introduce unused codepaths)
> now and the rest later on.

This series needs rebasing from patch 5 on. I'll review patches 1-3 so
that the dependency for the other series is there.

Kevin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 01/14] qcow2: Allow "full" discard
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 01/14] qcow2: Allow "full" discard Max Reitz
@ 2014-06-30 10:00   ` Kevin Wolf
  0 siblings, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2014-06-30 10:00 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 07.06.2014 um 20:51 hat Max Reitz geschrieben:
> Normally, discarded sectors should read back as zero. However, there are
> cases in which a sector (or rather cluster) should be discarded as if
> they were never written in the first place, that is, reading them should
> fall through to the backing file again.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 02/14] qcow2: Implement bdrv_make_empty()
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 02/14] qcow2: Implement bdrv_make_empty() Max Reitz
@ 2014-06-30 10:00   ` Kevin Wolf
  0 siblings, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2014-06-30 10:00 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 07.06.2014 um 20:51 hat Max Reitz geschrieben:
> Implement this function by making all clusters in the image file fall
> through to the backing file (by using the recently extended discard).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty()
  2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty() Max Reitz
@ 2014-06-30 11:33   ` Kevin Wolf
  2014-07-01  7:11     ` Hu Tao
                       ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Kevin Wolf @ 2014-06-30 11:33 UTC (permalink / raw)
  To: Max Reitz; +Cc: qemu-devel, Stefan Hajnoczi

Am 07.06.2014 um 20:51 hat Max Reitz geschrieben:
> bdrv_make_empty() is currently only called if the current image
> represents an external snapshot that has been committed to its base
> image; it is therefore unlikely to have internal snapshots. In this
> case, bdrv_make_empty() can be greatly sped up by creating an empty L1
> table and dropping all data clusters at once by recreating the refcount
> structure accordingly instead of normally discarding all clusters.
> 
> If there are snapshots, fall back to the simple implementation (discard
> all clusters).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>

This approach looks a bit too complicated to me, and calulating the
required metadata size seems error-prone.

How about this:

1. Set the dirty flag in the header so we can mess with the L1 table
   without keeping the refcounts consistent

2. Overwrite the L1 table with zeros

3. Overwrite the first n clusters after the header with zeros
   (n = 2 + l1_clusters).

4. Update the header:
   refcount_table_offset = cluster_size
   refcount_table_clusters = 1
   l1_table_offset = 3 * cluster_size

6. bdrv_truncate to n + 1 clusters

7. Now update the first 8 bytes at cluster_size (the first new refcount
   table entry) to point to 2 * cluster_size (new refcount block)

8. Reset refcount block and L2 cache

9. Allocate n + 1 clusters (the header, too) and make sure you get
   offset 0

10. Remove the dirty flag

Surprisingly (or not) this is much like an ordinary image creation. The
main difference is that we keep the full size of the L1 table so the
image stays always valid (the spec would even allow us to temporarily
set l1_size = 0, but qcow2_open() doesn't seem to like that) and all
areas where the L1 table could be are zeroed (this includes the new
refcount table/block until the header is updated).


I wanted to check whether this would still give the preallocation=full
series what it needs, but a v11 doesn't seem to be on the list yet and
v10 doesn't have the dependency on this series yet.

Kevin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty()
  2014-06-30 11:33   ` Kevin Wolf
@ 2014-07-01  7:11     ` Hu Tao
  2014-07-01 12:12     ` Max Reitz
  2014-07-09 23:23     ` Max Reitz
  2 siblings, 0 replies; 29+ messages in thread
From: Hu Tao @ 2014-07-01  7:11 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Gotou, Yasunori, qemu-devel, Stefan Hajnoczi, Max Reitz

On Mon, Jun 30, 2014 at 01:33:39PM +0200, Kevin Wolf wrote:
> Am 07.06.2014 um 20:51 hat Max Reitz geschrieben:
> > bdrv_make_empty() is currently only called if the current image
> > represents an external snapshot that has been committed to its base
> > image; it is therefore unlikely to have internal snapshots. In this
> > case, bdrv_make_empty() can be greatly sped up by creating an empty L1
> > table and dropping all data clusters at once by recreating the refcount
> > structure accordingly instead of normally discarding all clusters.
> > 
> > If there are snapshots, fall back to the simple implementation (discard
> > all clusters).
> > 
> > Signed-off-by: Max Reitz <mreitz@redhat.com>
> > Reviewed-by: Eric Blake <eblake@redhat.com>
> 
> This approach looks a bit too complicated to me, and calulating the
> required metadata size seems error-prone.
> 
> How about this:
> 
> 1. Set the dirty flag in the header so we can mess with the L1 table
>    without keeping the refcounts consistent
> 
> 2. Overwrite the L1 table with zeros
> 
> 3. Overwrite the first n clusters after the header with zeros
>    (n = 2 + l1_clusters).
> 
> 4. Update the header:
>    refcount_table_offset = cluster_size
>    refcount_table_clusters = 1
>    l1_table_offset = 3 * cluster_size
> 
> 6. bdrv_truncate to n + 1 clusters
> 
> 7. Now update the first 8 bytes at cluster_size (the first new refcount
>    table entry) to point to 2 * cluster_size (new refcount block)
> 
> 8. Reset refcount block and L2 cache
> 
> 9. Allocate n + 1 clusters (the header, too) and make sure you get
>    offset 0
> 
> 10. Remove the dirty flag
> 
> Surprisingly (or not) this is much like an ordinary image creation. The
> main difference is that we keep the full size of the L1 table so the
> image stays always valid (the spec would even allow us to temporarily
> set l1_size = 0, but qcow2_open() doesn't seem to like that) and all
> areas where the L1 table could be are zeroed (this includes the new
> refcount table/block until the header is updated).

Kevin,

It seems that this approach doesn't need calculation of metadata
size(minimal_blob_size()), which is exactly the one prealllocation=full
will depend on.

> 
> 
> I wanted to check whether this would still give the preallocation=full
> series what it needs, but a v11 doesn't seem to be on the list yet and
> v10 doesn't have the dependency on this series yet.

Although I'm now have v11 done, I'm not sure it's ready to post since
you rejected the calculation of metadata size. But for you to check how
the series depends on this patch, I uploaded it to github at
https://github.com/taohu/qemu/commits/preallocation-v11.
(specifically, the dependency exists on commit
https://github.com/taohu/qemu/commit/308720c6b10166d60045c81a4d9fab7205c85986)

If you think it's not a problem to post v11, just tell me and I can post
to list.

Regards,
Hu

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty()
  2014-06-30 11:33   ` Kevin Wolf
  2014-07-01  7:11     ` Hu Tao
@ 2014-07-01 12:12     ` Max Reitz
  2014-07-09 23:23     ` Max Reitz
  2 siblings, 0 replies; 29+ messages in thread
From: Max Reitz @ 2014-07-01 12:12 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi

On 30.06.2014 13:33, Kevin Wolf wrote:
> Am 07.06.2014 um 20:51 hat Max Reitz geschrieben:
>> bdrv_make_empty() is currently only called if the current image
>> represents an external snapshot that has been committed to its base
>> image; it is therefore unlikely to have internal snapshots. In this
>> case, bdrv_make_empty() can be greatly sped up by creating an empty L1
>> table and dropping all data clusters at once by recreating the refcount
>> structure accordingly instead of normally discarding all clusters.
>>
>> If there are snapshots, fall back to the simple implementation (discard
>> all clusters).
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
> This approach looks a bit too complicated to me, and calulating the
> required metadata size seems error-prone.
>
> How about this:
>
> 1. Set the dirty flag in the header so we can mess with the L1 table
>     without keeping the refcounts consistent

Hm, I didn't think about this. *g*

> 2. Overwrite the L1 table with zeros
>
> 3. Overwrite the first n clusters after the header with zeros
>     (n = 2 + l1_clusters).
>
> 4. Update the header:
>     refcount_table_offset = cluster_size
>     refcount_table_clusters = 1
>     l1_table_offset = 3 * cluster_size
>
> 6. bdrv_truncate to n + 1 clusters
>
> 7. Now update the first 8 bytes at cluster_size (the first new refcount
>     table entry) to point to 2 * cluster_size (new refcount block)
>
> 8. Reset refcount block and L2 cache
>
> 9. Allocate n + 1 clusters (the header, too) and make sure you get
>     offset 0
>
> 10. Remove the dirty flag
>
> Surprisingly (or not) this is much like an ordinary image creation. The
> main difference is that we keep the full size of the L1 table so the
> image stays always valid (the spec would even allow us to temporarily
> set l1_size = 0, but qcow2_open() doesn't seem to like that)

Yes, I noticed. ;-)

> and all
> areas where the L1 table could be are zeroed (this includes the new
> refcount table/block until the header is updated).
>
>
> I wanted to check whether this would still give the preallocation=full
> series what it needs, but a v11 doesn't seem to be on the list yet and
> v10 doesn't have the dependency on this series yet.

Well, as far as I see it, the preallocation=full series will need a 
function to calculate the required image size (if it doesn't, 
preallocation=thin will). I don't really care whether this series 
introduces such a function or whether preallocation=full does.

Max

PS: I personally am reluctant to drop/change this patch, if only because 
I spent about a week getting it right. ;-)

I guess I'll just take a look into marking the image dirty and see how 
it goes.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty()
  2014-06-30 11:33   ` Kevin Wolf
  2014-07-01  7:11     ` Hu Tao
  2014-07-01 12:12     ` Max Reitz
@ 2014-07-09 23:23     ` Max Reitz
  2 siblings, 0 replies; 29+ messages in thread
From: Max Reitz @ 2014-07-09 23:23 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Stefan Hajnoczi

On 30.06.2014 13:33, Kevin Wolf wrote:
> Am 07.06.2014 um 20:51 hat Max Reitz geschrieben:
>> bdrv_make_empty() is currently only called if the current image
>> represents an external snapshot that has been committed to its base
>> image; it is therefore unlikely to have internal snapshots. In this
>> case, bdrv_make_empty() can be greatly sped up by creating an empty L1
>> table and dropping all data clusters at once by recreating the refcount
>> structure accordingly instead of normally discarding all clusters.
>>
>> If there are snapshots, fall back to the simple implementation (discard
>> all clusters).
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
> This approach looks a bit too complicated to me, and calulating the
> required metadata size seems error-prone.
>
> How about this:
>
> 1. Set the dirty flag in the header so we can mess with the L1 table
>     without keeping the refcounts consistent
>
> 2. Overwrite the L1 table with zeros
>
> 3. Overwrite the first n clusters after the header with zeros
>     (n = 2 + l1_clusters).
>
> 4. Update the header:
>     refcount_table_offset = cluster_size
>     refcount_table_clusters = 1
>     l1_table_offset = 3 * cluster_size
>
> 6. bdrv_truncate to n + 1 clusters
>
> 7. Now update the first 8 bytes at cluster_size (the first new refcount
>     table entry) to point to 2 * cluster_size (new refcount block)
>
> 8. Reset refcount block and L2 cache
>
> 9. Allocate n + 1 clusters (the header, too) and make sure you get
>     offset 0
>
> 10. Remove the dirty flag

Okay, after some fixing around and getting it to work, I noticed a 
(seemingly to me) rather big problem: If something bad happens between 3 
and 7 (especially between 4 and 7), the image cannot be repaired. The 
reason is that the refcount table is empty and a new refcount block 
cannot be allocated because the consistency checks correctly signal an 
overlap with the refcount table (I guess, I would have expected the 
image header instead, but well...); this is because nothing is allocated 
and the first cluster offset returned by an allocation will probably be 
zero (the image header) or $cluster_size (where the reftable resides).

So I think we absolutely have to make sure that whenever the 
refcount_table_offset is changed on disk, the reftable it points to 
already contains a valid offset. We could pull 7 before 4, but then we'd 
have to guarantee that 3 did not already overwrite the reftable (which 
it probably does). Well, maybe we could change 3 so it checks whether 
the reftable is already part of that area, and if it is, overwrite its 
first entry not with zero, but with 2 * cluster_size; if the offset of 
the reftable is not 2 * cluster_size, in which case we'd have to take 
some other offset. Then we could either try to write a new reftable 
anyway or just place everything behind that old reftable, just ignoring 
the "lost" space.

In any case, I doubt it'll be much shorter overall with these additional 
checks. The current code has 340 LOC with extremely verbose commentary; 
my new code (failing to address the problem described above) has 100 LOC 
without any comments.

So I guess the main issue is how *complicated* the code actually is; in 
my opinion, the most complicated and hardest to review piece of code in 
this patch (patch v8 3/14) is minimal_blob_size(); which, as far as I 
think, we will need in one form or another eventually anyway. 
create_refcount_l1() is pretty long, but due to the commentary should be 
well comprehensible.

In any case, I still have the code for your proposal here and I'd be 
absolutely fine with working further on it. So if you think it'll be 
worth it anyway (which I personally don't have any opinion on), I'll 
continue on it.

Max

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-07-09 23:23 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-07 18:51 [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 01/14] qcow2: Allow "full" discard Max Reitz
2014-06-30 10:00   ` Kevin Wolf
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 02/14] qcow2: Implement bdrv_make_empty() Max Reitz
2014-06-30 10:00   ` Kevin Wolf
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty() Max Reitz
2014-06-30 11:33   ` Kevin Wolf
2014-07-01  7:11     ` Hu Tao
2014-07-01 12:12     ` Max Reitz
2014-07-09 23:23     ` Max Reitz
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 04/14] blockjob: Introduce block_job_complete_sync() Max Reitz
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 05/14] blockjob: Add "ready" field Max Reitz
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 06/14] block/mirror: Improve progress report Max Reitz
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 07/14] qemu-img: Implement commit like QMP Max Reitz
2014-06-09 16:53   ` Eric Blake
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 08/14] qemu-img: Empty image after commit Max Reitz
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 09/14] qemu-img: Enable progress output for commit Max Reitz
2014-06-09 17:28   ` Eric Blake
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 10/14] qemu-img: Specify backing file " Max Reitz
2014-06-09 17:40   ` Eric Blake
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 11/14] iotests: Add _filter_qemu_img_map Max Reitz
2014-06-09 17:51   ` Eric Blake
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 12/14] iotests: Add test for backing-chain commits Max Reitz
2014-06-09 18:50   ` Eric Blake
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 13/14] iotests: Add test for qcow2's bdrv_make_empty Max Reitz
2014-06-09 18:55   ` Eric Blake
2014-06-07 18:51 ` [Qemu-devel] [PATCH v8 14/14] iotests: Omit length/offset test in 040 and 041 Max Reitz
2014-06-27 22:07 ` [Qemu-devel] [PATCH v8 00/14] qemu-img: Implement commit like QMP Max Reitz
2014-06-30  9:50   ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.