QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v9 00/34] Add subcluster allocation to qcow2
@ 2020-06-28 11:02 Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 01/34] qcow2: Make Qcow2AioTask store the full host offset Alberto Garcia
                   ` (33 more replies)
  0 siblings, 34 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Hi,

here's the new version of the patches to add subcluster allocation
support to qcow2.

Please refer to the cover letter of the first version for a full
description of the patches:

   https://lists.gnu.org/archive/html/qemu-block/2019-10/msg00983.html

The important change from v8 is patch 24. This patch fixes unaligned
offsets in ZERO_ALLOC clusters, but the previous implementation was
not working for images with extended L2 entries and was simply adding
dead code. The new implementation works properly.

If you want to test this series make sure to apply this patch first:

   https://lists.gnu.org/archive/html/qemu-block/2020-06/msg00504.html

Berto

Based-on: <20200610094600.4029-1-berto@igalia.com>

v9:
- Patch 24: Change the algorithm so it actually does something in
  images with extended L2 entries.
- Patch 31: Update expected test results after rebase
- Patch 34: Fix a couple of typos.
            Add a new test for the changes in patch 24.
            Test concurrent requests on different subclusters of the
            same cluster.

v8: https://lists.gnu.org/archive/html/qemu-block/2020-06/msg00546.html
v7: https://lists.gnu.org/archive/html/qemu-block/2020-05/msg01683.html
v6: https://lists.gnu.org/archive/html/qemu-block/2020-05/msg01583.html
v5: https://lists.gnu.org/archive/html/qemu-block/2020-05/msg00251.html
v4: https://lists.gnu.org/archive/html/qemu-block/2020-03/msg00966.html
v3: https://lists.gnu.org/archive/html/qemu-block/2019-12/msg00587.html
v2: https://lists.gnu.org/archive/html/qemu-block/2019-10/msg01642.html
v1: https://lists.gnu.org/archive/html/qemu-block/2019-10/msg00983.html

Output of git backport-diff against v8:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/34:[----] [--] 'qcow2: Make Qcow2AioTask store the full host offset'
002/34:[----] [--] 'qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset()'
003/34:[----] [--] 'qcow2: Add calculate_l2_meta()'
004/34:[----] [--] 'qcow2: Split cluster_needs_cow() out of count_cow_clusters()'
005/34:[----] [--] 'qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied()'
006/34:[----] [--] 'qcow2: Add get_l2_entry() and set_l2_entry()'
007/34:[----] [--] 'qcow2: Document the Extended L2 Entries feature'
008/34:[----] [--] 'qcow2: Add dummy has_subclusters() function'
009/34:[----] [--] 'qcow2: Add subcluster-related fields to BDRVQcow2State'
010/34:[----] [--] 'qcow2: Add offset_to_sc_index()'
011/34:[----] [--] 'qcow2: Add offset_into_subcluster() and size_to_subclusters()'
012/34:[----] [--] 'qcow2: Add l2_entry_size()'
013/34:[----] [--] 'qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap()'
014/34:[----] [--] 'qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()'
015/34:[----] [--] 'qcow2: Add qcow2_get_subcluster_range_type()'
016/34:[----] [--] 'qcow2: Add qcow2_cluster_is_allocated()'
017/34:[----] [--] 'qcow2: Add cluster type parameter to qcow2_get_host_offset()'
018/34:[----] [--] 'qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_*'
019/34:[----] [--] 'qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC'
020/34:[----] [--] 'qcow2: Add subcluster support to calculate_l2_meta()'
021/34:[----] [--] 'qcow2: Add subcluster support to qcow2_get_host_offset()'
022/34:[----] [--] 'qcow2: Add subcluster support to zero_in_l2_slice()'
023/34:[----] [--] 'qcow2: Add subcluster support to discard_in_l2_slice()'
024/34:[0025] [FC] 'qcow2: Add subcluster support to check_refcounts_l2()'
025/34:[----] [--] 'qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2()'
026/34:[----] [--] 'qcow2: Clear the L2 bitmap when allocating a compressed cluster'
027/34:[----] [--] 'qcow2: Add subcluster support to handle_alloc_space()'
028/34:[----] [--] 'qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()'
029/34:[----] [--] 'qcow2: Add subcluster support to qcow2_measure()'
030/34:[----] [--] 'qcow2: Add prealloc field to QCowL2Meta'
031/34:[0004] [FC] 'qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit'
032/34:[----] [--] 'qcow2: Allow preallocation and backing files if extended_l2 is set'
033/34:[----] [--] 'qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters'
034/34:[0152] [FC] 'iotests: Add tests for qcow2 images with extended L2 entries'

Alberto Garcia (34):
  qcow2: Make Qcow2AioTask store the full host offset
  qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset()
  qcow2: Add calculate_l2_meta()
  qcow2: Split cluster_needs_cow() out of count_cow_clusters()
  qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied()
  qcow2: Add get_l2_entry() and set_l2_entry()
  qcow2: Document the Extended L2 Entries feature
  qcow2: Add dummy has_subclusters() function
  qcow2: Add subcluster-related fields to BDRVQcow2State
  qcow2: Add offset_to_sc_index()
  qcow2: Add offset_into_subcluster() and size_to_subclusters()
  qcow2: Add l2_entry_size()
  qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap()
  qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()
  qcow2: Add qcow2_get_subcluster_range_type()
  qcow2: Add qcow2_cluster_is_allocated()
  qcow2: Add cluster type parameter to qcow2_get_host_offset()
  qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_*
  qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC
  qcow2: Add subcluster support to calculate_l2_meta()
  qcow2: Add subcluster support to qcow2_get_host_offset()
  qcow2: Add subcluster support to zero_in_l2_slice()
  qcow2: Add subcluster support to discard_in_l2_slice()
  qcow2: Add subcluster support to check_refcounts_l2()
  qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2()
  qcow2: Clear the L2 bitmap when allocating a compressed cluster
  qcow2: Add subcluster support to handle_alloc_space()
  qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
  qcow2: Add subcluster support to qcow2_measure()
  qcow2: Add prealloc field to QCowL2Meta
  qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit
  qcow2: Allow preallocation and backing files if extended_l2 is set
  qcow2: Assert that expand_zero_clusters_in_l1() does not support
    subclusters
  iotests: Add tests for qcow2 images with extended L2 entries

 docs/interop/qcow2.txt           |  68 ++-
 docs/qcow2-cache.txt             |  19 +-
 qapi/block-core.json             |   7 +
 block/qcow2.h                    | 211 ++++++-
 include/block/block_int.h        |   1 +
 block/qcow2-cluster.c            | 912 +++++++++++++++++++++----------
 block/qcow2-refcount.c           |  47 +-
 block/qcow2.c                    | 304 +++++++----
 block/trace-events               |   2 +-
 tests/qemu-iotests/031.out       |   8 +-
 tests/qemu-iotests/036.out       |   4 +-
 tests/qemu-iotests/049.out       | 102 ++--
 tests/qemu-iotests/060.out       |   3 +-
 tests/qemu-iotests/061           |   6 +
 tests/qemu-iotests/061.out       |  25 +-
 tests/qemu-iotests/065           |  12 +-
 tests/qemu-iotests/082.out       |  48 +-
 tests/qemu-iotests/085.out       |  38 +-
 tests/qemu-iotests/144.out       |   4 +-
 tests/qemu-iotests/182.out       |   2 +-
 tests/qemu-iotests/185.out       |   8 +-
 tests/qemu-iotests/198.out       |   2 +
 tests/qemu-iotests/206.out       |   6 +-
 tests/qemu-iotests/242.out       |   5 +
 tests/qemu-iotests/255.out       |   8 +-
 tests/qemu-iotests/271           | 901 ++++++++++++++++++++++++++++++
 tests/qemu-iotests/271.out       | 724 ++++++++++++++++++++++++
 tests/qemu-iotests/274.out       |  49 +-
 tests/qemu-iotests/280.out       |   2 +-
 tests/qemu-iotests/291.out       |   6 +-
 tests/qemu-iotests/common.filter |   1 +
 tests/qemu-iotests/group         |   1 +
 32 files changed, 2963 insertions(+), 573 deletions(-)
 create mode 100755 tests/qemu-iotests/271
 create mode 100644 tests/qemu-iotests/271.out

-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 01/34] qcow2: Make Qcow2AioTask store the full host offset
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset() Alberto Garcia
                   ` (32 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

The file_cluster_offset field of Qcow2AioTask stores a cluster-aligned
host offset. In practice this is not very useful because all users(*)
of this structure need the final host offset into the cluster, which
they calculate using

   host_offset = file_cluster_offset + offset_into_cluster(s, offset)

There is no reason why Qcow2AioTask cannot store host_offset directly
and that is what this patch does.

(*) compressed clusters are the exception: in this case what
    file_cluster_offset was storing was the full compressed cluster
    descriptor (offset + size). This does not change with this patch
    but it is documented now.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c      | 69 ++++++++++++++++++++++------------------------
 block/trace-events |  2 +-
 2 files changed, 34 insertions(+), 37 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index e20590c3b7..d792137af6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -74,7 +74,7 @@ typedef struct {
 
 static int coroutine_fn
 qcow2_co_preadv_compressed(BlockDriverState *bs,
-                           uint64_t file_cluster_offset,
+                           uint64_t cluster_descriptor,
                            uint64_t offset,
                            uint64_t bytes,
                            QEMUIOVector *qiov,
@@ -2103,7 +2103,7 @@ out:
 
 static coroutine_fn int
 qcow2_co_preadv_encrypted(BlockDriverState *bs,
-                           uint64_t file_cluster_offset,
+                           uint64_t host_offset,
                            uint64_t offset,
                            uint64_t bytes,
                            QEMUIOVector *qiov,
@@ -2130,16 +2130,12 @@ qcow2_co_preadv_encrypted(BlockDriverState *bs,
     }
 
     BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-    ret = bdrv_co_pread(s->data_file,
-                        file_cluster_offset + offset_into_cluster(s, offset),
-                        bytes, buf, 0);
+    ret = bdrv_co_pread(s->data_file, host_offset, bytes, buf, 0);
     if (ret < 0) {
         goto fail;
     }
 
-    if (qcow2_co_decrypt(bs,
-                         file_cluster_offset + offset_into_cluster(s, offset),
-                         offset, buf, bytes) < 0)
+    if (qcow2_co_decrypt(bs, host_offset, offset, buf, bytes) < 0)
     {
         ret = -EIO;
         goto fail;
@@ -2157,7 +2153,7 @@ typedef struct Qcow2AioTask {
 
     BlockDriverState *bs;
     QCow2ClusterType cluster_type; /* only for read */
-    uint64_t file_cluster_offset;
+    uint64_t host_offset; /* or full descriptor in compressed clusters */
     uint64_t offset;
     uint64_t bytes;
     QEMUIOVector *qiov;
@@ -2170,7 +2166,7 @@ static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
                                        AioTaskPool *pool,
                                        AioTaskFunc func,
                                        QCow2ClusterType cluster_type,
-                                       uint64_t file_cluster_offset,
+                                       uint64_t host_offset,
                                        uint64_t offset,
                                        uint64_t bytes,
                                        QEMUIOVector *qiov,
@@ -2185,7 +2181,7 @@ static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
         .bs = bs,
         .cluster_type = cluster_type,
         .qiov = qiov,
-        .file_cluster_offset = file_cluster_offset,
+        .host_offset = host_offset,
         .offset = offset,
         .bytes = bytes,
         .qiov_offset = qiov_offset,
@@ -2194,7 +2190,7 @@ static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
 
     trace_qcow2_add_task(qemu_coroutine_self(), bs, pool,
                          func == qcow2_co_preadv_task_entry ? "read" : "write",
-                         cluster_type, file_cluster_offset, offset, bytes,
+                         cluster_type, host_offset, offset, bytes,
                          qiov, qiov_offset);
 
     if (!pool) {
@@ -2208,13 +2204,12 @@ static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
 
 static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
                                              QCow2ClusterType cluster_type,
-                                             uint64_t file_cluster_offset,
+                                             uint64_t host_offset,
                                              uint64_t offset, uint64_t bytes,
                                              QEMUIOVector *qiov,
                                              size_t qiov_offset)
 {
     BDRVQcow2State *s = bs->opaque;
-    int offset_in_cluster = offset_into_cluster(s, offset);
 
     switch (cluster_type) {
     case QCOW2_CLUSTER_ZERO_PLAIN:
@@ -2230,19 +2225,17 @@ static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
                                    qiov, qiov_offset, 0);
 
     case QCOW2_CLUSTER_COMPRESSED:
-        return qcow2_co_preadv_compressed(bs, file_cluster_offset,
+        return qcow2_co_preadv_compressed(bs, host_offset,
                                           offset, bytes, qiov, qiov_offset);
 
     case QCOW2_CLUSTER_NORMAL:
-        assert(offset_into_cluster(s, file_cluster_offset) == 0);
         if (bs->encrypted) {
-            return qcow2_co_preadv_encrypted(bs, file_cluster_offset,
+            return qcow2_co_preadv_encrypted(bs, host_offset,
                                              offset, bytes, qiov, qiov_offset);
         }
 
         BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-        return bdrv_co_preadv_part(s->data_file,
-                                   file_cluster_offset + offset_in_cluster,
+        return bdrv_co_preadv_part(s->data_file, host_offset,
                                    bytes, qiov, qiov_offset, 0);
 
     default:
@@ -2258,7 +2251,7 @@ static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task)
 
     assert(!t->l2meta);
 
-    return qcow2_co_preadv_task(t->bs, t->cluster_type, t->file_cluster_offset,
+    return qcow2_co_preadv_task(t->bs, t->cluster_type, t->host_offset,
                                 t->offset, t->bytes, t->qiov, t->qiov_offset);
 }
 
@@ -2294,11 +2287,20 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
         {
             qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes);
         } else {
+            /*
+             * For compressed clusters the variable cluster_offset
+             * does not actually store the offset but the full
+             * descriptor. We need to leave it unchanged because
+             * that's what qcow2_co_preadv_compressed() expects.
+             */
+            uint64_t host_offset = (ret == QCOW2_CLUSTER_COMPRESSED) ?
+                cluster_offset :
+                cluster_offset + offset_into_cluster(s, offset);
             if (!aio && cur_bytes != bytes) {
                 aio = aio_task_pool_new(QCOW2_MAX_WORKERS);
             }
             ret = qcow2_add_task(bs, aio, qcow2_co_preadv_task_entry, ret,
-                                 cluster_offset, offset, cur_bytes,
+                                 host_offset, offset, cur_bytes,
                                  qiov, qiov_offset, NULL);
             if (ret < 0) {
                 goto out;
@@ -2449,7 +2451,7 @@ static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
  *           not use it somehow after qcow2_co_pwritev_task() call
  */
 static coroutine_fn int qcow2_co_pwritev_task(BlockDriverState *bs,
-                                              uint64_t file_cluster_offset,
+                                              uint64_t host_offset,
                                               uint64_t offset, uint64_t bytes,
                                               QEMUIOVector *qiov,
                                               uint64_t qiov_offset,
@@ -2458,7 +2460,6 @@ static coroutine_fn int qcow2_co_pwritev_task(BlockDriverState *bs,
     int ret;
     BDRVQcow2State *s = bs->opaque;
     void *crypt_buf = NULL;
-    int offset_in_cluster = offset_into_cluster(s, offset);
     QEMUIOVector encrypted_qiov;
 
     if (bs->encrypted) {
@@ -2471,9 +2472,7 @@ static coroutine_fn int qcow2_co_pwritev_task(BlockDriverState *bs,
         }
         qemu_iovec_to_buf(qiov, qiov_offset, crypt_buf, bytes);
 
-        if (qcow2_co_encrypt(bs, file_cluster_offset + offset_in_cluster,
-                             offset, crypt_buf, bytes) < 0)
-        {
+        if (qcow2_co_encrypt(bs, host_offset, offset, crypt_buf, bytes) < 0) {
             ret = -EIO;
             goto out_unlocked;
         }
@@ -2497,10 +2496,8 @@ static coroutine_fn int qcow2_co_pwritev_task(BlockDriverState *bs,
      */
     if (!merge_cow(offset, bytes, qiov, qiov_offset, l2meta)) {
         BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
-        trace_qcow2_writev_data(qemu_coroutine_self(),
-                                file_cluster_offset + offset_in_cluster);
-        ret = bdrv_co_pwritev_part(s->data_file,
-                                   file_cluster_offset + offset_in_cluster,
+        trace_qcow2_writev_data(qemu_coroutine_self(), host_offset);
+        ret = bdrv_co_pwritev_part(s->data_file, host_offset,
                                    bytes, qiov, qiov_offset, 0);
         if (ret < 0) {
             goto out_unlocked;
@@ -2530,7 +2527,7 @@ static coroutine_fn int qcow2_co_pwritev_task_entry(AioTask *task)
 
     assert(!t->cluster_type);
 
-    return qcow2_co_pwritev_task(t->bs, t->file_cluster_offset,
+    return qcow2_co_pwritev_task(t->bs, t->host_offset,
                                  t->offset, t->bytes, t->qiov, t->qiov_offset,
                                  t->l2meta);
 }
@@ -2585,8 +2582,8 @@ static coroutine_fn int qcow2_co_pwritev_part(
             aio = aio_task_pool_new(QCOW2_MAX_WORKERS);
         }
         ret = qcow2_add_task(bs, aio, qcow2_co_pwritev_task_entry, 0,
-                             cluster_offset, offset, cur_bytes,
-                             qiov, qiov_offset, l2meta);
+                             cluster_offset + offset_in_cluster, offset,
+                             cur_bytes, qiov, qiov_offset, l2meta);
         l2meta = NULL; /* l2meta is consumed by qcow2_co_pwritev_task() */
         if (ret < 0) {
             goto fail_nometa;
@@ -4569,7 +4566,7 @@ qcow2_co_pwritev_compressed_part(BlockDriverState *bs,
 
 static int coroutine_fn
 qcow2_co_preadv_compressed(BlockDriverState *bs,
-                           uint64_t file_cluster_offset,
+                           uint64_t cluster_descriptor,
                            uint64_t offset,
                            uint64_t bytes,
                            QEMUIOVector *qiov,
@@ -4581,8 +4578,8 @@ qcow2_co_preadv_compressed(BlockDriverState *bs,
     uint8_t *buf, *out_buf;
     int offset_in_cluster = offset_into_cluster(s, offset);
 
-    coffset = file_cluster_offset & s->cluster_offset_mask;
-    nb_csectors = ((file_cluster_offset >> s->csize_shift) & s->csize_mask) + 1;
+    coffset = cluster_descriptor & s->cluster_offset_mask;
+    nb_csectors = ((cluster_descriptor >> s->csize_shift) & s->csize_mask) + 1;
     csize = nb_csectors * QCOW2_COMPRESSED_SECTOR_SIZE -
         (coffset & ~QCOW2_COMPRESSED_SECTOR_MASK);
 
diff --git a/block/trace-events b/block/trace-events
index dbe76a7613..e21b95f3e6 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -77,7 +77,7 @@ luring_io_uring_submit(void *s, int ret) "LuringState %p ret %d"
 luring_resubmit_short_read(void *s, void *luringcb, int nread) "LuringState %p luringcb %p nread %d"
 
 # qcow2.c
-qcow2_add_task(void *co, void *bs, void *pool, const char *action, int cluster_type, uint64_t file_cluster_offset, uint64_t offset, uint64_t bytes, void *qiov, size_t qiov_offset) "co %p bs %p pool %p: %s: cluster_type %d file_cluster_offset %" PRIu64 " offset %" PRIu64 " bytes %" PRIu64 " qiov %p qiov_offset %zu"
+qcow2_add_task(void *co, void *bs, void *pool, const char *action, int cluster_type, uint64_t host_offset, uint64_t offset, uint64_t bytes, void *qiov, size_t qiov_offset) "co %p bs %p pool %p: %s: cluster_type %d file_cluster_offset %" PRIu64 " offset %" PRIu64 " bytes %" PRIu64 " qiov %p qiov_offset %zu"
 qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" PRIx64 " bytes %d"
 qcow2_writev_done_req(void *co, int ret) "co %p ret %d"
 qcow2_writev_start_part(void *co) "co %p"
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 01/34] qcow2: Make Qcow2AioTask store the full host offset Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-30 10:19   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 03/34] qcow2: Add calculate_l2_meta() Alberto Garcia
                   ` (31 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

qcow2_get_cluster_offset() takes an (unaligned) guest offset and
returns the (aligned) offset of the corresponding cluster in the qcow2
image.

In practice none of the callers need to know where the cluster starts
so this patch makes the function calculate and return the final host
offset directly. The function is also renamed accordingly.

There is a pre-existing exception with compressed clusters: in this
case the function returns the complete cluster descriptor (containing
the offset and size of the compressed data). This does not change with
this patch but it is now documented.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |  4 ++--
 block/qcow2-cluster.c | 42 +++++++++++++++++++++++-------------------
 block/qcow2.c         | 24 +++++++-----------------
 3 files changed, 32 insertions(+), 38 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 7ce2c23bdb..06475e0849 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -694,8 +694,8 @@ int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index);
 int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
                           uint8_t *buf, int nb_sectors, bool enc, Error **errp);
 
-int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
-                             unsigned int *bytes, uint64_t *cluster_offset);
+int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
+                          unsigned int *bytes, uint64_t *host_offset);
 int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
                                unsigned int *bytes, uint64_t *host_offset,
                                QCowL2Meta **m);
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 4b5fc8c4a7..9ab41cb728 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -496,10 +496,15 @@ static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
 
 
 /*
- * get_cluster_offset
+ * get_host_offset
  *
- * For a given offset of the virtual disk, find the cluster type and offset in
- * the qcow2 file. The offset is stored in *cluster_offset.
+ * For a given offset of the virtual disk find the equivalent host
+ * offset in the qcow2 file and store it in *host_offset. Neither
+ * offset needs to be aligned to a cluster boundary.
+ *
+ * If the cluster is unallocated then *host_offset will be 0.
+ * If the cluster is compressed then *host_offset will contain the
+ * complete compressed cluster descriptor.
  *
  * On entry, *bytes is the maximum number of contiguous bytes starting at
  * offset that we are interested in.
@@ -511,12 +516,12 @@ static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
  * Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error
  * cases.
  */
-int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
-                             unsigned int *bytes, uint64_t *cluster_offset)
+int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
+                          unsigned int *bytes, uint64_t *host_offset)
 {
     BDRVQcow2State *s = bs->opaque;
     unsigned int l2_index;
-    uint64_t l1_index, l2_offset, *l2_slice;
+    uint64_t l1_index, l2_offset, *l2_slice, l2_entry;
     int c;
     unsigned int offset_in_cluster;
     uint64_t bytes_available, bytes_needed, nb_clusters;
@@ -537,8 +542,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
         bytes_needed = bytes_available;
     }
 
-    *cluster_offset = 0;
-
     /* seek to the l2 offset in the l1 table */
 
     l1_index = offset_to_l1_index(s, offset);
@@ -570,7 +573,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
     /* find the cluster offset for the given disk offset */
 
     l2_index = offset_to_l2_slice_index(s, offset);
-    *cluster_offset = be64_to_cpu(l2_slice[l2_index]);
+    l2_entry = be64_to_cpu(l2_slice[l2_index]);
 
     nb_clusters = size_to_clusters(s, bytes_needed);
     /* bytes_needed <= *bytes + offset_in_cluster, both of which are unsigned
@@ -578,7 +581,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
      * true */
     assert(nb_clusters <= INT_MAX);
 
-    type = qcow2_get_cluster_type(bs, *cluster_offset);
+    type = qcow2_get_cluster_type(bs, l2_entry);
     if (s->qcow_version < 3 && (type == QCOW2_CLUSTER_ZERO_PLAIN ||
                                 type == QCOW2_CLUSTER_ZERO_ALLOC)) {
         qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
@@ -599,42 +602,43 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
         }
         /* Compressed clusters can only be processed one by one */
         c = 1;
-        *cluster_offset &= L2E_COMPRESSED_OFFSET_SIZE_MASK;
+        *host_offset = l2_entry & L2E_COMPRESSED_OFFSET_SIZE_MASK;
         break;
     case QCOW2_CLUSTER_ZERO_PLAIN:
     case QCOW2_CLUSTER_UNALLOCATED:
         /* how many empty clusters ? */
         c = count_contiguous_clusters_unallocated(bs, nb_clusters,
                                                   &l2_slice[l2_index], type);
-        *cluster_offset = 0;
+        *host_offset = 0;
         break;
     case QCOW2_CLUSTER_ZERO_ALLOC:
-    case QCOW2_CLUSTER_NORMAL:
+    case QCOW2_CLUSTER_NORMAL: {
+        uint64_t host_cluster_offset = l2_entry & L2E_OFFSET_MASK;
+        *host_offset = host_cluster_offset + offset_in_cluster;
         /* how many allocated clusters ? */
         c = count_contiguous_clusters(bs, nb_clusters, s->cluster_size,
                                       &l2_slice[l2_index], QCOW_OFLAG_ZERO);
-        *cluster_offset &= L2E_OFFSET_MASK;
-        if (offset_into_cluster(s, *cluster_offset)) {
+        if (offset_into_cluster(s, host_cluster_offset)) {
             qcow2_signal_corruption(bs, true, -1, -1,
                                     "Cluster allocation offset %#"
                                     PRIx64 " unaligned (L2 offset: %#" PRIx64
-                                    ", L2 index: %#x)", *cluster_offset,
+                                    ", L2 index: %#x)", host_cluster_offset,
                                     l2_offset, l2_index);
             ret = -EIO;
             goto fail;
         }
-        if (has_data_file(bs) && *cluster_offset != offset - offset_in_cluster)
-        {
+        if (has_data_file(bs) && *host_offset != offset) {
             qcow2_signal_corruption(bs, true, -1, -1,
                                     "External data file host cluster offset %#"
                                     PRIx64 " does not match guest cluster "
                                     "offset: %#" PRIx64
-                                    ", L2 index: %#x)", *cluster_offset,
+                                    ", L2 index: %#x)", host_cluster_offset,
                                     offset - offset_in_cluster, l2_index);
             ret = -EIO;
             goto fail;
         }
         break;
+    }
     default:
         abort();
     }
diff --git a/block/qcow2.c b/block/qcow2.c
index d792137af6..afb00ada42 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2026,7 +2026,7 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
                                               BlockDriverState **file)
 {
     BDRVQcow2State *s = bs->opaque;
-    uint64_t cluster_offset;
+    uint64_t host_offset;
     unsigned int bytes;
     int ret, status = 0;
 
@@ -2039,7 +2039,7 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     }
 
     bytes = MIN(INT_MAX, count);
-    ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset);
+    ret = qcow2_get_host_offset(bs, offset, &bytes, &host_offset);
     qemu_co_mutex_unlock(&s->lock);
     if (ret < 0) {
         return ret;
@@ -2049,7 +2049,7 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
 
     if ((ret == QCOW2_CLUSTER_NORMAL || ret == QCOW2_CLUSTER_ZERO_ALLOC) &&
         !s->crypto) {
-        *map = cluster_offset | offset_into_cluster(s, offset);
+        *map = host_offset;
         *file = s->data_file->bs;
         status |= BDRV_BLOCK_OFFSET_VALID;
     }
@@ -2263,7 +2263,7 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
     BDRVQcow2State *s = bs->opaque;
     int ret = 0;
     unsigned int cur_bytes; /* number of bytes in current iteration */
-    uint64_t cluster_offset = 0;
+    uint64_t host_offset = 0;
     AioTaskPool *aio = NULL;
 
     while (bytes != 0 && aio_task_pool_status(aio) == 0) {
@@ -2275,7 +2275,7 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
         }
 
         qemu_co_mutex_lock(&s->lock);
-        ret = qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_offset);
+        ret = qcow2_get_host_offset(bs, offset, &cur_bytes, &host_offset);
         qemu_co_mutex_unlock(&s->lock);
         if (ret < 0) {
             goto out;
@@ -2287,15 +2287,6 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
         {
             qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes);
         } else {
-            /*
-             * For compressed clusters the variable cluster_offset
-             * does not actually store the offset but the full
-             * descriptor. We need to leave it unchanged because
-             * that's what qcow2_co_preadv_compressed() expects.
-             */
-            uint64_t host_offset = (ret == QCOW2_CLUSTER_COMPRESSED) ?
-                cluster_offset :
-                cluster_offset + offset_into_cluster(s, offset);
             if (!aio && cur_bytes != bytes) {
                 aio = aio_task_pool_new(QCOW2_MAX_WORKERS);
             }
@@ -3860,7 +3851,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
         offset = QEMU_ALIGN_DOWN(offset, s->cluster_size);
         bytes = s->cluster_size;
         nr = s->cluster_size;
-        ret = qcow2_get_cluster_offset(bs, offset, &nr, &off);
+        ret = qcow2_get_host_offset(bs, offset, &nr, &off);
         if (ret != QCOW2_CLUSTER_UNALLOCATED &&
             ret != QCOW2_CLUSTER_ZERO_PLAIN &&
             ret != QCOW2_CLUSTER_ZERO_ALLOC) {
@@ -3931,7 +3922,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
         cur_bytes = MIN(bytes, INT_MAX);
         cur_write_flags = write_flags;
 
-        ret = qcow2_get_cluster_offset(bs, src_offset, &cur_bytes, &copy_offset);
+        ret = qcow2_get_host_offset(bs, src_offset, &cur_bytes, &copy_offset);
         if (ret < 0) {
             goto out;
         }
@@ -3963,7 +3954,6 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
 
         case QCOW2_CLUSTER_NORMAL:
             child = s->data_file;
-            copy_offset += offset_into_cluster(s, src_offset);
             break;
 
         default:
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 03/34] qcow2: Add calculate_l2_meta()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 01/34] qcow2: Make Qcow2AioTask store the full host offset Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 04/34] qcow2: Split cluster_needs_cow() out of count_cow_clusters() Alberto Garcia
                   ` (30 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

handle_alloc() creates a QCowL2Meta structure in order to update the
image metadata and perform the necessary copy-on-write operations.

This patch moves that code to a separate function so it can be used
from other places.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c | 77 +++++++++++++++++++++++++++++--------------
 1 file changed, 53 insertions(+), 24 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 9ab41cb728..61ad638bdc 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1037,6 +1037,56 @@ void qcow2_alloc_cluster_abort(BlockDriverState *bs, QCowL2Meta *m)
     }
 }
 
+/*
+ * For a given write request, create a new QCowL2Meta structure, add
+ * it to @m and the BDRVQcow2State.cluster_allocs list.
+ *
+ * @host_cluster_offset points to the beginning of the first cluster.
+ *
+ * @guest_offset and @bytes indicate the offset and length of the
+ * request.
+ *
+ * If @keep_old is true it means that the clusters were already
+ * allocated and will be overwritten. If false then the clusters are
+ * new and we have to decrease the reference count of the old ones.
+ */
+static void calculate_l2_meta(BlockDriverState *bs,
+                              uint64_t host_cluster_offset,
+                              uint64_t guest_offset, unsigned bytes,
+                              QCowL2Meta **m, bool keep_old)
+{
+    BDRVQcow2State *s = bs->opaque;
+    unsigned cow_start_from = 0;
+    unsigned cow_start_to = offset_into_cluster(s, guest_offset);
+    unsigned cow_end_from = cow_start_to + bytes;
+    unsigned cow_end_to = ROUND_UP(cow_end_from, s->cluster_size);
+    unsigned nb_clusters = size_to_clusters(s, cow_end_from);
+    QCowL2Meta *old_m = *m;
+
+    *m = g_malloc0(sizeof(**m));
+    **m = (QCowL2Meta) {
+        .next           = old_m,
+
+        .alloc_offset   = host_cluster_offset,
+        .offset         = start_of_cluster(s, guest_offset),
+        .nb_clusters    = nb_clusters,
+
+        .keep_old_clusters = keep_old,
+
+        .cow_start = {
+            .offset     = cow_start_from,
+            .nb_bytes   = cow_start_to - cow_start_from,
+        },
+        .cow_end = {
+            .offset     = cow_end_from,
+            .nb_bytes   = cow_end_to - cow_end_from,
+        },
+    };
+
+    qemu_co_queue_init(&(*m)->dependent_requests);
+    QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
+}
+
 /*
  * Returns the number of contiguous clusters that can be used for an allocating
  * write, but require COW to be performed (this includes yet unallocated space,
@@ -1435,35 +1485,14 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
     uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset);
     int avail_bytes = nb_clusters << s->cluster_bits;
     int nb_bytes = MIN(requested_bytes, avail_bytes);
-    QCowL2Meta *old_m = *m;
-
-    *m = g_malloc0(sizeof(**m));
-
-    **m = (QCowL2Meta) {
-        .next           = old_m,
-
-        .alloc_offset   = alloc_cluster_offset,
-        .offset         = start_of_cluster(s, guest_offset),
-        .nb_clusters    = nb_clusters,
-
-        .keep_old_clusters  = keep_old_clusters,
-
-        .cow_start = {
-            .offset     = 0,
-            .nb_bytes   = offset_into_cluster(s, guest_offset),
-        },
-        .cow_end = {
-            .offset     = nb_bytes,
-            .nb_bytes   = avail_bytes - nb_bytes,
-        },
-    };
-    qemu_co_queue_init(&(*m)->dependent_requests);
-    QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
 
     *host_offset = alloc_cluster_offset + offset_into_cluster(s, guest_offset);
     *bytes = MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset));
     assert(*bytes != 0);
 
+    calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *bytes,
+                      m, keep_old_clusters);
+
     return 1;
 
 fail:
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 04/34] qcow2: Split cluster_needs_cow() out of count_cow_clusters()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (2 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 03/34] qcow2: Add calculate_l2_meta() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 05/34] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied() Alberto Garcia
                   ` (29 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

We are going to need it in other places.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c | 34 +++++++++++++++++++---------------
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 61ad638bdc..80f9787461 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1087,6 +1087,24 @@ static void calculate_l2_meta(BlockDriverState *bs,
     QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
 }
 
+/* Returns true if writing to a cluster requires COW */
+static bool cluster_needs_cow(BlockDriverState *bs, uint64_t l2_entry)
+{
+    switch (qcow2_get_cluster_type(bs, l2_entry)) {
+    case QCOW2_CLUSTER_NORMAL:
+        if (l2_entry & QCOW_OFLAG_COPIED) {
+            return false;
+        }
+    case QCOW2_CLUSTER_UNALLOCATED:
+    case QCOW2_CLUSTER_COMPRESSED:
+    case QCOW2_CLUSTER_ZERO_PLAIN:
+    case QCOW2_CLUSTER_ZERO_ALLOC:
+        return true;
+    default:
+        abort();
+    }
+}
+
 /*
  * Returns the number of contiguous clusters that can be used for an allocating
  * write, but require COW to be performed (this includes yet unallocated space,
@@ -1099,25 +1117,11 @@ static int count_cow_clusters(BlockDriverState *bs, int nb_clusters,
 
     for (i = 0; i < nb_clusters; i++) {
         uint64_t l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
-        QCow2ClusterType cluster_type = qcow2_get_cluster_type(bs, l2_entry);
-
-        switch(cluster_type) {
-        case QCOW2_CLUSTER_NORMAL:
-            if (l2_entry & QCOW_OFLAG_COPIED) {
-                goto out;
-            }
+        if (!cluster_needs_cow(bs, l2_entry)) {
             break;
-        case QCOW2_CLUSTER_UNALLOCATED:
-        case QCOW2_CLUSTER_COMPRESSED:
-        case QCOW2_CLUSTER_ZERO_PLAIN:
-        case QCOW2_CLUSTER_ZERO_ALLOC:
-            break;
-        default:
-            abort();
         }
     }
 
-out:
     assert(i <= nb_clusters);
     return i;
 }
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 05/34] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (3 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 04/34] qcow2: Split cluster_needs_cow() out of count_cow_clusters() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-30 10:38   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 06/34] qcow2: Add get_l2_entry() and set_l2_entry() Alberto Garcia
                   ` (28 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

When writing to a qcow2 file there are two functions that take a
virtual offset and return a host offset, possibly allocating new
clusters if necessary:

   - handle_copied() looks for normal data clusters that are already
     allocated and have a reference count of 1. In those clusters we
     can simply write the data and there is no need to perform any
     copy-on-write.

   - handle_alloc() looks for clusters that do need copy-on-write,
     either because they haven't been allocated yet, because their
     reference count is != 1 or because they are ZERO_ALLOC clusters.

The ZERO_ALLOC case is a bit special because those are clusters that
are already allocated and they could perfectly be dealt with in
handle_copied() (as long as copy-on-write is performed when required).

In fact, there is extra code specifically for them in handle_alloc()
that tries to reuse the existing allocation if possible and frees them
otherwise.

This patch changes the handling of ZERO_ALLOC clusters so the
semantics of these two functions are now like this:

   - handle_copied() looks for clusters that are already allocated and
     which we can overwrite (NORMAL and ZERO_ALLOC clusters with a
     reference count of 1).

   - handle_alloc() looks for clusters for which we need a new
     allocation (all other cases).

One important difference after this change is that clusters found
in handle_copied() may now require copy-on-write, but this will be
necessary anyway once we add support for subclusters.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-cluster.c | 256 +++++++++++++++++++++++-------------------
 1 file changed, 141 insertions(+), 115 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 80f9787461..fce0be7a08 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1039,13 +1039,18 @@ void qcow2_alloc_cluster_abort(BlockDriverState *bs, QCowL2Meta *m)
 
 /*
  * For a given write request, create a new QCowL2Meta structure, add
- * it to @m and the BDRVQcow2State.cluster_allocs list.
+ * it to @m and the BDRVQcow2State.cluster_allocs list. If the write
+ * request does not need copy-on-write or changes to the L2 metadata
+ * then this function does nothing.
  *
  * @host_cluster_offset points to the beginning of the first cluster.
  *
  * @guest_offset and @bytes indicate the offset and length of the
  * request.
  *
+ * @l2_slice contains the L2 entries of all clusters involved in this
+ * write request.
+ *
  * If @keep_old is true it means that the clusters were already
  * allocated and will be overwritten. If false then the clusters are
  * new and we have to decrease the reference count of the old ones.
@@ -1053,15 +1058,53 @@ void qcow2_alloc_cluster_abort(BlockDriverState *bs, QCowL2Meta *m)
 static void calculate_l2_meta(BlockDriverState *bs,
                               uint64_t host_cluster_offset,
                               uint64_t guest_offset, unsigned bytes,
-                              QCowL2Meta **m, bool keep_old)
+                              uint64_t *l2_slice, QCowL2Meta **m, bool keep_old)
 {
     BDRVQcow2State *s = bs->opaque;
-    unsigned cow_start_from = 0;
+    int l2_index = offset_to_l2_slice_index(s, guest_offset);
+    uint64_t l2_entry;
+    unsigned cow_start_from, cow_end_to;
     unsigned cow_start_to = offset_into_cluster(s, guest_offset);
     unsigned cow_end_from = cow_start_to + bytes;
-    unsigned cow_end_to = ROUND_UP(cow_end_from, s->cluster_size);
     unsigned nb_clusters = size_to_clusters(s, cow_end_from);
     QCowL2Meta *old_m = *m;
+    QCow2ClusterType type;
+
+    assert(nb_clusters <= s->l2_slice_size - l2_index);
+
+    /* Return if there's no COW (all clusters are normal and we keep them) */
+    if (keep_old) {
+        int i;
+        for (i = 0; i < nb_clusters; i++) {
+            l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
+            if (qcow2_get_cluster_type(bs, l2_entry) != QCOW2_CLUSTER_NORMAL) {
+                break;
+            }
+        }
+        if (i == nb_clusters) {
+            return;
+        }
+    }
+
+    /* Get the L2 entry of the first cluster */
+    l2_entry = be64_to_cpu(l2_slice[l2_index]);
+    type = qcow2_get_cluster_type(bs, l2_entry);
+
+    if (type == QCOW2_CLUSTER_NORMAL && keep_old) {
+        cow_start_from = cow_start_to;
+    } else {
+        cow_start_from = 0;
+    }
+
+    /* Get the L2 entry of the last cluster */
+    l2_entry = be64_to_cpu(l2_slice[l2_index + nb_clusters - 1]);
+    type = qcow2_get_cluster_type(bs, l2_entry);
+
+    if (type == QCOW2_CLUSTER_NORMAL && keep_old) {
+        cow_end_to = cow_end_from;
+    } else {
+        cow_end_to = ROUND_UP(cow_end_from, s->cluster_size);
+    }
 
     *m = g_malloc0(sizeof(**m));
     **m = (QCowL2Meta) {
@@ -1087,18 +1130,22 @@ static void calculate_l2_meta(BlockDriverState *bs,
     QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
 }
 
-/* Returns true if writing to a cluster requires COW */
-static bool cluster_needs_cow(BlockDriverState *bs, uint64_t l2_entry)
+/*
+ * Returns true if writing to the cluster pointed to by @l2_entry
+ * requires a new allocation (that is, if the cluster is unallocated
+ * or has refcount > 1 and therefore cannot be written in-place).
+ */
+static bool cluster_needs_new_alloc(BlockDriverState *bs, uint64_t l2_entry)
 {
     switch (qcow2_get_cluster_type(bs, l2_entry)) {
     case QCOW2_CLUSTER_NORMAL:
+    case QCOW2_CLUSTER_ZERO_ALLOC:
         if (l2_entry & QCOW_OFLAG_COPIED) {
             return false;
         }
     case QCOW2_CLUSTER_UNALLOCATED:
     case QCOW2_CLUSTER_COMPRESSED:
     case QCOW2_CLUSTER_ZERO_PLAIN:
-    case QCOW2_CLUSTER_ZERO_ALLOC:
         return true;
     default:
         abort();
@@ -1106,20 +1153,38 @@ static bool cluster_needs_cow(BlockDriverState *bs, uint64_t l2_entry)
 }
 
 /*
- * Returns the number of contiguous clusters that can be used for an allocating
- * write, but require COW to be performed (this includes yet unallocated space,
- * which must copy from the backing file)
+ * Returns the number of contiguous clusters that can be written to
+ * using one single write request, starting from @l2_index.
+ * At most @nb_clusters are checked.
+ *
+ * If @new_alloc is true this counts clusters that are either
+ * unallocated, or allocated but with refcount > 1 (so they need to be
+ * newly allocated and COWed).
+ *
+ * If @new_alloc is false this counts clusters that are already
+ * allocated and can be overwritten in-place (this includes clusters
+ * of type QCOW2_CLUSTER_ZERO_ALLOC).
  */
-static int count_cow_clusters(BlockDriverState *bs, int nb_clusters,
-    uint64_t *l2_slice, int l2_index)
+static int count_single_write_clusters(BlockDriverState *bs, int nb_clusters,
+                                       uint64_t *l2_slice, int l2_index,
+                                       bool new_alloc)
 {
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t l2_entry = be64_to_cpu(l2_slice[l2_index]);
+    uint64_t expected_offset = l2_entry & L2E_OFFSET_MASK;
     int i;
 
     for (i = 0; i < nb_clusters; i++) {
-        uint64_t l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
-        if (!cluster_needs_cow(bs, l2_entry)) {
+        l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
+        if (cluster_needs_new_alloc(bs, l2_entry) != new_alloc) {
             break;
         }
+        if (!new_alloc) {
+            if (expected_offset != (l2_entry & L2E_OFFSET_MASK)) {
+                break;
+            }
+            expected_offset += s->cluster_size;
+        }
     }
 
     assert(i <= nb_clusters);
@@ -1190,10 +1255,10 @@ static int handle_dependencies(BlockDriverState *bs, uint64_t guest_offset,
 }
 
 /*
- * Checks how many already allocated clusters that don't require a copy on
- * write there are at the given guest_offset (up to *bytes). If *host_offset is
- * not INV_OFFSET, only physically contiguous clusters beginning at this host
- * offset are counted.
+ * Checks how many already allocated clusters that don't require a new
+ * allocation there are at the given guest_offset (up to *bytes).
+ * If *host_offset is not INV_OFFSET, only physically contiguous clusters
+ * beginning at this host offset are counted.
  *
  * Note that guest_offset may not be cluster aligned. In this case, the
  * returned *host_offset points to exact byte referenced by guest_offset and
@@ -1202,12 +1267,12 @@ static int handle_dependencies(BlockDriverState *bs, uint64_t guest_offset,
  * Returns:
  *   0:     if no allocated clusters are available at the given offset.
  *          *bytes is normally unchanged. It is set to 0 if the cluster
- *          is allocated and doesn't need COW, but doesn't have the right
- *          physical offset.
+ *          is allocated and can be overwritten in-place but doesn't have
+ *          the right physical offset.
  *
- *   1:     if allocated clusters that don't require a COW are available at
- *          the requested offset. *bytes may have decreased and describes
- *          the length of the area that can be written to.
+ *   1:     if allocated clusters that can be overwritten in place are
+ *          available at the requested offset. *bytes may have decreased
+ *          and describes the length of the area that can be written to.
  *
  *  -errno: in error cases
  */
@@ -1216,7 +1281,7 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
 {
     BDRVQcow2State *s = bs->opaque;
     int l2_index;
-    uint64_t cluster_offset;
+    uint64_t l2_entry, cluster_offset;
     uint64_t *l2_slice;
     uint64_t nb_clusters;
     unsigned int keep_clusters;
@@ -1237,7 +1302,8 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
 
     l2_index = offset_to_l2_slice_index(s, guest_offset);
     nb_clusters = MIN(nb_clusters, s->l2_slice_size - l2_index);
-    assert(nb_clusters <= INT_MAX);
+    /* Limit total byte count to BDRV_REQUEST_MAX_BYTES */
+    nb_clusters = MIN(nb_clusters, BDRV_REQUEST_MAX_BYTES >> s->cluster_bits);
 
     /* Find L2 entry for the first involved cluster */
     ret = get_cluster_table(bs, guest_offset, &l2_slice, &l2_index);
@@ -1245,41 +1311,39 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
         return ret;
     }
 
-    cluster_offset = be64_to_cpu(l2_slice[l2_index]);
+    l2_entry = be64_to_cpu(l2_slice[l2_index]);
+    cluster_offset = l2_entry & L2E_OFFSET_MASK;
+
+    if (!cluster_needs_new_alloc(bs, l2_entry)) {
+        if (offset_into_cluster(s, cluster_offset)) {
+            qcow2_signal_corruption(bs, true, -1, -1, "%s cluster offset "
+                                    "%#" PRIx64 " unaligned (guest offset: %#"
+                                    PRIx64 ")", l2_entry & QCOW_OFLAG_ZERO ?
+                                    "Preallocated zero" : "Data",
+                                    cluster_offset, guest_offset);
+            ret = -EIO;
+            goto out;
+        }
 
-    /* Check how many clusters are already allocated and don't need COW */
-    if (qcow2_get_cluster_type(bs, cluster_offset) == QCOW2_CLUSTER_NORMAL
-        && (cluster_offset & QCOW_OFLAG_COPIED))
-    {
         /* If a specific host_offset is required, check it */
-        bool offset_matches =
-            (cluster_offset & L2E_OFFSET_MASK) == *host_offset;
-
-        if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) {
-            qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset "
-                                    "%#llx unaligned (guest offset: %#" PRIx64
-                                    ")", cluster_offset & L2E_OFFSET_MASK,
-                                    guest_offset);
-            ret = -EIO;
-            goto out;
-        }
-
-        if (*host_offset != INV_OFFSET && !offset_matches) {
+        if (*host_offset != INV_OFFSET && cluster_offset != *host_offset) {
             *bytes = 0;
             ret = 0;
             goto out;
         }
 
         /* We keep all QCOW_OFLAG_COPIED clusters */
-        keep_clusters =
-            count_contiguous_clusters(bs, nb_clusters, s->cluster_size,
-                                      &l2_slice[l2_index],
-                                      QCOW_OFLAG_COPIED | QCOW_OFLAG_ZERO);
+        keep_clusters = count_single_write_clusters(bs, nb_clusters, l2_slice,
+                                                    l2_index, false);
         assert(keep_clusters <= nb_clusters);
 
         *bytes = MIN(*bytes,
                  keep_clusters * s->cluster_size
                  - offset_into_cluster(s, guest_offset));
+        assert(*bytes != 0);
+
+        calculate_l2_meta(bs, cluster_offset, guest_offset,
+                          *bytes, l2_slice, m, true);
 
         ret = 1;
     } else {
@@ -1293,8 +1357,7 @@ out:
     /* Only return a host offset if we actually made progress. Otherwise we
      * would make requirements for handle_alloc() that it can't fulfill */
     if (ret > 0) {
-        *host_offset = (cluster_offset & L2E_OFFSET_MASK)
-                     + offset_into_cluster(s, guest_offset);
+        *host_offset = cluster_offset + offset_into_cluster(s, guest_offset);
     }
 
     return ret;
@@ -1355,9 +1418,10 @@ static int do_alloc_cluster_offset(BlockDriverState *bs, uint64_t guest_offset,
 }
 
 /*
- * Allocates new clusters for an area that either is yet unallocated or needs a
- * copy on write. If *host_offset is not INV_OFFSET, clusters are only
- * allocated if the new allocation can match the specified host offset.
+ * Allocates new clusters for an area that is either still unallocated or
+ * cannot be overwritten in-place. If *host_offset is not INV_OFFSET,
+ * clusters are only allocated if the new allocation can match the specified
+ * host offset.
  *
  * Note that guest_offset may not be cluster aligned. In this case, the
  * returned *host_offset points to exact byte referenced by guest_offset and
@@ -1380,12 +1444,10 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
     BDRVQcow2State *s = bs->opaque;
     int l2_index;
     uint64_t *l2_slice;
-    uint64_t entry;
     uint64_t nb_clusters;
     int ret;
-    bool keep_old_clusters = false;
 
-    uint64_t alloc_cluster_offset = INV_OFFSET;
+    uint64_t alloc_cluster_offset;
 
     trace_qcow2_handle_alloc(qemu_coroutine_self(), guest_offset, *host_offset,
                              *bytes);
@@ -1400,10 +1462,8 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
 
     l2_index = offset_to_l2_slice_index(s, guest_offset);
     nb_clusters = MIN(nb_clusters, s->l2_slice_size - l2_index);
-    assert(nb_clusters <= INT_MAX);
-
-    /* Limit total allocation byte count to INT_MAX */
-    nb_clusters = MIN(nb_clusters, INT_MAX >> s->cluster_bits);
+    /* Limit total allocation byte count to BDRV_REQUEST_MAX_BYTES */
+    nb_clusters = MIN(nb_clusters, BDRV_REQUEST_MAX_BYTES >> s->cluster_bits);
 
     /* Find L2 entry for the first involved cluster */
     ret = get_cluster_table(bs, guest_offset, &l2_slice, &l2_index);
@@ -1411,67 +1471,32 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
         return ret;
     }
 
-    entry = be64_to_cpu(l2_slice[l2_index]);
-    nb_clusters = count_cow_clusters(bs, nb_clusters, l2_slice, l2_index);
+    nb_clusters = count_single_write_clusters(bs, nb_clusters,
+                                              l2_slice, l2_index, true);
 
     /* This function is only called when there were no non-COW clusters, so if
      * we can't find any unallocated or COW clusters either, something is
      * wrong with our code. */
     assert(nb_clusters > 0);
 
-    if (qcow2_get_cluster_type(bs, entry) == QCOW2_CLUSTER_ZERO_ALLOC &&
-        (entry & QCOW_OFLAG_COPIED) &&
-        (*host_offset == INV_OFFSET ||
-         start_of_cluster(s, *host_offset) == (entry & L2E_OFFSET_MASK)))
-    {
-        int preallocated_nb_clusters;
-
-        if (offset_into_cluster(s, entry & L2E_OFFSET_MASK)) {
-            qcow2_signal_corruption(bs, true, -1, -1, "Preallocated zero "
-                                    "cluster offset %#llx unaligned (guest "
-                                    "offset: %#" PRIx64 ")",
-                                    entry & L2E_OFFSET_MASK, guest_offset);
-            ret = -EIO;
-            goto fail;
-        }
-
-        /* Try to reuse preallocated zero clusters; contiguous normal clusters
-         * would be fine, too, but count_cow_clusters() above has limited
-         * nb_clusters already to a range of COW clusters */
-        preallocated_nb_clusters =
-            count_contiguous_clusters(bs, nb_clusters, s->cluster_size,
-                                      &l2_slice[l2_index], QCOW_OFLAG_COPIED);
-        assert(preallocated_nb_clusters > 0);
-
-        nb_clusters = preallocated_nb_clusters;
-        alloc_cluster_offset = entry & L2E_OFFSET_MASK;
-
-        /* We want to reuse these clusters, so qcow2_alloc_cluster_link_l2()
-         * should not free them. */
-        keep_old_clusters = true;
+    /* Allocate at a given offset in the image file */
+    alloc_cluster_offset = *host_offset == INV_OFFSET ? INV_OFFSET :
+        start_of_cluster(s, *host_offset);
+    ret = do_alloc_cluster_offset(bs, guest_offset, &alloc_cluster_offset,
+                                  &nb_clusters);
+    if (ret < 0) {
+        goto out;
     }
 
-    qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
-
-    if (alloc_cluster_offset == INV_OFFSET) {
-        /* Allocate, if necessary at a given offset in the image file */
-        alloc_cluster_offset = *host_offset == INV_OFFSET ? INV_OFFSET :
-                               start_of_cluster(s, *host_offset);
-        ret = do_alloc_cluster_offset(bs, guest_offset, &alloc_cluster_offset,
-                                      &nb_clusters);
-        if (ret < 0) {
-            goto fail;
-        }
-
-        /* Can't extend contiguous allocation */
-        if (nb_clusters == 0) {
-            *bytes = 0;
-            return 0;
-        }
-
-        assert(alloc_cluster_offset != INV_OFFSET);
+    /* Can't extend contiguous allocation */
+    if (nb_clusters == 0) {
+        *bytes = 0;
+        ret = 0;
+        goto out;
     }
 
+    assert(alloc_cluster_offset != INV_OFFSET);
+
     /*
      * Save info needed for meta data update.
      *
@@ -1494,13 +1519,14 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
     *bytes = MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset));
     assert(*bytes != 0);
 
-    calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *bytes,
-                      m, keep_old_clusters);
+    calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *bytes, l2_slice,
+                      m, false);
 
-    return 1;
+    ret = 1;
 
-fail:
-    if (*m && (*m)->nb_clusters > 0) {
+out:
+    qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
+    if (ret < 0 && *m && (*m)->nb_clusters > 0) {
         QLIST_REMOVE(*m, next_in_flight);
     }
     return ret;
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 06/34] qcow2: Add get_l2_entry() and set_l2_entry()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (4 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 05/34] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 07/34] qcow2: Document the Extended L2 Entries feature Alberto Garcia
                   ` (27 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

The size of an L2 entry is 64 bits, but if we want to have subclusters
we need extended L2 entries. This means that we have to access L2
tables and slices differently depending on whether an image has
extended L2 entries or not.

This patch replaces all l2_slice[] accesses with calls to
get_l2_entry() and set_l2_entry().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h          | 12 ++++++++
 block/qcow2-cluster.c  | 63 ++++++++++++++++++++++--------------------
 block/qcow2-refcount.c | 17 ++++++------
 3 files changed, 54 insertions(+), 38 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 06475e0849..eecbadc4cb 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -510,6 +510,18 @@ typedef enum QCow2MetadataOverlap {
 
 #define INV_OFFSET (-1ULL)
 
+static inline uint64_t get_l2_entry(BDRVQcow2State *s, uint64_t *l2_slice,
+                                    int idx)
+{
+    return be64_to_cpu(l2_slice[idx]);
+}
+
+static inline void set_l2_entry(BDRVQcow2State *s, uint64_t *l2_slice,
+                                int idx, uint64_t entry)
+{
+    l2_slice[idx] = cpu_to_be64(entry);
+}
+
 static inline bool has_data_file(BlockDriverState *bs)
 {
     BDRVQcow2State *s = bs->opaque;
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index fce0be7a08..76fd0f3cdb 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -383,12 +383,13 @@ fail:
  * cluster which may require a different handling)
  */
 static int count_contiguous_clusters(BlockDriverState *bs, int nb_clusters,
-        int cluster_size, uint64_t *l2_slice, uint64_t stop_flags)
+        int cluster_size, uint64_t *l2_slice, int l2_index, uint64_t stop_flags)
 {
+    BDRVQcow2State *s = bs->opaque;
     int i;
     QCow2ClusterType first_cluster_type;
     uint64_t mask = stop_flags | L2E_OFFSET_MASK | QCOW_OFLAG_COMPRESSED;
-    uint64_t first_entry = be64_to_cpu(l2_slice[0]);
+    uint64_t first_entry = get_l2_entry(s, l2_slice, l2_index);
     uint64_t offset = first_entry & mask;
 
     first_cluster_type = qcow2_get_cluster_type(bs, first_entry);
@@ -401,7 +402,7 @@ static int count_contiguous_clusters(BlockDriverState *bs, int nb_clusters,
            first_cluster_type == QCOW2_CLUSTER_ZERO_ALLOC);
 
     for (i = 0; i < nb_clusters; i++) {
-        uint64_t l2_entry = be64_to_cpu(l2_slice[i]) & mask;
+        uint64_t l2_entry = get_l2_entry(s, l2_slice, l2_index + i) & mask;
         if (offset + (uint64_t) i * cluster_size != l2_entry) {
             break;
         }
@@ -417,14 +418,16 @@ static int count_contiguous_clusters(BlockDriverState *bs, int nb_clusters,
 static int count_contiguous_clusters_unallocated(BlockDriverState *bs,
                                                  int nb_clusters,
                                                  uint64_t *l2_slice,
+                                                 int l2_index,
                                                  QCow2ClusterType wanted_type)
 {
+    BDRVQcow2State *s = bs->opaque;
     int i;
 
     assert(wanted_type == QCOW2_CLUSTER_ZERO_PLAIN ||
            wanted_type == QCOW2_CLUSTER_UNALLOCATED);
     for (i = 0; i < nb_clusters; i++) {
-        uint64_t entry = be64_to_cpu(l2_slice[i]);
+        uint64_t entry = get_l2_entry(s, l2_slice, l2_index + i);
         QCow2ClusterType type = qcow2_get_cluster_type(bs, entry);
 
         if (type != wanted_type) {
@@ -573,7 +576,7 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
     /* find the cluster offset for the given disk offset */
 
     l2_index = offset_to_l2_slice_index(s, offset);
-    l2_entry = be64_to_cpu(l2_slice[l2_index]);
+    l2_entry = get_l2_entry(s, l2_slice, l2_index);
 
     nb_clusters = size_to_clusters(s, bytes_needed);
     /* bytes_needed <= *bytes + offset_in_cluster, both of which are unsigned
@@ -608,7 +611,7 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
     case QCOW2_CLUSTER_UNALLOCATED:
         /* how many empty clusters ? */
         c = count_contiguous_clusters_unallocated(bs, nb_clusters,
-                                                  &l2_slice[l2_index], type);
+                                                  l2_slice, l2_index, type);
         *host_offset = 0;
         break;
     case QCOW2_CLUSTER_ZERO_ALLOC:
@@ -617,7 +620,7 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
         *host_offset = host_cluster_offset + offset_in_cluster;
         /* how many allocated clusters ? */
         c = count_contiguous_clusters(bs, nb_clusters, s->cluster_size,
-                                      &l2_slice[l2_index], QCOW_OFLAG_ZERO);
+                                      l2_slice, l2_index, QCOW_OFLAG_ZERO);
         if (offset_into_cluster(s, host_cluster_offset)) {
             qcow2_signal_corruption(bs, true, -1, -1,
                                     "Cluster allocation offset %#"
@@ -769,7 +772,7 @@ int qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
 
     /* Compression can't overwrite anything. Fail if the cluster was already
      * allocated. */
-    cluster_offset = be64_to_cpu(l2_slice[l2_index]);
+    cluster_offset = get_l2_entry(s, l2_slice, l2_index);
     if (cluster_offset & L2E_OFFSET_MASK) {
         qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
         return -EIO;
@@ -798,7 +801,7 @@ int qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
 
     BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED);
     qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
-    l2_slice[l2_index] = cpu_to_be64(cluster_offset);
+    set_l2_entry(s, l2_slice, l2_index, cluster_offset);
     qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
 
     *host_offset = cluster_offset & s->cluster_offset_mask;
@@ -991,14 +994,14 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
          * cluster the second one has to do RMW (which is done above by
          * perform_cow()), update l2 table with its cluster pointer and free
          * old cluster. This is what this loop does */
-        if (l2_slice[l2_index + i] != 0) {
-            old_cluster[j++] = l2_slice[l2_index + i];
+        if (get_l2_entry(s, l2_slice, l2_index + i) != 0) {
+            old_cluster[j++] = get_l2_entry(s, l2_slice, l2_index + i);
         }
 
         /* The offset must fit in the offset field of the L2 table entry */
         assert((offset & L2E_OFFSET_MASK) == offset);
 
-        l2_slice[l2_index + i] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
+        set_l2_entry(s, l2_slice, l2_index + i, offset | QCOW_OFLAG_COPIED);
      }
 
 
@@ -1012,8 +1015,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
      */
     if (!m->keep_old_clusters && j != 0) {
         for (i = 0; i < j; i++) {
-            qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1,
-                                    QCOW2_DISCARD_NEVER);
+            qcow2_free_any_clusters(bs, old_cluster[i], 1, QCOW2_DISCARD_NEVER);
         }
     }
 
@@ -1076,7 +1078,7 @@ static void calculate_l2_meta(BlockDriverState *bs,
     if (keep_old) {
         int i;
         for (i = 0; i < nb_clusters; i++) {
-            l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
+            l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
             if (qcow2_get_cluster_type(bs, l2_entry) != QCOW2_CLUSTER_NORMAL) {
                 break;
             }
@@ -1087,7 +1089,7 @@ static void calculate_l2_meta(BlockDriverState *bs,
     }
 
     /* Get the L2 entry of the first cluster */
-    l2_entry = be64_to_cpu(l2_slice[l2_index]);
+    l2_entry = get_l2_entry(s, l2_slice, l2_index);
     type = qcow2_get_cluster_type(bs, l2_entry);
 
     if (type == QCOW2_CLUSTER_NORMAL && keep_old) {
@@ -1097,7 +1099,7 @@ static void calculate_l2_meta(BlockDriverState *bs,
     }
 
     /* Get the L2 entry of the last cluster */
-    l2_entry = be64_to_cpu(l2_slice[l2_index + nb_clusters - 1]);
+    l2_entry = get_l2_entry(s, l2_slice, l2_index + nb_clusters - 1);
     type = qcow2_get_cluster_type(bs, l2_entry);
 
     if (type == QCOW2_CLUSTER_NORMAL && keep_old) {
@@ -1170,12 +1172,12 @@ static int count_single_write_clusters(BlockDriverState *bs, int nb_clusters,
                                        bool new_alloc)
 {
     BDRVQcow2State *s = bs->opaque;
-    uint64_t l2_entry = be64_to_cpu(l2_slice[l2_index]);
+    uint64_t l2_entry = get_l2_entry(s, l2_slice, l2_index);
     uint64_t expected_offset = l2_entry & L2E_OFFSET_MASK;
     int i;
 
     for (i = 0; i < nb_clusters; i++) {
-        l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
+        l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
         if (cluster_needs_new_alloc(bs, l2_entry) != new_alloc) {
             break;
         }
@@ -1311,7 +1313,7 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
         return ret;
     }
 
-    l2_entry = be64_to_cpu(l2_slice[l2_index]);
+    l2_entry = get_l2_entry(s, l2_slice, l2_index);
     cluster_offset = l2_entry & L2E_OFFSET_MASK;
 
     if (!cluster_needs_new_alloc(bs, l2_entry)) {
@@ -1688,7 +1690,7 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset,
     for (i = 0; i < nb_clusters; i++) {
         uint64_t old_l2_entry;
 
-        old_l2_entry = be64_to_cpu(l2_slice[l2_index + i]);
+        old_l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
 
         /*
          * If full_discard is false, make sure that a discarded area reads back
@@ -1728,9 +1730,9 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset,
         /* First remove L2 entries */
         qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
         if (!full_discard && s->qcow_version >= 3) {
-            l2_slice[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
+            set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO);
         } else {
-            l2_slice[l2_index + i] = cpu_to_be64(0);
+            set_l2_entry(s, l2_slice, l2_index + i, 0);
         }
 
         /* Then decrease the refcount */
@@ -1810,7 +1812,7 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset,
         uint64_t old_offset;
         QCow2ClusterType cluster_type;
 
-        old_offset = be64_to_cpu(l2_slice[l2_index + i]);
+        old_offset = get_l2_entry(s, l2_slice, l2_index + i);
 
         /*
          * Minimize L2 changes if the cluster already reads back as
@@ -1824,10 +1826,11 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset,
 
         qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
         if (cluster_type == QCOW2_CLUSTER_COMPRESSED || unmap) {
-            l2_slice[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
+            set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO);
             qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
         } else {
-            l2_slice[l2_index + i] |= cpu_to_be64(QCOW_OFLAG_ZERO);
+            uint64_t entry = get_l2_entry(s, l2_slice, l2_index + i);
+            set_l2_entry(s, l2_slice, l2_index + i, entry | QCOW_OFLAG_ZERO);
         }
     }
 
@@ -1965,7 +1968,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
             }
 
             for (j = 0; j < s->l2_slice_size; j++) {
-                uint64_t l2_entry = be64_to_cpu(l2_slice[j]);
+                uint64_t l2_entry = get_l2_entry(s, l2_slice, j);
                 int64_t offset = l2_entry & L2E_OFFSET_MASK;
                 QCow2ClusterType cluster_type =
                     qcow2_get_cluster_type(bs, l2_entry);
@@ -1979,7 +1982,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
                     if (!bs->backing) {
                         /* not backed; therefore we can simply deallocate the
                          * cluster */
-                        l2_slice[j] = 0;
+                        set_l2_entry(s, l2_slice, j, 0);
                         l2_dirty = true;
                         continue;
                     }
@@ -2045,9 +2048,9 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
                 }
 
                 if (l2_refcount == 1) {
-                    l2_slice[j] = cpu_to_be64(offset | QCOW_OFLAG_COPIED);
+                    set_l2_entry(s, l2_slice, j, offset | QCOW_OFLAG_COPIED);
                 } else {
-                    l2_slice[j] = cpu_to_be64(offset);
+                    set_l2_entry(s, l2_slice, j, offset);
                 }
                 l2_dirty = true;
             }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 0457a6060d..04546838e8 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1310,7 +1310,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                     uint64_t cluster_index;
                     uint64_t offset;
 
-                    entry = be64_to_cpu(l2_slice[j]);
+                    entry = get_l2_entry(s, l2_slice, j);
                     old_entry = entry;
                     entry &= ~QCOW_OFLAG_COPIED;
                     offset = entry & L2E_OFFSET_MASK;
@@ -1384,7 +1384,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                             qcow2_cache_set_dependency(bs, s->l2_table_cache,
                                                        s->refcount_block_cache);
                         }
-                        l2_slice[j] = cpu_to_be64(entry);
+                        set_l2_entry(s, l2_slice, j, entry);
                         qcow2_cache_entry_mark_dirty(s->l2_table_cache,
                                                      l2_slice);
                     }
@@ -1617,7 +1617,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
 
     /* Do the actual checks */
     for(i = 0; i < s->l2_size; i++) {
-        l2_entry = be64_to_cpu(l2_table[i]);
+        l2_entry = get_l2_entry(s, l2_table, i);
 
         switch (qcow2_get_cluster_type(bs, l2_entry)) {
         case QCOW2_CLUSTER_COMPRESSED:
@@ -1686,7 +1686,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
                                            QCOW2_OL_INACTIVE_L2;
 
                         l2_entry = QCOW_OFLAG_ZERO;
-                        l2_table[i] = cpu_to_be64(l2_entry);
+                        set_l2_entry(s, l2_table, i, l2_entry);
                         ret = qcow2_pre_write_overlap_check(bs, ign,
                                 l2e_offset, sizeof(uint64_t), false);
                         if (ret < 0) {
@@ -1914,7 +1914,7 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
         }
 
         for (j = 0; j < s->l2_size; j++) {
-            uint64_t l2_entry = be64_to_cpu(l2_table[j]);
+            uint64_t l2_entry = get_l2_entry(s, l2_table, j);
             uint64_t data_offset = l2_entry & L2E_OFFSET_MASK;
             QCow2ClusterType cluster_type = qcow2_get_cluster_type(bs, l2_entry);
 
@@ -1937,9 +1937,10 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
                             "l2_entry=%" PRIx64 " refcount=%" PRIu64 "\n",
                             repair ? "Repairing" : "ERROR", l2_entry, refcount);
                     if (repair) {
-                        l2_table[j] = cpu_to_be64(refcount == 1
-                                    ? l2_entry |  QCOW_OFLAG_COPIED
-                                    : l2_entry & ~QCOW_OFLAG_COPIED);
+                        set_l2_entry(s, l2_table, j,
+                                     refcount == 1 ?
+                                     l2_entry |  QCOW_OFLAG_COPIED :
+                                     l2_entry & ~QCOW_OFLAG_COPIED);
                         l2_dirty++;
                     }
                 }
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 07/34] qcow2: Document the Extended L2 Entries feature
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (5 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 06/34] qcow2: Add get_l2_entry() and set_l2_entry() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 08/34] qcow2: Add dummy has_subclusters() function Alberto Garcia
                   ` (26 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Subcluster allocation in qcow2 is implemented by extending the
existing L2 table entries and adding additional information to
indicate the allocation status of each subcluster.

This patch documents the changes to the qcow2 format and how they
affect the calculation of the L2 cache size.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 docs/interop/qcow2.txt | 68 ++++++++++++++++++++++++++++++++++++++++--
 docs/qcow2-cache.txt   | 19 +++++++++++-
 2 files changed, 83 insertions(+), 4 deletions(-)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index cb723463f2..64e9345fb4 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -42,6 +42,9 @@ The first cluster of a qcow2 image contains the file header:
                     as the maximum cluster size and won't be able to open images
                     with larger cluster sizes.
 
+                    Note: if the image has Extended L2 Entries then cluster_bits
+                    must be at least 14 (i.e. 16384 byte clusters).
+
          24 - 31:   size
                     Virtual disk size in bytes.
 
@@ -117,7 +120,12 @@ the next fields through header_length.
                                 clusters. The compression_type field must be
                                 present and not zero.
 
-                    Bits 4-63:  Reserved (set to 0)
+                    Bit 4:      Extended L2 Entries.  If this bit is set then
+                                L2 table entries use an extended format that
+                                allows subcluster-based allocation. See the
+                                Extended L2 Entries section for more details.
+
+                    Bits 5-63:  Reserved (set to 0)
 
          80 -  87:  compatible_features
                     Bitmask of compatible features. An implementation can
@@ -498,7 +506,7 @@ cannot be relaxed without an incompatible layout change).
 Given an offset into the virtual disk, the offset into the image file can be
 obtained as follows:
 
-    l2_entries = (cluster_size / sizeof(uint64_t))
+    l2_entries = (cluster_size / sizeof(uint64_t))        [*]
 
     l2_index = (offset / cluster_size) % l2_entries
     l1_index = (offset / cluster_size) / l2_entries
@@ -508,6 +516,8 @@ obtained as follows:
 
     return cluster_offset + (offset % cluster_size)
 
+    [*] this changes if Extended L2 Entries are enabled, see next section
+
 L1 table entry:
 
     Bit  0 -  8:    Reserved (set to 0)
@@ -548,7 +558,8 @@ Standard Cluster Descriptor:
                     nor is data read from the backing file if the cluster is
                     unallocated.
 
-                    With version 2, this is always 0.
+                    With version 2 or with extended L2 entries (see the next
+                    section), this is always 0.
 
          1 -  8:    Reserved (set to 0)
 
@@ -585,6 +596,57 @@ file (except if bit 0 in the Standard Cluster Descriptor is set). If there is
 no backing file or the backing file is smaller than the image, they shall read
 zeros for all parts that are not covered by the backing file.
 
+== Extended L2 Entries ==
+
+An image uses Extended L2 Entries if bit 4 is set on the incompatible_features
+field of the header.
+
+In these images standard data clusters are divided into 32 subclusters of the
+same size. They are contiguous and start from the beginning of the cluster.
+Subclusters can be allocated independently and the L2 entry contains information
+indicating the status of each one of them. Compressed data clusters don't have
+subclusters so they are treated the same as in images without this feature.
+
+The size of an extended L2 entry is 128 bits so the number of entries per table
+is calculated using this formula:
+
+    l2_entries = (cluster_size / (2 * sizeof(uint64_t)))
+
+The first 64 bits have the same format as the standard L2 table entry described
+in the previous section, with the exception of bit 0 of the standard cluster
+descriptor.
+
+The last 64 bits contain a subcluster allocation bitmap with this format:
+
+Subcluster Allocation Bitmap (for standard clusters):
+
+    Bit  0 - 31:    Allocation status (one bit per subcluster)
+
+                    1: the subcluster is allocated. In this case the
+                       host cluster offset field must contain a valid
+                       offset.
+                    0: the subcluster is not allocated. In this case
+                       read requests shall go to the backing file or
+                       return zeros if there is no backing file data.
+
+                    Bits are assigned starting from the least significant
+                    one (i.e. bit x is used for subcluster x).
+
+        32 - 63     Subcluster reads as zeros (one bit per subcluster)
+
+                    1: the subcluster reads as zeros. In this case the
+                       allocation status bit must be unset. The host
+                       cluster offset field may or may not be set.
+                    0: no effect.
+
+                    Bits are assigned starting from the least significant
+                    one (i.e. bit x is used for subcluster x - 32).
+
+Subcluster Allocation Bitmap (for compressed clusters):
+
+    Bit  0 - 63:    Reserved (set to 0)
+                    Compressed clusters don't have subclusters,
+                    so this field is not used.
 
 == Snapshots ==
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index d57f409861..5f763aa6bb 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -1,6 +1,6 @@
 qcow2 L2/refcount cache configuration
 =====================================
-Copyright (C) 2015, 2018 Igalia, S.L.
+Copyright (C) 2015, 2018-2020 Igalia, S.L.
 Author: Alberto Garcia <berto@igalia.com>
 
 This work is licensed under the terms of the GNU GPL, version 2 or
@@ -222,3 +222,20 @@ support this functionality, and is 0 (disabled) on other platforms.
 This functionality currently relies on the MADV_DONTNEED argument for
 madvise() to actually free the memory. This is a Linux-specific feature,
 so cache-clean-interval is not supported on other systems.
+
+
+Extended L2 Entries
+-------------------
+All numbers shown in this document are valid for qcow2 images with normal
+64-bit L2 entries.
+
+Images with extended L2 entries need twice as much L2 metadata, so the L2
+cache size must be twice as large for the same disk space.
+
+   disk_size = l2_cache_size * cluster_size / 16
+
+i.e.
+
+   l2_cache_size = disk_size * 16 / cluster_size
+
+Refcount blocks are not affected by this.
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 08/34] qcow2: Add dummy has_subclusters() function
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (6 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 07/34] qcow2: Document the Extended L2 Entries feature Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 09/34] qcow2: Add subcluster-related fields to BDRVQcow2State Alberto Garcia
                   ` (25 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

This function will be used by the qcow2 code to check if an image has
subclusters or not.

At the moment this simply returns false. Once all patches needed for
subcluster support are ready then QEMU will be able to create and
read images with subclusters and this function will return the actual
value.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index eecbadc4cb..2064dd3d85 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -510,6 +510,12 @@ typedef enum QCow2MetadataOverlap {
 
 #define INV_OFFSET (-1ULL)
 
+static inline bool has_subclusters(BDRVQcow2State *s)
+{
+    /* FIXME: Return false until this feature is complete */
+    return false;
+}
+
 static inline uint64_t get_l2_entry(BDRVQcow2State *s, uint64_t *l2_slice,
                                     int idx)
 {
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 09/34] qcow2: Add subcluster-related fields to BDRVQcow2State
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (7 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 08/34] qcow2: Add dummy has_subclusters() function Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 10/34] qcow2: Add offset_to_sc_index() Alberto Garcia
                   ` (24 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

This patch adds the following new fields to BDRVQcow2State:

- subclusters_per_cluster: Number of subclusters in a cluster
- subcluster_size: The size of each subcluster, in bytes
- subcluster_bits: No. of bits so 1 << subcluster_bits = subcluster_size

Images without subclusters are treated as if they had exactly one
subcluster per cluster (i.e. subcluster_size = cluster_size).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h | 5 +++++
 block/qcow2.c | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index 2064dd3d85..eee4c8de9c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -78,6 +78,8 @@
 /* The cluster reads as all zeros */
 #define QCOW_OFLAG_ZERO (1ULL << 0)
 
+#define QCOW_EXTL2_SUBCLUSTERS_PER_CLUSTER 32
+
 #define MIN_CLUSTER_BITS 9
 #define MAX_CLUSTER_BITS 21
 
@@ -295,6 +297,9 @@ typedef struct BDRVQcow2State {
     int cluster_bits;
     int cluster_size;
     int l2_slice_size;
+    int subcluster_bits;
+    int subcluster_size;
+    int subclusters_per_cluster;
     int l2_bits;
     int l2_size;
     int l1_size;
diff --git a/block/qcow2.c b/block/qcow2.c
index afb00ada42..5c175a314c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1433,6 +1433,11 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
         }
     }
 
+    s->subclusters_per_cluster =
+        has_subclusters(s) ? QCOW_EXTL2_SUBCLUSTERS_PER_CLUSTER : 1;
+    s->subcluster_size = s->cluster_size / s->subclusters_per_cluster;
+    s->subcluster_bits = ctz32(s->subcluster_size);
+
     /* Check support for various header values */
     if (header.refcount_order > 6) {
         error_setg(errp, "Reference count entry width too large; may not "
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 10/34] qcow2: Add offset_to_sc_index()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (8 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 09/34] qcow2: Add subcluster-related fields to BDRVQcow2State Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 11/34] qcow2: Add offset_into_subcluster() and size_to_subclusters() Alberto Garcia
                   ` (23 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

For a given offset, return the subcluster number within its cluster
(i.e. with 32 subclusters per cluster it returns a number between 0
and 31).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index eee4c8de9c..2503374677 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -581,6 +581,11 @@ static inline int offset_to_l2_slice_index(BDRVQcow2State *s, int64_t offset)
     return (offset >> s->cluster_bits) & (s->l2_slice_size - 1);
 }
 
+static inline int offset_to_sc_index(BDRVQcow2State *s, int64_t offset)
+{
+    return (offset >> s->subcluster_bits) & (s->subclusters_per_cluster - 1);
+}
+
 static inline int64_t qcow2_vm_state_offset(BDRVQcow2State *s)
 {
     return (int64_t)s->l1_vm_state_index << (s->cluster_bits + s->l2_bits);
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 11/34] qcow2: Add offset_into_subcluster() and size_to_subclusters()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (9 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 10/34] qcow2: Add offset_to_sc_index() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-01 12:23   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 12/34] qcow2: Add l2_entry_size() Alberto Garcia
                   ` (22 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Like offset_into_cluster() and size_to_clusters(), but for
subclusters.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index 2503374677..4fe31adfd3 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -555,11 +555,21 @@ static inline int64_t offset_into_cluster(BDRVQcow2State *s, int64_t offset)
     return offset & (s->cluster_size - 1);
 }
 
+static inline int64_t offset_into_subcluster(BDRVQcow2State *s, int64_t offset)
+{
+    return offset & (s->subcluster_size - 1);
+}
+
 static inline uint64_t size_to_clusters(BDRVQcow2State *s, uint64_t size)
 {
     return (size + (s->cluster_size - 1)) >> s->cluster_bits;
 }
 
+static inline uint64_t size_to_subclusters(BDRVQcow2State *s, uint64_t size)
+{
+    return (size + (s->subcluster_size - 1)) >> s->subcluster_bits;
+}
+
 static inline int64_t size_to_l1(BDRVQcow2State *s, int64_t size)
 {
     int shift = s->cluster_bits + s->l2_bits;
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 12/34] qcow2: Add l2_entry_size()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (10 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 11/34] qcow2: Add offset_into_subcluster() and size_to_subclusters() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 13/34] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap() Alberto Garcia
                   ` (21 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

qcow2 images with subclusters have 128-bit L2 entries. The first 64
bits contain the same information as traditional images and the last
64 bits form a bitmap with the status of each individual subcluster.

Because of that we cannot assume that L2 entries are sizeof(uint64_t)
anymore. This function returns the proper value for the image.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2.h          |  9 +++++++++
 block/qcow2-cluster.c  | 12 ++++++------
 block/qcow2-refcount.c | 14 ++++++++------
 block/qcow2.c          |  8 ++++----
 4 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 4fe31adfd3..46b351229a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -80,6 +80,10 @@
 
 #define QCOW_EXTL2_SUBCLUSTERS_PER_CLUSTER 32
 
+/* Size of normal and extended L2 entries */
+#define L2E_SIZE_NORMAL   (sizeof(uint64_t))
+#define L2E_SIZE_EXTENDED (sizeof(uint64_t) * 2)
+
 #define MIN_CLUSTER_BITS 9
 #define MAX_CLUSTER_BITS 21
 
@@ -521,6 +525,11 @@ static inline bool has_subclusters(BDRVQcow2State *s)
     return false;
 }
 
+static inline size_t l2_entry_size(BDRVQcow2State *s)
+{
+    return has_subclusters(s) ? L2E_SIZE_EXTENDED : L2E_SIZE_NORMAL;
+}
+
 static inline uint64_t get_l2_entry(BDRVQcow2State *s, uint64_t *l2_slice,
                                     int idx)
 {
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 76fd0f3cdb..8b2fc550b7 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -208,7 +208,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
                    uint64_t l2_offset, uint64_t **l2_slice)
 {
     BDRVQcow2State *s = bs->opaque;
-    int start_of_slice = sizeof(uint64_t) *
+    int start_of_slice = l2_entry_size(s) *
         (offset_to_l2_index(s, offset) - offset_to_l2_slice_index(s, offset));
 
     return qcow2_cache_get(bs, s->l2_table_cache, l2_offset + start_of_slice,
@@ -281,7 +281,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index)
 
     /* allocate a new l2 entry */
 
-    l2_offset = qcow2_alloc_clusters(bs, s->l2_size * sizeof(uint64_t));
+    l2_offset = qcow2_alloc_clusters(bs, s->l2_size * l2_entry_size(s));
     if (l2_offset < 0) {
         ret = l2_offset;
         goto fail;
@@ -305,7 +305,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index)
 
     /* allocate a new entry in the l2 cache */
 
-    slice_size2 = s->l2_slice_size * sizeof(uint64_t);
+    slice_size2 = s->l2_slice_size * l2_entry_size(s);
     n_slices = s->cluster_size / slice_size2;
 
     trace_qcow2_l2_allocate_get_empty(bs, l1_index);
@@ -369,7 +369,7 @@ fail:
     }
     s->l1_table[l1_index] = old_l2_offset;
     if (l2_offset > 0) {
-        qcow2_free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t),
+        qcow2_free_clusters(bs, l2_offset, s->l2_size * l2_entry_size(s),
                             QCOW2_DISCARD_ALWAYS);
     }
     return ret;
@@ -716,7 +716,7 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
 
         /* Then decrease the refcount of the old table */
         if (l2_offset) {
-            qcow2_free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t),
+            qcow2_free_clusters(bs, l2_offset, s->l2_size * l2_entry_size(s),
                                 QCOW2_DISCARD_OTHER);
         }
 
@@ -1913,7 +1913,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
     int ret;
     int i, j;
 
-    slice_size2 = s->l2_slice_size * sizeof(uint64_t);
+    slice_size2 = s->l2_slice_size * l2_entry_size(s);
     n_slices = s->cluster_size / slice_size2;
 
     if (!is_active_l1) {
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 04546838e8..770c5dbc83 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1254,7 +1254,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     l2_slice = NULL;
     l1_table = NULL;
     l1_size2 = l1_size * sizeof(uint64_t);
-    slice_size2 = s->l2_slice_size * sizeof(uint64_t);
+    slice_size2 = s->l2_slice_size * l2_entry_size(s);
     n_slices = s->cluster_size / slice_size2;
 
     s->cache_discards = true;
@@ -1605,7 +1605,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
     int i, l2_size, nb_csectors, ret;
 
     /* Read L2 table from disk */
-    l2_size = s->l2_size * sizeof(uint64_t);
+    l2_size = s->l2_size * l2_entry_size(s);
     l2_table = g_malloc(l2_size);
 
     ret = bdrv_pread(bs->file, l2_offset, l2_table, l2_size);
@@ -1680,15 +1680,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
                             fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR",
                             offset);
                     if (fix & BDRV_FIX_ERRORS) {
+                        int idx = i * (l2_entry_size(s) / sizeof(uint64_t));
                         uint64_t l2e_offset =
-                            l2_offset + (uint64_t)i * sizeof(uint64_t);
+                            l2_offset + (uint64_t)i * l2_entry_size(s);
                         int ign = active ? QCOW2_OL_ACTIVE_L2 :
                                            QCOW2_OL_INACTIVE_L2;
 
                         l2_entry = QCOW_OFLAG_ZERO;
                         set_l2_entry(s, l2_table, i, l2_entry);
                         ret = qcow2_pre_write_overlap_check(bs, ign,
-                                l2e_offset, sizeof(uint64_t), false);
+                                l2e_offset, l2_entry_size(s), false);
                         if (ret < 0) {
                             fprintf(stderr, "ERROR: Overlap check failed\n");
                             res->check_errors++;
@@ -1698,7 +1699,8 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
                         }
 
                         ret = bdrv_pwrite_sync(bs->file, l2e_offset,
-                                               &l2_table[i], sizeof(uint64_t));
+                                               &l2_table[idx],
+                                               l2_entry_size(s));
                         if (ret < 0) {
                             fprintf(stderr, "ERROR: Failed to overwrite L2 "
                                     "table entry: %s\n", strerror(-ret));
@@ -1905,7 +1907,7 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
         }
 
         ret = bdrv_pread(bs->file, l2_offset, l2_table,
-                         s->l2_size * sizeof(uint64_t));
+                         s->l2_size * l2_entry_size(s));
         if (ret < 0) {
             fprintf(stderr, "ERROR: Could not read L2 table: %s\n",
                     strerror(-ret));
diff --git a/block/qcow2.c b/block/qcow2.c
index 5c175a314c..7df55a88a8 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -870,7 +870,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts,
     uint64_t max_l2_entries = DIV_ROUND_UP(virtual_disk_size, s->cluster_size);
     /* An L2 table is always one cluster in size so the max cache size
      * should be a multiple of the cluster size. */
-    uint64_t max_l2_cache = ROUND_UP(max_l2_entries * sizeof(uint64_t),
+    uint64_t max_l2_cache = ROUND_UP(max_l2_entries * l2_entry_size(s),
                                      s->cluster_size);
 
     combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
@@ -1031,7 +1031,7 @@ static int qcow2_update_options_prepare(BlockDriverState *bs,
         }
     }
 
-    r->l2_slice_size = l2_cache_entry_size / sizeof(uint64_t);
+    r->l2_slice_size = l2_cache_entry_size / l2_entry_size(s);
     r->l2_table_cache = qcow2_cache_create(bs, l2_cache_size,
                                            l2_cache_entry_size);
     r->refcount_block_cache = qcow2_cache_create(bs, refcount_cache_size,
@@ -1478,7 +1478,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
         bs->encrypted = true;
     }
 
-    s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
+    s->l2_bits = s->cluster_bits - ctz32(l2_entry_size(s));
     s->l2_size = 1 << s->l2_bits;
     /* 2^(s->refcount_order - 3) is the refcount width in bytes */
     s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
@@ -4245,7 +4245,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
          *  preallocation. All that matters is that we will not have to allocate
          *  new refcount structures for them.) */
         nb_new_l2_tables = DIV_ROUND_UP(nb_new_data_clusters,
-                                        s->cluster_size / sizeof(uint64_t));
+                                        s->cluster_size / l2_entry_size(s));
         /* The cluster range may not be aligned to L2 boundaries, so add one L2
          * table for a potential head/tail */
         nb_new_l2_tables++;
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 13/34] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (11 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 12/34] qcow2: Add l2_entry_size() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-01 12:28   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type() Alberto Garcia
                   ` (20 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Extended L2 entries are 128-bit wide: 64 bits for the entry itself and
64 bits for the subcluster allocation bitmap.

In order to support them correctly get/set_l2_entry() need to be
updated so they take the entry width into account in order to
calculate the correct offset.

This patch also adds the get/set_l2_bitmap() functions that are
used to access the bitmaps. For convenience we allow calling
get_l2_bitmap() on images without subclusters. In this case the
returned value is always 0 and has no meaning.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index 46b351229a..82b86f6cec 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -533,15 +533,36 @@ static inline size_t l2_entry_size(BDRVQcow2State *s)
 static inline uint64_t get_l2_entry(BDRVQcow2State *s, uint64_t *l2_slice,
                                     int idx)
 {
+    idx *= l2_entry_size(s) / sizeof(uint64_t);
     return be64_to_cpu(l2_slice[idx]);
 }
 
+static inline uint64_t get_l2_bitmap(BDRVQcow2State *s, uint64_t *l2_slice,
+                                     int idx)
+{
+    if (has_subclusters(s)) {
+        idx *= l2_entry_size(s) / sizeof(uint64_t);
+        return be64_to_cpu(l2_slice[idx + 1]);
+    } else {
+        return 0; /* For convenience only; this value has no meaning. */
+    }
+}
+
 static inline void set_l2_entry(BDRVQcow2State *s, uint64_t *l2_slice,
                                 int idx, uint64_t entry)
 {
+    idx *= l2_entry_size(s) / sizeof(uint64_t);
     l2_slice[idx] = cpu_to_be64(entry);
 }
 
+static inline void set_l2_bitmap(BDRVQcow2State *s, uint64_t *l2_slice,
+                                 int idx, uint64_t bitmap)
+{
+    assert(has_subclusters(s));
+    idx *= l2_entry_size(s) / sizeof(uint64_t);
+    l2_slice[idx + 1] = cpu_to_be64(bitmap);
+}
+
 static inline bool has_data_file(BlockDriverState *bs)
 {
     BDRVQcow2State *s = bs->opaque;
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (12 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 13/34] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-01 12:52   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 15/34] qcow2: Add qcow2_get_subcluster_range_type() Alberto Garcia
                   ` (19 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

This patch adds QCow2SubclusterType, which is the subcluster-level
version of QCow2ClusterType. All QCOW2_SUBCLUSTER_* values have the
the same meaning as their QCOW2_CLUSTER_* equivalents (when they
exist). See below for details and caveats.

In images without extended L2 entries clusters are treated as having
exactly one subcluster so it is possible to replace one data type with
the other while keeping the exact same semantics.

With extended L2 entries there are new possible values, and every
subcluster in the same cluster can obviously have a different
QCow2SubclusterType so functions need to be adapted to work on the
subcluster level.

There are several things that have to be taken into account:

  a) QCOW2_SUBCLUSTER_COMPRESSED means that the whole cluster is
     compressed. We do not support compression at the subcluster
     level.

  b) There are two different values for unallocated subclusters:
     QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN which means that the whole
     cluster is unallocated, and QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC
     which means that the cluster is allocated but the subcluster is
     not. The latter can only happen in images with extended L2
     entries.

  c) QCOW2_SUBCLUSTER_INVALID is used to detect the cases where an L2
     entry has a value that violates the specification. The caller is
     responsible for handling these situations.

     To prevent compatibility problems with images that have invalid
     values but are currently being read by QEMU without causing side
     effects, QCOW2_SUBCLUSTER_INVALID is only returned for images
     with extended L2 entries.

qcow2_cluster_to_subcluster_type() is added as a separate function
from qcow2_get_subcluster_type(), but this is only temporary and both
will be merged in a subsequent patch.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.h | 126 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 125 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 82b86f6cec..3aec6f452a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -80,6 +80,21 @@
 
 #define QCOW_EXTL2_SUBCLUSTERS_PER_CLUSTER 32
 
+/* The subcluster X [0..31] is allocated */
+#define QCOW_OFLAG_SUB_ALLOC(X)   (1ULL << (X))
+/* The subcluster X [0..31] reads as zeroes */
+#define QCOW_OFLAG_SUB_ZERO(X)    (QCOW_OFLAG_SUB_ALLOC(X) << 32)
+/* Subclusters [X, Y) (0 <= X <= Y <= 32) are allocated */
+#define QCOW_OFLAG_SUB_ALLOC_RANGE(X, Y) \
+    (QCOW_OFLAG_SUB_ALLOC(Y) - QCOW_OFLAG_SUB_ALLOC(X))
+/* Subclusters [X, Y) (0 <= X <= Y <= 32) read as zeroes */
+#define QCOW_OFLAG_SUB_ZERO_RANGE(X, Y) \
+    (QCOW_OFLAG_SUB_ALLOC_RANGE(X, Y) << 32)
+/* L2 entry bitmap with all allocation bits set */
+#define QCOW_L2_BITMAP_ALL_ALLOC  (QCOW_OFLAG_SUB_ALLOC_RANGE(0, 32))
+/* L2 entry bitmap with all "read as zeroes" bits set */
+#define QCOW_L2_BITMAP_ALL_ZEROES (QCOW_OFLAG_SUB_ZERO_RANGE(0, 32))
+
 /* Size of normal and extended L2 entries */
 #define L2E_SIZE_NORMAL   (sizeof(uint64_t))
 #define L2E_SIZE_EXTENDED (sizeof(uint64_t) * 2)
@@ -462,6 +477,33 @@ typedef struct QCowL2Meta
     QLIST_ENTRY(QCowL2Meta) next_in_flight;
 } QCowL2Meta;
 
+/*
+ * In images with standard L2 entries all clusters are treated as if
+ * they had one subcluster so QCow2ClusterType and QCow2SubclusterType
+ * can be mapped to each other and have the exact same meaning
+ * (QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC cannot happen in these images).
+ *
+ * In images with extended L2 entries QCow2ClusterType refers to the
+ * complete cluster and QCow2SubclusterType to each of the individual
+ * subclusters, so there are several possible combinations:
+ *
+ *     |--------------+---------------------------|
+ *     | Cluster type | Possible subcluster types |
+ *     |--------------+---------------------------|
+ *     | UNALLOCATED  |         UNALLOCATED_PLAIN |
+ *     |              |                ZERO_PLAIN |
+ *     |--------------+---------------------------|
+ *     | NORMAL       |         UNALLOCATED_ALLOC |
+ *     |              |                ZERO_ALLOC |
+ *     |              |                    NORMAL |
+ *     |--------------+---------------------------|
+ *     | COMPRESSED   |                COMPRESSED |
+ *     |--------------+---------------------------|
+ *
+ * QCOW2_SUBCLUSTER_INVALID means that the L2 entry is incorrect and
+ * the image should be marked corrupt.
+ */
+
 typedef enum QCow2ClusterType {
     QCOW2_CLUSTER_UNALLOCATED,
     QCOW2_CLUSTER_ZERO_PLAIN,
@@ -470,6 +512,16 @@ typedef enum QCow2ClusterType {
     QCOW2_CLUSTER_COMPRESSED,
 } QCow2ClusterType;
 
+typedef enum QCow2SubclusterType {
+    QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN,
+    QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC,
+    QCOW2_SUBCLUSTER_ZERO_PLAIN,
+    QCOW2_SUBCLUSTER_ZERO_ALLOC,
+    QCOW2_SUBCLUSTER_NORMAL,
+    QCOW2_SUBCLUSTER_COMPRESSED,
+    QCOW2_SUBCLUSTER_INVALID,
+} QCow2SubclusterType;
+
 typedef enum QCow2MetadataOverlap {
     QCOW2_OL_MAIN_HEADER_BITNR      = 0,
     QCOW2_OL_ACTIVE_L1_BITNR        = 1,
@@ -634,9 +686,11 @@ static inline int64_t qcow2_vm_state_offset(BDRVQcow2State *s)
 static inline QCow2ClusterType qcow2_get_cluster_type(BlockDriverState *bs,
                                                       uint64_t l2_entry)
 {
+    BDRVQcow2State *s = bs->opaque;
+
     if (l2_entry & QCOW_OFLAG_COMPRESSED) {
         return QCOW2_CLUSTER_COMPRESSED;
-    } else if (l2_entry & QCOW_OFLAG_ZERO) {
+    } else if ((l2_entry & QCOW_OFLAG_ZERO) && !has_subclusters(s)) {
         if (l2_entry & L2E_OFFSET_MASK) {
             return QCOW2_CLUSTER_ZERO_ALLOC;
         }
@@ -656,6 +710,76 @@ static inline QCow2ClusterType qcow2_get_cluster_type(BlockDriverState *bs,
     }
 }
 
+/*
+ * For an image without extended L2 entries, return the
+ * QCow2SubclusterType equivalent of a given QCow2ClusterType.
+ */
+static inline
+QCow2SubclusterType qcow2_cluster_to_subcluster_type(QCow2ClusterType type)
+{
+    switch (type) {
+    case QCOW2_CLUSTER_COMPRESSED:
+        return QCOW2_SUBCLUSTER_COMPRESSED;
+    case QCOW2_CLUSTER_ZERO_PLAIN:
+        return QCOW2_SUBCLUSTER_ZERO_PLAIN;
+    case QCOW2_CLUSTER_ZERO_ALLOC:
+        return QCOW2_SUBCLUSTER_ZERO_ALLOC;
+    case QCOW2_CLUSTER_NORMAL:
+        return QCOW2_SUBCLUSTER_NORMAL;
+    case QCOW2_CLUSTER_UNALLOCATED:
+        return QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+/*
+ * In an image without subsclusters @l2_bitmap is ignored and
+ * @sc_index must be 0.
+ * Return QCOW2_SUBCLUSTER_INVALID if an invalid l2 entry is detected
+ * (this checks the whole entry and bitmap, not only the bits related
+ * to subcluster @sc_index).
+ */
+static inline
+QCow2SubclusterType qcow2_get_subcluster_type(BlockDriverState *bs,
+                                              uint64_t l2_entry,
+                                              uint64_t l2_bitmap,
+                                              unsigned sc_index)
+{
+    BDRVQcow2State *s = bs->opaque;
+    QCow2ClusterType type = qcow2_get_cluster_type(bs, l2_entry);
+    assert(sc_index < s->subclusters_per_cluster);
+
+    if (has_subclusters(s)) {
+        switch (type) {
+        case QCOW2_CLUSTER_COMPRESSED:
+            return QCOW2_SUBCLUSTER_COMPRESSED;
+        case QCOW2_CLUSTER_NORMAL:
+            if ((l2_bitmap >> 32) & l2_bitmap) {
+                return QCOW2_SUBCLUSTER_INVALID;
+            } else if (l2_bitmap & QCOW_OFLAG_SUB_ZERO(sc_index)) {
+                return QCOW2_SUBCLUSTER_ZERO_ALLOC;
+            } else if (l2_bitmap & QCOW_OFLAG_SUB_ALLOC(sc_index)) {
+                return QCOW2_SUBCLUSTER_NORMAL;
+            } else {
+                return QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC;
+            }
+        case QCOW2_CLUSTER_UNALLOCATED:
+            if (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC) {
+                return QCOW2_SUBCLUSTER_INVALID;
+            } else if (l2_bitmap & QCOW_OFLAG_SUB_ZERO(sc_index)) {
+                return QCOW2_SUBCLUSTER_ZERO_PLAIN;
+            } else {
+                return QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN;
+            }
+        default:
+            g_assert_not_reached();
+        }
+    } else {
+        return qcow2_cluster_to_subcluster_type(type);
+    }
+}
+
 /* Check whether refcounts are eager or lazy */
 static inline bool qcow2_need_accurate_refcounts(BDRVQcow2State *s)
 {
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 15/34] qcow2: Add qcow2_get_subcluster_range_type()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (13 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-01 13:37   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 16/34] qcow2: Add qcow2_cluster_is_allocated() Alberto Garcia
                   ` (18 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

There are situations in which we want to know how many contiguous
subclusters of the same type there are in a given cluster. This can be
done by simply iterating over the subclusters and repeatedly calling
qcow2_get_subcluster_type() for each one of them.

However once we determined the type of a subcluster we can check the
rest efficiently by counting the number of adjacent ones (or zeroes)
in the bitmap. This is what this function does.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-cluster.c | 51 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8b2fc550b7..32dc6e75e3 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -375,6 +375,57 @@ fail:
     return ret;
 }
 
+/*
+ * For a given L2 entry, count the number of contiguous subclusters of
+ * the same type starting from @sc_from. Compressed clusters are
+ * treated as if they were divided into subclusters of size
+ * s->subcluster_size.
+ *
+ * Return the number of contiguous subclusters and set @type to the
+ * subcluster type.
+ *
+ * If the L2 entry is invalid return -errno and set @type to
+ * QCOW2_SUBCLUSTER_INVALID.
+ */
+G_GNUC_UNUSED
+static int qcow2_get_subcluster_range_type(BlockDriverState *bs,
+                                           uint64_t l2_entry,
+                                           uint64_t l2_bitmap,
+                                           unsigned sc_from,
+                                           QCow2SubclusterType *type)
+{
+    BDRVQcow2State *s = bs->opaque;
+    uint32_t val;
+
+    *type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_from);
+
+    if (*type == QCOW2_SUBCLUSTER_INVALID) {
+        return -EINVAL;
+    } else if (!has_subclusters(s) || *type == QCOW2_SUBCLUSTER_COMPRESSED) {
+        return s->subclusters_per_cluster - sc_from;
+    }
+
+    switch (*type) {
+    case QCOW2_SUBCLUSTER_NORMAL:
+        val = l2_bitmap | QCOW_OFLAG_SUB_ALLOC_RANGE(0, sc_from);
+        return cto32(val) - sc_from;
+
+    case QCOW2_SUBCLUSTER_ZERO_PLAIN:
+    case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+        val = (l2_bitmap | QCOW_OFLAG_SUB_ZERO_RANGE(0, sc_from)) >> 32;
+        return cto32(val) - sc_from;
+
+    case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
+    case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC:
+        val = ((l2_bitmap >> 32) | l2_bitmap)
+            & ~QCOW_OFLAG_SUB_ALLOC_RANGE(0, sc_from);
+        return ctz32(val) - sc_from;
+
+    default:
+        g_assert_not_reached();
+    }
+}
+
 /*
  * Checks how many clusters in a given L2 slice are contiguous in the image
  * file. As soon as one of the flags in the bitmask stop_flags changes compared
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 16/34] qcow2: Add qcow2_cluster_is_allocated()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (14 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 15/34] qcow2: Add qcow2_get_subcluster_range_type() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-01 13:55   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 17/34] qcow2: Add cluster type parameter to qcow2_get_host_offset() Alberto Garcia
                   ` (17 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

This helper function tells us if a cluster is allocated (that is,
there is an associated host offset for it).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/block/qcow2.h b/block/qcow2.h
index 3aec6f452a..ea647c8bb5 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -780,6 +780,12 @@ QCow2SubclusterType qcow2_get_subcluster_type(BlockDriverState *bs,
     }
 }
 
+static inline bool qcow2_cluster_is_allocated(QCow2ClusterType type)
+{
+    return (type == QCOW2_CLUSTER_COMPRESSED || type == QCOW2_CLUSTER_NORMAL ||
+            type == QCOW2_CLUSTER_ZERO_ALLOC);
+}
+
 /* Check whether refcounts are eager or lazy */
 static inline bool qcow2_need_accurate_refcounts(BDRVQcow2State *s)
 {
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 17/34] qcow2: Add cluster type parameter to qcow2_get_host_offset()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (15 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 16/34] qcow2: Add qcow2_cluster_is_allocated() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 18/34] qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_* Alberto Garcia
                   ` (16 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

This function returns an integer that can be either an error code or a
cluster type (a value from the QCow2ClusterType enum).

We are going to start using subcluster types instead of cluster types
in some functions so it's better to use the exact data types instead
of integers for clarity and in order to detect errors more easily.

This patch makes qcow2_get_host_offset() return 0 on success and
puts the returned cluster type in a separate parameter. There are no
semantic changes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |  3 ++-
 block/qcow2-cluster.c | 11 +++++++----
 block/qcow2.c         | 37 ++++++++++++++++++++++---------------
 3 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index ea647c8bb5..74f65793bd 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -893,7 +893,8 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
                           uint8_t *buf, int nb_sectors, bool enc, Error **errp);
 
 int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
-                          unsigned int *bytes, uint64_t *host_offset);
+                          unsigned int *bytes, uint64_t *host_offset,
+                          QCow2ClusterType *cluster_type);
 int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
                                unsigned int *bytes, uint64_t *host_offset,
                                QCowL2Meta **m);
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 32dc6e75e3..1e5681b0c6 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -565,13 +565,14 @@ static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
  *
  * On exit, *bytes is the number of bytes starting at offset that have the same
  * cluster type and (if applicable) are stored contiguously in the image file.
+ * The cluster type is stored in *cluster_type.
  * Compressed clusters are always returned one by one.
  *
- * Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error
- * cases.
+ * Returns 0 on success, -errno in error cases.
  */
 int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
-                          unsigned int *bytes, uint64_t *host_offset)
+                          unsigned int *bytes, uint64_t *host_offset,
+                          QCow2ClusterType *cluster_type)
 {
     BDRVQcow2State *s = bs->opaque;
     unsigned int l2_index;
@@ -712,7 +713,9 @@ out:
     assert(bytes_available - offset_in_cluster <= UINT_MAX);
     *bytes = bytes_available - offset_in_cluster;
 
-    return type;
+    *cluster_type = type;
+
+    return 0;
 
 fail:
     qcow2_cache_put(s->l2_table_cache, (void **)&l2_slice);
diff --git a/block/qcow2.c b/block/qcow2.c
index 7df55a88a8..89e17bdaba 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2033,6 +2033,7 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     BDRVQcow2State *s = bs->opaque;
     uint64_t host_offset;
     unsigned int bytes;
+    QCow2ClusterType type;
     int ret, status = 0;
 
     qemu_co_mutex_lock(&s->lock);
@@ -2044,7 +2045,7 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     }
 
     bytes = MIN(INT_MAX, count);
-    ret = qcow2_get_host_offset(bs, offset, &bytes, &host_offset);
+    ret = qcow2_get_host_offset(bs, offset, &bytes, &host_offset, &type);
     qemu_co_mutex_unlock(&s->lock);
     if (ret < 0) {
         return ret;
@@ -2052,15 +2053,15 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
 
     *pnum = bytes;
 
-    if ((ret == QCOW2_CLUSTER_NORMAL || ret == QCOW2_CLUSTER_ZERO_ALLOC) &&
+    if ((type == QCOW2_CLUSTER_NORMAL || type == QCOW2_CLUSTER_ZERO_ALLOC) &&
         !s->crypto) {
         *map = host_offset;
         *file = s->data_file->bs;
         status |= BDRV_BLOCK_OFFSET_VALID;
     }
-    if (ret == QCOW2_CLUSTER_ZERO_PLAIN || ret == QCOW2_CLUSTER_ZERO_ALLOC) {
+    if (type == QCOW2_CLUSTER_ZERO_PLAIN || type == QCOW2_CLUSTER_ZERO_ALLOC) {
         status |= BDRV_BLOCK_ZERO;
-    } else if (ret != QCOW2_CLUSTER_UNALLOCATED) {
+    } else if (type != QCOW2_CLUSTER_UNALLOCATED) {
         status |= BDRV_BLOCK_DATA;
     }
     if (s->metadata_preallocation && (status & BDRV_BLOCK_DATA) &&
@@ -2269,6 +2270,7 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
     int ret = 0;
     unsigned int cur_bytes; /* number of bytes in current iteration */
     uint64_t host_offset = 0;
+    QCow2ClusterType type;
     AioTaskPool *aio = NULL;
 
     while (bytes != 0 && aio_task_pool_status(aio) == 0) {
@@ -2280,22 +2282,23 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
         }
 
         qemu_co_mutex_lock(&s->lock);
-        ret = qcow2_get_host_offset(bs, offset, &cur_bytes, &host_offset);
+        ret = qcow2_get_host_offset(bs, offset, &cur_bytes,
+                                    &host_offset, &type);
         qemu_co_mutex_unlock(&s->lock);
         if (ret < 0) {
             goto out;
         }
 
-        if (ret == QCOW2_CLUSTER_ZERO_PLAIN ||
-            ret == QCOW2_CLUSTER_ZERO_ALLOC ||
-            (ret == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
+        if (type == QCOW2_CLUSTER_ZERO_PLAIN ||
+            type == QCOW2_CLUSTER_ZERO_ALLOC ||
+            (type == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
         {
             qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes);
         } else {
             if (!aio && cur_bytes != bytes) {
                 aio = aio_task_pool_new(QCOW2_MAX_WORKERS);
             }
-            ret = qcow2_add_task(bs, aio, qcow2_co_preadv_task_entry, ret,
+            ret = qcow2_add_task(bs, aio, qcow2_co_preadv_task_entry, type,
                                  host_offset, offset, cur_bytes,
                                  qiov, qiov_offset, NULL);
             if (ret < 0) {
@@ -3841,6 +3844,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
     if (head || tail) {
         uint64_t off;
         unsigned int nr;
+        QCow2ClusterType type;
 
         assert(head + bytes <= s->cluster_size);
 
@@ -3856,10 +3860,11 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
         offset = QEMU_ALIGN_DOWN(offset, s->cluster_size);
         bytes = s->cluster_size;
         nr = s->cluster_size;
-        ret = qcow2_get_host_offset(bs, offset, &nr, &off);
-        if (ret != QCOW2_CLUSTER_UNALLOCATED &&
-            ret != QCOW2_CLUSTER_ZERO_PLAIN &&
-            ret != QCOW2_CLUSTER_ZERO_ALLOC) {
+        ret = qcow2_get_host_offset(bs, offset, &nr, &off, &type);
+        if (ret < 0 ||
+            (type != QCOW2_CLUSTER_UNALLOCATED &&
+             type != QCOW2_CLUSTER_ZERO_PLAIN &&
+             type != QCOW2_CLUSTER_ZERO_ALLOC)) {
             qemu_co_mutex_unlock(&s->lock);
             return -ENOTSUP;
         }
@@ -3923,16 +3928,18 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
 
     while (bytes != 0) {
         uint64_t copy_offset = 0;
+        QCow2ClusterType type;
         /* prepare next request */
         cur_bytes = MIN(bytes, INT_MAX);
         cur_write_flags = write_flags;
 
-        ret = qcow2_get_host_offset(bs, src_offset, &cur_bytes, &copy_offset);
+        ret = qcow2_get_host_offset(bs, src_offset, &cur_bytes,
+                                    &copy_offset, &type);
         if (ret < 0) {
             goto out;
         }
 
-        switch (ret) {
+        switch (type) {
         case QCOW2_CLUSTER_UNALLOCATED:
             if (bs->backing && bs->backing->bs) {
                 int64_t backing_length = bdrv_getlength(bs->backing->bs);
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 18/34] qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_*
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (16 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 17/34] qcow2: Add cluster type parameter to qcow2_get_host_offset() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 19/34] qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC Alberto Garcia
                   ` (15 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

In order to support extended L2 entries some functions of the qcow2
driver need to start dealing with subclusters instead of clusters.

qcow2_get_host_offset() is modified to return the subcluster type
instead of the cluster type, and all callers are updated to replace
all values of QCow2ClusterType with their QCow2SubclusterType
equivalents.

This patch only changes the data types, there are no semantic changes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.h         |  2 +-
 block/qcow2-cluster.c | 10 +++----
 block/qcow2.c         | 70 ++++++++++++++++++++++---------------------
 3 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 74f65793bd..5df761edc3 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -894,7 +894,7 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
 
 int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
                           unsigned int *bytes, uint64_t *host_offset,
-                          QCow2ClusterType *cluster_type);
+                          QCow2SubclusterType *subcluster_type);
 int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
                                unsigned int *bytes, uint64_t *host_offset,
                                QCowL2Meta **m);
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 1e5681b0c6..ed7b92dbb2 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -564,15 +564,15 @@ static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
  * offset that we are interested in.
  *
  * On exit, *bytes is the number of bytes starting at offset that have the same
- * cluster type and (if applicable) are stored contiguously in the image file.
- * The cluster type is stored in *cluster_type.
- * Compressed clusters are always returned one by one.
+ * subcluster type and (if applicable) are stored contiguously in the image
+ * file. The subcluster type is stored in *subcluster_type.
+ * Compressed clusters are always processed one by one.
  *
  * Returns 0 on success, -errno in error cases.
  */
 int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
                           unsigned int *bytes, uint64_t *host_offset,
-                          QCow2ClusterType *cluster_type)
+                          QCow2SubclusterType *subcluster_type)
 {
     BDRVQcow2State *s = bs->opaque;
     unsigned int l2_index;
@@ -713,7 +713,7 @@ out:
     assert(bytes_available - offset_in_cluster <= UINT_MAX);
     *bytes = bytes_available - offset_in_cluster;
 
-    *cluster_type = type;
+    *subcluster_type = qcow2_cluster_to_subcluster_type(type);
 
     return 0;
 
diff --git a/block/qcow2.c b/block/qcow2.c
index 89e17bdaba..2f7a2a7c7a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2033,7 +2033,7 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     BDRVQcow2State *s = bs->opaque;
     uint64_t host_offset;
     unsigned int bytes;
-    QCow2ClusterType type;
+    QCow2SubclusterType type;
     int ret, status = 0;
 
     qemu_co_mutex_lock(&s->lock);
@@ -2053,15 +2053,16 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
 
     *pnum = bytes;
 
-    if ((type == QCOW2_CLUSTER_NORMAL || type == QCOW2_CLUSTER_ZERO_ALLOC) &&
-        !s->crypto) {
+    if ((type == QCOW2_SUBCLUSTER_NORMAL ||
+         type == QCOW2_SUBCLUSTER_ZERO_ALLOC) && !s->crypto) {
         *map = host_offset;
         *file = s->data_file->bs;
         status |= BDRV_BLOCK_OFFSET_VALID;
     }
-    if (type == QCOW2_CLUSTER_ZERO_PLAIN || type == QCOW2_CLUSTER_ZERO_ALLOC) {
+    if (type == QCOW2_SUBCLUSTER_ZERO_PLAIN ||
+        type == QCOW2_SUBCLUSTER_ZERO_ALLOC) {
         status |= BDRV_BLOCK_ZERO;
-    } else if (type != QCOW2_CLUSTER_UNALLOCATED) {
+    } else if (type != QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN) {
         status |= BDRV_BLOCK_DATA;
     }
     if (s->metadata_preallocation && (status & BDRV_BLOCK_DATA) &&
@@ -2158,7 +2159,7 @@ typedef struct Qcow2AioTask {
     AioTask task;
 
     BlockDriverState *bs;
-    QCow2ClusterType cluster_type; /* only for read */
+    QCow2SubclusterType subcluster_type; /* only for read */
     uint64_t host_offset; /* or full descriptor in compressed clusters */
     uint64_t offset;
     uint64_t bytes;
@@ -2171,7 +2172,7 @@ static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task);
 static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
                                        AioTaskPool *pool,
                                        AioTaskFunc func,
-                                       QCow2ClusterType cluster_type,
+                                       QCow2SubclusterType subcluster_type,
                                        uint64_t host_offset,
                                        uint64_t offset,
                                        uint64_t bytes,
@@ -2185,7 +2186,7 @@ static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
     *task = (Qcow2AioTask) {
         .task.func = func,
         .bs = bs,
-        .cluster_type = cluster_type,
+        .subcluster_type = subcluster_type,
         .qiov = qiov,
         .host_offset = host_offset,
         .offset = offset,
@@ -2196,7 +2197,7 @@ static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
 
     trace_qcow2_add_task(qemu_coroutine_self(), bs, pool,
                          func == qcow2_co_preadv_task_entry ? "read" : "write",
-                         cluster_type, host_offset, offset, bytes,
+                         subcluster_type, host_offset, offset, bytes,
                          qiov, qiov_offset);
 
     if (!pool) {
@@ -2209,7 +2210,7 @@ static coroutine_fn int qcow2_add_task(BlockDriverState *bs,
 }
 
 static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
-                                             QCow2ClusterType cluster_type,
+                                             QCow2SubclusterType subc_type,
                                              uint64_t host_offset,
                                              uint64_t offset, uint64_t bytes,
                                              QEMUIOVector *qiov,
@@ -2217,24 +2218,24 @@ static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
 {
     BDRVQcow2State *s = bs->opaque;
 
-    switch (cluster_type) {
-    case QCOW2_CLUSTER_ZERO_PLAIN:
-    case QCOW2_CLUSTER_ZERO_ALLOC:
+    switch (subc_type) {
+    case QCOW2_SUBCLUSTER_ZERO_PLAIN:
+    case QCOW2_SUBCLUSTER_ZERO_ALLOC:
         /* Both zero types are handled in qcow2_co_preadv_part */
         g_assert_not_reached();
 
-    case QCOW2_CLUSTER_UNALLOCATED:
+    case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
         assert(bs->backing); /* otherwise handled in qcow2_co_preadv_part */
 
         BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
         return bdrv_co_preadv_part(bs->backing, offset, bytes,
                                    qiov, qiov_offset, 0);
 
-    case QCOW2_CLUSTER_COMPRESSED:
+    case QCOW2_SUBCLUSTER_COMPRESSED:
         return qcow2_co_preadv_compressed(bs, host_offset,
                                           offset, bytes, qiov, qiov_offset);
 
-    case QCOW2_CLUSTER_NORMAL:
+    case QCOW2_SUBCLUSTER_NORMAL:
         if (bs->encrypted) {
             return qcow2_co_preadv_encrypted(bs, host_offset,
                                              offset, bytes, qiov, qiov_offset);
@@ -2257,8 +2258,9 @@ static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task)
 
     assert(!t->l2meta);
 
-    return qcow2_co_preadv_task(t->bs, t->cluster_type, t->host_offset,
-                                t->offset, t->bytes, t->qiov, t->qiov_offset);
+    return qcow2_co_preadv_task(t->bs, t->subcluster_type,
+                                t->host_offset, t->offset, t->bytes,
+                                t->qiov, t->qiov_offset);
 }
 
 static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
@@ -2270,7 +2272,7 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
     int ret = 0;
     unsigned int cur_bytes; /* number of bytes in current iteration */
     uint64_t host_offset = 0;
-    QCow2ClusterType type;
+    QCow2SubclusterType type;
     AioTaskPool *aio = NULL;
 
     while (bytes != 0 && aio_task_pool_status(aio) == 0) {
@@ -2289,9 +2291,9 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
             goto out;
         }
 
-        if (type == QCOW2_CLUSTER_ZERO_PLAIN ||
-            type == QCOW2_CLUSTER_ZERO_ALLOC ||
-            (type == QCOW2_CLUSTER_UNALLOCATED && !bs->backing))
+        if (type == QCOW2_SUBCLUSTER_ZERO_PLAIN ||
+            type == QCOW2_SUBCLUSTER_ZERO_ALLOC ||
+            (type == QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN && !bs->backing))
         {
             qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes);
         } else {
@@ -2524,7 +2526,7 @@ static coroutine_fn int qcow2_co_pwritev_task_entry(AioTask *task)
 {
     Qcow2AioTask *t = container_of(task, Qcow2AioTask, task);
 
-    assert(!t->cluster_type);
+    assert(!t->subcluster_type);
 
     return qcow2_co_pwritev_task(t->bs, t->host_offset,
                                  t->offset, t->bytes, t->qiov, t->qiov_offset,
@@ -3844,7 +3846,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
     if (head || tail) {
         uint64_t off;
         unsigned int nr;
-        QCow2ClusterType type;
+        QCow2SubclusterType type;
 
         assert(head + bytes <= s->cluster_size);
 
@@ -3862,9 +3864,9 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
         nr = s->cluster_size;
         ret = qcow2_get_host_offset(bs, offset, &nr, &off, &type);
         if (ret < 0 ||
-            (type != QCOW2_CLUSTER_UNALLOCATED &&
-             type != QCOW2_CLUSTER_ZERO_PLAIN &&
-             type != QCOW2_CLUSTER_ZERO_ALLOC)) {
+            (type != QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN &&
+             type != QCOW2_SUBCLUSTER_ZERO_PLAIN &&
+             type != QCOW2_SUBCLUSTER_ZERO_ALLOC)) {
             qemu_co_mutex_unlock(&s->lock);
             return -ENOTSUP;
         }
@@ -3928,7 +3930,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
 
     while (bytes != 0) {
         uint64_t copy_offset = 0;
-        QCow2ClusterType type;
+        QCow2SubclusterType type;
         /* prepare next request */
         cur_bytes = MIN(bytes, INT_MAX);
         cur_write_flags = write_flags;
@@ -3940,7 +3942,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
         }
 
         switch (type) {
-        case QCOW2_CLUSTER_UNALLOCATED:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
             if (bs->backing && bs->backing->bs) {
                 int64_t backing_length = bdrv_getlength(bs->backing->bs);
                 if (src_offset >= backing_length) {
@@ -3955,16 +3957,16 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
             }
             break;
 
-        case QCOW2_CLUSTER_ZERO_PLAIN:
-        case QCOW2_CLUSTER_ZERO_ALLOC:
+        case QCOW2_SUBCLUSTER_ZERO_PLAIN:
+        case QCOW2_SUBCLUSTER_ZERO_ALLOC:
             cur_write_flags |= BDRV_REQ_ZERO_WRITE;
             break;
 
-        case QCOW2_CLUSTER_COMPRESSED:
+        case QCOW2_SUBCLUSTER_COMPRESSED:
             ret = -ENOTSUP;
             goto out;
 
-        case QCOW2_CLUSTER_NORMAL:
+        case QCOW2_SUBCLUSTER_NORMAL:
             child = s->data_file;
             break;
 
@@ -4493,7 +4495,7 @@ static coroutine_fn int qcow2_co_pwritev_compressed_task_entry(AioTask *task)
 {
     Qcow2AioTask *t = container_of(task, Qcow2AioTask, task);
 
-    assert(!t->cluster_type && !t->l2meta);
+    assert(!t->subcluster_type && !t->l2meta);
 
     return qcow2_co_pwritev_compressed_task(t->bs, t->offset, t->bytes, t->qiov,
                                             t->qiov_offset);
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 19/34] qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (17 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 18/34] qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_* Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 20/34] qcow2: Add subcluster support to calculate_l2_meta() Alberto Garcia
                   ` (14 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

When dealing with subcluster types there is a new value called
QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC that has no equivalent in
QCow2ClusterType.

This patch handles that value in all places where subcluster types
are processed.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 2f7a2a7c7a..a3481cd85b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2054,7 +2054,8 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     *pnum = bytes;
 
     if ((type == QCOW2_SUBCLUSTER_NORMAL ||
-         type == QCOW2_SUBCLUSTER_ZERO_ALLOC) && !s->crypto) {
+         type == QCOW2_SUBCLUSTER_ZERO_ALLOC ||
+         type == QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC) && !s->crypto) {
         *map = host_offset;
         *file = s->data_file->bs;
         status |= BDRV_BLOCK_OFFSET_VALID;
@@ -2062,7 +2063,8 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs,
     if (type == QCOW2_SUBCLUSTER_ZERO_PLAIN ||
         type == QCOW2_SUBCLUSTER_ZERO_ALLOC) {
         status |= BDRV_BLOCK_ZERO;
-    } else if (type != QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN) {
+    } else if (type != QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN &&
+               type != QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC) {
         status |= BDRV_BLOCK_DATA;
     }
     if (s->metadata_preallocation && (status & BDRV_BLOCK_DATA) &&
@@ -2225,6 +2227,7 @@ static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs,
         g_assert_not_reached();
 
     case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
+    case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC:
         assert(bs->backing); /* otherwise handled in qcow2_co_preadv_part */
 
         BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
@@ -2293,7 +2296,8 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs,
 
         if (type == QCOW2_SUBCLUSTER_ZERO_PLAIN ||
             type == QCOW2_SUBCLUSTER_ZERO_ALLOC ||
-            (type == QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN && !bs->backing))
+            (type == QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN && !bs->backing) ||
+            (type == QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC && !bs->backing))
         {
             qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes);
         } else {
@@ -3865,6 +3869,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
         ret = qcow2_get_host_offset(bs, offset, &nr, &off, &type);
         if (ret < 0 ||
             (type != QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN &&
+             type != QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC &&
              type != QCOW2_SUBCLUSTER_ZERO_PLAIN &&
              type != QCOW2_SUBCLUSTER_ZERO_ALLOC)) {
             qemu_co_mutex_unlock(&s->lock);
@@ -3943,6 +3948,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
 
         switch (type) {
         case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC:
             if (bs->backing && bs->backing->bs) {
                 int64_t backing_length = bdrv_getlength(bs->backing->bs);
                 if (src_offset >= backing_length) {
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 20/34] qcow2: Add subcluster support to calculate_l2_meta()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (18 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 19/34] qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 11:30   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 21/34] qcow2: Add subcluster support to qcow2_get_host_offset() Alberto Garcia
                   ` (13 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

If an image has subclusters then there are more copy-on-write
scenarios that we need to consider. Let's say we have a write request
from the middle of subcluster #3 until the end of the cluster:

1) If we are writing to a newly allocated cluster then we need
   copy-on-write. The previous contents of subclusters #0 to #3 must
   be copied to the new cluster. We can optimize this process by
   skipping all leading unallocated or zero subclusters (the status of
   those skipped subclusters will be reflected in the new L2 bitmap).

2) If we are overwriting an existing cluster:

   2.1) If subcluster #3 is unallocated or has the all-zeroes bit set
        then we need copy-on-write (on subcluster #3 only).

   2.2) If subcluster #3 was already allocated then there is no need
        for any copy-on-write. However we still need to update the L2
        bitmap to reflect possible changes in the allocation status of
        subclusters #4 to #31. Because of this, this function checks
        if all the overwritten subclusters are already allocated and
        in this case it returns without creating a new QCowL2Meta
        structure.

After all these changes l2meta_cow_start() and l2meta_cow_end()
are not necessarily cluster-aligned anymore. We need to update the
calculation of old_start and old_end in handle_dependencies() to
guarantee that no two requests try to write on the same cluster.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-cluster.c | 163 +++++++++++++++++++++++++++++++++---------
 1 file changed, 131 insertions(+), 32 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index ed7b92dbb2..59dd9bda29 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -387,7 +387,6 @@ fail:
  * If the L2 entry is invalid return -errno and set @type to
  * QCOW2_SUBCLUSTER_INVALID.
  */
-G_GNUC_UNUSED
 static int qcow2_get_subcluster_range_type(BlockDriverState *bs,
                                            uint64_t l2_entry,
                                            uint64_t l2_bitmap,
@@ -1110,56 +1109,148 @@ void qcow2_alloc_cluster_abort(BlockDriverState *bs, QCowL2Meta *m)
  * If @keep_old is true it means that the clusters were already
  * allocated and will be overwritten. If false then the clusters are
  * new and we have to decrease the reference count of the old ones.
+ *
+ * Returns 0 on success, -errno on failure.
  */
-static void calculate_l2_meta(BlockDriverState *bs,
-                              uint64_t host_cluster_offset,
-                              uint64_t guest_offset, unsigned bytes,
-                              uint64_t *l2_slice, QCowL2Meta **m, bool keep_old)
+static int calculate_l2_meta(BlockDriverState *bs, uint64_t host_cluster_offset,
+                             uint64_t guest_offset, unsigned bytes,
+                             uint64_t *l2_slice, QCowL2Meta **m, bool keep_old)
 {
     BDRVQcow2State *s = bs->opaque;
-    int l2_index = offset_to_l2_slice_index(s, guest_offset);
-    uint64_t l2_entry;
+    int sc_index, l2_index = offset_to_l2_slice_index(s, guest_offset);
+    uint64_t l2_entry, l2_bitmap;
     unsigned cow_start_from, cow_end_to;
     unsigned cow_start_to = offset_into_cluster(s, guest_offset);
     unsigned cow_end_from = cow_start_to + bytes;
     unsigned nb_clusters = size_to_clusters(s, cow_end_from);
     QCowL2Meta *old_m = *m;
-    QCow2ClusterType type;
+    QCow2SubclusterType type;
+    int i;
+    bool skip_cow = keep_old;
 
     assert(nb_clusters <= s->l2_slice_size - l2_index);
 
-    /* Return if there's no COW (all clusters are normal and we keep them) */
-    if (keep_old) {
-        int i;
-        for (i = 0; i < nb_clusters; i++) {
-            l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
-            if (qcow2_get_cluster_type(bs, l2_entry) != QCOW2_CLUSTER_NORMAL) {
-                break;
+    /* Check the type of all affected subclusters */
+    for (i = 0; i < nb_clusters; i++) {
+        l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
+        l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + i);
+        if (skip_cow) {
+            unsigned write_from = MAX(cow_start_to, i << s->cluster_bits);
+            unsigned write_to = MIN(cow_end_from, (i + 1) << s->cluster_bits);
+            int first_sc = offset_to_sc_index(s, write_from);
+            int last_sc = offset_to_sc_index(s, write_to - 1);
+            int cnt = qcow2_get_subcluster_range_type(bs, l2_entry, l2_bitmap,
+                                                      first_sc, &type);
+            /* Is any of the subclusters of type != QCOW2_SUBCLUSTER_NORMAL ? */
+            if (type != QCOW2_SUBCLUSTER_NORMAL || first_sc + cnt <= last_sc) {
+                skip_cow = false;
             }
+        } else {
+            /* If we can't skip the cow we can still look for invalid entries */
+            type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, 0);
         }
-        if (i == nb_clusters) {
-            return;
+        if (type == QCOW2_SUBCLUSTER_INVALID) {
+            int l1_index = offset_to_l1_index(s, guest_offset);
+            uint64_t l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
+            qcow2_signal_corruption(bs, true, -1, -1, "Invalid cluster "
+                                    "entry found (L2 offset: %#" PRIx64
+                                    ", L2 index: %#x)",
+                                    l2_offset, l2_index + i);
+            return -EIO;
         }
     }
 
+    if (skip_cow) {
+        return 0;
+    }
+
     /* Get the L2 entry of the first cluster */
     l2_entry = get_l2_entry(s, l2_slice, l2_index);
-    type = qcow2_get_cluster_type(bs, l2_entry);
+    l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
+    sc_index = offset_to_sc_index(s, guest_offset);
+    type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index);
 
-    if (type == QCOW2_CLUSTER_NORMAL && keep_old) {
-        cow_start_from = cow_start_to;
+    if (!keep_old) {
+        switch (type) {
+        case QCOW2_SUBCLUSTER_COMPRESSED:
+            cow_start_from = 0;
+            break;
+        case QCOW2_SUBCLUSTER_NORMAL:
+        case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC:
+            if (has_subclusters(s)) {
+                /* Skip all leading zero and unallocated subclusters */
+                uint32_t alloc_bitmap = l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC;
+                cow_start_from =
+                    MIN(sc_index, ctz32(alloc_bitmap)) << s->subcluster_bits;
+            } else {
+                cow_start_from = 0;
+            }
+            break;
+        case QCOW2_SUBCLUSTER_ZERO_PLAIN:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
+            cow_start_from = sc_index << s->subcluster_bits;
+            break;
+        default:
+            g_assert_not_reached();
+        }
     } else {
-        cow_start_from = 0;
+        switch (type) {
+        case QCOW2_SUBCLUSTER_NORMAL:
+            cow_start_from = cow_start_to;
+            break;
+        case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC:
+            cow_start_from = sc_index << s->subcluster_bits;
+            break;
+        default:
+            g_assert_not_reached();
+        }
     }
 
     /* Get the L2 entry of the last cluster */
-    l2_entry = get_l2_entry(s, l2_slice, l2_index + nb_clusters - 1);
-    type = qcow2_get_cluster_type(bs, l2_entry);
+    l2_index += nb_clusters - 1;
+    l2_entry = get_l2_entry(s, l2_slice, l2_index);
+    l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
+    sc_index = offset_to_sc_index(s, guest_offset + bytes - 1);
+    type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index);
 
-    if (type == QCOW2_CLUSTER_NORMAL && keep_old) {
-        cow_end_to = cow_end_from;
+    if (!keep_old) {
+        switch (type) {
+        case QCOW2_SUBCLUSTER_COMPRESSED:
+            cow_end_to = ROUND_UP(cow_end_from, s->cluster_size);
+            break;
+        case QCOW2_SUBCLUSTER_NORMAL:
+        case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC:
+            cow_end_to = ROUND_UP(cow_end_from, s->cluster_size);
+            if (has_subclusters(s)) {
+                /* Skip all trailing zero and unallocated subclusters */
+                uint32_t alloc_bitmap = l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC;
+                cow_end_to -=
+                    MIN(s->subclusters_per_cluster - sc_index - 1,
+                        clz32(alloc_bitmap)) << s->subcluster_bits;
+            }
+            break;
+        case QCOW2_SUBCLUSTER_ZERO_PLAIN:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
+            cow_end_to = ROUND_UP(cow_end_from, s->subcluster_size);
+            break;
+        default:
+            g_assert_not_reached();
+        }
     } else {
-        cow_end_to = ROUND_UP(cow_end_from, s->cluster_size);
+        switch (type) {
+        case QCOW2_SUBCLUSTER_NORMAL:
+            cow_end_to = cow_end_from;
+            break;
+        case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+        case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC:
+            cow_end_to = ROUND_UP(cow_end_from, s->subcluster_size);
+            break;
+        default:
+            g_assert_not_reached();
+        }
     }
 
     *m = g_malloc0(sizeof(**m));
@@ -1184,6 +1275,8 @@ static void calculate_l2_meta(BlockDriverState *bs,
 
     qemu_co_queue_init(&(*m)->dependent_requests);
     QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
+
+    return 0;
 }
 
 /*
@@ -1272,8 +1365,8 @@ static int handle_dependencies(BlockDriverState *bs, uint64_t guest_offset,
 
         uint64_t start = guest_offset;
         uint64_t end = start + bytes;
-        uint64_t old_start = l2meta_cow_start(old_alloc);
-        uint64_t old_end = l2meta_cow_end(old_alloc);
+        uint64_t old_start = start_of_cluster(s, l2meta_cow_start(old_alloc));
+        uint64_t old_end = ROUND_UP(l2meta_cow_end(old_alloc), s->cluster_size);
 
         if (end <= old_start || start >= old_end) {
             /* No intersection */
@@ -1398,8 +1491,11 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
                  - offset_into_cluster(s, guest_offset));
         assert(*bytes != 0);
 
-        calculate_l2_meta(bs, cluster_offset, guest_offset,
-                          *bytes, l2_slice, m, true);
+        ret = calculate_l2_meta(bs, cluster_offset, guest_offset,
+                                *bytes, l2_slice, m, true);
+        if (ret < 0) {
+            goto out;
+        }
 
         ret = 1;
     } else {
@@ -1575,8 +1671,11 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
     *bytes = MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset));
     assert(*bytes != 0);
 
-    calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *bytes, l2_slice,
-                      m, false);
+    ret = calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *bytes,
+                            l2_slice, m, false);
+    if (ret < 0) {
+        goto out;
+    }
 
     ret = 1;
 
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 21/34] qcow2: Add subcluster support to qcow2_get_host_offset()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (19 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 20/34] qcow2: Add subcluster support to calculate_l2_meta() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 12:46   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 22/34] qcow2: Add subcluster support to zero_in_l2_slice() Alberto Garcia
                   ` (12 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

The logic of this function remains pretty much the same, except that
it uses count_contiguous_subclusters(), which combines the logic of
count_contiguous_clusters() / count_contiguous_clusters_unallocated()
and checks individual subclusters.

qcow2_cluster_to_subcluster_type() is not necessary as a separate
function anymore so it's inlined into its caller.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.h         |  38 ++++-------
 block/qcow2-cluster.c | 150 ++++++++++++++++++++++--------------------
 2 files changed, 92 insertions(+), 96 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 5df761edc3..4fad40b96b 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -710,29 +710,6 @@ static inline QCow2ClusterType qcow2_get_cluster_type(BlockDriverState *bs,
     }
 }
 
-/*
- * For an image without extended L2 entries, return the
- * QCow2SubclusterType equivalent of a given QCow2ClusterType.
- */
-static inline
-QCow2SubclusterType qcow2_cluster_to_subcluster_type(QCow2ClusterType type)
-{
-    switch (type) {
-    case QCOW2_CLUSTER_COMPRESSED:
-        return QCOW2_SUBCLUSTER_COMPRESSED;
-    case QCOW2_CLUSTER_ZERO_PLAIN:
-        return QCOW2_SUBCLUSTER_ZERO_PLAIN;
-    case QCOW2_CLUSTER_ZERO_ALLOC:
-        return QCOW2_SUBCLUSTER_ZERO_ALLOC;
-    case QCOW2_CLUSTER_NORMAL:
-        return QCOW2_SUBCLUSTER_NORMAL;
-    case QCOW2_CLUSTER_UNALLOCATED:
-        return QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN;
-    default:
-        g_assert_not_reached();
-    }
-}
-
 /*
  * In an image without subsclusters @l2_bitmap is ignored and
  * @sc_index must be 0.
@@ -776,7 +753,20 @@ QCow2SubclusterType qcow2_get_subcluster_type(BlockDriverState *bs,
             g_assert_not_reached();
         }
     } else {
-        return qcow2_cluster_to_subcluster_type(type);
+        switch (type) {
+        case QCOW2_CLUSTER_COMPRESSED:
+            return QCOW2_SUBCLUSTER_COMPRESSED;
+        case QCOW2_CLUSTER_ZERO_PLAIN:
+            return QCOW2_SUBCLUSTER_ZERO_PLAIN;
+        case QCOW2_CLUSTER_ZERO_ALLOC:
+            return QCOW2_SUBCLUSTER_ZERO_ALLOC;
+        case QCOW2_CLUSTER_NORMAL:
+            return QCOW2_SUBCLUSTER_NORMAL;
+        case QCOW2_CLUSTER_UNALLOCATED:
+            return QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN;
+        default:
+            g_assert_not_reached();
+        }
     }
 }
 
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 59dd9bda29..2f3bd3a882 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -426,66 +426,66 @@ static int qcow2_get_subcluster_range_type(BlockDriverState *bs,
 }
 
 /*
- * Checks how many clusters in a given L2 slice are contiguous in the image
- * file. As soon as one of the flags in the bitmask stop_flags changes compared
- * to the first cluster, the search is stopped and the cluster is not counted
- * as contiguous. (This allows it, for example, to stop at the first compressed
- * cluster which may require a different handling)
+ * Return the number of contiguous subclusters of the exact same type
+ * in a given L2 slice, starting from cluster @l2_index, subcluster
+ * @sc_index. Allocated subclusters are required to be contiguous in
+ * the image file.
+ * At most @nb_clusters are checked (note that this means clusters,
+ * not subclusters).
+ * Compressed clusters are always processed one by one but for the
+ * purpose of this count they are treated as if they were divided into
+ * subclusters of size s->subcluster_size.
+ * On failure return -errno and update @l2_index to point to the
+ * invalid entry.
  */
-static int count_contiguous_clusters(BlockDriverState *bs, int nb_clusters,
-        int cluster_size, uint64_t *l2_slice, int l2_index, uint64_t stop_flags)
+static int count_contiguous_subclusters(BlockDriverState *bs, int nb_clusters,
+                                        unsigned sc_index, uint64_t *l2_slice,
+                                        unsigned *l2_index)
 {
     BDRVQcow2State *s = bs->opaque;
-    int i;
-    QCow2ClusterType first_cluster_type;
-    uint64_t mask = stop_flags | L2E_OFFSET_MASK | QCOW_OFLAG_COMPRESSED;
-    uint64_t first_entry = get_l2_entry(s, l2_slice, l2_index);
-    uint64_t offset = first_entry & mask;
+    int i, count = 0;
+    bool check_offset;
+    uint64_t expected_offset;
+    QCow2SubclusterType expected_type, type;
 
-    first_cluster_type = qcow2_get_cluster_type(bs, first_entry);
-    if (first_cluster_type == QCOW2_CLUSTER_UNALLOCATED) {
-        return 0;
-    }
-
-    /* must be allocated */
-    assert(first_cluster_type == QCOW2_CLUSTER_NORMAL ||
-           first_cluster_type == QCOW2_CLUSTER_ZERO_ALLOC);
+    assert(*l2_index + nb_clusters <= s->l2_size);
 
     for (i = 0; i < nb_clusters; i++) {
-        uint64_t l2_entry = get_l2_entry(s, l2_slice, l2_index + i) & mask;
-        if (offset + (uint64_t) i * cluster_size != l2_entry) {
+        unsigned first_sc = (i == 0) ? sc_index : 0;
+        uint64_t l2_entry = get_l2_entry(s, l2_slice, *l2_index + i);
+        uint64_t l2_bitmap = get_l2_bitmap(s, l2_slice, *l2_index + i);
+        int ret = qcow2_get_subcluster_range_type(bs, l2_entry, l2_bitmap,
+                                                  first_sc, &type);
+        if (ret < 0) {
+            *l2_index += i; /* Point to the invalid entry */
+            return -EIO;
+        }
+        if (i == 0) {
+            if (type == QCOW2_SUBCLUSTER_COMPRESSED) {
+                /* Compressed clusters are always processed one by one */
+                return ret;
+            }
+            expected_type = type;
+            expected_offset = l2_entry & L2E_OFFSET_MASK;
+            check_offset = (type == QCOW2_SUBCLUSTER_NORMAL ||
+                            type == QCOW2_SUBCLUSTER_ZERO_ALLOC ||
+                            type == QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC);
+        } else if (type != expected_type) {
             break;
+        } else if (check_offset) {
+            expected_offset += s->cluster_size;
+            if (expected_offset != (l2_entry & L2E_OFFSET_MASK)) {
+                break;
+            }
         }
-    }
-
-        return i;
-}
-
-/*
- * Checks how many consecutive unallocated clusters in a given L2
- * slice have the same cluster type.
- */
-static int count_contiguous_clusters_unallocated(BlockDriverState *bs,
-                                                 int nb_clusters,
-                                                 uint64_t *l2_slice,
-                                                 int l2_index,
-                                                 QCow2ClusterType wanted_type)
-{
-    BDRVQcow2State *s = bs->opaque;
-    int i;
-
-    assert(wanted_type == QCOW2_CLUSTER_ZERO_PLAIN ||
-           wanted_type == QCOW2_CLUSTER_UNALLOCATED);
-    for (i = 0; i < nb_clusters; i++) {
-        uint64_t entry = get_l2_entry(s, l2_slice, l2_index + i);
-        QCow2ClusterType type = qcow2_get_cluster_type(bs, entry);
-
-        if (type != wanted_type) {
+        count += ret;
+        /* Stop if there are type changes before the end of the cluster */
+        if (first_sc + ret < s->subclusters_per_cluster) {
             break;
         }
     }
 
-    return i;
+    return count;
 }
 
 static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
@@ -574,12 +574,12 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
                           QCow2SubclusterType *subcluster_type)
 {
     BDRVQcow2State *s = bs->opaque;
-    unsigned int l2_index;
-    uint64_t l1_index, l2_offset, *l2_slice, l2_entry;
-    int c;
+    unsigned int l2_index, sc_index;
+    uint64_t l1_index, l2_offset, *l2_slice, l2_entry, l2_bitmap;
+    int sc;
     unsigned int offset_in_cluster;
     uint64_t bytes_available, bytes_needed, nb_clusters;
-    QCow2ClusterType type;
+    QCow2SubclusterType type;
     int ret;
 
     offset_in_cluster = offset_into_cluster(s, offset);
@@ -600,13 +600,13 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
 
     l1_index = offset_to_l1_index(s, offset);
     if (l1_index >= s->l1_size) {
-        type = QCOW2_CLUSTER_UNALLOCATED;
+        type = QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN;
         goto out;
     }
 
     l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
     if (!l2_offset) {
-        type = QCOW2_CLUSTER_UNALLOCATED;
+        type = QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN;
         goto out;
     }
 
@@ -627,7 +627,9 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
     /* find the cluster offset for the given disk offset */
 
     l2_index = offset_to_l2_slice_index(s, offset);
+    sc_index = offset_to_sc_index(s, offset);
     l2_entry = get_l2_entry(s, l2_slice, l2_index);
+    l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
 
     nb_clusters = size_to_clusters(s, bytes_needed);
     /* bytes_needed <= *bytes + offset_in_cluster, both of which are unsigned
@@ -635,9 +637,9 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
      * true */
     assert(nb_clusters <= INT_MAX);
 
-    type = qcow2_get_cluster_type(bs, l2_entry);
-    if (s->qcow_version < 3 && (type == QCOW2_CLUSTER_ZERO_PLAIN ||
-                                type == QCOW2_CLUSTER_ZERO_ALLOC)) {
+    type = qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index);
+    if (s->qcow_version < 3 && (type == QCOW2_SUBCLUSTER_ZERO_PLAIN ||
+                                type == QCOW2_SUBCLUSTER_ZERO_ALLOC)) {
         qcow2_signal_corruption(bs, true, -1, -1, "Zero cluster entry found"
                                 " in pre-v3 image (L2 offset: %#" PRIx64
                                 ", L2 index: %#x)", l2_offset, l2_index);
@@ -645,7 +647,9 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
         goto fail;
     }
     switch (type) {
-    case QCOW2_CLUSTER_COMPRESSED:
+    case QCOW2_SUBCLUSTER_INVALID:
+        break; /* This is handled by count_contiguous_subclusters() below */
+    case QCOW2_SUBCLUSTER_COMPRESSED:
         if (has_data_file(bs)) {
             qcow2_signal_corruption(bs, true, -1, -1, "Compressed cluster "
                                     "entry found in image with external data "
@@ -654,24 +658,17 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
             ret = -EIO;
             goto fail;
         }
-        /* Compressed clusters can only be processed one by one */
-        c = 1;
         *host_offset = l2_entry & L2E_COMPRESSED_OFFSET_SIZE_MASK;
         break;
-    case QCOW2_CLUSTER_ZERO_PLAIN:
-    case QCOW2_CLUSTER_UNALLOCATED:
-        /* how many empty clusters ? */
-        c = count_contiguous_clusters_unallocated(bs, nb_clusters,
-                                                  l2_slice, l2_index, type);
+    case QCOW2_SUBCLUSTER_ZERO_PLAIN:
+    case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
         *host_offset = 0;
         break;
-    case QCOW2_CLUSTER_ZERO_ALLOC:
-    case QCOW2_CLUSTER_NORMAL: {
+    case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+    case QCOW2_SUBCLUSTER_NORMAL:
+    case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC: {
         uint64_t host_cluster_offset = l2_entry & L2E_OFFSET_MASK;
         *host_offset = host_cluster_offset + offset_in_cluster;
-        /* how many allocated clusters ? */
-        c = count_contiguous_clusters(bs, nb_clusters, s->cluster_size,
-                                      l2_slice, l2_index, QCOW_OFLAG_ZERO);
         if (offset_into_cluster(s, host_cluster_offset)) {
             qcow2_signal_corruption(bs, true, -1, -1,
                                     "Cluster allocation offset %#"
@@ -697,9 +694,18 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t offset,
         abort();
     }
 
+    sc = count_contiguous_subclusters(bs, nb_clusters, sc_index,
+                                      l2_slice, &l2_index);
+    if (sc < 0) {
+        qcow2_signal_corruption(bs, true, -1, -1, "Invalid cluster entry found "
+                                " (L2 offset: %#" PRIx64 ", L2 index: %#x)",
+                                l2_offset, l2_index);
+        ret = -EIO;
+        goto fail;
+    }
     qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
 
-    bytes_available = (int64_t)c * s->cluster_size;
+    bytes_available = ((int64_t)sc + sc_index) << s->subcluster_bits;
 
 out:
     if (bytes_available > bytes_needed) {
@@ -712,7 +718,7 @@ out:
     assert(bytes_available - offset_in_cluster <= UINT_MAX);
     *bytes = bytes_available - offset_in_cluster;
 
-    *subcluster_type = qcow2_cluster_to_subcluster_type(type);
+    *subcluster_type = type;
 
     return 0;
 
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 22/34] qcow2: Add subcluster support to zero_in_l2_slice()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (20 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 21/34] qcow2: Add subcluster support to qcow2_get_host_offset() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 12:56   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 23/34] qcow2: Add subcluster support to discard_in_l2_slice() Alberto Garcia
                   ` (11 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

The QCOW_OFLAG_ZERO bit that indicates that a cluster reads as
zeroes is only used in standard L2 entries. Extended L2 entries use
individual 'all zeroes' bits for each subcluster.

This must be taken into account when updating the L2 entry and also
when deciding that an existing entry does not need to be updated.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-cluster.c | 36 +++++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 2f3bd3a882..4e59bbd545 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1956,7 +1956,6 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset,
     int l2_index;
     int ret;
     int i;
-    bool unmap = !!(flags & BDRV_REQ_MAY_UNMAP);
 
     ret = get_cluster_table(bs, offset, &l2_slice, &l2_index);
     if (ret < 0) {
@@ -1968,28 +1967,31 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset,
     assert(nb_clusters <= INT_MAX);
 
     for (i = 0; i < nb_clusters; i++) {
-        uint64_t old_offset;
-        QCow2ClusterType cluster_type;
+        uint64_t old_l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
+        uint64_t old_l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + i);
+        QCow2ClusterType type = qcow2_get_cluster_type(bs, old_l2_entry);
+        bool unmap = (type == QCOW2_CLUSTER_COMPRESSED) ||
+            ((flags & BDRV_REQ_MAY_UNMAP) && qcow2_cluster_is_allocated(type));
+        uint64_t new_l2_entry = unmap ? 0 : old_l2_entry;
+        uint64_t new_l2_bitmap = old_l2_bitmap;
 
-        old_offset = get_l2_entry(s, l2_slice, l2_index + i);
+        if (has_subclusters(s)) {
+            new_l2_bitmap = QCOW_L2_BITMAP_ALL_ZEROES;
+        } else {
+            new_l2_entry |= QCOW_OFLAG_ZERO;
+        }
 
-        /*
-         * Minimize L2 changes if the cluster already reads back as
-         * zeroes with correct allocation.
-         */
-        cluster_type = qcow2_get_cluster_type(bs, old_offset);
-        if (cluster_type == QCOW2_CLUSTER_ZERO_PLAIN ||
-            (cluster_type == QCOW2_CLUSTER_ZERO_ALLOC && !unmap)) {
+        if (old_l2_entry == new_l2_entry && old_l2_bitmap == new_l2_bitmap) {
             continue;
         }
 
         qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
-        if (cluster_type == QCOW2_CLUSTER_COMPRESSED || unmap) {
-            set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO);
-            qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
-        } else {
-            uint64_t entry = get_l2_entry(s, l2_slice, l2_index + i);
-            set_l2_entry(s, l2_slice, l2_index + i, entry | QCOW_OFLAG_ZERO);
+        if (unmap) {
+            qcow2_free_any_clusters(bs, old_l2_entry, 1, QCOW2_DISCARD_REQUEST);
+        }
+        set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry);
+        if (has_subclusters(s)) {
+            set_l2_bitmap(s, l2_slice, l2_index + i, new_l2_bitmap);
         }
     }
 
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 23/34] qcow2: Add subcluster support to discard_in_l2_slice()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (21 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 22/34] qcow2: Add subcluster support to zero_in_l2_slice() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 13:24   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 24/34] qcow2: Add subcluster support to check_refcounts_l2() Alberto Garcia
                   ` (10 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Two things need to be taken into account here:

1) With full_discard == true the L2 entry must be cleared completely.
   This also includes the L2 bitmap if the image has extended L2
   entries.

2) With full_discard == false we have to make the discarded cluster
   read back as zeroes. With normal L2 entries this is done with the
   QCOW_OFLAG_ZERO bit, whereas with extended L2 entries this is done
   with the individual 'all zeroes' bits for each subcluster.

   Note however that QCOW_OFLAG_ZERO is not supported in v2 qcow2
   images so, if there is a backing file, discard cannot guarantee
   that the image will read back as zeroes. If this is important for
   the caller it should forbid it as qcow2_co_pdiscard() does (see
   80f5c01183 for more details).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-cluster.c | 52 +++++++++++++++++++------------------------
 1 file changed, 23 insertions(+), 29 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 4e59bbd545..edfc8ea91c 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1847,11 +1847,17 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset,
     assert(nb_clusters <= INT_MAX);
 
     for (i = 0; i < nb_clusters; i++) {
-        uint64_t old_l2_entry;
-
-        old_l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
+        uint64_t old_l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
+        uint64_t old_l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + i);
+        uint64_t new_l2_entry = old_l2_entry;
+        uint64_t new_l2_bitmap = old_l2_bitmap;
+        QCow2ClusterType cluster_type =
+            qcow2_get_cluster_type(bs, old_l2_entry);
 
         /*
+         * If full_discard is true, the cluster should not read back as zeroes,
+         * but rather fall through to the backing file.
+         *
          * If full_discard is false, make sure that a discarded area reads back
          * as zeroes for v3 images (we cannot do it for v2 without actually
          * writing a zero-filled buffer). We can skip the operation if the
@@ -1860,40 +1866,28 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset,
          *
          * TODO We might want to use bdrv_block_status(bs) here, but we're
          * holding s->lock, so that doesn't work today.
-         *
-         * If full_discard is true, the sector should not read back as zeroes,
-         * but rather fall through to the backing file.
          */
-        switch (qcow2_get_cluster_type(bs, old_l2_entry)) {
-        case QCOW2_CLUSTER_UNALLOCATED:
-            if (full_discard || !bs->backing) {
-                continue;
+        if (full_discard) {
+            new_l2_entry = new_l2_bitmap = 0;
+        } else if (bs->backing || qcow2_cluster_is_allocated(cluster_type)) {
+            if (has_subclusters(s)) {
+                new_l2_entry = 0;
+                new_l2_bitmap = QCOW_L2_BITMAP_ALL_ZEROES;
+            } else {
+                new_l2_entry = s->qcow_version >= 3 ? QCOW_OFLAG_ZERO : 0;
             }
-            break;
+        }
 
-        case QCOW2_CLUSTER_ZERO_PLAIN:
-            if (!full_discard) {
-                continue;
-            }
-            break;
-
-        case QCOW2_CLUSTER_ZERO_ALLOC:
-        case QCOW2_CLUSTER_NORMAL:
-        case QCOW2_CLUSTER_COMPRESSED:
-            break;
-
-        default:
-            abort();
+        if (old_l2_entry == new_l2_entry && old_l2_bitmap == new_l2_bitmap) {
+            continue;
         }
 
         /* First remove L2 entries */
         qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
-        if (!full_discard && s->qcow_version >= 3) {
-            set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO);
-        } else {
-            set_l2_entry(s, l2_slice, l2_index + i, 0);
+        set_l2_entry(s, l2_slice, l2_index + i, new_l2_entry);
+        if (has_subclusters(s)) {
+            set_l2_bitmap(s, l2_slice, l2_index + i, new_l2_bitmap);
         }
-
         /* Then decrease the refcount */
         qcow2_free_any_clusters(bs, old_l2_entry, 1, type);
     }
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 24/34] qcow2: Add subcluster support to check_refcounts_l2()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (22 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 23/34] qcow2: Add subcluster support to discard_in_l2_slice() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 13:32   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 25/34] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2() Alberto Garcia
                   ` (9 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

The offset field of an uncompressed cluster's L2 entry must be aligned
to the cluster size, otherwise it is invalid. If the cluster has no
data then it means that the offset points to a preallocation, so we
can clear the offset field without affecting the guest-visible data.
This is what 'qemu-img check' does when run in repair mode.

On traditional qcow2 images this can only happen when QCOW_OFLAG_ZERO
is set, and repairing such entries turns the clusters from ZERO_ALLOC
into ZERO_PLAIN.

Extended L2 entries have no ZERO_ALLOC clusters and no QCOW_OFLAG_ZERO
but the idea is the same: if none of the subclusters are allocated
then we can clear the offset field and leave the bitmap untouched.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-refcount.c     | 16 +++++++++++-----
 tests/qemu-iotests/060.out |  2 +-
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 770c5dbc83..aae52607eb 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1669,12 +1669,18 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
 
             /* Correct offsets are cluster aligned */
             if (offset_into_cluster(s, offset)) {
+                bool contains_data;
                 res->corruptions++;
 
-                if (qcow2_get_cluster_type(bs, l2_entry) ==
-                    QCOW2_CLUSTER_ZERO_ALLOC)
-                {
-                    fprintf(stderr, "%s offset=%" PRIx64 ": Preallocated zero "
+                if (has_subclusters(s)) {
+                    uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i);
+                    contains_data = (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC);
+                } else {
+                    contains_data = !(l2_entry & QCOW_OFLAG_ZERO);
+                }
+
+                if (!contains_data) {
+                    fprintf(stderr, "%s offset=%" PRIx64 ": Preallocated "
                             "cluster is not properly aligned; L2 entry "
                             "corrupted.\n",
                             fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR",
@@ -1686,7 +1692,7 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
                         int ign = active ? QCOW2_OL_ACTIVE_L2 :
                                            QCOW2_OL_INACTIVE_L2;
 
-                        l2_entry = QCOW_OFLAG_ZERO;
+                        l2_entry = has_subclusters(s) ? 0 : QCOW_OFLAG_ZERO;
                         set_l2_entry(s, l2_table, i, l2_entry);
                         ret = qcow2_pre_write_overlap_check(bs, ign,
                                 l2e_offset, l2_entry_size(s), false);
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index be5f8707a3..fa3d68f0df 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -320,7 +320,7 @@ discard 65536/65536 bytes at offset 0
 qcow2: Marking image as corrupt: Preallocated zero cluster offset 0x2a00 unaligned (guest offset: 0); further corruption events will be suppressed
 write failed: Input/output error
 --- Repairing ---
-Repairing offset=2a00: Preallocated zero cluster is not properly aligned; L2 entry corrupted.
+Repairing offset=2a00: Preallocated cluster is not properly aligned; L2 entry corrupted.
 The following inconsistencies were found and repaired:
 
     0 leaked clusters
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 25/34] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (23 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 24/34] qcow2: Add subcluster support to check_refcounts_l2() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 14:01   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 26/34] qcow2: Clear the L2 bitmap when allocating a compressed cluster Alberto Garcia
                   ` (8 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

The L2 bitmap needs to be updated after each write to indicate what
new subclusters are now allocated. This needs to happen even if the
cluster was already allocated and the L2 entry was otherwise valid.

In some cases however a write operation doesn't need change the L2
bitmap (because all affected subclusters were already allocated). This
is detected in calculate_l2_meta(), and qcow2_alloc_cluster_link_l2()
is never called in those cases.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-cluster.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index edfc8ea91c..2276cee6d6 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1061,6 +1061,24 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
         assert((offset & L2E_OFFSET_MASK) == offset);
 
         set_l2_entry(s, l2_slice, l2_index + i, offset | QCOW_OFLAG_COPIED);
+
+        /* Update bitmap with the subclusters that were just written */
+        if (has_subclusters(s)) {
+            uint64_t l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + i);
+            unsigned written_from = m->cow_start.offset;
+            unsigned written_to = m->cow_end.offset + m->cow_end.nb_bytes ?:
+                m->nb_clusters << s->cluster_bits;
+            int first_sc, last_sc;
+            /* Narrow written_from and written_to down to the current cluster */
+            written_from = MAX(written_from, i << s->cluster_bits);
+            written_to   = MIN(written_to, (i + 1) << s->cluster_bits);
+            assert(written_from < written_to);
+            first_sc = offset_to_sc_index(s, written_from);
+            last_sc  = offset_to_sc_index(s, written_to - 1);
+            l2_bitmap |= QCOW_OFLAG_SUB_ALLOC_RANGE(first_sc, last_sc + 1);
+            l2_bitmap &= ~QCOW_OFLAG_SUB_ZERO_RANGE(first_sc, last_sc + 1);
+            set_l2_bitmap(s, l2_slice, l2_index + i, l2_bitmap);
+        }
      }
 
 
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 26/34] qcow2: Clear the L2 bitmap when allocating a compressed cluster
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (24 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 25/34] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 27/34] qcow2: Add subcluster support to handle_alloc_space() Alberto Garcia
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Compressed clusters always have the bitmap part of the extended L2
entry set to 0.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 2276cee6d6..deff838fe8 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -861,6 +861,9 @@ int qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
     BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED);
     qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
     set_l2_entry(s, l2_slice, l2_index, cluster_offset);
+    if (has_subclusters(s)) {
+        set_l2_bitmap(s, l2_slice, l2_index, 0);
+    }
     qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
 
     *host_offset = cluster_offset & s->cluster_offset_mask;
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 27/34] qcow2: Add subcluster support to handle_alloc_space()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (25 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 26/34] qcow2: Clear the L2 bitmap when allocating a compressed cluster Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes() Alberto Garcia
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

The bdrv_co_pwrite_zeroes() call here fills complete clusters with
zeroes, but it can happen that some subclusters are not part of the
write request or the copy-on-write. This patch makes sure that only
the affected subclusters are overwritten.

A potential improvement would be to also fill with zeroes the other
subclusters if we can guarantee that we are not overwriting existing
data. However this would waste more disk space, so we should first
evaluate if it's really worth doing.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index a3481cd85b..86258fbc22 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2411,6 +2411,9 @@ static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
 
     for (m = l2meta; m != NULL; m = m->next) {
         int ret;
+        uint64_t start_offset = m->alloc_offset + m->cow_start.offset;
+        unsigned nb_bytes = m->cow_end.offset + m->cow_end.nb_bytes -
+            m->cow_start.offset;
 
         if (!m->cow_start.nb_bytes && !m->cow_end.nb_bytes) {
             continue;
@@ -2425,16 +2428,14 @@ static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
          * efficiently zero out the whole clusters
          */
 
-        ret = qcow2_pre_write_overlap_check(bs, 0, m->alloc_offset,
-                                            m->nb_clusters * s->cluster_size,
+        ret = qcow2_pre_write_overlap_check(bs, 0, start_offset, nb_bytes,
                                             true);
         if (ret < 0) {
             return ret;
         }
 
         BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_SPACE);
-        ret = bdrv_co_pwrite_zeroes(s->data_file, m->alloc_offset,
-                                    m->nb_clusters * s->cluster_size,
+        ret = bdrv_co_pwrite_zeroes(s->data_file, start_offset, nb_bytes,
                                     BDRV_REQ_NO_FALLBACK);
         if (ret < 0) {
             if (ret != -ENOTSUP && ret != -EAGAIN) {
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (26 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 27/34] qcow2: Add subcluster support to handle_alloc_space() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 14:28   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 29/34] qcow2: Add subcluster support to qcow2_measure() Alberto Garcia
                   ` (5 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

This works now at the subcluster level and pwrite_zeroes_alignment is
updated accordingly.

qcow2_cluster_zeroize() is turned into qcow2_subcluster_zeroize() with
the following changes:

   - The request can now be subcluster-aligned.

   - The cluster-aligned body of the request is still zeroized using
     zero_in_l2_slice() as before.

   - The subcluster-aligned head and tail of the request are zeroized
     with the new zero_l2_subclusters() function.

There is just one thing to take into account for a possible future
improvement: compressed clusters cannot be partially zeroized so
zero_l2_subclusters() on the head or the tail can return -ENOTSUP.
This makes the caller repeat the *complete* request and write actual
zeroes to disk. This is sub-optimal because

   1) if the head area was compressed we would still be able to use
      the fast path for the body and possibly the tail.

   2) if the tail area was compressed we are writing zeroes to the
      head and the body areas, which are already zeroized.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.h         |  4 +--
 block/qcow2-cluster.c | 80 +++++++++++++++++++++++++++++++++++++++----
 block/qcow2.c         | 27 ++++++++-------
 3 files changed, 90 insertions(+), 21 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 4fad40b96b..4ef4ae4ab0 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -898,8 +898,8 @@ void qcow2_alloc_cluster_abort(BlockDriverState *bs, QCowL2Meta *m);
 int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset,
                           uint64_t bytes, enum qcow2_discard_type type,
                           bool full_discard);
-int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
-                          uint64_t bytes, int flags);
+int qcow2_subcluster_zeroize(BlockDriverState *bs, uint64_t offset,
+                             uint64_t bytes, int flags);
 
 int qcow2_expand_zero_clusters(BlockDriverState *bs,
                                BlockDriverAmendStatusCB *status_cb,
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index deff838fe8..1641976028 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -2015,12 +2015,58 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset,
     return nb_clusters;
 }
 
-int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
-                          uint64_t bytes, int flags)
+static int zero_l2_subclusters(BlockDriverState *bs, uint64_t offset,
+                               unsigned nb_subclusters)
+{
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t *l2_slice;
+    uint64_t old_l2_bitmap, l2_bitmap;
+    int l2_index, ret, sc = offset_to_sc_index(s, offset);
+
+    /* For full clusters use zero_in_l2_slice() instead */
+    assert(nb_subclusters > 0 && nb_subclusters < s->subclusters_per_cluster);
+    assert(sc + nb_subclusters <= s->subclusters_per_cluster);
+
+    ret = get_cluster_table(bs, offset, &l2_slice, &l2_index);
+    if (ret < 0) {
+        return ret;
+    }
+
+    switch (qcow2_get_cluster_type(bs, get_l2_entry(s, l2_slice, l2_index))) {
+    case QCOW2_CLUSTER_COMPRESSED:
+        ret = -ENOTSUP; /* We cannot partially zeroize compressed clusters */
+        goto out;
+    case QCOW2_CLUSTER_NORMAL:
+    case QCOW2_CLUSTER_UNALLOCATED:
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    old_l2_bitmap = l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
+
+    l2_bitmap |=  QCOW_OFLAG_SUB_ZERO_RANGE(sc, sc + nb_subclusters);
+    l2_bitmap &= ~QCOW_OFLAG_SUB_ALLOC_RANGE(sc, sc + nb_subclusters);
+
+    if (old_l2_bitmap != l2_bitmap) {
+        set_l2_bitmap(s, l2_slice, l2_index, l2_bitmap);
+        qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
+    }
+
+    ret = 0;
+out:
+    qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
+
+    return ret;
+}
+
+int qcow2_subcluster_zeroize(BlockDriverState *bs, uint64_t offset,
+                             uint64_t bytes, int flags)
 {
     BDRVQcow2State *s = bs->opaque;
     uint64_t end_offset = offset + bytes;
     uint64_t nb_clusters;
+    unsigned head, tail;
     int64_t cleared;
     int ret;
 
@@ -2035,8 +2081,8 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
     }
 
     /* Caller must pass aligned values, except at image end */
-    assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
-    assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
+    assert(offset_into_subcluster(s, offset) == 0);
+    assert(offset_into_subcluster(s, end_offset) == 0 ||
            end_offset >= bs->total_sectors << BDRV_SECTOR_BITS);
 
     /* The zero flag is only supported by version 3 and newer */
@@ -2044,11 +2090,26 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
         return -ENOTSUP;
     }
 
-    /* Each L2 slice is handled by its own loop iteration */
-    nb_clusters = size_to_clusters(s, bytes);
+    head = MIN(end_offset, ROUND_UP(offset, s->cluster_size)) - offset;
+    offset += head;
+
+    tail = (end_offset >= bs->total_sectors << BDRV_SECTOR_BITS) ? 0 :
+        end_offset - MAX(offset, start_of_cluster(s, end_offset));
+    end_offset -= tail;
 
     s->cache_discards = true;
 
+    if (head) {
+        ret = zero_l2_subclusters(bs, offset - head,
+                                  size_to_subclusters(s, head));
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
+    /* Each L2 slice is handled by its own loop iteration */
+    nb_clusters = size_to_clusters(s, end_offset - offset);
+
     while (nb_clusters > 0) {
         cleared = zero_in_l2_slice(bs, offset, nb_clusters, flags);
         if (cleared < 0) {
@@ -2060,6 +2121,13 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
         offset += (cleared * s->cluster_size);
     }
 
+    if (tail) {
+        ret = zero_l2_subclusters(bs, end_offset, size_to_subclusters(s, tail));
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
     ret = 0;
 fail:
     s->cache_discards = false;
diff --git a/block/qcow2.c b/block/qcow2.c
index 86258fbc22..4edc3c72b9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1903,7 +1903,7 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
         /* Encryption works on a sector granularity */
         bs->bl.request_alignment = qcrypto_block_get_sector_size(s->crypto);
     }
-    bs->bl.pwrite_zeroes_alignment = s->cluster_size;
+    bs->bl.pwrite_zeroes_alignment = s->subcluster_size;
     bs->bl.pdiscard_alignment = s->cluster_size;
 }
 
@@ -3840,8 +3840,9 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
     int ret;
     BDRVQcow2State *s = bs->opaque;
 
-    uint32_t head = offset % s->cluster_size;
-    uint32_t tail = (offset + bytes) % s->cluster_size;
+    uint32_t head = offset_into_subcluster(s, offset);
+    uint32_t tail = ROUND_UP(offset + bytes, s->subcluster_size) -
+        (offset + bytes);
 
     trace_qcow2_pwrite_zeroes_start_req(qemu_coroutine_self(), offset, bytes);
     if (offset + bytes == bs->total_sectors * BDRV_SECTOR_SIZE) {
@@ -3853,20 +3854,19 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
         unsigned int nr;
         QCow2SubclusterType type;
 
-        assert(head + bytes <= s->cluster_size);
+        assert(head + bytes + tail <= s->subcluster_size);
 
         /* check whether remainder of cluster already reads as zero */
         if (!(is_zero(bs, offset - head, head) &&
-              is_zero(bs, offset + bytes,
-                      tail ? s->cluster_size - tail : 0))) {
+              is_zero(bs, offset + bytes, tail))) {
             return -ENOTSUP;
         }
 
         qemu_co_mutex_lock(&s->lock);
         /* We can have new write after previous check */
-        offset = QEMU_ALIGN_DOWN(offset, s->cluster_size);
-        bytes = s->cluster_size;
-        nr = s->cluster_size;
+        offset -= head;
+        bytes = s->subcluster_size;
+        nr = s->subcluster_size;
         ret = qcow2_get_host_offset(bs, offset, &nr, &off, &type);
         if (ret < 0 ||
             (type != QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN &&
@@ -3882,8 +3882,8 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
 
     trace_qcow2_pwrite_zeroes(qemu_coroutine_self(), offset, bytes);
 
-    /* Whatever is left can use real zero clusters */
-    ret = qcow2_cluster_zeroize(bs, offset, bytes, flags);
+    /* Whatever is left can use real zero subclusters */
+    ret = qcow2_subcluster_zeroize(bs, offset, bytes, flags);
     qemu_co_mutex_unlock(&s->lock);
 
     return ret;
@@ -4367,12 +4367,13 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
         uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
 
         /*
-         * Use zero clusters as much as we can. qcow2_cluster_zeroize()
+         * Use zero clusters as much as we can. qcow2_subcluster_zeroize()
          * requires a cluster-aligned start. The end may be unaligned if it is
          * at the end of the image (which it is here).
          */
         if (offset > zero_start) {
-            ret = qcow2_cluster_zeroize(bs, zero_start, offset - zero_start, 0);
+            ret = qcow2_subcluster_zeroize(bs, zero_start, offset - zero_start,
+                                           0);
             if (ret < 0) {
                 error_setg_errno(errp, -ret, "Failed to zero out new clusters");
                 goto fail;
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 29/34] qcow2: Add subcluster support to qcow2_measure()
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (27 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta Alberto Garcia
                   ` (4 subsequent siblings)
  33 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Extended L2 entries are bigger than normal L2 entries so this has an
impact on the amount of metadata needed for a qcow2 file.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 4edc3c72b9..72bd25e774 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3233,28 +3233,31 @@ int64_t qcow2_refcount_metadata_size(int64_t clusters, size_t cluster_size,
  * @total_size: virtual disk size in bytes
  * @cluster_size: cluster size in bytes
  * @refcount_order: refcount bits power-of-2 exponent
+ * @extended_l2: true if the image has extended L2 entries
  *
  * Returns: Total number of bytes required for the fully allocated image
  * (including metadata).
  */
 static int64_t qcow2_calc_prealloc_size(int64_t total_size,
                                         size_t cluster_size,
-                                        int refcount_order)
+                                        int refcount_order,
+                                        bool extended_l2)
 {
     int64_t meta_size = 0;
     uint64_t nl1e, nl2e;
     int64_t aligned_total_size = ROUND_UP(total_size, cluster_size);
+    size_t l2e_size = extended_l2 ? L2E_SIZE_EXTENDED : L2E_SIZE_NORMAL;
 
     /* header: 1 cluster */
     meta_size += cluster_size;
 
     /* total size of L2 tables */
     nl2e = aligned_total_size / cluster_size;
-    nl2e = ROUND_UP(nl2e, cluster_size / sizeof(uint64_t));
-    meta_size += nl2e * sizeof(uint64_t);
+    nl2e = ROUND_UP(nl2e, cluster_size / l2e_size);
+    meta_size += nl2e * l2e_size;
 
     /* total size of L1 tables */
-    nl1e = nl2e * sizeof(uint64_t) / cluster_size;
+    nl1e = nl2e * l2e_size / cluster_size;
     nl1e = ROUND_UP(nl1e, cluster_size / sizeof(uint64_t));
     meta_size += nl1e * sizeof(uint64_t);
 
@@ -4845,6 +4848,8 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, BlockDriverState *in_bs,
     PreallocMode prealloc;
     bool has_backing_file;
     bool has_luks;
+    bool extended_l2 = false; /* Set to false until the option is added */
+    size_t l2e_size;
 
     /* Parse image creation options */
     cluster_size = qcow2_opt_get_cluster_size_del(opts, &local_err);
@@ -4910,8 +4915,9 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, BlockDriverState *in_bs,
     virtual_size = ROUND_UP(virtual_size, cluster_size);
 
     /* Check that virtual disk size is valid */
+    l2e_size = extended_l2 ? L2E_SIZE_EXTENDED : L2E_SIZE_NORMAL;
     l2_tables = DIV_ROUND_UP(virtual_size / cluster_size,
-                             cluster_size / sizeof(uint64_t));
+                             cluster_size / l2e_size);
     if (l2_tables * sizeof(uint64_t) > QCOW_MAX_L1_SIZE) {
         error_setg(&local_err, "The image size is too large "
                                "(try using a larger cluster size)");
@@ -4974,9 +4980,9 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, BlockDriverState *in_bs,
     }
 
     info = g_new0(BlockMeasureInfo, 1);
-    info->fully_allocated =
+    info->fully_allocated = luks_payload_size +
         qcow2_calc_prealloc_size(virtual_size, cluster_size,
-                                 ctz32(refcount_bits)) + luks_payload_size;
+                                 ctz32(refcount_bits), extended_l2);
 
     /*
      * Remove data clusters that are not required.  This overestimates the
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (28 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 29/34] qcow2: Add subcluster support to qcow2_measure() Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 14:50   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 31/34] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit Alberto Garcia
                   ` (3 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

This field allows us to indicate that the L2 metadata update does not
come from a write request with actual data but from a preallocation
request.

For traditional images this does not make any difference, but for
images with extended L2 entries this means that the clusters are
allocated normally in the L2 table but individual subclusters are
marked as unallocated.

This will allow preallocating images that have a backing file.

There is one special case: when we resize an existing image we can
also request that the new clusters are preallocated. If the image
already had a backing file then we have to hide any possible stale
data and zero out the new clusters (see commit 955c7d6687 for more
details).

In this case the subclusters cannot be left as unallocated so the L2
bitmap must be updated.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.h         | 8 ++++++++
 block/qcow2-cluster.c | 2 +-
 block/qcow2.c         | 6 ++++++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 4ef4ae4ab0..f3499e53bf 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -463,6 +463,14 @@ typedef struct QCowL2Meta
      */
     bool skip_cow;
 
+    /**
+     * Indicates that this is not a normal write request but a preallocation.
+     * If the image has extended L2 entries this means that no new individual
+     * subclusters will be marked as allocated in the L2 bitmap (but any
+     * existing contents of that bitmap will be kept).
+     */
+    bool prealloc;
+
     /**
      * The I/O vector with the data from the actual guest write request.
      * If non-NULL, this is meant to be merged together with the data
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 1641976028..c8217081f2 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1066,7 +1066,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
         set_l2_entry(s, l2_slice, l2_index + i, offset | QCOW_OFLAG_COPIED);
 
         /* Update bitmap with the subclusters that were just written */
-        if (has_subclusters(s)) {
+        if (has_subclusters(s) && !m->prealloc) {
             uint64_t l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + i);
             unsigned written_from = m->cow_start.offset;
             unsigned written_to = m->cow_end.offset + m->cow_end.nb_bytes ?:
diff --git a/block/qcow2.c b/block/qcow2.c
index 72bd25e774..003f166024 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2086,6 +2086,7 @@ static coroutine_fn int qcow2_handle_l2meta(BlockDriverState *bs,
         QCowL2Meta *next;
 
         if (link_l2) {
+            assert(!l2meta->prealloc);
             ret = qcow2_alloc_cluster_link_l2(bs, l2meta);
             if (ret) {
                 goto out;
@@ -3131,6 +3132,7 @@ static int coroutine_fn preallocate_co(BlockDriverState *bs, uint64_t offset,
 
         while (meta) {
             QCowL2Meta *next = meta->next;
+            meta->prealloc = true;
 
             ret = qcow2_alloc_cluster_link_l2(bs, meta);
             if (ret < 0) {
@@ -4224,6 +4226,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
         int64_t clusters_allocated;
         int64_t old_file_size, last_cluster, new_file_size;
         uint64_t nb_new_data_clusters, nb_new_l2_tables;
+        bool subclusters_need_allocation = false;
 
         /* With a data file, preallocation means just allocating the metadata
          * and forwarding the truncate request to the data file */
@@ -4305,6 +4308,8 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
                                    BDRV_REQ_ZERO_WRITE, NULL);
             if (ret >= 0) {
                 flags &= ~BDRV_REQ_ZERO_WRITE;
+                /* Ensure that we read zeroes and not backing file data */
+                subclusters_need_allocation = true;
             }
         } else {
             ret = -1;
@@ -4343,6 +4348,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
                     .offset       = nb_clusters << s->cluster_bits,
                     .nb_bytes     = 0,
                 },
+                .prealloc     = !subclusters_need_allocation,
             };
             qemu_co_queue_init(&allocation.dependent_requests);
 
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 31/34] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (29 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-02 15:13   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 32/34] qcow2: Allow preallocation and backing files if extended_l2 is set Alberto Garcia
                   ` (2 subsequent siblings)
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Now that the implementation of subclusters is complete we can finally
add the necessary options to create and read images with this feature,
which we call "extended L2 entries".

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 qapi/block-core.json             |   7 +++
 block/qcow2.h                    |   8 ++-
 include/block/block_int.h        |   1 +
 block/qcow2.c                    |  74 ++++++++++++++++++++--
 tests/qemu-iotests/031.out       |   8 +--
 tests/qemu-iotests/036.out       |   4 +-
 tests/qemu-iotests/049.out       | 102 +++++++++++++++----------------
 tests/qemu-iotests/060.out       |   1 +
 tests/qemu-iotests/061.out       |  20 +++---
 tests/qemu-iotests/065           |  12 ++--
 tests/qemu-iotests/082.out       |  48 ++++++++++++---
 tests/qemu-iotests/085.out       |  38 ++++++------
 tests/qemu-iotests/144.out       |   4 +-
 tests/qemu-iotests/182.out       |   2 +-
 tests/qemu-iotests/185.out       |   8 +--
 tests/qemu-iotests/198.out       |   2 +
 tests/qemu-iotests/206.out       |   4 ++
 tests/qemu-iotests/242.out       |   5 ++
 tests/qemu-iotests/255.out       |   8 +--
 tests/qemu-iotests/274.out       |  49 ++++++++-------
 tests/qemu-iotests/280.out       |   2 +-
 tests/qemu-iotests/291.out       |   6 +-
 tests/qemu-iotests/common.filter |   1 +
 23 files changed, 272 insertions(+), 142 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0e1c6a59f2..24e002ebae 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -66,6 +66,9 @@
 #                 standalone (read-only) raw image without looking at qcow2
 #                 metadata (since: 4.0)
 #
+# @extended-l2: true if the image has extended L2 entries; only valid for
+#               compat >= 1.1 (since 5.1)
+#
 # @lazy-refcounts: on or off; only valid for compat >= 1.1
 #
 # @corrupt: true if the image has been marked corrupt; only valid for
@@ -87,6 +90,7 @@
       'compat': 'str',
       '*data-file': 'str',
       '*data-file-raw': 'bool',
+      '*extended-l2': 'bool',
       '*lazy-refcounts': 'bool',
       '*corrupt': 'bool',
       'refcount-bits': 'int',
@@ -4318,6 +4322,8 @@
 # @data-file-raw: True if the external data file must stay valid as a
 #                 standalone (read-only) raw image without looking at qcow2
 #                 metadata (default: false; since: 4.0)
+# @extended-l2      True to make the image have extended L2 entries
+#                   (default: false; since 5.1)
 # @size: Size of the virtual disk in bytes
 # @version: Compatibility level (default: v3)
 # @backing-file: File name of the backing file if a backing file
@@ -4338,6 +4344,7 @@
   'data': { 'file':             'BlockdevRef',
             '*data-file':       'BlockdevRef',
             '*data-file-raw':   'bool',
+            '*extended-l2':     'bool',
             'size':             'size',
             '*version':         'BlockdevQcow2Version',
             '*backing-file':    'str',
diff --git a/block/qcow2.h b/block/qcow2.h
index f3499e53bf..065ec3df0b 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -246,15 +246,18 @@ enum {
     QCOW2_INCOMPAT_CORRUPT_BITNR    = 1,
     QCOW2_INCOMPAT_DATA_FILE_BITNR  = 2,
     QCOW2_INCOMPAT_COMPRESSION_BITNR = 3,
+    QCOW2_INCOMPAT_EXTL2_BITNR      = 4,
     QCOW2_INCOMPAT_DIRTY            = 1 << QCOW2_INCOMPAT_DIRTY_BITNR,
     QCOW2_INCOMPAT_CORRUPT          = 1 << QCOW2_INCOMPAT_CORRUPT_BITNR,
     QCOW2_INCOMPAT_DATA_FILE        = 1 << QCOW2_INCOMPAT_DATA_FILE_BITNR,
     QCOW2_INCOMPAT_COMPRESSION      = 1 << QCOW2_INCOMPAT_COMPRESSION_BITNR,
+    QCOW2_INCOMPAT_EXTL2            = 1 << QCOW2_INCOMPAT_EXTL2_BITNR,
 
     QCOW2_INCOMPAT_MASK             = QCOW2_INCOMPAT_DIRTY
                                     | QCOW2_INCOMPAT_CORRUPT
                                     | QCOW2_INCOMPAT_DATA_FILE
-                                    | QCOW2_INCOMPAT_COMPRESSION,
+                                    | QCOW2_INCOMPAT_COMPRESSION
+                                    | QCOW2_INCOMPAT_EXTL2,
 };
 
 /* Compatible feature bits */
@@ -581,8 +584,7 @@ typedef enum QCow2MetadataOverlap {
 
 static inline bool has_subclusters(BDRVQcow2State *s)
 {
-    /* FIXME: Return false until this feature is complete */
-    return false;
+    return s->incompatible_features & QCOW2_INCOMPAT_EXTL2;
 }
 
 static inline size_t l2_entry_size(BDRVQcow2State *s)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 791de6a59c..36e1993788 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -58,6 +58,7 @@
 #define BLOCK_OPT_DATA_FILE         "data_file"
 #define BLOCK_OPT_DATA_FILE_RAW     "data_file_raw"
 #define BLOCK_OPT_COMPRESSION_TYPE  "compression_type"
+#define BLOCK_OPT_EXTL2             "extended_l2"
 
 #define BLOCK_PROBE_BUF_SIZE        512
 
diff --git a/block/qcow2.c b/block/qcow2.c
index 003f166024..37bfae823c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1438,6 +1438,12 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
     s->subcluster_size = s->cluster_size / s->subclusters_per_cluster;
     s->subcluster_bits = ctz32(s->subcluster_size);
 
+    if (s->subcluster_size < (1 << MIN_CLUSTER_BITS)) {
+        error_setg(errp, "Unsupported subcluster size: %d", s->subcluster_size);
+        ret = -EINVAL;
+        goto fail;
+    }
+
     /* Check support for various header values */
     if (header.refcount_order > 6) {
         error_setg(errp, "Reference count entry width too large; may not "
@@ -2924,6 +2930,11 @@ int qcow2_update_header(BlockDriverState *bs)
                 .bit  = QCOW2_INCOMPAT_COMPRESSION_BITNR,
                 .name = "compression type",
             },
+            {
+                .type = QCOW2_FEAT_TYPE_INCOMPATIBLE,
+                .bit  = QCOW2_INCOMPAT_EXTL2_BITNR,
+                .name = "extended L2 entries",
+            },
             {
                 .type = QCOW2_FEAT_TYPE_COMPATIBLE,
                 .bit  = QCOW2_COMPAT_LAZY_REFCOUNTS_BITNR,
@@ -3271,7 +3282,8 @@ static int64_t qcow2_calc_prealloc_size(int64_t total_size,
     return meta_size + aligned_total_size;
 }
 
-static bool validate_cluster_size(size_t cluster_size, Error **errp)
+static bool validate_cluster_size(size_t cluster_size, bool extended_l2,
+                                  Error **errp)
 {
     int cluster_bits = ctz32(cluster_size);
     if (cluster_bits < MIN_CLUSTER_BITS || cluster_bits > MAX_CLUSTER_BITS ||
@@ -3281,16 +3293,28 @@ static bool validate_cluster_size(size_t cluster_size, Error **errp)
                    "%dk", 1 << MIN_CLUSTER_BITS, 1 << (MAX_CLUSTER_BITS - 10));
         return false;
     }
+
+    if (extended_l2) {
+        unsigned min_cluster_size =
+            (1 << MIN_CLUSTER_BITS) * QCOW_EXTL2_SUBCLUSTERS_PER_CLUSTER;
+        if (cluster_size < min_cluster_size) {
+            error_setg(errp, "Extended L2 entries are only supported with "
+                       "cluster sizes of at least %u bytes", min_cluster_size);
+            return false;
+        }
+    }
+
     return true;
 }
 
-static size_t qcow2_opt_get_cluster_size_del(QemuOpts *opts, Error **errp)
+static size_t qcow2_opt_get_cluster_size_del(QemuOpts *opts, bool extended_l2,
+                                             Error **errp)
 {
     size_t cluster_size;
 
     cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
                                          DEFAULT_CLUSTER_SIZE);
-    if (!validate_cluster_size(cluster_size, errp)) {
+    if (!validate_cluster_size(cluster_size, extended_l2, errp)) {
         return 0;
     }
     return cluster_size;
@@ -3405,7 +3429,20 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
         cluster_size = DEFAULT_CLUSTER_SIZE;
     }
 
-    if (!validate_cluster_size(cluster_size, errp)) {
+    if (!qcow2_opts->has_extended_l2) {
+        qcow2_opts->extended_l2 = false;
+    }
+    if (qcow2_opts->extended_l2) {
+        if (version < 3) {
+            error_setg(errp, "Extended L2 entries are only supported with "
+                       "compatibility level 1.1 and above (use version=v3 or "
+                       "greater)");
+            ret = -EINVAL;
+            goto out;
+        }
+    }
+
+    if (!validate_cluster_size(cluster_size, qcow2_opts->extended_l2, errp)) {
         ret = -EINVAL;
         goto out;
     }
@@ -3556,6 +3593,11 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
             cpu_to_be64(QCOW2_INCOMPAT_COMPRESSION);
     }
 
+    if (qcow2_opts->extended_l2) {
+        header->incompatible_features |=
+            cpu_to_be64(QCOW2_INCOMPAT_EXTL2);
+    }
+
     ret = blk_pwrite(blk, 0, header, cluster_size, 0);
     g_free(header);
     if (ret < 0) {
@@ -3736,6 +3778,7 @@ static int coroutine_fn qcow2_co_create_opts(BlockDriver *drv,
         { BLOCK_OPT_BACKING_FMT,        "backing-fmt" },
         { BLOCK_OPT_CLUSTER_SIZE,       "cluster-size" },
         { BLOCK_OPT_LAZY_REFCOUNTS,     "lazy-refcounts" },
+        { BLOCK_OPT_EXTL2,              "extended-l2" },
         { BLOCK_OPT_REFCOUNT_BITS,      "refcount-bits" },
         { BLOCK_OPT_ENCRYPT,            BLOCK_OPT_ENCRYPT_FORMAT },
         { BLOCK_OPT_COMPAT_LEVEL,       "version" },
@@ -4854,11 +4897,14 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, BlockDriverState *in_bs,
     PreallocMode prealloc;
     bool has_backing_file;
     bool has_luks;
-    bool extended_l2 = false; /* Set to false until the option is added */
+    bool extended_l2;
     size_t l2e_size;
 
     /* Parse image creation options */
-    cluster_size = qcow2_opt_get_cluster_size_del(opts, &local_err);
+    extended_l2 = qemu_opt_get_bool_del(opts, BLOCK_OPT_EXTL2, false);
+
+    cluster_size = qcow2_opt_get_cluster_size_del(opts, extended_l2,
+                                                  &local_err);
     if (local_err) {
         goto err;
     }
@@ -5062,6 +5108,8 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs,
             .corrupt            = s->incompatible_features &
                                   QCOW2_INCOMPAT_CORRUPT,
             .has_corrupt        = true,
+            .has_extended_l2    = true,
+            .extended_l2        = has_subclusters(s),
             .refcount_bits      = s->refcount_bits,
             .has_bitmaps        = !!bitmaps,
             .bitmaps            = bitmaps,
@@ -5491,6 +5539,14 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
                                  "is not supported");
                 return -ENOTSUP;
             }
+        } else if (!strcmp(desc->name, BLOCK_OPT_EXTL2)) {
+            bool extended_l2 = qemu_opt_get_bool(opts, BLOCK_OPT_EXTL2,
+                                                 has_subclusters(s));
+            if (extended_l2 != has_subclusters(s)) {
+                error_setg(errp, "Toggling extended L2 entries "
+                           "is not supported");
+                return -EINVAL;
+            }
         } else {
             /* if this point is reached, this probably means a new option was
              * added without having it covered here */
@@ -5751,6 +5807,12 @@ static QemuOptsList qcow2_create_opts = {
             .help = "Postpone refcount updates",
             .def_value_str = "off"
         },
+        {
+            .name = BLOCK_OPT_EXTL2,
+            .type = QEMU_OPT_BOOL,
+            .help = "Extended L2 tables",
+            .def_value_str = "off"
+        },
         {
             .name = BLOCK_OPT_REFCOUNT_BITS,
             .type = QEMU_OPT_NUMBER,
diff --git a/tests/qemu-iotests/031.out b/tests/qemu-iotests/031.out
index 4b21d6a9ba..0054c2ed97 100644
--- a/tests/qemu-iotests/031.out
+++ b/tests/qemu-iotests/031.out
@@ -117,7 +117,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 Header extension:
@@ -150,7 +150,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 Header extension:
@@ -164,7 +164,7 @@ No errors were found on the image.
 
 magic                     0x514649fb
 version                   3
-backing_file_offset       0x210
+backing_file_offset       0x240
 backing_file_size         0x17
 cluster_bits              16
 size                      67108864
@@ -188,7 +188,7 @@ data                      'host_device'
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 Header extension:
diff --git a/tests/qemu-iotests/036.out b/tests/qemu-iotests/036.out
index a9bed828e5..1fa7cad28d 100644
--- a/tests/qemu-iotests/036.out
+++ b/tests/qemu-iotests/036.out
@@ -26,7 +26,7 @@ compatible_features       []
 autoclear_features        [63]
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 
@@ -38,7 +38,7 @@ compatible_features       []
 autoclear_features        []
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 *** done
diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out
index c54ae21b86..c8356be551 100644
--- a/tests/qemu-iotests/049.out
+++ b/tests/qemu-iotests/049.out
@@ -4,90 +4,90 @@ QA output created by 049
 == 1. Traditional size parameter ==
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1024
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1024b
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1k
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1K
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1048576 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1048576 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1G
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1T
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1099511627776 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1099511627776 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1024.0
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1024.0b
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1.5k
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1536 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1536 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1.5K
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1536 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1536 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1.5M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1572864 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1572864 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1.5G
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1610612736 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1610612736 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 TEST_DIR/t.qcow2 1.5T
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1649267441664 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1649267441664 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 == 2. Specifying size via -o ==
 
 qemu-img create -f qcow2 -o size=1024 TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1024b TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1k TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1K TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1M TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1048576 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1048576 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1G TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1T TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1099511627776 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1099511627776 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1024.0 TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1024.0b TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1024 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1.5k TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1536 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1536 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1.5K TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1536 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1536 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1.5M TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1572864 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1572864 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1.5G TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1610612736 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1610612736 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o size=1.5T TEST_DIR/t.qcow2
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1649267441664 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=1649267441664 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 == 3. Invalid sizes ==
 
@@ -129,84 +129,84 @@ qemu-img: TEST_DIR/t.qcow2: The image size must be specified only once
 == Check correct interpretation of suffixes for cluster size ==
 
 qemu-img create -f qcow2 -o cluster_size=1024 TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=1024b TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=1k TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=1K TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=1M TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1048576 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1048576 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=1024.0 TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=1024.0b TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=1024 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=0.5k TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=512 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=512 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=0.5K TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=512 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=512 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o cluster_size=0.5M TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=524288 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=524288 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 == Check compat level option ==
 
 qemu-img create -f qcow2 -o compat=0.10 TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=0.10 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=0.10 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o compat=1.1 TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=1.1 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=1.1 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o compat=0.42 TEST_DIR/t.qcow2 64M
 qemu-img: TEST_DIR/t.qcow2: Invalid parameter '0.42'
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=0.42 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=0.42 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o compat=foobar TEST_DIR/t.qcow2 64M
 qemu-img: TEST_DIR/t.qcow2: Invalid parameter 'foobar'
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=foobar cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=foobar cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 == Check preallocation option ==
 
 qemu-img create -f qcow2 -o preallocation=off TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 preallocation=off lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 preallocation=off lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o preallocation=metadata TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 preallocation=metadata lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o preallocation=1234 TEST_DIR/t.qcow2 64M
 qemu-img: TEST_DIR/t.qcow2: Invalid parameter '1234'
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 preallocation=1234 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 preallocation=1234 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 == Check encryption option ==
 
 qemu-img create -f qcow2 -o encryption=off TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 --object secret,id=sec0,data=123456 -o encryption=on,encrypt.key-secret=sec0 TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=on encrypt.key-secret=sec0 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 encryption=on encrypt.key-secret=sec0 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 == Check lazy_refcounts option (only with v3) ==
 
 qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=off TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=1.1 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=1.1 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=1.1 cluster_size=65536 lazy_refcounts=on refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=1.1 cluster_size=65536 lazy_refcounts=on extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o compat=0.10,lazy_refcounts=off TEST_DIR/t.qcow2 64M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=0.10 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=0.10 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 qemu-img create -f qcow2 -o compat=0.10,lazy_refcounts=on TEST_DIR/t.qcow2 64M
 qemu-img: TEST_DIR/t.qcow2: Lazy refcounts only supported with compatibility level 1.1 and above (use version=v3 or greater)
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=0.10 cluster_size=65536 lazy_refcounts=on refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 compat=0.10 cluster_size=65536 lazy_refcounts=on extended_l2=off refcount_bits=16 compression_type=zlib
 
 *** done
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index fa3d68f0df..bb98fa5db8 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -21,6 +21,7 @@ Format specific information:
     lazy refcounts: false
     refcount bits: 16
     corrupt: true
+    extended l2: false
 qemu-io: can't open device TEST_DIR/t.IMGFMT: IMGFMT: Image is corrupt; cannot be opened read/write
 no file open, try 'help open'
 read 512/512 bytes at offset 0
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index 2f03cf045c..f618259d47 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -26,7 +26,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 magic                     0x514649fb
@@ -84,7 +84,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 magic                     0x514649fb
@@ -140,7 +140,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 ERROR cluster 5 refcount=0 reference=1
@@ -195,7 +195,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 magic                     0x514649fb
@@ -264,7 +264,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 read 65536/65536 bytes at offset 44040192
@@ -326,7 +326,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 ERROR cluster 5 refcount=0 reference=1
@@ -355,7 +355,7 @@ header_length             112
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 read 131072/131072 bytes at offset 0
@@ -525,6 +525,7 @@ Format specific information:
     data file: TEST_DIR/t.IMGFMT.data
     data file raw: false
     corrupt: false
+    extended l2: false
 No errors were found on the image.
 
 === Try changing the external data file ===
@@ -546,6 +547,7 @@ Format specific information:
     data file: foo
     data file raw: false
     corrupt: false
+    extended l2: false
 
 qemu-img: Could not open 'TEST_DIR/t.IMGFMT': 'data-file' is required for this image
 image: TEST_DIR/t.IMGFMT
@@ -559,6 +561,7 @@ Format specific information:
     refcount bits: 16
     data file raw: false
     corrupt: false
+    extended l2: false
 
 === Clearing and setting data-file-raw ===
 
@@ -575,6 +578,7 @@ Format specific information:
     data file: TEST_DIR/t.IMGFMT.data
     data file raw: true
     corrupt: false
+    extended l2: false
 No errors were found on the image.
 image: TEST_DIR/t.IMGFMT
 file format: IMGFMT
@@ -588,6 +592,7 @@ Format specific information:
     data file: TEST_DIR/t.IMGFMT.data
     data file raw: false
     corrupt: false
+    extended l2: false
 No errors were found on the image.
 qemu-img: data-file-raw cannot be set on existing images
 image: TEST_DIR/t.IMGFMT
@@ -602,5 +607,6 @@ Format specific information:
     data file: TEST_DIR/t.IMGFMT.data
     data file raw: false
     corrupt: false
+    extended l2: false
 No errors were found on the image.
 *** done
diff --git a/tests/qemu-iotests/065 b/tests/qemu-iotests/065
index 18dc488c7a..29a7f7ad60 100755
--- a/tests/qemu-iotests/065
+++ b/tests/qemu-iotests/065
@@ -98,20 +98,20 @@ class TestQCow3NotLazy(TestQemuImgInfo):
     img_options = 'compat=1.1,lazy_refcounts=off'
     json_compare = { 'compat': '1.1', 'lazy-refcounts': False,
                      'refcount-bits': 16, 'corrupt': False,
-                     'compression-type': 'zlib' }
+                     'compression-type': 'zlib', 'extended-l2': False }
     human_compare = [ 'compat: 1.1', 'compression type: zlib',
                       'lazy refcounts: false', 'refcount bits: 16',
-                      'corrupt: false' ]
+                      'corrupt: false', 'extended l2: false' ]
 
 class TestQCow3Lazy(TestQemuImgInfo):
     '''Testing a qcow2 version 3 image with lazy refcounts enabled'''
     img_options = 'compat=1.1,lazy_refcounts=on'
     json_compare = { 'compat': '1.1', 'lazy-refcounts': True,
                      'refcount-bits': 16, 'corrupt': False,
-                     'compression-type': 'zlib' }
+                     'compression-type': 'zlib', 'extended-l2': False }
     human_compare = [ 'compat: 1.1', 'compression type: zlib',
                       'lazy refcounts: true', 'refcount bits: 16',
-                      'corrupt: false' ]
+                      'corrupt: false', 'extended l2: false' ]
 
 class TestQCow3NotLazyQMP(TestQMP):
     '''Testing a qcow2 version 3 image with lazy refcounts disabled, opening
@@ -120,7 +120,7 @@ class TestQCow3NotLazyQMP(TestQMP):
     qemu_options = 'lazy-refcounts=on'
     compare = { 'compat': '1.1', 'lazy-refcounts': False,
                 'refcount-bits': 16, 'corrupt': False,
-                'compression-type': 'zlib' }
+                'compression-type': 'zlib', 'extended-l2': False }
 
 
 class TestQCow3LazyQMP(TestQMP):
@@ -130,7 +130,7 @@ class TestQCow3LazyQMP(TestQMP):
     qemu_options = 'lazy-refcounts=off'
     compare = { 'compat': '1.1', 'lazy-refcounts': True,
                 'refcount-bits': 16, 'corrupt': False,
-                'compression-type': 'zlib' }
+                'compression-type': 'zlib', 'extended-l2': False }
 
 TestImageInfoSpecific = None
 TestQemuImgInfo = None
diff --git a/tests/qemu-iotests/082.out b/tests/qemu-iotests/082.out
index 529a1214e1..efc0328e8d 100644
--- a/tests/qemu-iotests/082.out
+++ b/tests/qemu-iotests/082.out
@@ -3,14 +3,14 @@ QA output created by 082
 === create: Options specified more than once ===
 
 Testing: create -f foo -f qcow2 TEST_DIR/t.qcow2 128M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 image: TEST_DIR/t.IMGFMT
 file format: IMGFMT
 virtual size: 128 MiB (134217728 bytes)
 cluster_size: 65536
 
 Testing: create -f qcow2 -o cluster_size=4k -o lazy_refcounts=on TEST_DIR/t.qcow2 128M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=4096 lazy_refcounts=on refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=4096 lazy_refcounts=on extended_l2=off refcount_bits=16 compression_type=zlib
 image: TEST_DIR/t.IMGFMT
 file format: IMGFMT
 virtual size: 128 MiB (134217728 bytes)
@@ -21,9 +21,10 @@ Format specific information:
     lazy refcounts: true
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 Testing: create -f qcow2 -o cluster_size=4k -o lazy_refcounts=on -o cluster_size=8k TEST_DIR/t.qcow2 128M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=8192 lazy_refcounts=on refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=8192 lazy_refcounts=on extended_l2=off refcount_bits=16 compression_type=zlib
 image: TEST_DIR/t.IMGFMT
 file format: IMGFMT
 virtual size: 128 MiB (134217728 bytes)
@@ -34,9 +35,10 @@ Format specific information:
     lazy refcounts: true
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 Testing: create -f qcow2 -o cluster_size=4k,cluster_size=8k TEST_DIR/t.qcow2 128M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=8192 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=8192 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 image: TEST_DIR/t.IMGFMT
 file format: IMGFMT
 virtual size: 128 MiB (134217728 bytes)
@@ -62,6 +64,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -86,6 +89,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -110,6 +114,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -134,6 +139,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -158,6 +164,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -182,6 +189,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -206,6 +214,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -230,6 +239,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -237,10 +247,10 @@ Supported options:
   size=<size>            - Virtual disk size
 
 Testing: create -f qcow2 -u -o backing_file=TEST_DIR/t.qcow2,,help TEST_DIR/t.qcow2 128M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/t.qcow2,,help cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/t.qcow2,,help cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 Testing: create -f qcow2 -u -o backing_file=TEST_DIR/t.qcow2,,? TEST_DIR/t.qcow2 128M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/t.qcow2,,? cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/t.qcow2,,? cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 Testing: create -f qcow2 -o backing_file=TEST_DIR/t.qcow2, -o help TEST_DIR/t.qcow2 128M
 qemu-img: Invalid option list: backing_file=TEST_DIR/t.qcow2,
@@ -269,6 +279,7 @@ Supported qcow2 options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -290,7 +301,7 @@ qemu-img: Format driver 'bochs' does not support image creation
 === convert: Options specified more than once ===
 
 Testing: create -f qcow2 TEST_DIR/t.qcow2 128M
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 Testing: convert -f foo -f qcow2 TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.base
 image: TEST_DIR/t.IMGFMT.base
@@ -314,6 +325,7 @@ Format specific information:
     lazy refcounts: true
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 Testing: convert -O qcow2 -o cluster_size=4k -o lazy_refcounts=on -o cluster_size=8k TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.base
 image: TEST_DIR/t.IMGFMT.base
@@ -326,6 +338,7 @@ Format specific information:
     lazy refcounts: true
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 Testing: convert -O qcow2 -o cluster_size=4k,cluster_size=8k TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.base
 image: TEST_DIR/t.IMGFMT.base
@@ -353,6 +366,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -377,6 +391,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -401,6 +416,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -425,6 +441,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -449,6 +466,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -473,6 +491,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -497,6 +516,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -521,6 +541,7 @@ Supported options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   nocow=<bool (on/off)>  - Turn off copy-on-write (valid only on btrfs)
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
@@ -560,6 +581,7 @@ Supported qcow2 options:
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -605,6 +627,7 @@ Format specific information:
     lazy refcounts: true
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 Testing: amend -f qcow2 -o size=130M -o lazy_refcounts=off TEST_DIR/t.qcow2
 image: TEST_DIR/t.IMGFMT
@@ -617,6 +640,7 @@ Format specific information:
     lazy refcounts: false
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 Testing: amend -f qcow2 -o size=8M -o lazy_refcounts=on -o size=132M TEST_DIR/t.qcow2
 image: TEST_DIR/t.IMGFMT
@@ -629,6 +653,7 @@ Format specific information:
     lazy refcounts: true
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 Testing: amend -f qcow2 -o size=4M,size=148M TEST_DIR/t.qcow2
 image: TEST_DIR/t.IMGFMT
@@ -656,6 +681,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -681,6 +707,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -706,6 +733,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -731,6 +759,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -756,6 +785,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -781,6 +811,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -806,6 +837,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -831,6 +863,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
@@ -873,6 +906,7 @@ Creation options for 'qcow2':
   encrypt.ivgen-hash-alg=<str> - Name of IV generator hash algorithm
   encrypt.key-secret=<str> - ID of secret providing qcow AES key or LUKS passphrase
   encryption=<bool (on/off)> - Encrypt the image with format 'aes'. (Deprecated in favor of encrypt.format=aes)
+  extended_l2=<bool (on/off)> - Extended L2 tables
   lazy_refcounts=<bool (on/off)> - Postpone refcount updates
   preallocation=<str>    - Preallocation mode (allowed values: off, metadata, falloc, full)
   refcount_bits=<num>    - Width of a reference count entry in bits
diff --git a/tests/qemu-iotests/085.out b/tests/qemu-iotests/085.out
index a822ff4ef6..0431743808 100644
--- a/tests/qemu-iotests/085.out
+++ b/tests/qemu-iotests/085.out
@@ -13,7 +13,7 @@ Formatting 'TEST_DIR/t.IMGFMT.2', fmt=IMGFMT size=134217728
 === Create a single snapshot on virtio0 ===
 
 { 'execute': 'blockdev-snapshot-sync', 'arguments': { 'device': 'virtio0', 'snapshot-file':'TEST_DIR/1-snapshot-v0.IMGFMT', 'format': 'IMGFMT' } }
-Formatting 'TEST_DIR/1-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/t.qcow2.1 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/1-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/t.qcow2.1 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 
 === Invalid command - missing device and nodename ===
@@ -30,40 +30,40 @@ Formatting 'TEST_DIR/1-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file
 === Create several transactional group snapshots ===
 
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/2-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/2-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/2-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/1-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/2-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/t.qcow2.2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/2-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/1-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/2-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/t.qcow2.2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/3-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/3-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/3-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/2-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/3-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/2-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/3-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/2-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/3-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/2-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/4-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/4-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/4-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/3-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/4-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/3-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/4-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/3-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/4-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/3-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/5-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/5-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/5-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/4-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/5-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/4-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/5-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/4-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/5-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/4-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/6-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/6-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/6-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/5-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/6-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/5-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/6-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/5-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/6-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/5-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/7-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/7-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/7-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/6-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/7-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/6-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/7-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/6-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/7-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/6-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/8-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/8-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/8-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/7-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/8-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/7-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/8-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/7-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/8-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/7-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/9-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/9-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/9-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/8-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/9-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/8-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/9-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/8-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/9-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/8-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'transaction', 'arguments': {'actions': [ { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio0', 'snapshot-file': 'TEST_DIR/10-snapshot-v0.IMGFMT' } }, { 'type': 'blockdev-snapshot-sync', 'data' : { 'device': 'virtio1', 'snapshot-file': 'TEST_DIR/10-snapshot-v1.IMGFMT' } } ] } }
-Formatting 'TEST_DIR/10-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/9-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
-Formatting 'TEST_DIR/10-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/9-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/10-snapshot-v0.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/9-snapshot-v0.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/10-snapshot-v1.qcow2', fmt=qcow2 size=134217728 backing_file=TEST_DIR/9-snapshot-v1.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 
 === Create a couple of snapshots using blockdev-snapshot ===
diff --git a/tests/qemu-iotests/144.out b/tests/qemu-iotests/144.out
index 885a8874a5..a2627f0cc5 100644
--- a/tests/qemu-iotests/144.out
+++ b/tests/qemu-iotests/144.out
@@ -9,7 +9,7 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=536870912
 { 'execute': 'qmp_capabilities' }
 {"return": {}}
 { 'execute': 'blockdev-snapshot-sync', 'arguments': { 'device': 'virtio0', 'snapshot-file':'TEST_DIR/tmp.IMGFMT', 'format': 'IMGFMT' } }
-Formatting 'TEST_DIR/tmp.qcow2', fmt=qcow2 size=536870912 backing_file=TEST_DIR/t.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/tmp.qcow2', fmt=qcow2 size=536870912 backing_file=TEST_DIR/t.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 
 === Performing block-commit on active layer ===
@@ -31,6 +31,6 @@ Formatting 'TEST_DIR/tmp.qcow2', fmt=qcow2 size=536870912 backing_file=TEST_DIR/
 === Performing Live Snapshot 2 ===
 
 { 'execute': 'blockdev-snapshot-sync', 'arguments': { 'device': 'virtio0', 'snapshot-file':'TEST_DIR/tmp2.IMGFMT', 'format': 'IMGFMT' } }
-Formatting 'TEST_DIR/tmp2.qcow2', fmt=qcow2 size=536870912 backing_file=TEST_DIR/t.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/tmp2.qcow2', fmt=qcow2 size=536870912 backing_file=TEST_DIR/t.qcow2 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 *** done
diff --git a/tests/qemu-iotests/182.out b/tests/qemu-iotests/182.out
index ae43654d32..e4290ec028 100644
--- a/tests/qemu-iotests/182.out
+++ b/tests/qemu-iotests/182.out
@@ -13,7 +13,7 @@ Is another process using the image [TEST_DIR/t.qcow2]?
 {'execute': 'blockdev-add', 'arguments': { 'node-name': 'node0', 'driver': 'file', 'filename': 'TEST_DIR/t.IMGFMT', 'locking': 'on' } }
 {"return": {}}
 {'execute': 'blockdev-snapshot-sync', 'arguments': { 'node-name': 'node0', 'snapshot-file': 'TEST_DIR/t.IMGFMT.overlay', 'snapshot-node-name': 'node1' } }
-Formatting 'TEST_DIR/t.qcow2.overlay', fmt=qcow2 size=197120 backing_file=TEST_DIR/t.qcow2 backing_fmt=file cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2.overlay', fmt=qcow2 size=197120 backing_file=TEST_DIR/t.qcow2 backing_fmt=file cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 {'execute': 'blockdev-add', 'arguments': { 'node-name': 'node1', 'driver': 'file', 'filename': 'TEST_DIR/t.IMGFMT', 'locking': 'on' } }
 {"return": {}}
diff --git a/tests/qemu-iotests/185.out b/tests/qemu-iotests/185.out
index ac5ab16bc8..4e0f018bf2 100644
--- a/tests/qemu-iotests/185.out
+++ b/tests/qemu-iotests/185.out
@@ -9,14 +9,14 @@ Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864
 === Creating backing chain ===
 
 { 'execute': 'blockdev-snapshot-sync', 'arguments': { 'device': 'disk', 'snapshot-file': 'TEST_DIR/t.IMGFMT.mid', 'format': 'IMGFMT', 'mode': 'absolute-paths' } }
-Formatting 'TEST_DIR/t.qcow2.mid', fmt=qcow2 size=67108864 backing_file=TEST_DIR/t.qcow2.base backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2.mid', fmt=qcow2 size=67108864 backing_file=TEST_DIR/t.qcow2.base backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 { 'execute': 'human-monitor-command', 'arguments': { 'command-line': 'qemu-io disk "write 0 4M"' } }
 wrote 4194304/4194304 bytes at offset 0
 4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 {"return": ""}
 { 'execute': 'blockdev-snapshot-sync', 'arguments': { 'device': 'disk', 'snapshot-file': 'TEST_DIR/t.IMGFMT', 'format': 'IMGFMT', 'mode': 'absolute-paths' } }
-Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 backing_file=TEST_DIR/t.qcow2.mid backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 backing_file=TEST_DIR/t.qcow2.mid backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"return": {}}
 
 === Start commit job and exit qemu ===
@@ -48,7 +48,7 @@ Formatting 'TEST_DIR/t.qcow2', fmt=qcow2 size=67108864 backing_file=TEST_DIR/t.q
 { 'execute': 'qmp_capabilities' }
 {"return": {}}
 { 'execute': 'drive-mirror', 'arguments': { 'device': 'disk', 'target': 'TEST_DIR/t.IMGFMT.copy', 'format': 'IMGFMT', 'sync': 'full', 'speed': 65536 } }
-Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "disk"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "disk"}}
 {"return": {}}
@@ -62,7 +62,7 @@ Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 cluster_size=65536 l
 { 'execute': 'qmp_capabilities' }
 {"return": {}}
 { 'execute': 'drive-backup', 'arguments': { 'device': 'disk', 'target': 'TEST_DIR/t.IMGFMT.copy', 'format': 'IMGFMT', 'sync': 'full', 'speed': 65536 } }
-Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "disk"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "disk"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "disk"}}
diff --git a/tests/qemu-iotests/198.out b/tests/qemu-iotests/198.out
index 6280ae6eed..821a052bb1 100644
--- a/tests/qemu-iotests/198.out
+++ b/tests/qemu-iotests/198.out
@@ -73,6 +73,7 @@ Format specific information:
                 key offset: 1810432
         payload offset: 2068480
         master key iters: 1024
+    extended l2: false
 
 == checking image layer ==
 image: json:{ /* filtered */ }
@@ -117,4 +118,5 @@ Format specific information:
                 key offset: 1810432
         payload offset: 2068480
         master key iters: 1024
+    extended l2: false
 *** done
diff --git a/tests/qemu-iotests/206.out b/tests/qemu-iotests/206.out
index 1a14255a83..363c5abe35 100644
--- a/tests/qemu-iotests/206.out
+++ b/tests/qemu-iotests/206.out
@@ -22,6 +22,7 @@ Format specific information:
     lazy refcounts: false
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 === Successful image creation (inline blockdev-add, explicit defaults) ===
 
@@ -45,6 +46,7 @@ Format specific information:
     lazy refcounts: false
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 === Successful image creation (v3 non-default options) ===
 
@@ -68,6 +70,7 @@ Format specific information:
     lazy refcounts: true
     refcount bits: 1
     corrupt: false
+    extended l2: false
 
 === Successful image creation (v2 non-default options) ===
 
@@ -146,6 +149,7 @@ Format specific information:
         payload offset: 528384
         master key iters: XXX
     corrupt: false
+    extended l2: false
 
 === Invalid BlockdevRef ===
 
diff --git a/tests/qemu-iotests/242.out b/tests/qemu-iotests/242.out
index 091b9126ce..3759c99284 100644
--- a/tests/qemu-iotests/242.out
+++ b/tests/qemu-iotests/242.out
@@ -16,6 +16,7 @@ Format specific information:
     lazy refcounts: false
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 No bitmap in JSON format output
 
@@ -42,6 +43,7 @@ Format specific information:
             granularity: 32768
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 The same bitmaps in JSON format:
 [
@@ -80,6 +82,7 @@ Format specific information:
             granularity: 65536
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 The same bitmaps in JSON format:
 [
@@ -123,6 +126,7 @@ Format specific information:
             granularity: 65536
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 The same bitmaps in JSON format:
 [
@@ -167,5 +171,6 @@ Format specific information:
             granularity: 16384
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 Test complete
diff --git a/tests/qemu-iotests/255.out b/tests/qemu-iotests/255.out
index a3c99fd62e..38ac1a4e95 100644
--- a/tests/qemu-iotests/255.out
+++ b/tests/qemu-iotests/255.out
@@ -3,9 +3,9 @@ Finishing a commit job with background reads
 
 === Create backing chain and start VM ===
 
-Formatting 'TEST_DIR/PID-t.qcow2.mid', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-t.qcow2.mid', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-t.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-t.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 === Start background read requests ===
 
@@ -23,9 +23,9 @@ Closing the VM while a job is being cancelled
 
 === Create images and start VM ===
 
-Formatting 'TEST_DIR/PID-src.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-src.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-dst.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-dst.qcow2', fmt=qcow2 size=134217728 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 1048576/1048576 bytes at offset 0
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
index d24ff681af..47edc3c423 100644
--- a/tests/qemu-iotests/274.out
+++ b/tests/qemu-iotests/274.out
@@ -1,9 +1,9 @@
 == Commit tests ==
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=2097152 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=2097152 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-mid', fmt=qcow2 size=1048576 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 size=1048576 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=2097152 backing_file=TEST_DIR/PID-mid cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=2097152 backing_file=TEST_DIR/PID-mid cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 2097152/2097152 bytes at offset 0
 2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -56,6 +56,7 @@ Format specific information:
     lazy refcounts: false
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 read 1048576/1048576 bytes at offset 0
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -64,11 +65,11 @@ read 1048576/1048576 bytes at offset 1048576
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 === Testing HMP commit (top -> mid) ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=2097152 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=2097152 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-mid', fmt=qcow2 size=1048576 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 size=1048576 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=2097152 backing_file=TEST_DIR/PID-mid cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=2097152 backing_file=TEST_DIR/PID-mid cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 2097152/2097152 bytes at offset 0
 2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -86,6 +87,7 @@ Format specific information:
     lazy refcounts: false
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 read 1048576/1048576 bytes at offset 0
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -94,11 +96,11 @@ read 1048576/1048576 bytes at offset 1048576
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 === Testing QMP active commit (top -> mid) ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=2097152 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=2097152 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-mid', fmt=qcow2 size=1048576 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 size=1048576 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=2097152 backing_file=TEST_DIR/PID-mid cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=2097152 backing_file=TEST_DIR/PID-mid cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 2097152/2097152 bytes at offset 0
 2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -122,6 +124,7 @@ Format specific information:
     lazy refcounts: false
     refcount bits: 16
     corrupt: false
+    extended l2: false
 
 read 1048576/1048576 bytes at offset 0
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -131,9 +134,9 @@ read 1048576/1048576 bytes at offset 1048576
 
 == Resize tests ==
 === preallocation=off ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=6442450944 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=6442450944 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=1073741824 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=1073741824 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 65536/65536 bytes at offset 5368709120
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -150,9 +153,9 @@ read 65536/65536 bytes at offset 5368709120
 { "start": 1073741824, "length": 7516192768, "depth": 0, "zero": true, "data": false}]
 
 === preallocation=metadata ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=34359738368 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=34359738368 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=32212254720 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=32212254720 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 65536/65536 bytes at offset 33285996544
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -174,9 +177,9 @@ read 65536/65536 bytes at offset 33285996544
 { "start": 34896609280, "length": 536870912, "depth": 0, "zero": true, "data": false, "offset": 2685075456}]
 
 === preallocation=falloc ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=10485760 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=10485760 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=5242880 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=5242880 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 65536/65536 bytes at offset 9437184
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -193,9 +196,9 @@ read 65536/65536 bytes at offset 9437184
 { "start": 5242880, "length": 10485760, "depth": 0, "zero": false, "data": true, "offset": 327680}]
 
 === preallocation=full ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=16777216 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=16777216 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=8388608 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=8388608 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 65536/65536 bytes at offset 11534336
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -212,9 +215,9 @@ read 65536/65536 bytes at offset 11534336
 { "start": 8388608, "length": 4194304, "depth": 0, "zero": false, "data": true, "offset": 327680}]
 
 === preallocation=off ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=393216 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=393216 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=259072 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=259072 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 65536/65536 bytes at offset 259072
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -232,9 +235,9 @@ read 65536/65536 bytes at offset 259072
 { "start": 262144, "length": 262144, "depth": 0, "zero": true, "data": false}]
 
 === preallocation=off ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=409600 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=409600 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=262144 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=262144 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 65536/65536 bytes at offset 344064
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -251,9 +254,9 @@ read 65536/65536 bytes at offset 344064
 { "start": 262144, "length": 262144, "depth": 0, "zero": true, "data": false}]
 
 === preallocation=off ===
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=524288 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=524288 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
-Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=262144 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=262144 backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 wrote 65536/65536 bytes at offset 446464
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
diff --git a/tests/qemu-iotests/280.out b/tests/qemu-iotests/280.out
index 92e4d14079..372e9f70ad 100644
--- a/tests/qemu-iotests/280.out
+++ b/tests/qemu-iotests/280.out
@@ -1,4 +1,4 @@
-Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off refcount_bits=16 compression_type=zlib
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off extended_l2=off refcount_bits=16 compression_type=zlib
 
 === Launch VM ===
 Enabling migration QMP events on VM...
diff --git a/tests/qemu-iotests/291.out b/tests/qemu-iotests/291.out
index 08bfaaaa6b..83e3ec821c 100644
--- a/tests/qemu-iotests/291.out
+++ b/tests/qemu-iotests/291.out
@@ -22,7 +22,7 @@ data                      'qcow2'
 
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 Header extension:
@@ -60,6 +60,7 @@ Format specific information:
             granularity: 65536
     refcount bits: 16
     corrupt: false
+    extended l2: false
 image: TEST_DIR/t.IMGFMT
 file format: IMGFMT
 virtual size: 10 MiB (10485760 bytes)
@@ -84,10 +85,11 @@ Format specific information:
             granularity: 65536
     refcount bits: 16
     corrupt: false
+    extended l2: false
 Check resulting qcow2 header extensions:
 Header extension:
 magic                     0x6803f857 (Feature table)
-length                    336
+length                    384
 data                      <binary>
 
 Header extension:
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 03e4f71808..2ca66de0f0 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -146,6 +146,7 @@ _filter_img_create()
         -e "s# adapter_type=[^ ]*##g" \
         -e "s# hwversion=[^ ]*##g" \
         -e "s# lazy_refcounts=\\(on\\|off\\)##g" \
+        -e "s# extended_l2=\\(on\\|off\\)##g" \
         -e "s# block_size=[0-9]\\+##g" \
         -e "s# block_state_zero=\\(on\\|off\\)##g" \
         -e "s# log_size=[0-9]\\+##g" \
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 32/34] qcow2: Allow preallocation and backing files if extended_l2 is set
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (30 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 31/34] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-03  7:45   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 33/34] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters Alberto Garcia
  2020-06-28 11:02 ` [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries Alberto Garcia
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Traditional qcow2 images don't allow preallocation if a backing file
is set. This is because once a cluster is allocated there is no way to
tell that its data should be read from the backing file.

Extended L2 entries have individual allocation bits for each
subcluster, and therefore it is perfectly possible to have an
allocated cluster with all its subclusters unallocated.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2.c              | 7 ++++---
 tests/qemu-iotests/206.out | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 37bfae823c..1ea8d3b87e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3451,10 +3451,11 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
         qcow2_opts->preallocation = PREALLOC_MODE_OFF;
     }
     if (qcow2_opts->has_backing_file &&
-        qcow2_opts->preallocation != PREALLOC_MODE_OFF)
+        qcow2_opts->preallocation != PREALLOC_MODE_OFF &&
+        !qcow2_opts->extended_l2)
     {
-        error_setg(errp, "Backing file and preallocation cannot be used at "
-                   "the same time");
+        error_setg(errp, "Backing file and preallocation can only be used at "
+                   "the same time if extended_l2 is on");
         ret = -EINVAL;
         goto out;
     }
diff --git a/tests/qemu-iotests/206.out b/tests/qemu-iotests/206.out
index 363c5abe35..a100849fcb 100644
--- a/tests/qemu-iotests/206.out
+++ b/tests/qemu-iotests/206.out
@@ -203,7 +203,7 @@ Job failed: Different refcount widths than 16 bits require compatibility level 1
 === Invalid backing file options ===
 {"execute": "blockdev-create", "arguments": {"job-id": "job0", "options": {"backing-file": "/dev/null", "driver": "qcow2", "file": "node0", "preallocation": "full", "size": 67108864}}}
 {"return": {}}
-Job failed: Backing file and preallocation cannot be used at the same time
+Job failed: Backing file and preallocation can only be used at the same time if extended_l2 is on
 {"execute": "job-dismiss", "arguments": {"id": "job0"}}
 {"return": {}}
 
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 33/34] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (31 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 32/34] qcow2: Allow preallocation and backing files if extended_l2 is set Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-03  7:46   ` Max Reitz
  2020-06-28 11:02 ` [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries Alberto Garcia
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

This function is only used by qcow2_expand_zero_clusters() to
downgrade a qcow2 image to a previous version. This would require
transforming all extended L2 entries into normal L2 entries but this
is not a simple task and there are no plans to implement this at the
moment.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-cluster.c      | 8 +++++++-
 tests/qemu-iotests/061     | 6 ++++++
 tests/qemu-iotests/061.out | 5 +++++
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index c8217081f2..e8bb1f32f3 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -2157,6 +2157,9 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
     int ret;
     int i, j;
 
+    /* qcow2_downgrade() is not allowed in images with subclusters */
+    assert(!has_subclusters(s));
+
     slice_size2 = s->l2_slice_size * l2_entry_size(s);
     n_slices = s->cluster_size / slice_size2;
 
@@ -2225,7 +2228,8 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
                 if (cluster_type == QCOW2_CLUSTER_ZERO_PLAIN) {
                     if (!bs->backing) {
                         /* not backed; therefore we can simply deallocate the
-                         * cluster */
+                         * cluster. No need to call set_l2_bitmap(), this
+                         * function doesn't support images with subclusters. */
                         set_l2_entry(s, l2_slice, j, 0);
                         l2_dirty = true;
                         continue;
@@ -2296,6 +2300,8 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
                 } else {
                     set_l2_entry(s, l2_slice, j, offset);
                 }
+                /* No need to call set_l2_bitmap() after set_l2_entry() because
+                 * this function doesn't support images with subclusters. */
                 l2_dirty = true;
             }
 
diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
index 10eb243164..23add2dfe3 100755
--- a/tests/qemu-iotests/061
+++ b/tests/qemu-iotests/061
@@ -303,6 +303,12 @@ $QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
 _img_info --format-specific
 _check_test_img
 
+echo
+echo "=== Testing version downgrade with extended L2 entries ==="
+echo
+_make_test_img -o "compat=1.1,extended_l2=on" 64M
+$QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
+
 echo
 echo "=== Try changing the external data file ==="
 echo
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index f618259d47..a139d4f1fe 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -528,6 +528,11 @@ Format specific information:
     extended l2: false
 No errors were found on the image.
 
+=== Testing version downgrade with extended L2 entries ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+qemu-img: Cannot downgrade an image with incompatible features 0x10 set
+
 === Try changing the external data file ===
 
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries
  2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
                   ` (32 preceding siblings ...)
  2020-06-28 11:02 ` [PATCH v9 33/34] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters Alberto Garcia
@ 2020-06-28 11:02 ` Alberto Garcia
  2020-07-03  9:49   ` Max Reitz
  33 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-06-28 11:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, Alberto Garcia,
	qemu-block, Derek Su, Max Reitz

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 tests/qemu-iotests/271     | 901 +++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/271.out | 724 +++++++++++++++++++++++++++++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 1626 insertions(+)
 create mode 100755 tests/qemu-iotests/271
 create mode 100644 tests/qemu-iotests/271.out

diff --git a/tests/qemu-iotests/271 b/tests/qemu-iotests/271
new file mode 100755
index 0000000000..5ef3ebb2bf
--- /dev/null
+++ b/tests/qemu-iotests/271
@@ -0,0 +1,901 @@
+#!/bin/bash
+#
+# Test qcow2 images with extended L2 entries
+#
+# Copyright (C) 2019-2020 Igalia, S.L.
+# Author: Alberto Garcia <berto@igalia.com>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=berto@igalia.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+status=1	# failure is the default!
+
+_cleanup()
+{
+        _cleanup_test_img
+        rm -f "$TEST_IMG.raw"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file nfs
+_supported_os Linux
+_unsupported_imgopts extended_l2 compat=0.10 cluster_size data_file
+
+l2_offset=$((0x40000))
+
+_verify_img()
+{
+    $QEMU_IMG compare "$TEST_IMG" "$TEST_IMG.raw" | grep -v 'Images are identical'
+    $QEMU_IMG check "$TEST_IMG" | _filter_qemu_img_check | \
+        grep -v 'No errors were found on the image'
+}
+
+# Compare the bitmap of an extended L2 entry against an expected value
+_verify_l2_bitmap()
+{
+    entry_no="$1"            # L2 entry number, starting from 0
+    expected_alloc="$alloc"  # Space-separated list of allocated subcluster indexes
+    expected_zero="$zero"    # Space-separated list of zero subcluster indexes
+
+    offset=$(($l2_offset + $entry_no * 16))
+    entry=$(peek_file_be "$TEST_IMG" $offset 8)
+    offset=$(($offset + 8))
+    bitmap=$(peek_file_be "$TEST_IMG" $offset 8)
+
+    expected_bitmap=0
+    for bit in $expected_alloc; do
+        expected_bitmap=$(($expected_bitmap | (1 << $bit)))
+    done
+    for bit in $expected_zero; do
+        expected_bitmap=$(($expected_bitmap | (1 << (32 + $bit))))
+    done
+    printf -v expected_bitmap "%llu" $expected_bitmap # Convert to unsigned
+
+    printf "L2 entry #%d: 0x%016lx %016lx\n" "$entry_no" "$entry" "$bitmap"
+    if [ "$bitmap" != "$expected_bitmap" ]; then
+        printf "ERROR: expecting bitmap       0x%016lx\n" "$expected_bitmap"
+    fi
+}
+
+# This should be called as _run_test c=XXX sc=XXX off=XXX len=XXX cmd=XXX
+# c:   cluster number (0 if unset)
+# sc:  subcluster number inside cluster @c (0 if unset)
+# off: offset inside subcluster @sc, in kilobytes (0 if unset)
+# len: request length, passed directly to qemu-io (e.g: 256, 4k, 1M, ...)
+# cmd: the command to pass to qemu-io, must be one of
+#      write    -> write
+#      zero     -> write -z
+#      unmap    -> write -z -u
+#      compress -> write -c
+#      discard  -> discard
+_run_test()
+{
+    unset c sc off len cmd
+    for var in "$@"; do eval "$var"; done
+    case "${cmd:-write}" in
+        zero)
+            cmd="write -q -z";;
+        unmap)
+            cmd="write -q -z -u";;
+        compress)
+            pat=$((${pat:-0} + 1))
+            cmd="write -q -c -P ${pat}";;
+        write)
+            pat=$((${pat:-0} + 1))
+            cmd="write -q -P ${pat}";;
+        discard)
+            cmd="discard -q";;
+        *)
+            echo "Unknown option $cmd"
+            exit 1;;
+    esac
+    c="${c:-0}"
+    sc="${sc:-0}"
+    off="${off:-0}"
+    offset="$(($c * 64 + $sc * 2 + $off))"
+    [ "$offset" != 0 ] && offset="${offset}k"
+    cmd="$cmd ${offset} ${len}"
+    raw_cmd=$(echo $cmd | sed s/-c//) # Raw images don't support -c
+    echo $cmd | sed 's/-P [0-9][0-9]\?/-P PATTERN/'
+    $QEMU_IO -c "$cmd" "$TEST_IMG" | _filter_qemu_io
+    $QEMU_IO -c "$raw_cmd" -f raw "$TEST_IMG.raw" | _filter_qemu_io
+    _verify_img
+    _verify_l2_bitmap "$c"
+}
+
+_reset_img()
+{
+    size="$1"
+    $QEMU_IMG create -f raw "$TEST_IMG.raw" "$size" | _filter_img_create
+    if [ "$use_backing_file" = "yes" ]; then
+        $QEMU_IMG create -f raw "$TEST_IMG.base" "$size" | _filter_img_create
+        $QEMU_IO -c "write -q -P 0xFF 0 $size" -f raw "$TEST_IMG.base" | _filter_qemu_io
+        $QEMU_IO -c "write -q -P 0xFF 0 $size" -f raw "$TEST_IMG.raw" | _filter_qemu_io
+        _make_test_img -o extended_l2=on -F raw -b "$TEST_IMG.base" "$size"
+    else
+        _make_test_img -o extended_l2=on "$size"
+    fi
+}
+
+############################################################
+############################################################
+############################################################
+
+# Test that writing to an image with subclusters produces the expected
+# results, in images with and without backing files
+for use_backing_file in yes no; do
+    echo
+    echo "### Standard write tests (backing file: $use_backing_file) ###"
+    echo
+    _reset_img 1M
+    ### Write subcluster #0 (beginning of subcluster) ###
+    alloc="0"; zero=""
+    _run_test sc=0 len=1k
+
+    ### Write subcluster #1 (middle of subcluster) ###
+    alloc="0 1"; zero=""
+    _run_test sc=1 off=1 len=512
+
+    ### Write subcluster #2 (end of subcluster) ###
+    alloc="0 1 2"; zero=""
+    _run_test sc=2 off=1 len=1k
+
+    ### Write subcluster #3 (full subcluster) ###
+    alloc="0 1 2 3"; zero=""
+    _run_test sc=3 len=2k
+
+    ### Write subclusters #4-6 (full subclusters) ###
+    alloc="$(seq 0 6)"; zero=""
+    _run_test sc=4 len=6k
+
+    ### Write subclusters #7-9 (partial subclusters) ###
+    alloc="$(seq 0 9)"; zero=""
+    _run_test sc=7 off=1 len=4k
+
+    ### Write subcluster #16 (partial subcluster) ###
+    alloc="$(seq 0 9) 16"; zero=""
+    _run_test sc=16 len=1k
+
+    ### Write subcluster #31-#33 (cluster overlap) ###
+    alloc="$(seq 0 9) 16 31"; zero=""
+    _run_test sc=31 off=1 len=4k
+    alloc="0 1" ; zero=""
+    _verify_l2_bitmap 1
+
+    ### Zero subcluster #1
+    alloc="0 $(seq 2 9) 16 31"; zero="1"
+    _run_test sc=1 len=2k cmd=zero
+
+    ### Zero cluster #0
+    alloc=""; zero="$(seq 0 31)"
+    _run_test sc=0 len=64k cmd=zero
+
+    ### Fill cluster #0 with data
+    alloc="$(seq 0 31)"; zero=""
+    _run_test sc=0 len=64k
+
+    ### Zero and unmap half of cluster #0 (this won't unmap it)
+    alloc="$(seq 16 31)"; zero="$(seq 0 15)"
+    _run_test sc=0 len=32k cmd=unmap
+
+    ### Zero and unmap cluster #0
+    alloc=""; zero="$(seq 0 31)"
+    _run_test sc=0 len=64k cmd=unmap
+
+    ### Write subcluster #1 (middle of subcluster)
+    alloc="1"; zero="0 $(seq 2 31)"
+    _run_test sc=1 off=1 len=512
+
+    ### Fill cluster #0 with data
+    alloc="$(seq 0 31)"; zero=""
+    _run_test sc=0 len=64k
+
+    ### Discard cluster #0
+    alloc=""; zero="$(seq 0 31)"
+    _run_test sc=0 len=64k cmd=discard
+
+    ### Write compressed data to cluster #0
+    alloc=""; zero=""
+    _run_test sc=0 len=64k cmd=compress
+
+    ### Write subcluster #1 (middle of subcluster)
+    alloc="$(seq 0 31)"; zero=""
+    _run_test sc=1 off=1 len=512
+done
+
+############################################################
+############################################################
+############################################################
+
+# calculate_l2_meta() checks if none of the clusters affected by a
+# write operation need COW or changes to their L2 metadata and simply
+# returns when they don't. This is a test for that optimization.
+# Here clusters #0-#3 are overwritten but only #1 and #2 need changes.
+echo
+echo '### Overwriting several clusters without COW ###'
+echo
+use_backing_file="no" _reset_img 1M
+# Write cluster #0, subclusters #12-#31
+alloc="$(seq 12 31)"; zero=""
+_run_test sc=12 len=40k
+
+# Write cluster #1, subcluster #13
+alloc="13"; zero=""
+_run_test c=1 sc=13 len=2k
+
+# Zeroize cluster #2, subcluster #14
+alloc="14"; zero=""
+_run_test c=2 sc=14 len=2k
+alloc=""; zero="14"
+_run_test c=2 sc=14 len=2k cmd=zero
+
+# Write cluster #3, subclusters #0-#16
+alloc="$(seq 0 16)"; zero=""
+_run_test c=3 sc=0 len=34k
+
+# Write from cluster #0, subcluster #12 to cluster #3, subcluster #11
+alloc="$(seq 12 31)"; zero=""
+_run_test sc=12 len=192k
+alloc="$(seq 0 31)"; zero=""
+_verify_l2_bitmap 1
+_verify_l2_bitmap 2
+
+alloc="$(seq 0 16)"; zero=""
+_verify_l2_bitmap 3
+
+############################################################
+############################################################
+############################################################
+
+# Test different patterns of writing zeroes
+for use_backing_file in yes no; do
+    echo
+    echo "### Writing zeroes 1: unallocated clusters (backing file: $use_backing_file) ###"
+    echo
+    # Note that the image size is not a multiple of the cluster size
+    _reset_img 2083k
+
+    # Cluster-aligned request from clusters #0 to #2
+    alloc=""; zero="$(seq 0 31)"
+    _run_test c=0 sc=0 len=192k cmd=zero
+    _verify_l2_bitmap 1
+    _verify_l2_bitmap 2
+
+    # Subcluster-aligned request from clusters #3 to #5
+    alloc=""; zero="$(seq 16 31)"
+    _run_test c=3 sc=16 len=128k cmd=zero
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 4
+    alloc=""; zero="$(seq 0 15)"
+    _verify_l2_bitmap 5
+
+    # Unaligned request from clusters #6 to #8
+    if [ "$use_backing_file" = "yes" ]; then
+        alloc="15"; zero="$(seq 16 31)" # copy-on-write happening here
+    else
+        alloc=""; zero="$(seq 15 31)"
+    fi
+    _run_test c=6 sc=15 off=1 len=128k cmd=zero
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 7
+    if [ "$use_backing_file" = "yes" ]; then
+        alloc="15"; zero="$(seq 0 14)" # copy-on-write happening here
+    else
+        alloc=""; zero="$(seq 0 15)"
+    fi
+    _verify_l2_bitmap 8
+
+    echo
+    echo "### Writing zeroes 2: allocated clusters (backing file: $use_backing_file) ###"
+    echo
+    alloc="$(seq 0 31)"; zero=""
+    _run_test c=9 sc=0 len=576k
+    _verify_l2_bitmap 10
+    _verify_l2_bitmap 11
+    _verify_l2_bitmap 12
+    _verify_l2_bitmap 13
+    _verify_l2_bitmap 14
+    _verify_l2_bitmap 15
+    _verify_l2_bitmap 16
+    _verify_l2_bitmap 17
+
+    # Cluster-aligned request from clusters #9 to #11
+    alloc=""; zero="$(seq 0 31)"
+    _run_test c=9 sc=0 len=192k cmd=zero
+    _verify_l2_bitmap 10
+    _verify_l2_bitmap 11
+
+    # Subcluster-aligned request from clusters #12 to #14
+    alloc="$(seq 0 15)"; zero="$(seq 16 31)"
+    _run_test c=12 sc=16 len=128k cmd=zero
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 13
+    alloc="$(seq 16 31)"; zero="$(seq 0 15)"
+    _verify_l2_bitmap 14
+
+    # Unaligned request from clusters #15 to #17
+    alloc="$(seq 0 15)"; zero="$(seq 16 31)"
+    _run_test c=15 sc=15 off=1 len=128k cmd=zero
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 16
+    alloc="$(seq 15 31)"; zero="$(seq 0 14)"
+    _verify_l2_bitmap 17
+
+    echo
+    echo "### Writing zeroes 3: compressed clusters (backing file: $use_backing_file) ###"
+    echo
+    alloc=""; zero=""
+    for c in $(seq 18 28); do
+        _run_test c=$c sc=0 len=64k cmd=compress
+    done
+
+    # Cluster-aligned request from clusters #18 to #20
+    alloc=""; zero="$(seq 0 31)"
+    _run_test c=18 sc=0 len=192k cmd=zero
+    _verify_l2_bitmap 19
+    _verify_l2_bitmap 20
+
+    # Subcluster-aligned request from clusters #21 to #23.
+    # We cannot partially zero a compressed cluster so the code
+    # returns -ENOTSUP, which means copy-on-write of the compressed
+    # data and fill the rest with actual zeroes on disk.
+    # TODO: cluster #22 should use the 'all zeroes' bits.
+    alloc="$(seq 0 31)"; zero=""
+    _run_test c=21 sc=16 len=128k cmd=zero
+    _verify_l2_bitmap 22
+    _verify_l2_bitmap 23
+
+    # Unaligned request from clusters #24 to #26
+    # In this case QEMU internally sends a 1k request followed by a
+    # subcluster-aligned 128k request. The first request decompresses
+    # cluster #24, but that's not enough to perform the second request
+    # efficiently because it partially writes to cluster #26 (which is
+    # compressed) so we hit the same problem as before.
+    alloc="$(seq 0 31)"; zero=""
+    _run_test c=24 sc=15 off=1 len=129k cmd=zero
+    _verify_l2_bitmap 25
+    _verify_l2_bitmap 26
+
+    # Unaligned request from clusters #27 to #29
+    # Similar to the previous case, but this time the tail of the
+    # request does not correspond to a compressed cluster, so it can
+    # be zeroed efficiently.
+    # Note that the very last subcluster is partially written, so if
+    # there's a backing file we need to perform cow.
+    alloc="$(seq 0 15)"; zero="$(seq 16 31)"
+    _run_test c=27 sc=15 off=1 len=128k cmd=zero
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 28
+    if [ "$use_backing_file" = "yes" ]; then
+        alloc="15"; zero="$(seq 0 14)" # copy-on-write happening here
+    else
+        alloc=""; zero="$(seq 0 15)"
+    fi
+    _verify_l2_bitmap 29
+
+    echo
+    echo "### Writing zeroes 4: other tests (backing file: $use_backing_file) ###"
+    echo
+    # Unaligned request in the middle of cluster #30.
+    # If there's a backing file we need to allocate and do
+    # copy-on-write on the partially zeroed subclusters.
+    # If not we can set the 'all zeroes' bit on them.
+    if [ "$use_backing_file" = "yes" ]; then
+        alloc="15 19"; zero="$(seq 16 18)" # copy-on-write happening here
+    else
+        alloc=""; zero="$(seq 15 19)"
+    fi
+    _run_test c=30 sc=15 off=1 len=8k cmd=zero
+
+    # Fill the last cluster with zeroes, up to the end of the image
+    # (the image size is not a multiple of the cluster or subcluster size).
+    alloc=""; zero="$(seq 0 17)"
+    _run_test c=32 sc=0 len=35k cmd=zero
+done
+
+############################################################
+############################################################
+############################################################
+
+# Zero + unmap
+for use_backing_file in yes no; do
+    echo
+    echo "### Zero + unmap 1: allocated clusters (backing file: $use_backing_file) ###"
+    echo
+    # Note that the image size is not a multiple of the cluster size
+    _reset_img 2083k
+    alloc="$(seq 0 31)"; zero=""
+    _run_test c=9 sc=0 len=576k
+    _verify_l2_bitmap 10
+    _verify_l2_bitmap 11
+    _verify_l2_bitmap 12
+    _verify_l2_bitmap 13
+    _verify_l2_bitmap 14
+    _verify_l2_bitmap 15
+    _verify_l2_bitmap 16
+    _verify_l2_bitmap 17
+
+    # Cluster-aligned request from clusters #9 to #11
+    alloc=""; zero="$(seq 0 31)"
+    _run_test c=9 sc=0 len=192k cmd=unmap
+    _verify_l2_bitmap 10
+    _verify_l2_bitmap 11
+
+    # Subcluster-aligned request from clusters #12 to #14
+    alloc="$(seq 0 15)"; zero="$(seq 16 31)"
+    _run_test c=12 sc=16 len=128k cmd=unmap
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 13
+    alloc="$(seq 16 31)"; zero="$(seq 0 15)"
+    _verify_l2_bitmap 14
+
+    # Unaligned request from clusters #15 to #17
+    alloc="$(seq 0 15)"; zero="$(seq 16 31)"
+    _run_test c=15 sc=15 off=1 len=128k cmd=unmap
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 16
+    alloc="$(seq 15 31)"; zero="$(seq 0 14)"
+    _verify_l2_bitmap 17
+
+    echo
+    echo "### Zero + unmap 2: compressed clusters (backing file: $use_backing_file) ###"
+    echo
+    alloc=""; zero=""
+    for c in $(seq 18 28); do
+        _run_test c=$c sc=0 len=64k cmd=compress
+    done
+
+    # Cluster-aligned request from clusters #18 to #20
+    alloc=""; zero="$(seq 0 31)"
+    _run_test c=18 sc=0 len=192k cmd=unmap
+    _verify_l2_bitmap 19
+    _verify_l2_bitmap 20
+
+    # Subcluster-aligned request from clusters #21 to #23.
+    # We cannot partially zero a compressed cluster so the code
+    # returns -ENOTSUP, which means copy-on-write of the compressed
+    # data and fill the rest with actual zeroes on disk.
+    # TODO: cluster #22 should use the 'all zeroes' bits.
+    alloc="$(seq 0 31)"; zero=""
+    _run_test c=21 sc=16 len=128k cmd=unmap
+    _verify_l2_bitmap 22
+    _verify_l2_bitmap 23
+
+    # Unaligned request from clusters #24 to #26
+    # In this case QEMU internally sends a 1k request followed by a
+    # subcluster-aligned 128k request. The first request decompresses
+    # cluster #24, but that's not enough to perform the second request
+    # efficiently because it partially writes to cluster #26 (which is
+    # compressed) so we hit the same problem as before.
+    alloc="$(seq 0 31)"; zero=""
+    _run_test c=24 sc=15 off=1 len=129k cmd=unmap
+    _verify_l2_bitmap 25
+    _verify_l2_bitmap 26
+
+    # Unaligned request from clusters #27 to #29
+    # Similar to the previous case, but this time the tail of the
+    # request does not correspond to a compressed cluster, so it can
+    # be zeroed efficiently.
+    # Note that the very last subcluster is partially written, so if
+    # there's a backing file we need to perform cow.
+    alloc="$(seq 0 15)"; zero="$(seq 16 31)"
+    _run_test c=27 sc=15 off=1 len=128k cmd=unmap
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 28
+    if [ "$use_backing_file" = "yes" ]; then
+        alloc="15"; zero="$(seq 0 14)" # copy-on-write happening here
+    else
+        alloc=""; zero="$(seq 0 15)"
+    fi
+    _verify_l2_bitmap 29
+done
+
+############################################################
+############################################################
+############################################################
+
+# Test qcow2_cluster_discard() with full and normal discards
+for use_backing_file in yes no; do
+    echo
+    echo "### Discarding clusters with non-zero bitmaps (backing file: $use_backing_file) ###"
+    echo
+    if [ "$use_backing_file" = "yes" ]; then
+        _make_test_img -o extended_l2=on -F raw -b "$TEST_IMG.base" 1M
+    else
+        _make_test_img -o extended_l2=on 1M
+    fi
+    # Write clusters #0-#2 and then discard them
+    $QEMU_IO -c 'write -q 0 128k' "$TEST_IMG"
+    $QEMU_IO -c 'discard -q 0 128k' "$TEST_IMG"
+    # 'qemu-io discard' doesn't do a full discard, it zeroizes the
+    # cluster, so both clusters have all zero bits set now
+    alloc=""; zero="$(seq 0 31)"
+    _verify_l2_bitmap 0
+    _verify_l2_bitmap 1
+    # Now mark the 2nd half of the subclusters from cluster #0 as unallocated
+    poke_file "$TEST_IMG" $(($l2_offset+8)) "\x00\x00"
+    # Discard cluster #0 again to see how the zero bits have changed
+    $QEMU_IO -c 'discard -q 0 64k' "$TEST_IMG"
+    # And do a full discard of cluster #1 by shrinking and growing the image
+    $QEMU_IMG resize --shrink "$TEST_IMG" 64k
+    $QEMU_IMG resize "$TEST_IMG" 1M
+    # A normal discard sets all 'zero' bits only if the image has a
+    # backing file, otherwise it won't touch them.
+    if [ "$use_backing_file" = "yes" ]; then
+        alloc=""; zero="$(seq 0 31)"
+    else
+        alloc=""; zero="$(seq 0 15)"
+    fi
+    _verify_l2_bitmap 0
+    # A full discard should clear the L2 entry completely. However
+    # when growing an image with a backing file the new clusters are
+    # zeroized to hide the stale data from the backing file
+    if [ "$use_backing_file" = "yes" ]; then
+        alloc=""; zero="$(seq 0 31)"
+    else
+        alloc=""; zero=""
+    fi
+    _verify_l2_bitmap 1
+done
+
+############################################################
+############################################################
+############################################################
+
+# Test that corrupted L2 entries are detected in both read and write
+# operations
+for corruption_test_cmd in read write; do
+    echo
+    echo "### Corrupted L2 entries - $corruption_test_cmd test (allocated) ###"
+    echo
+    echo "# 'cluster is zero' bit set on the standard cluster descriptor"
+    echo
+    # We actually don't consider this a corrupted image.
+    # The 'cluster is zero' bit is unused in extended L2 entries so
+    # QEMU ignores it.
+    # TODO: maybe treat the image as corrupted and make qemu-img check fix it?
+    _make_test_img -o extended_l2=on 1M
+    $QEMU_IO -c 'write -q -P 0x11 0 2k' "$TEST_IMG"
+    poke_file "$TEST_IMG" $(($l2_offset+7)) "\x01"
+    alloc="0"; zero=""
+    _verify_l2_bitmap 0
+    $QEMU_IO -c "$corruption_test_cmd -q -P 0x11 0 1k" "$TEST_IMG"
+    if [ "$corruption_test_cmd" = "write" ]; then
+        alloc="0"; zero=""
+    fi
+    _verify_l2_bitmap 0
+
+    echo
+    echo "# Both 'subcluster is zero' and 'subcluster is allocated' bits set"
+    echo
+    _make_test_img -o extended_l2=on 1M
+    # Write from the middle of cluster #0 to the middle of cluster #2
+    $QEMU_IO -c 'write -q 32k 128k' "$TEST_IMG"
+    # Corrupt the L2 entry from cluster #1
+    poke_file_be "$TEST_IMG" $(($l2_offset+24)) 4 1
+    alloc="$(seq 0 31)"; zero="0"
+    _verify_l2_bitmap 1
+    $QEMU_IO -c "$corruption_test_cmd 0 192k" "$TEST_IMG"
+
+    echo
+    echo "### Corrupted L2 entries - $corruption_test_cmd test (unallocated) ###"
+    echo
+    echo "# 'cluster is zero' bit set on the standard cluster descriptor"
+    echo
+    # We actually don't consider this a corrupted image.
+    # The 'cluster is zero' bit is unused in extended L2 entries so
+    # QEMU ignores it.
+    # TODO: maybe treat the image as corrupted and make qemu-img check fix it?
+    _make_test_img -o extended_l2=on 1M
+    # We want to modify the (empty) L2 entry from cluster #0,
+    # but we write to #4 in order to initialize the L2 table first
+    $QEMU_IO -c 'write -q 256k 1k' "$TEST_IMG"
+    poke_file "$TEST_IMG" $(($l2_offset+7)) "\x01"
+    alloc=""; zero=""
+    _verify_l2_bitmap 0
+    $QEMU_IO -c "$corruption_test_cmd -q 0 1k" "$TEST_IMG"
+    if [ "$corruption_test_cmd" = "write" ]; then
+        alloc="0"; zero=""
+    fi
+    _verify_l2_bitmap 0
+
+    echo
+    echo "# 'subcluster is allocated' bit set"
+    echo
+    _make_test_img -o extended_l2=on 1M
+    # We want to corrupt the (empty) L2 entry from cluster #0,
+    # but we write to #4 in order to initialize the L2 table first
+    $QEMU_IO -c 'write -q 256k 1k' "$TEST_IMG"
+    poke_file "$TEST_IMG" $(($l2_offset+15)) "\x01"
+    alloc="0"; zero=""
+    _verify_l2_bitmap 0
+    $QEMU_IO -c "$corruption_test_cmd 0 1k" "$TEST_IMG"
+
+    echo
+    echo "# Both 'subcluster is zero' and 'subcluster is allocated' bits set"
+    echo
+    _make_test_img -o extended_l2=on 1M
+    # We want to corrupt the (empty) L2 entry from cluster #1,
+    # but we write to #4 in order to initialize the L2 table first
+    $QEMU_IO -c 'write -q 256k 1k' "$TEST_IMG"
+    # Corrupt the L2 entry from cluster #1
+    poke_file_be "$TEST_IMG" $(($l2_offset+24)) 8 $(((1 << 32) | 1))
+    alloc="0"; zero="0"
+    _verify_l2_bitmap 1
+    $QEMU_IO -c "$corruption_test_cmd 0 192k" "$TEST_IMG"
+
+    echo
+    echo "### Compressed cluster with subcluster bitmap != 0 - $corruption_test_cmd test ###"
+    echo
+    # We actually don't consider this a corrupted image.
+    # The bitmap in compressed clusters is unused so QEMU should just ignore it.
+    _make_test_img -o extended_l2=on 1M
+    $QEMU_IO -c 'write -q -P 11 -c 0 64k' "$TEST_IMG"
+    # Change the L2 bitmap to allocate subcluster #31 and zeroize subcluster #0
+    poke_file "$TEST_IMG" $(($l2_offset+11)) "\x01\x80"
+    alloc="31"; zero="0"
+    _verify_l2_bitmap 0
+    $QEMU_IO -c "$corruption_test_cmd -P 11 0 64k" "$TEST_IMG" | _filter_qemu_io
+    # Writing allocates a new uncompressed cluster so we get a new bitmap
+    if [ "$corruption_test_cmd" = "write" ]; then
+        alloc="$(seq 0 31)"; zero=""
+    fi
+    _verify_l2_bitmap 0
+done
+
+############################################################
+############################################################
+############################################################
+
+echo
+echo "### Detect and repair unaligned clusters ###"
+echo
+# Create a backing file and fill it with data
+$QEMU_IMG create -f raw "$TEST_IMG.base" 128k | _filter_img_create
+$QEMU_IO -c "write -q -P 0xff 0 128k" -f raw "$TEST_IMG.base" | _filter_qemu_io
+
+echo "# Corrupted L2 entry, allocated subcluster #"
+# Create a new image, allocate a cluster and write some data to it
+_make_test_img -o extended_l2=on -F raw -b "$TEST_IMG.base"
+$QEMU_IO -c 'write -q -P 1 4k 2k' "$TEST_IMG"
+# Corrupt the L2 entry by making the offset unaligned
+poke_file "$TEST_IMG" "$(($l2_offset+6))" "\x02"
+# This cannot be repaired, qemu-img check will fail to fix it
+_check_test_img -r all
+# Attempting to read the image will still show that it's corrupted
+$QEMU_IO -c 'read -q 0 2k' "$TEST_IMG"
+
+echo "# Corrupted L2 entry, no allocated subclusters #"
+# Create a new image, allocate a cluster and zeroize subcluster #2
+_make_test_img -o extended_l2=on -F raw -b "$TEST_IMG.base"
+$QEMU_IO -c 'write -q -P 1 4k 2k' "$TEST_IMG"
+$QEMU_IO -c 'write -q -z   4k 2k' "$TEST_IMG"
+# Corrupt the L2 entry by making the offset unaligned
+poke_file "$TEST_IMG" "$(($l2_offset+6))" "\x02"
+# This time none of the subclusters are allocated so we can repair the image
+_check_test_img -r all
+# And the data can be read normally
+$QEMU_IO -c 'read -q -P 0xff  0   4k' "$TEST_IMG"
+$QEMU_IO -c 'read -q -P 0x00 4k   2k' "$TEST_IMG"
+$QEMU_IO -c 'read -q -P 0xff 6k 122k' "$TEST_IMG"
+
+############################################################
+############################################################
+############################################################
+
+echo
+echo "### Image creation options ###"
+echo
+echo "# cluster_size < 16k"
+_make_test_img -o extended_l2=on,cluster_size=8k 1M
+
+echo "# backing file and preallocation=metadata"
+# For preallocation with backing files, create a backing file first
+$QEMU_IMG create -f raw "$TEST_IMG.base" 1M | _filter_img_create
+$QEMU_IO -c "write -q -P 0xff 0 1M" -f raw "$TEST_IMG.base" | _filter_qemu_io
+
+_make_test_img -o extended_l2=on,preallocation=metadata -F raw -b "$TEST_IMG.base" 512k
+$QEMU_IMG resize "$TEST_IMG" 1M
+$QEMU_IO -c 'read -P 0xff    0 512k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 512k 512k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG map "$TEST_IMG" | _filter_testdir
+
+echo "# backing file and preallocation=falloc"
+_make_test_img -o extended_l2=on,preallocation=falloc -F raw -b "$TEST_IMG.base" 512k
+$QEMU_IMG resize "$TEST_IMG" 1M
+$QEMU_IO -c 'read -P 0xff    0 512k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 512k 512k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG map "$TEST_IMG" | _filter_testdir
+
+echo "# backing file and preallocation=full"
+_make_test_img -o extended_l2=on,preallocation=full -F raw -b "$TEST_IMG.base" 512k
+$QEMU_IMG resize "$TEST_IMG" 1M
+$QEMU_IO -c 'read -P 0xff    0 512k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 512k 512k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG map "$TEST_IMG" | _filter_testdir
+
+echo
+echo "### Image resizing with preallocation and backing files ###"
+echo
+# In this case the new subclusters must have the 'all zeroes' bit set
+echo "# resize --preallocation=metadata"
+_make_test_img -o extended_l2=on -F raw -b "$TEST_IMG.base" 503k
+$QEMU_IMG resize --preallocation=metadata "$TEST_IMG" 1013k
+$QEMU_IO -c 'read -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 503k 510k' "$TEST_IMG" | _filter_qemu_io
+
+# In this case and the next one the new subclusters must be allocated
+echo "# resize --preallocation=falloc"
+_make_test_img -o extended_l2=on -F raw -b "$TEST_IMG.base" 503k
+$QEMU_IMG resize --preallocation=falloc "$TEST_IMG" 1013k
+$QEMU_IO -c 'read -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 503k 510k' "$TEST_IMG" | _filter_qemu_io
+
+echo "# resize --preallocation=full"
+_make_test_img -o extended_l2=on -F raw -b "$TEST_IMG.base" 503k
+$QEMU_IMG resize --preallocation=full "$TEST_IMG" 1013k
+$QEMU_IO -c 'read -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 503k 510k' "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "### Image resizing with preallocation without backing files ###"
+echo
+# In this case the new subclusters must have the 'all zeroes' bit set
+echo "# resize --preallocation=metadata"
+_make_test_img -o extended_l2=on 503k
+$QEMU_IO -c 'write -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG resize --preallocation=metadata "$TEST_IMG" 1013k
+$QEMU_IO -c 'read -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 503k 510k' "$TEST_IMG" | _filter_qemu_io
+
+# In this case and the next one the new subclusters must be allocated
+echo "# resize --preallocation=falloc"
+_make_test_img -o extended_l2=on 503k
+$QEMU_IO -c 'write -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG resize --preallocation=falloc "$TEST_IMG" 1013k
+$QEMU_IO -c 'read -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 503k 510k' "$TEST_IMG" | _filter_qemu_io
+
+echo "# resize --preallocation=full"
+_make_test_img -o extended_l2=on 503k
+$QEMU_IO -c 'write -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG resize --preallocation=full "$TEST_IMG" 1013k
+$QEMU_IO -c 'read -P 0xff    0 503k' "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c 'read -P 0x00 503k 510k' "$TEST_IMG" | _filter_qemu_io
+
+echo
+echo "### qemu-img measure ###"
+echo
+echo "# 512MB, extended_l2=off" # This needs one L2 table
+$QEMU_IMG measure --size 512M -O qcow2 -o extended_l2=off
+echo "# 512MB, extended_l2=on"  # This needs two L2 tables
+$QEMU_IMG measure --size 512M -O qcow2 -o extended_l2=on
+
+echo "# 16K clusters, 64GB, extended_l2=off" # This needs one full L1 table cluster
+$QEMU_IMG measure --size 64G -O qcow2 -o cluster_size=16k,extended_l2=off
+echo "# 16K clusters, 64GB, extended_l2=on"  # This needs two full L2 table clusters
+$QEMU_IMG measure --size 64G -O qcow2 -o cluster_size=16k,extended_l2=on
+
+echo "# 8k clusters" # This should fail
+$QEMU_IMG measure --size 1M -O qcow2 -o cluster_size=8k,extended_l2=on
+
+echo "# 1024 TB" # Maximum allowed size with extended_l2=on and 64K clusters
+$QEMU_IMG measure --size 1024T -O qcow2 -o extended_l2=on
+echo "# 1025 TB" # This should fail
+$QEMU_IMG measure --size 1025T -O qcow2 -o extended_l2=on
+
+echo
+echo "### qemu-img amend ###"
+echo
+_make_test_img -o extended_l2=on 1M
+$QEMU_IMG amend -o extended_l2=off "$TEST_IMG" && echo "Unexpected pass"
+
+_make_test_img -o extended_l2=off 1M
+$QEMU_IMG amend -o extended_l2=on "$TEST_IMG" && echo "Unexpected pass"
+
+echo
+echo "### Test copy-on-write on an image with snapshots ###"
+echo
+_make_test_img -o extended_l2=on 1M
+
+# For each cluster from #0 to #9 this loop zeroes subcluster #7
+# and allocates subclusters #13 and #18.
+alloc="13 18"; zero="7"
+for c in $(seq 0 9); do
+    $QEMU_IO -c "write -q -z $((64*$c+14))k 2k" \
+             -c "write -q -P $((0xd0+$c)) $((64*$c+26))k 2k" \
+             -c "write -q -P $((0xe0+$c)) $((64*$c+36))k 2k" "$TEST_IMG"
+    _verify_l2_bitmap "$c"
+done
+
+# Create a snapshot and set l2_offset to the new L2 table
+$QEMU_IMG snapshot -c snap1 "$TEST_IMG"
+l2_offset=$((0x110000))
+
+# Write different patterns to each one of the clusters
+# in order to see how copy-on-write behaves in each case.
+$QEMU_IO -c "write -q -P 0xf0 $((64*0+30))k 1k" \
+         -c "write -q -P 0xf1 $((64*1+20))k 1k" \
+         -c "write -q -P 0xf2 $((64*2+40))k 1k" \
+         -c "write -q -P 0xf3 $((64*3+26))k 1k" \
+         -c "write -q -P 0xf4 $((64*4+14))k 1k" \
+         -c "write -q -P 0xf5 $((64*5+1))k  1k" \
+         -c "write -q -z      $((64*6+30))k 3k" \
+         -c "write -q -z      $((64*7+26))k 2k" \
+         -c "write -q -z      $((64*8+26))k 1k" \
+         -c "write -q -z      $((64*9+12))k 1k" \
+         "$TEST_IMG"
+alloc="$(seq 13 18)"; zero="7" _verify_l2_bitmap 0
+alloc="$(seq 10 18)"; zero="7" _verify_l2_bitmap 1
+alloc="$(seq 13 20)"; zero="7" _verify_l2_bitmap 2
+alloc="$(seq 13 18)"; zero="7" _verify_l2_bitmap 3
+alloc="$(seq 7 18)";  zero=""  _verify_l2_bitmap 4
+alloc="$(seq 0 18)";  zero=""  _verify_l2_bitmap 5
+alloc="13 18";  zero="7 15 16" _verify_l2_bitmap 6
+alloc="18";        zero="7 13" _verify_l2_bitmap 7
+alloc="$(seq 13 18)"; zero="7" _verify_l2_bitmap 8
+alloc="13 18";      zero="6 7" _verify_l2_bitmap 9
+
+echo
+echo "### Test concurrent requests ###"
+echo
+
+_concurrent_io()
+{
+# Allocate three subclusters in the same cluster.
+# This works because handle_dependencies() checks whether the requests
+# allocate the same cluster, even if the COW regions don't overlap (in
+# this case they don't).
+cat <<EOF
+open -o driver=$IMGFMT blkdebug::$TEST_IMG
+break write_aio A
+aio_write -P 10 30k 2k
+wait_break A
+aio_write -P 11 20k 2k
+aio_write -P 12 40k 2k
+resume A
+aio_flush
+EOF
+}
+
+_concurrent_verify()
+{
+cat <<EOF
+open -o driver=$IMGFMT $TEST_IMG
+read -q -P 10 30k 2k
+read -q -P 11 20k 2k
+read -q -P 12 40k 2k
+EOF
+}
+
+_make_test_img -o extended_l2=on 1M
+_concurrent_io     | $QEMU_IO | _filter_qemu_io
+_concurrent_verify | $QEMU_IO | _filter_qemu_io
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/271.out b/tests/qemu-iotests/271.out
new file mode 100644
index 0000000000..07c1e7b46d
--- /dev/null
+++ b/tests/qemu-iotests/271.out
@@ -0,0 +1,724 @@
+QA output created by 271
+
+### Standard write tests (backing file: yes) ###
+
+Formatting 'TEST_DIR/t.IMGFMT.raw', fmt=raw size=1048576
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=raw size=1048576
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+write -q -P PATTERN 0 1k
+L2 entry #0: 0x8000000000050000 0000000000000001
+write -q -P PATTERN 3k 512
+L2 entry #0: 0x8000000000050000 0000000000000003
+write -q -P PATTERN 5k 1k
+L2 entry #0: 0x8000000000050000 0000000000000007
+write -q -P PATTERN 6k 2k
+L2 entry #0: 0x8000000000050000 000000000000000f
+write -q -P PATTERN 8k 6k
+L2 entry #0: 0x8000000000050000 000000000000007f
+write -q -P PATTERN 15k 4k
+L2 entry #0: 0x8000000000050000 00000000000003ff
+write -q -P PATTERN 32k 1k
+L2 entry #0: 0x8000000000050000 00000000000103ff
+write -q -P PATTERN 63k 4k
+L2 entry #0: 0x8000000000050000 00000000800103ff
+L2 entry #1: 0x8000000000060000 0000000000000003
+write -q -z 2k 2k
+L2 entry #0: 0x8000000000050000 00000002800103fd
+write -q -z 0 64k
+L2 entry #0: 0x8000000000050000 ffffffff00000000
+write -q -P PATTERN 0 64k
+L2 entry #0: 0x8000000000050000 00000000ffffffff
+write -q -z -u 0 32k
+L2 entry #0: 0x8000000000050000 0000ffffffff0000
+write -q -z -u 0 64k
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+write -q -P PATTERN 3k 512
+L2 entry #0: 0x8000000000050000 fffffffd00000002
+write -q -P PATTERN 0 64k
+L2 entry #0: 0x8000000000050000 00000000ffffffff
+discard -q 0 64k
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+write -q -c -P PATTERN 0 64k
+L2 entry #0: 0x4000000000050000 0000000000000000
+write -q -P PATTERN 3k 512
+L2 entry #0: 0x8000000000070000 00000000ffffffff
+
+### Standard write tests (backing file: no) ###
+
+Formatting 'TEST_DIR/t.IMGFMT.raw', fmt=raw size=1048576
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+write -q -P PATTERN 0 1k
+L2 entry #0: 0x8000000000050000 0000000000000001
+write -q -P PATTERN 3k 512
+L2 entry #0: 0x8000000000050000 0000000000000003
+write -q -P PATTERN 5k 1k
+L2 entry #0: 0x8000000000050000 0000000000000007
+write -q -P PATTERN 6k 2k
+L2 entry #0: 0x8000000000050000 000000000000000f
+write -q -P PATTERN 8k 6k
+L2 entry #0: 0x8000000000050000 000000000000007f
+write -q -P PATTERN 15k 4k
+L2 entry #0: 0x8000000000050000 00000000000003ff
+write -q -P PATTERN 32k 1k
+L2 entry #0: 0x8000000000050000 00000000000103ff
+write -q -P PATTERN 63k 4k
+L2 entry #0: 0x8000000000050000 00000000800103ff
+L2 entry #1: 0x8000000000060000 0000000000000003
+write -q -z 2k 2k
+L2 entry #0: 0x8000000000050000 00000002800103fd
+write -q -z 0 64k
+L2 entry #0: 0x8000000000050000 ffffffff00000000
+write -q -P PATTERN 0 64k
+L2 entry #0: 0x8000000000050000 00000000ffffffff
+write -q -z -u 0 32k
+L2 entry #0: 0x8000000000050000 0000ffffffff0000
+write -q -z -u 0 64k
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+write -q -P PATTERN 3k 512
+L2 entry #0: 0x8000000000050000 fffffffd00000002
+write -q -P PATTERN 0 64k
+L2 entry #0: 0x8000000000050000 00000000ffffffff
+discard -q 0 64k
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+write -q -c -P PATTERN 0 64k
+L2 entry #0: 0x4000000000050000 0000000000000000
+write -q -P PATTERN 3k 512
+L2 entry #0: 0x8000000000070000 00000000ffffffff
+
+### Overwriting several clusters without COW ###
+
+Formatting 'TEST_DIR/t.IMGFMT.raw', fmt=raw size=1048576
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+write -q -P PATTERN 24k 40k
+L2 entry #0: 0x8000000000050000 00000000fffff000
+write -q -P PATTERN 90k 2k
+L2 entry #1: 0x8000000000060000 0000000000002000
+write -q -P PATTERN 156k 2k
+L2 entry #2: 0x8000000000070000 0000000000004000
+write -q -z 156k 2k
+L2 entry #2: 0x8000000000070000 0000400000000000
+write -q -P PATTERN 192k 34k
+L2 entry #3: 0x8000000000080000 000000000001ffff
+write -q -P PATTERN 24k 192k
+L2 entry #0: 0x8000000000050000 00000000fffff000
+L2 entry #1: 0x8000000000060000 00000000ffffffff
+L2 entry #2: 0x8000000000070000 00000000ffffffff
+L2 entry #3: 0x8000000000080000 000000000001ffff
+
+### Writing zeroes 1: unallocated clusters (backing file: yes) ###
+
+Formatting 'TEST_DIR/t.IMGFMT.raw', fmt=raw size=2132992
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=raw size=2132992
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2132992 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+write -q -z 0 192k
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+L2 entry #1: 0x0000000000000000 ffffffff00000000
+L2 entry #2: 0x0000000000000000 ffffffff00000000
+write -q -z 224k 128k
+L2 entry #3: 0x0000000000000000 ffff000000000000
+L2 entry #4: 0x0000000000000000 ffffffff00000000
+L2 entry #5: 0x0000000000000000 0000ffff00000000
+write -q -z 415k 128k
+L2 entry #6: 0x8000000000050000 ffff000000008000
+L2 entry #7: 0x0000000000000000 ffffffff00000000
+L2 entry #8: 0x8000000000060000 00007fff00008000
+
+### Writing zeroes 2: allocated clusters (backing file: yes) ###
+
+write -q -P PATTERN 576k 576k
+L2 entry #9: 0x8000000000070000 00000000ffffffff
+L2 entry #10: 0x8000000000080000 00000000ffffffff
+L2 entry #11: 0x8000000000090000 00000000ffffffff
+L2 entry #12: 0x80000000000a0000 00000000ffffffff
+L2 entry #13: 0x80000000000b0000 00000000ffffffff
+L2 entry #14: 0x80000000000c0000 00000000ffffffff
+L2 entry #15: 0x80000000000d0000 00000000ffffffff
+L2 entry #16: 0x80000000000e0000 00000000ffffffff
+L2 entry #17: 0x80000000000f0000 00000000ffffffff
+write -q -z 576k 192k
+L2 entry #9: 0x8000000000070000 ffffffff00000000
+L2 entry #10: 0x8000000000080000 ffffffff00000000
+L2 entry #11: 0x8000000000090000 ffffffff00000000
+write -q -z 800k 128k
+L2 entry #12: 0x80000000000a0000 ffff00000000ffff
+L2 entry #13: 0x80000000000b0000 ffffffff00000000
+L2 entry #14: 0x80000000000c0000 0000ffffffff0000
+write -q -z 991k 128k
+L2 entry #15: 0x80000000000d0000 ffff00000000ffff
+L2 entry #16: 0x80000000000e0000 ffffffff00000000
+L2 entry #17: 0x80000000000f0000 00007fffffff8000
+
+### Writing zeroes 3: compressed clusters (backing file: yes) ###
+
+write -q -c -P PATTERN 1152k 64k
+L2 entry #18: 0x4000000000100000 0000000000000000
+write -q -c -P PATTERN 1216k 64k
+L2 entry #19: 0x4000000000110000 0000000000000000
+write -q -c -P PATTERN 1280k 64k
+L2 entry #20: 0x4000000000120000 0000000000000000
+write -q -c -P PATTERN 1344k 64k
+L2 entry #21: 0x4000000000130000 0000000000000000
+write -q -c -P PATTERN 1408k 64k
+L2 entry #22: 0x4000000000140000 0000000000000000
+write -q -c -P PATTERN 1472k 64k
+L2 entry #23: 0x4000000000150000 0000000000000000
+write -q -c -P PATTERN 1536k 64k
+L2 entry #24: 0x4000000000160000 0000000000000000
+write -q -c -P PATTERN 1600k 64k
+L2 entry #25: 0x4000000000170000 0000000000000000
+write -q -c -P PATTERN 1664k 64k
+L2 entry #26: 0x4000000000180000 0000000000000000
+write -q -c -P PATTERN 1728k 64k
+L2 entry #27: 0x4000000000190000 0000000000000000
+write -q -c -P PATTERN 1792k 64k
+L2 entry #28: 0x40000000001a0000 0000000000000000
+write -q -z 1152k 192k
+L2 entry #18: 0x0000000000000000 ffffffff00000000
+L2 entry #19: 0x0000000000000000 ffffffff00000000
+L2 entry #20: 0x0000000000000000 ffffffff00000000
+write -q -z 1376k 128k
+L2 entry #21: 0x8000000000100000 00000000ffffffff
+L2 entry #22: 0x8000000000110000 00000000ffffffff
+L2 entry #23: 0x8000000000120000 00000000ffffffff
+write -q -z 1567k 129k
+L2 entry #24: 0x8000000000130000 00000000ffffffff
+L2 entry #25: 0x8000000000140000 00000000ffffffff
+L2 entry #26: 0x8000000000150000 00000000ffffffff
+write -q -z 1759k 128k
+L2 entry #27: 0x8000000000160000 ffff00000000ffff
+L2 entry #28: 0x0000000000000000 ffffffff00000000
+L2 entry #29: 0x8000000000170000 00007fff00008000
+
+### Writing zeroes 4: other tests (backing file: yes) ###
+
+write -q -z 1951k 8k
+L2 entry #30: 0x8000000000180000 0007000000088000
+write -q -z 2048k 35k
+L2 entry #32: 0x0000000000000000 0003ffff00000000
+
+### Writing zeroes 1: unallocated clusters (backing file: no) ###
+
+Formatting 'TEST_DIR/t.IMGFMT.raw', fmt=raw size=2132992
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2132992
+write -q -z 0 192k
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+L2 entry #1: 0x0000000000000000 ffffffff00000000
+L2 entry #2: 0x0000000000000000 ffffffff00000000
+write -q -z 224k 128k
+L2 entry #3: 0x0000000000000000 ffff000000000000
+L2 entry #4: 0x0000000000000000 ffffffff00000000
+L2 entry #5: 0x0000000000000000 0000ffff00000000
+write -q -z 415k 128k
+L2 entry #6: 0x0000000000000000 ffff800000000000
+L2 entry #7: 0x0000000000000000 ffffffff00000000
+L2 entry #8: 0x0000000000000000 0000ffff00000000
+
+### Writing zeroes 2: allocated clusters (backing file: no) ###
+
+write -q -P PATTERN 576k 576k
+L2 entry #9: 0x8000000000050000 00000000ffffffff
+L2 entry #10: 0x8000000000060000 00000000ffffffff
+L2 entry #11: 0x8000000000070000 00000000ffffffff
+L2 entry #12: 0x8000000000080000 00000000ffffffff
+L2 entry #13: 0x8000000000090000 00000000ffffffff
+L2 entry #14: 0x80000000000a0000 00000000ffffffff
+L2 entry #15: 0x80000000000b0000 00000000ffffffff
+L2 entry #16: 0x80000000000c0000 00000000ffffffff
+L2 entry #17: 0x80000000000d0000 00000000ffffffff
+write -q -z 576k 192k
+L2 entry #9: 0x8000000000050000 ffffffff00000000
+L2 entry #10: 0x8000000000060000 ffffffff00000000
+L2 entry #11: 0x8000000000070000 ffffffff00000000
+write -q -z 800k 128k
+L2 entry #12: 0x8000000000080000 ffff00000000ffff
+L2 entry #13: 0x8000000000090000 ffffffff00000000
+L2 entry #14: 0x80000000000a0000 0000ffffffff0000
+write -q -z 991k 128k
+L2 entry #15: 0x80000000000b0000 ffff00000000ffff
+L2 entry #16: 0x80000000000c0000 ffffffff00000000
+L2 entry #17: 0x80000000000d0000 00007fffffff8000
+
+### Writing zeroes 3: compressed clusters (backing file: no) ###
+
+write -q -c -P PATTERN 1152k 64k
+L2 entry #18: 0x40000000000e0000 0000000000000000
+write -q -c -P PATTERN 1216k 64k
+L2 entry #19: 0x40000000000f0000 0000000000000000
+write -q -c -P PATTERN 1280k 64k
+L2 entry #20: 0x4000000000100000 0000000000000000
+write -q -c -P PATTERN 1344k 64k
+L2 entry #21: 0x4000000000110000 0000000000000000
+write -q -c -P PATTERN 1408k 64k
+L2 entry #22: 0x4000000000120000 0000000000000000
+write -q -c -P PATTERN 1472k 64k
+L2 entry #23: 0x4000000000130000 0000000000000000
+write -q -c -P PATTERN 1536k 64k
+L2 entry #24: 0x4000000000140000 0000000000000000
+write -q -c -P PATTERN 1600k 64k
+L2 entry #25: 0x4000000000150000 0000000000000000
+write -q -c -P PATTERN 1664k 64k
+L2 entry #26: 0x4000000000160000 0000000000000000
+write -q -c -P PATTERN 1728k 64k
+L2 entry #27: 0x4000000000170000 0000000000000000
+write -q -c -P PATTERN 1792k 64k
+L2 entry #28: 0x4000000000180000 0000000000000000
+write -q -z 1152k 192k
+L2 entry #18: 0x0000000000000000 ffffffff00000000
+L2 entry #19: 0x0000000000000000 ffffffff00000000
+L2 entry #20: 0x0000000000000000 ffffffff00000000
+write -q -z 1376k 128k
+L2 entry #21: 0x80000000000e0000 00000000ffffffff
+L2 entry #22: 0x80000000000f0000 00000000ffffffff
+L2 entry #23: 0x8000000000100000 00000000ffffffff
+write -q -z 1567k 129k
+L2 entry #24: 0x8000000000110000 00000000ffffffff
+L2 entry #25: 0x8000000000120000 00000000ffffffff
+L2 entry #26: 0x8000000000130000 00000000ffffffff
+write -q -z 1759k 128k
+L2 entry #27: 0x8000000000140000 ffff00000000ffff
+L2 entry #28: 0x0000000000000000 ffffffff00000000
+L2 entry #29: 0x0000000000000000 0000ffff00000000
+
+### Writing zeroes 4: other tests (backing file: no) ###
+
+write -q -z 1951k 8k
+L2 entry #30: 0x0000000000000000 000f800000000000
+write -q -z 2048k 35k
+L2 entry #32: 0x0000000000000000 0003ffff00000000
+
+### Zero + unmap 1: allocated clusters (backing file: yes) ###
+
+Formatting 'TEST_DIR/t.IMGFMT.raw', fmt=raw size=2132992
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=raw size=2132992
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2132992 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+write -q -P PATTERN 576k 576k
+L2 entry #9: 0x8000000000050000 00000000ffffffff
+L2 entry #10: 0x8000000000060000 00000000ffffffff
+L2 entry #11: 0x8000000000070000 00000000ffffffff
+L2 entry #12: 0x8000000000080000 00000000ffffffff
+L2 entry #13: 0x8000000000090000 00000000ffffffff
+L2 entry #14: 0x80000000000a0000 00000000ffffffff
+L2 entry #15: 0x80000000000b0000 00000000ffffffff
+L2 entry #16: 0x80000000000c0000 00000000ffffffff
+L2 entry #17: 0x80000000000d0000 00000000ffffffff
+write -q -z -u 576k 192k
+L2 entry #9: 0x0000000000000000 ffffffff00000000
+L2 entry #10: 0x0000000000000000 ffffffff00000000
+L2 entry #11: 0x0000000000000000 ffffffff00000000
+write -q -z -u 800k 128k
+L2 entry #12: 0x8000000000080000 ffff00000000ffff
+L2 entry #13: 0x0000000000000000 ffffffff00000000
+L2 entry #14: 0x80000000000a0000 0000ffffffff0000
+write -q -z -u 991k 128k
+L2 entry #15: 0x80000000000b0000 ffff00000000ffff
+L2 entry #16: 0x0000000000000000 ffffffff00000000
+L2 entry #17: 0x80000000000d0000 00007fffffff8000
+
+### Zero + unmap 2: compressed clusters (backing file: yes) ###
+
+write -q -c -P PATTERN 1152k 64k
+L2 entry #18: 0x4000000000050000 0000000000000000
+write -q -c -P PATTERN 1216k 64k
+L2 entry #19: 0x4000000000060000 0000000000000000
+write -q -c -P PATTERN 1280k 64k
+L2 entry #20: 0x4000000000070000 0000000000000000
+write -q -c -P PATTERN 1344k 64k
+L2 entry #21: 0x4000000000090000 0000000000000000
+write -q -c -P PATTERN 1408k 64k
+L2 entry #22: 0x40000000000c0000 0000000000000000
+write -q -c -P PATTERN 1472k 64k
+L2 entry #23: 0x40000000000e0000 0000000000000000
+write -q -c -P PATTERN 1536k 64k
+L2 entry #24: 0x40000000000f0000 0000000000000000
+write -q -c -P PATTERN 1600k 64k
+L2 entry #25: 0x4000000000100000 0000000000000000
+write -q -c -P PATTERN 1664k 64k
+L2 entry #26: 0x4000000000110000 0000000000000000
+write -q -c -P PATTERN 1728k 64k
+L2 entry #27: 0x4000000000120000 0000000000000000
+write -q -c -P PATTERN 1792k 64k
+L2 entry #28: 0x4000000000130000 0000000000000000
+write -q -z -u 1152k 192k
+L2 entry #18: 0x0000000000000000 ffffffff00000000
+L2 entry #19: 0x0000000000000000 ffffffff00000000
+L2 entry #20: 0x0000000000000000 ffffffff00000000
+write -q -z -u 1376k 128k
+L2 entry #21: 0x8000000000050000 00000000ffffffff
+L2 entry #22: 0x8000000000060000 00000000ffffffff
+L2 entry #23: 0x8000000000070000 00000000ffffffff
+write -q -z -u 1567k 129k
+L2 entry #24: 0x8000000000090000 00000000ffffffff
+L2 entry #25: 0x80000000000e0000 00000000ffffffff
+L2 entry #26: 0x80000000000f0000 00000000ffffffff
+write -q -z -u 1759k 128k
+L2 entry #27: 0x80000000000c0000 ffff00000000ffff
+L2 entry #28: 0x0000000000000000 ffffffff00000000
+L2 entry #29: 0x8000000000100000 00007fff00008000
+
+### Zero + unmap 1: allocated clusters (backing file: no) ###
+
+Formatting 'TEST_DIR/t.IMGFMT.raw', fmt=raw size=2132992
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2132992
+write -q -P PATTERN 576k 576k
+L2 entry #9: 0x8000000000050000 00000000ffffffff
+L2 entry #10: 0x8000000000060000 00000000ffffffff
+L2 entry #11: 0x8000000000070000 00000000ffffffff
+L2 entry #12: 0x8000000000080000 00000000ffffffff
+L2 entry #13: 0x8000000000090000 00000000ffffffff
+L2 entry #14: 0x80000000000a0000 00000000ffffffff
+L2 entry #15: 0x80000000000b0000 00000000ffffffff
+L2 entry #16: 0x80000000000c0000 00000000ffffffff
+L2 entry #17: 0x80000000000d0000 00000000ffffffff
+write -q -z -u 576k 192k
+L2 entry #9: 0x0000000000000000 ffffffff00000000
+L2 entry #10: 0x0000000000000000 ffffffff00000000
+L2 entry #11: 0x0000000000000000 ffffffff00000000
+write -q -z -u 800k 128k
+L2 entry #12: 0x8000000000080000 ffff00000000ffff
+L2 entry #13: 0x0000000000000000 ffffffff00000000
+L2 entry #14: 0x80000000000a0000 0000ffffffff0000
+write -q -z -u 991k 128k
+L2 entry #15: 0x80000000000b0000 ffff00000000ffff
+L2 entry #16: 0x0000000000000000 ffffffff00000000
+L2 entry #17: 0x80000000000d0000 00007fffffff8000
+
+### Zero + unmap 2: compressed clusters (backing file: no) ###
+
+write -q -c -P PATTERN 1152k 64k
+L2 entry #18: 0x4000000000050000 0000000000000000
+write -q -c -P PATTERN 1216k 64k
+L2 entry #19: 0x4000000000060000 0000000000000000
+write -q -c -P PATTERN 1280k 64k
+L2 entry #20: 0x4000000000070000 0000000000000000
+write -q -c -P PATTERN 1344k 64k
+L2 entry #21: 0x4000000000090000 0000000000000000
+write -q -c -P PATTERN 1408k 64k
+L2 entry #22: 0x40000000000c0000 0000000000000000
+write -q -c -P PATTERN 1472k 64k
+L2 entry #23: 0x40000000000e0000 0000000000000000
+write -q -c -P PATTERN 1536k 64k
+L2 entry #24: 0x40000000000f0000 0000000000000000
+write -q -c -P PATTERN 1600k 64k
+L2 entry #25: 0x4000000000100000 0000000000000000
+write -q -c -P PATTERN 1664k 64k
+L2 entry #26: 0x4000000000110000 0000000000000000
+write -q -c -P PATTERN 1728k 64k
+L2 entry #27: 0x4000000000120000 0000000000000000
+write -q -c -P PATTERN 1792k 64k
+L2 entry #28: 0x4000000000130000 0000000000000000
+write -q -z -u 1152k 192k
+L2 entry #18: 0x0000000000000000 ffffffff00000000
+L2 entry #19: 0x0000000000000000 ffffffff00000000
+L2 entry #20: 0x0000000000000000 ffffffff00000000
+write -q -z -u 1376k 128k
+L2 entry #21: 0x8000000000050000 00000000ffffffff
+L2 entry #22: 0x8000000000060000 00000000ffffffff
+L2 entry #23: 0x8000000000070000 00000000ffffffff
+write -q -z -u 1567k 129k
+L2 entry #24: 0x8000000000090000 00000000ffffffff
+L2 entry #25: 0x80000000000e0000 00000000ffffffff
+L2 entry #26: 0x80000000000f0000 00000000ffffffff
+write -q -z -u 1759k 128k
+L2 entry #27: 0x80000000000c0000 ffff00000000ffff
+L2 entry #28: 0x0000000000000000 ffffffff00000000
+L2 entry #29: 0x0000000000000000 0000ffff00000000
+
+### Discarding clusters with non-zero bitmaps (backing file: yes) ###
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+L2 entry #1: 0x0000000000000000 ffffffff00000000
+Image resized.
+Image resized.
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+L2 entry #1: 0x0000000000000000 ffffffff00000000
+
+### Discarding clusters with non-zero bitmaps (backing file: no) ###
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x0000000000000000 ffffffff00000000
+L2 entry #1: 0x0000000000000000 ffffffff00000000
+Image resized.
+Image resized.
+L2 entry #0: 0x0000000000000000 0000ffff00000000
+L2 entry #1: 0x0000000000000000 0000000000000000
+
+### Corrupted L2 entries - read test (allocated) ###
+
+# 'cluster is zero' bit set on the standard cluster descriptor
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x8000000000050001 0000000000000001
+L2 entry #0: 0x8000000000050001 0000000000000001
+
+# Both 'subcluster is zero' and 'subcluster is allocated' bits set
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #1: 0x8000000000060000 00000001ffffffff
+qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 0x40000, L2 index: 0x1); further corruption events will be suppressed
+read failed: Input/output error
+
+### Corrupted L2 entries - read test (unallocated) ###
+
+# 'cluster is zero' bit set on the standard cluster descriptor
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x0000000000000001 0000000000000000
+L2 entry #0: 0x0000000000000001 0000000000000000
+
+# 'subcluster is allocated' bit set
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x0000000000000000 0000000000000001
+qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 0x40000, L2 index: 0); further corruption events will be suppressed
+read failed: Input/output error
+
+# Both 'subcluster is zero' and 'subcluster is allocated' bits set
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #1: 0x0000000000000000 0000000100000001
+qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 0x40000, L2 index: 0x1); further corruption events will be suppressed
+read failed: Input/output error
+
+### Compressed cluster with subcluster bitmap != 0 - read test ###
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x4000000000050000 0000000180000000
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+L2 entry #0: 0x4000000000050000 0000000180000000
+
+### Corrupted L2 entries - write test (allocated) ###
+
+# 'cluster is zero' bit set on the standard cluster descriptor
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x8000000000050001 0000000000000001
+L2 entry #0: 0x8000000000050001 0000000000000001
+
+# Both 'subcluster is zero' and 'subcluster is allocated' bits set
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #1: 0x8000000000060000 00000001ffffffff
+qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 0x40000, L2 index: 0x1); further corruption events will be suppressed
+write failed: Input/output error
+
+### Corrupted L2 entries - write test (unallocated) ###
+
+# 'cluster is zero' bit set on the standard cluster descriptor
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x0000000000000001 0000000000000000
+L2 entry #0: 0x8000000000060000 0000000000000001
+
+# 'subcluster is allocated' bit set
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x0000000000000000 0000000000000001
+qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 0x40000, L2 index: 0); further corruption events will be suppressed
+write failed: Input/output error
+
+# Both 'subcluster is zero' and 'subcluster is allocated' bits set
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #1: 0x0000000000000000 0000000100000001
+qcow2: Marking image as corrupt: Invalid cluster entry found  (L2 offset: 0x40000, L2 index: 0x1); further corruption events will be suppressed
+write failed: Input/output error
+
+### Compressed cluster with subcluster bitmap != 0 - write test ###
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x4000000000050000 0000000180000000
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+L2 entry #0: 0x8000000000060000 00000000ffffffff
+
+### Detect and repair unaligned clusters ###
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=raw size=131072
+# Corrupted L2 entry, allocated subcluster #
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+ERROR offset=50200: Data cluster is not properly aligned; L2 entry corrupted.
+ERROR cluster 6 refcount=0 reference=1
+Rebuilding refcount structure
+ERROR offset=50200: Data cluster is not properly aligned; L2 entry corrupted.
+Repairing cluster 1 refcount=1 reference=0
+Repairing cluster 2 refcount=1 reference=0
+ERROR offset=50200: Data cluster is not properly aligned; L2 entry corrupted.
+The following inconsistencies were found and repaired:
+
+    0 leaked clusters
+    1 corruptions
+
+Double checking the fixed image now...
+
+1 errors were found on the image.
+Data may be corrupted, or further writes to the image may corrupt it.
+qcow2: Marking image as corrupt: Cluster allocation offset 0x50200 unaligned (L2 offset: 0x40000, L2 index: 0); further corruption events will be suppressed
+read failed: Input/output error
+# Corrupted L2 entry, no allocated subclusters #
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+Repairing offset=50200: Preallocated cluster is not properly aligned; L2 entry corrupted.
+Leaked cluster 5 refcount=1 reference=0
+Repairing cluster 5 refcount=1 reference=0
+The following inconsistencies were found and repaired:
+
+    1 leaked clusters
+    1 corruptions
+
+Double checking the fixed image now...
+No errors were found on the image.
+
+### Image creation options ###
+
+# cluster_size < 16k
+qemu-img: TEST_DIR/t.IMGFMT: Extended L2 entries are only supported with cluster sizes of at least 16384 bytes
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+# backing file and preallocation=metadata
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=raw size=1048576
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=524288 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw preallocation=metadata
+Image resized.
+read 524288/524288 bytes at offset 0
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 524288/524288 bytes at offset 524288
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset          Length          Mapped to       File
+0               0x80000        0               TEST_DIR/t.qcow2.base
+# backing file and preallocation=falloc
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=524288 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw preallocation=falloc
+Image resized.
+read 524288/524288 bytes at offset 0
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 524288/524288 bytes at offset 524288
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset          Length          Mapped to       File
+0               0x80000        0               TEST_DIR/t.qcow2.base
+# backing file and preallocation=full
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=524288 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw preallocation=full
+Image resized.
+read 524288/524288 bytes at offset 0
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 524288/524288 bytes at offset 524288
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset          Length          Mapped to       File
+0               0x80000        0               TEST_DIR/t.qcow2.base
+
+### Image resizing with preallocation and backing files ###
+
+# resize --preallocation=metadata
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=515072 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+Image resized.
+read 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 522240/522240 bytes at offset 515072
+510 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# resize --preallocation=falloc
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=515072 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+Image resized.
+read 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 522240/522240 bytes at offset 515072
+510 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# resize --preallocation=full
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=515072 backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=raw
+Image resized.
+read 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 522240/522240 bytes at offset 515072
+510 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+### Image resizing with preallocation without backing files ###
+
+# resize --preallocation=metadata
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=515072
+wrote 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Image resized.
+read 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 522240/522240 bytes at offset 515072
+510 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# resize --preallocation=falloc
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=515072
+wrote 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Image resized.
+read 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 522240/522240 bytes at offset 515072
+510 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# resize --preallocation=full
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=515072
+wrote 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Image resized.
+read 515072/515072 bytes at offset 0
+503 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 522240/522240 bytes at offset 515072
+510 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+### qemu-img measure ###
+
+# 512MB, extended_l2=off
+required size: 327680
+fully allocated size: 537198592
+# 512MB, extended_l2=on
+required size: 393216
+fully allocated size: 537264128
+# 16K clusters, 64GB, extended_l2=off
+required size: 42008576
+fully allocated size: 68761485312
+# 16K clusters, 64GB, extended_l2=on
+required size: 75579392
+fully allocated size: 68795056128
+# 8k clusters
+qemu-img: Extended L2 entries are only supported with cluster sizes of at least 16384 bytes
+# 1024 TB
+required size: 309285027840
+fully allocated size: 1126209191870464
+# 1025 TB
+qemu-img: The image size is too large (try using a larger cluster size)
+
+### qemu-img amend ###
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+qemu-img: Toggling extended L2 entries is not supported
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+qemu-img: Toggling extended L2 entries is not supported
+
+### Test copy-on-write on an image with snapshots ###
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+L2 entry #0: 0x8000000000050000 0000008000042000
+L2 entry #1: 0x8000000000060000 0000008000042000
+L2 entry #2: 0x8000000000070000 0000008000042000
+L2 entry #3: 0x8000000000080000 0000008000042000
+L2 entry #4: 0x8000000000090000 0000008000042000
+L2 entry #5: 0x80000000000a0000 0000008000042000
+L2 entry #6: 0x80000000000b0000 0000008000042000
+L2 entry #7: 0x80000000000c0000 0000008000042000
+L2 entry #8: 0x80000000000d0000 0000008000042000
+L2 entry #9: 0x80000000000e0000 0000008000042000
+L2 entry #0: 0x8000000000120000 000000800007e000
+L2 entry #1: 0x8000000000130000 000000800007fc00
+L2 entry #2: 0x8000000000140000 00000080001fe000
+L2 entry #3: 0x8000000000150000 000000800007e000
+L2 entry #4: 0x8000000000160000 000000000007ff80
+L2 entry #5: 0x8000000000170000 000000000007ffff
+L2 entry #6: 0x00000000000b0000 0001808000042000
+L2 entry #7: 0x00000000000c0000 0000208000040000
+L2 entry #8: 0x8000000000180000 000000800007e000
+L2 entry #9: 0x00000000000e0000 000000c000042000
+
+### Test concurrent requests ###
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576
+blkdebug: Suspended request 'A'
+blkdebug: Resuming request 'A'
+wrote 2048/2048 bytes at offset 30720
+2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 2048/2048 bytes at offset 20480
+2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 2048/2048 bytes at offset 40960
+2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index d886fa0cb3..68f6c2074a 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -284,6 +284,7 @@
 267 rw auto quick snapshot
 268 rw auto quick
 270 rw backing quick
+271 rw auto
 272 rw
 273 backing quick
 274 rw backing
-- 
2.20.1



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset()
  2020-06-28 11:02 ` [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset() Alberto Garcia
@ 2020-06-30 10:19   ` Max Reitz
  2020-06-30 10:27     ` Alberto Garcia
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-06-30 10:19 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1647 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> qcow2_get_cluster_offset() takes an (unaligned) guest offset and
> returns the (aligned) offset of the corresponding cluster in the qcow2
> image.
> 
> In practice none of the callers need to know where the cluster starts
> so this patch makes the function calculate and return the final host
> offset directly. The function is also renamed accordingly.
> 
> There is a pre-existing exception with compressed clusters: in this
> case the function returns the complete cluster descriptor (containing
> the offset and size of the compressed data). This does not change with
> this patch but it is now documented.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/qcow2.h         |  4 ++--
>  block/qcow2-cluster.c | 42 +++++++++++++++++++++++-------------------
>  block/qcow2.c         | 24 +++++++-----------------
>  3 files changed, 32 insertions(+), 38 deletions(-)

[...]

> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 4b5fc8c4a7..9ab41cb728 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c

[...]

> @@ -537,8 +542,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>          bytes_needed = bytes_available;
>      }
>  
> -    *cluster_offset = 0;
> -

You drop this line without replacement now.  That means that
*host_offset is no longer set to 0 if the L1 entry is out of bounds or
empty (which causes this function to return QCOW2_CLUSTER_UNALLOCATED
and no error).  Was that intentional?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset()
  2020-06-30 10:19   ` Max Reitz
@ 2020-06-30 10:27     ` Alberto Garcia
  0 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-06-30 10:27 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Tue 30 Jun 2020 12:19:42 PM CEST, Max Reitz wrote:
>> @@ -537,8 +542,6 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
>>          bytes_needed = bytes_available;
>>      }
>>  
>> -    *cluster_offset = 0;
>> -
>
> You drop this line without replacement now.  That means that
> *host_offset is no longer set to 0 if the L1 entry is out of bounds or
> empty (which causes this function to return QCOW2_CLUSTER_UNALLOCATED
> and no error).  Was that intentional?

Hmm, no, it wasn't intentional.

It does not have any side effect but I should be explicitly set
it to 0. I'll fix it in the next version.

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 05/34] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied()
  2020-06-28 11:02 ` [PATCH v9 05/34] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied() Alberto Garcia
@ 2020-06-30 10:38   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-06-30 10:38 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1943 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> When writing to a qcow2 file there are two functions that take a
> virtual offset and return a host offset, possibly allocating new
> clusters if necessary:
> 
>    - handle_copied() looks for normal data clusters that are already
>      allocated and have a reference count of 1. In those clusters we
>      can simply write the data and there is no need to perform any
>      copy-on-write.
> 
>    - handle_alloc() looks for clusters that do need copy-on-write,
>      either because they haven't been allocated yet, because their
>      reference count is != 1 or because they are ZERO_ALLOC clusters.
> 
> The ZERO_ALLOC case is a bit special because those are clusters that
> are already allocated and they could perfectly be dealt with in
> handle_copied() (as long as copy-on-write is performed when required).
> 
> In fact, there is extra code specifically for them in handle_alloc()
> that tries to reuse the existing allocation if possible and frees them
> otherwise.
> 
> This patch changes the handling of ZERO_ALLOC clusters so the
> semantics of these two functions are now like this:
> 
>    - handle_copied() looks for clusters that are already allocated and
>      which we can overwrite (NORMAL and ZERO_ALLOC clusters with a
>      reference count of 1).
> 
>    - handle_alloc() looks for clusters for which we need a new
>      allocation (all other cases).
> 
> One important difference after this change is that clusters found
> in handle_copied() may now require copy-on-write, but this will be
> necessary anyway once we add support for subclusters.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2-cluster.c | 256 +++++++++++++++++++++++-------------------
>  1 file changed, 141 insertions(+), 115 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 11/34] qcow2: Add offset_into_subcluster() and size_to_subclusters()
  2020-06-28 11:02 ` [PATCH v9 11/34] qcow2: Add offset_into_subcluster() and size_to_subclusters() Alberto Garcia
@ 2020-07-01 12:23   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-01 12:23 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 349 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> Like offset_into_cluster() and size_to_clusters(), but for
> subclusters.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 13/34] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap()
  2020-06-28 11:02 ` [PATCH v9 13/34] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap() Alberto Garcia
@ 2020-07-01 12:28   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-01 12:28 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 827 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> Extended L2 entries are 128-bit wide: 64 bits for the entry itself and
> 64 bits for the subcluster allocation bitmap.
> 
> In order to support them correctly get/set_l2_entry() need to be
> updated so they take the entry width into account in order to
> calculate the correct offset.
> 
> This patch also adds the get/set_l2_bitmap() functions that are
> used to access the bitmaps. For convenience we allow calling
> get_l2_bitmap() on images without subclusters. In this case the
> returned value is always 0 and has no meaning.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.h | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()
  2020-06-28 11:02 ` [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type() Alberto Garcia
@ 2020-07-01 12:52   ` Max Reitz
  2020-07-01 16:26     ` Alberto Garcia
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-01 12:52 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 5978 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> This patch adds QCow2SubclusterType, which is the subcluster-level
> version of QCow2ClusterType. All QCOW2_SUBCLUSTER_* values have the
> the same meaning as their QCOW2_CLUSTER_* equivalents (when they
> exist). See below for details and caveats.
> 
> In images without extended L2 entries clusters are treated as having
> exactly one subcluster so it is possible to replace one data type with
> the other while keeping the exact same semantics.
> 
> With extended L2 entries there are new possible values, and every
> subcluster in the same cluster can obviously have a different
> QCow2SubclusterType so functions need to be adapted to work on the
> subcluster level.
> 
> There are several things that have to be taken into account:
> 
>   a) QCOW2_SUBCLUSTER_COMPRESSED means that the whole cluster is
>      compressed. We do not support compression at the subcluster
>      level.
> 
>   b) There are two different values for unallocated subclusters:
>      QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN which means that the whole
>      cluster is unallocated, and QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC
>      which means that the cluster is allocated but the subcluster is
>      not. The latter can only happen in images with extended L2
>      entries.
> 
>   c) QCOW2_SUBCLUSTER_INVALID is used to detect the cases where an L2
>      entry has a value that violates the specification. The caller is
>      responsible for handling these situations.
> 
>      To prevent compatibility problems with images that have invalid
>      values but are currently being read by QEMU without causing side
>      effects, QCOW2_SUBCLUSTER_INVALID is only returned for images
>      with extended L2 entries.
> 
> qcow2_cluster_to_subcluster_type() is added as a separate function
> from qcow2_get_subcluster_type(), but this is only temporary and both
> will be merged in a subsequent patch.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.h | 126 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 125 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 82b86f6cec..3aec6f452a 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h

[...]

> @@ -634,9 +686,11 @@ static inline int64_t qcow2_vm_state_offset(BDRVQcow2State *s)
>  static inline QCow2ClusterType qcow2_get_cluster_type(BlockDriverState *bs,
>                                                        uint64_t l2_entry)
>  {
> +    BDRVQcow2State *s = bs->opaque;
> +
>      if (l2_entry & QCOW_OFLAG_COMPRESSED) {
>          return QCOW2_CLUSTER_COMPRESSED;
> -    } else if (l2_entry & QCOW_OFLAG_ZERO) {
> +    } else if ((l2_entry & QCOW_OFLAG_ZERO) && !has_subclusters(s)) {

OK, so now qcow2_get_cluster_type() reports zero clusters to be normal
or unallocated clusters when there are subclusters.  Seems weird to me,
because zero clusters are invalid clusters then.  I preferred just
reporting them as zero clusters and letting the caller deal with it,
because it does mean an error in the image and so it should be reported.

So...

>          if (l2_entry & L2E_OFFSET_MASK) {
>              return QCOW2_CLUSTER_ZERO_ALLOC;
>          }

[...]

> +/*
> + * In an image without subsclusters @l2_bitmap is ignored and
> + * @sc_index must be 0.
> + * Return QCOW2_SUBCLUSTER_INVALID if an invalid l2 entry is detected
> + * (this checks the whole entry and bitmap, not only the bits related
> + * to subcluster @sc_index).
> + */
> +static inline
> +QCow2SubclusterType qcow2_get_subcluster_type(BlockDriverState *bs,
> +                                              uint64_t l2_entry,
> +                                              uint64_t l2_bitmap,
> +                                              unsigned sc_index)
> +{
> +    BDRVQcow2State *s = bs->opaque;
> +    QCow2ClusterType type = qcow2_get_cluster_type(bs, l2_entry);
> +    assert(sc_index < s->subclusters_per_cluster);
> +
> +    if (has_subclusters(s)) {
> +        switch (type) {
> +        case QCOW2_CLUSTER_COMPRESSED:
> +            return QCOW2_SUBCLUSTER_COMPRESSED;
> +        case QCOW2_CLUSTER_NORMAL:
> +            if ((l2_bitmap >> 32) & l2_bitmap) {
> +                return QCOW2_SUBCLUSTER_INVALID;
> +            } else if (l2_bitmap & QCOW_OFLAG_SUB_ZERO(sc_index)) {
> +                return QCOW2_SUBCLUSTER_ZERO_ALLOC;
> +            } else if (l2_bitmap & QCOW_OFLAG_SUB_ALLOC(sc_index)) {
> +                return QCOW2_SUBCLUSTER_NORMAL;
> +            } else {
> +                return QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC;
> +            }
> +        case QCOW2_CLUSTER_UNALLOCATED:
> +            if (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC) {
> +                return QCOW2_SUBCLUSTER_INVALID;
> +            } else if (l2_bitmap & QCOW_OFLAG_SUB_ZERO(sc_index)) {
> +                return QCOW2_SUBCLUSTER_ZERO_PLAIN;
> +            } else {
> +                return QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN;
> +            }

...consequentially, this function no longer reports clusters which have
the zero flag set as invalid (which it did in v4, when I last looked at it).

I see this was a conscious choice of yours in v5.  I don’t really see
the justification for it, though.  As far as I can understand, you seem
to argue that no corruption detection is better than incomplete
corruption detection, and that a complete and thorough detection would
warrant its own series, do I understand that correctly?

Max

> +        default:
> +            g_assert_not_reached();
> +        }
> +    } else {
> +        return qcow2_cluster_to_subcluster_type(type);
> +    }
> +}
> +
>  /* Check whether refcounts are eager or lazy */
>  static inline bool qcow2_need_accurate_refcounts(BDRVQcow2State *s)
>  {
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 15/34] qcow2: Add qcow2_get_subcluster_range_type()
  2020-06-28 11:02 ` [PATCH v9 15/34] qcow2: Add qcow2_get_subcluster_range_type() Alberto Garcia
@ 2020-07-01 13:37   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-01 13:37 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 779 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> There are situations in which we want to know how many contiguous
> subclusters of the same type there are in a given cluster. This can be
> done by simply iterating over the subclusters and repeatedly calling
> qcow2_get_subcluster_type() for each one of them.
> 
> However once we determined the type of a subcluster we can check the
> rest efficiently by counting the number of adjacent ones (or zeroes)
> in the bitmap. This is what this function does.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2-cluster.c | 51 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 16/34] qcow2: Add qcow2_cluster_is_allocated()
  2020-06-28 11:02 ` [PATCH v9 16/34] qcow2: Add qcow2_cluster_is_allocated() Alberto Garcia
@ 2020-07-01 13:55   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-01 13:55 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 381 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> This helper function tells us if a cluster is allocated (that is,
> there is an associated host offset for it).
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.h | 6 ++++++
>  1 file changed, 6 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()
  2020-07-01 12:52   ` Max Reitz
@ 2020-07-01 16:26     ` Alberto Garcia
  2020-07-02  9:57       ` Max Reitz
  0 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-07-01 16:26 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Wed 01 Jul 2020 02:52:14 PM CEST, Max Reitz wrote:
>>      if (l2_entry & QCOW_OFLAG_COMPRESSED) {
>>          return QCOW2_CLUSTER_COMPRESSED;
>> -    } else if (l2_entry & QCOW_OFLAG_ZERO) {
>> +    } else if ((l2_entry & QCOW_OFLAG_ZERO) && !has_subclusters(s)) {
>
> OK, so now qcow2_get_cluster_type() reports zero clusters to be normal
> or unallocated clusters when there are subclusters.  Seems weird to
> me, because zero clusters are invalid clusters then.

I'm actually hesitant about this.

In extended L2 entries QCOW_OFLAG_ZERO does not have any meaning so
technically it doesn't need to be checked any more than the other
reserved bits (1 to 8).

The reason why we would want to check it is, of course, because that bit
does have a meaning in regular L2 entries.

But that bit is ignored in images with subclusters so the only reason
why we would check it is to report corruption, not because we need to
know its value.

It's true that we do check it in v2 images, although in that case the
entries are otherwise identical and there is a way to convert between
both types.

> I preferred just reporting them as zero clusters and letting the
> caller deal with it, because it does mean an error in the image and so
> it should be reported.

Another alternative would be to add QCOW2_CLUSTER_INVALID and we could
even include there other cases like unaligned offsets and things like
that. But that would also affect the code that repairs corrupted images.

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()
  2020-07-01 16:26     ` Alberto Garcia
@ 2020-07-02  9:57       ` Max Reitz
  2020-07-02 22:00         ` Alberto Garcia
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-02  9:57 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1888 bytes --]

On 01.07.20 18:26, Alberto Garcia wrote:
> On Wed 01 Jul 2020 02:52:14 PM CEST, Max Reitz wrote:
>>>      if (l2_entry & QCOW_OFLAG_COMPRESSED) {
>>>          return QCOW2_CLUSTER_COMPRESSED;
>>> -    } else if (l2_entry & QCOW_OFLAG_ZERO) {
>>> +    } else if ((l2_entry & QCOW_OFLAG_ZERO) && !has_subclusters(s)) {
>>
>> OK, so now qcow2_get_cluster_type() reports zero clusters to be normal
>> or unallocated clusters when there are subclusters.  Seems weird to
>> me, because zero clusters are invalid clusters then.
> 
> I'm actually hesitant about this.
> 
> In extended L2 entries QCOW_OFLAG_ZERO does not have any meaning so
> technically it doesn't need to be checked any more than the other
> reserved bits (1 to 8).

Good point.  That convinces me.

> The reason why we would want to check it is, of course, because that bit
> does have a meaning in regular L2 entries.
> 
> But that bit is ignored in images with subclusters so the only reason
> why we would check it is to report corruption, not because we need to
> know its value.

Sure.  But isn’t that the whole point of having QCOW2_SUBCLUSTER_INVALID
in the first place?

> It's true that we do check it in v2 images, although in that case the
> entries are otherwise identical and there is a way to convert between
> both types.
> 
>> I preferred just reporting them as zero clusters and letting the
>> caller deal with it, because it does mean an error in the image and so
>> it should be reported.
> 
> Another alternative would be to add QCOW2_CLUSTER_INVALID and we could
> even include there other cases like unaligned offsets and things like
> that. But that would also affect the code that repairs corrupted images.

Interesting.  Well, and that’d be definitely too much for this series,
as you already said.

So:

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 20/34] qcow2: Add subcluster support to calculate_l2_meta()
  2020-06-28 11:02 ` [PATCH v9 20/34] qcow2: Add subcluster support to calculate_l2_meta() Alberto Garcia
@ 2020-07-02 11:30   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-02 11:30 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1819 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> If an image has subclusters then there are more copy-on-write
> scenarios that we need to consider. Let's say we have a write request
> from the middle of subcluster #3 until the end of the cluster:
> 
> 1) If we are writing to a newly allocated cluster then we need
>    copy-on-write. The previous contents of subclusters #0 to #3 must
>    be copied to the new cluster. We can optimize this process by
>    skipping all leading unallocated or zero subclusters (the status of
>    those skipped subclusters will be reflected in the new L2 bitmap).
> 
> 2) If we are overwriting an existing cluster:
> 
>    2.1) If subcluster #3 is unallocated or has the all-zeroes bit set
>         then we need copy-on-write (on subcluster #3 only).
> 
>    2.2) If subcluster #3 was already allocated then there is no need
>         for any copy-on-write. However we still need to update the L2
>         bitmap to reflect possible changes in the allocation status of
>         subclusters #4 to #31. Because of this, this function checks
>         if all the overwritten subclusters are already allocated and
>         in this case it returns without creating a new QCowL2Meta
>         structure.
> 
> After all these changes l2meta_cow_start() and l2meta_cow_end()
> are not necessarily cluster-aligned anymore. We need to update the
> calculation of old_start and old_end in handle_dependencies() to
> guarantee that no two requests try to write on the same cluster.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2-cluster.c | 163 +++++++++++++++++++++++++++++++++---------
>  1 file changed, 131 insertions(+), 32 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 21/34] qcow2: Add subcluster support to qcow2_get_host_offset()
  2020-06-28 11:02 ` [PATCH v9 21/34] qcow2: Add subcluster support to qcow2_get_host_offset() Alberto Garcia
@ 2020-07-02 12:46   ` Max Reitz
  2020-07-02 22:04     ` Alberto Garcia
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-02 12:46 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 3695 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> The logic of this function remains pretty much the same, except that
> it uses count_contiguous_subclusters(), which combines the logic of
> count_contiguous_clusters() / count_contiguous_clusters_unallocated()
> and checks individual subclusters.
> 
> qcow2_cluster_to_subcluster_type() is not necessary as a separate
> function anymore so it's inlined into its caller.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.h         |  38 ++++-------
>  block/qcow2-cluster.c | 150 ++++++++++++++++++++++--------------------
>  2 files changed, 92 insertions(+), 96 deletions(-)

[...]

> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 59dd9bda29..2f3bd3a882 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -426,66 +426,66 @@ static int qcow2_get_subcluster_range_type(BlockDriverState *bs,

[...]

> +static int count_contiguous_subclusters(BlockDriverState *bs, int nb_clusters,
> +                                        unsigned sc_index, uint64_t *l2_slice,
> +                                        unsigned *l2_index)
>  {
>      BDRVQcow2State *s = bs->opaque;
> -    int i;
> -    QCow2ClusterType first_cluster_type;
> -    uint64_t mask = stop_flags | L2E_OFFSET_MASK | QCOW_OFLAG_COMPRESSED;
> -    uint64_t first_entry = get_l2_entry(s, l2_slice, l2_index);
> -    uint64_t offset = first_entry & mask;
> +    int i, count = 0;
> +    bool check_offset;
> +    uint64_t expected_offset;
> +    QCow2SubclusterType expected_type, type;
>  
> -    first_cluster_type = qcow2_get_cluster_type(bs, first_entry);
> -    if (first_cluster_type == QCOW2_CLUSTER_UNALLOCATED) {
> -        return 0;
> -    }
> -
> -    /* must be allocated */
> -    assert(first_cluster_type == QCOW2_CLUSTER_NORMAL ||
> -           first_cluster_type == QCOW2_CLUSTER_ZERO_ALLOC);
> +    assert(*l2_index + nb_clusters <= s->l2_size);

Not l2_slice_size?

>  
>      for (i = 0; i < nb_clusters; i++) {
> -        uint64_t l2_entry = get_l2_entry(s, l2_slice, l2_index + i) & mask;
> -        if (offset + (uint64_t) i * cluster_size != l2_entry) {
> +        unsigned first_sc = (i == 0) ? sc_index : 0;
> +        uint64_t l2_entry = get_l2_entry(s, l2_slice, *l2_index + i);
> +        uint64_t l2_bitmap = get_l2_bitmap(s, l2_slice, *l2_index + i);
> +        int ret = qcow2_get_subcluster_range_type(bs, l2_entry, l2_bitmap,
> +                                                  first_sc, &type);
> +        if (ret < 0) {
> +            *l2_index += i; /* Point to the invalid entry */
> +            return -EIO;
> +        }
> +        if (i == 0) {
> +            if (type == QCOW2_SUBCLUSTER_COMPRESSED) {
> +                /* Compressed clusters are always processed one by one */
> +                return ret;
> +            }
> +            expected_type = type;
> +            expected_offset = l2_entry & L2E_OFFSET_MASK;
> +            check_offset = (type == QCOW2_SUBCLUSTER_NORMAL ||
> +                            type == QCOW2_SUBCLUSTER_ZERO_ALLOC ||
> +                            type == QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC);
> +        } else if (type != expected_type) {
>              break;
> +        } else if (check_offset) {

My gcc (v10.1.1) appears to be a bit daft, and so doesn’t recognize that
check_offset must always be initialized before this line is hit.

> +            expected_offset += s->cluster_size;

Same for expected_offset.

Would you mind helping it by initializing them in their definition? O:)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 22/34] qcow2: Add subcluster support to zero_in_l2_slice()
  2020-06-28 11:02 ` [PATCH v9 22/34] qcow2: Add subcluster support to zero_in_l2_slice() Alberto Garcia
@ 2020-07-02 12:56   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-02 12:56 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 656 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> The QCOW_OFLAG_ZERO bit that indicates that a cluster reads as
> zeroes is only used in standard L2 entries. Extended L2 entries use
> individual 'all zeroes' bits for each subcluster.
> 
> This must be taken into account when updating the L2 entry and also
> when deciding that an existing entry does not need to be updated.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2-cluster.c | 36 +++++++++++++++++++-----------------
>  1 file changed, 19 insertions(+), 17 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 23/34] qcow2: Add subcluster support to discard_in_l2_slice()
  2020-06-28 11:02 ` [PATCH v9 23/34] qcow2: Add subcluster support to discard_in_l2_slice() Alberto Garcia
@ 2020-07-02 13:24   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-02 13:24 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1145 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> Two things need to be taken into account here:
> 
> 1) With full_discard == true the L2 entry must be cleared completely.
>    This also includes the L2 bitmap if the image has extended L2
>    entries.
> 
> 2) With full_discard == false we have to make the discarded cluster
>    read back as zeroes. With normal L2 entries this is done with the
>    QCOW_OFLAG_ZERO bit, whereas with extended L2 entries this is done
>    with the individual 'all zeroes' bits for each subcluster.
> 
>    Note however that QCOW_OFLAG_ZERO is not supported in v2 qcow2
>    images so, if there is a backing file, discard cannot guarantee
>    that the image will read back as zeroes. If this is important for
>    the caller it should forbid it as qcow2_co_pdiscard() does (see
>    80f5c01183 for more details).
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2-cluster.c | 52 +++++++++++++++++++------------------------
>  1 file changed, 23 insertions(+), 29 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 24/34] qcow2: Add subcluster support to check_refcounts_l2()
  2020-06-28 11:02 ` [PATCH v9 24/34] qcow2: Add subcluster support to check_refcounts_l2() Alberto Garcia
@ 2020-07-02 13:32   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-02 13:32 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1038 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> The offset field of an uncompressed cluster's L2 entry must be aligned
> to the cluster size, otherwise it is invalid. If the cluster has no
> data then it means that the offset points to a preallocation, so we
> can clear the offset field without affecting the guest-visible data.
> This is what 'qemu-img check' does when run in repair mode.
> 
> On traditional qcow2 images this can only happen when QCOW_OFLAG_ZERO
> is set, and repairing such entries turns the clusters from ZERO_ALLOC
> into ZERO_PLAIN.
> 
> Extended L2 entries have no ZERO_ALLOC clusters and no QCOW_OFLAG_ZERO
> but the idea is the same: if none of the subclusters are allocated
> then we can clear the offset field and leave the bitmap untouched.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-refcount.c     | 16 +++++++++++-----
>  tests/qemu-iotests/060.out |  2 +-
>  2 files changed, 12 insertions(+), 6 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 25/34] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2()
  2020-06-28 11:02 ` [PATCH v9 25/34] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2() Alberto Garcia
@ 2020-07-02 14:01   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-02 14:01 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 756 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> The L2 bitmap needs to be updated after each write to indicate what
> new subclusters are now allocated. This needs to happen even if the
> cluster was already allocated and the L2 entry was otherwise valid.
> 
> In some cases however a write operation doesn't need change the L2
> bitmap (because all affected subclusters were already allocated). This
> is detected in calculate_l2_meta(), and qcow2_alloc_cluster_link_l2()
> is never called in those cases.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2-cluster.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
  2020-06-28 11:02 ` [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes() Alberto Garcia
@ 2020-07-02 14:28   ` Max Reitz
  2020-07-02 22:40     ` Alberto Garcia
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-02 14:28 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 3679 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> This works now at the subcluster level and pwrite_zeroes_alignment is
> updated accordingly.
> 
> qcow2_cluster_zeroize() is turned into qcow2_subcluster_zeroize() with
> the following changes:
> 
>    - The request can now be subcluster-aligned.
> 
>    - The cluster-aligned body of the request is still zeroized using
>      zero_in_l2_slice() as before.
> 
>    - The subcluster-aligned head and tail of the request are zeroized
>      with the new zero_l2_subclusters() function.
> 
> There is just one thing to take into account for a possible future
> improvement: compressed clusters cannot be partially zeroized so
> zero_l2_subclusters() on the head or the tail can return -ENOTSUP.
> This makes the caller repeat the *complete* request and write actual
> zeroes to disk. This is sub-optimal because
> 
>    1) if the head area was compressed we would still be able to use
>       the fast path for the body and possibly the tail.
> 
>    2) if the tail area was compressed we are writing zeroes to the
>       head and the body areas, which are already zeroized.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.h         |  4 +--
>  block/qcow2-cluster.c | 80 +++++++++++++++++++++++++++++++++++++++----
>  block/qcow2.c         | 27 ++++++++-------
>  3 files changed, 90 insertions(+), 21 deletions(-)

[...]

> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index deff838fe8..1641976028 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -2015,12 +2015,58 @@ static int zero_in_l2_slice(BlockDriverState *bs, uint64_t offset,
>      return nb_clusters;
>  }
>  
> -int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
> -                          uint64_t bytes, int flags)
> +static int zero_l2_subclusters(BlockDriverState *bs, uint64_t offset,
> +                               unsigned nb_subclusters)
> +{
> +    BDRVQcow2State *s = bs->opaque;
> +    uint64_t *l2_slice;
> +    uint64_t old_l2_bitmap, l2_bitmap;
> +    int l2_index, ret, sc = offset_to_sc_index(s, offset);
> +
> +    /* For full clusters use zero_in_l2_slice() instead */
> +    assert(nb_subclusters > 0 && nb_subclusters < s->subclusters_per_cluster);
> +    assert(sc + nb_subclusters <= s->subclusters_per_cluster);

Maybe we should also assert that @offset is aligned to the subcluster size.

[...]

> diff --git a/block/qcow2.c b/block/qcow2.c
> index 86258fbc22..4edc3c72b9 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c

[...]

> @@ -4367,12 +4367,13 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
>          uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);

Can we instead align this to just subclusters?

>  
>          /*
> -         * Use zero clusters as much as we can. qcow2_cluster_zeroize()
> +         * Use zero clusters as much as we can. qcow2_subcluster_zeroize()
>           * requires a cluster-aligned start. The end may be unaligned if it is

s/cluster/subcluster/?

Max

>           * at the end of the image (which it is here).
>           */
>          if (offset > zero_start) {
> -            ret = qcow2_cluster_zeroize(bs, zero_start, offset - zero_start, 0);
> +            ret = qcow2_subcluster_zeroize(bs, zero_start, offset - zero_start,
> +                                           0);
>              if (ret < 0) {
>                  error_setg_errno(errp, -ret, "Failed to zero out new clusters");
>                  goto fail;
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta
  2020-06-28 11:02 ` [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta Alberto Garcia
@ 2020-07-02 14:50   ` Max Reitz
  2020-07-02 14:58     ` Eric Blake
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-02 14:50 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1958 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> This field allows us to indicate that the L2 metadata update does not
> come from a write request with actual data but from a preallocation
> request.
> 
> For traditional images this does not make any difference, but for
> images with extended L2 entries this means that the clusters are
> allocated normally in the L2 table but individual subclusters are
> marked as unallocated.
> 
> This will allow preallocating images that have a backing file.
> 
> There is one special case: when we resize an existing image we can
> also request that the new clusters are preallocated. If the image
> already had a backing file then we have to hide any possible stale
> data and zero out the new clusters (see commit 955c7d6687 for more
> details).
> 
> In this case the subclusters cannot be left as unallocated so the L2
> bitmap must be updated.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.h         | 8 ++++++++
>  block/qcow2-cluster.c | 2 +-
>  block/qcow2.c         | 6 ++++++
>  3 files changed, 15 insertions(+), 1 deletion(-)

Sounds good, but I’m just not quite sure about the details on
falloc/full allocation: With .prealloc = true, writing to the
preallocated subclusters will require a COW operation.  That’s not
ideal, and avoiding those COWs may be a reason to do preallocation in
the first place.

Now, with backing files, it’s entirely correct.  You need a COW
operation, because that’s the point of having a backing file.

But without a backing file I wonder if it wouldn’t be better to set
.prealloc = false to avoid that COW.

Of course, if we did that, you couldn’t create the overlay separately
from the backing file, preallocate it, and only then attach the backing
file to the overlay.  But is that a problem?

(Or are there other problems to consider?)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta
  2020-07-02 14:50   ` Max Reitz
@ 2020-07-02 14:58     ` Eric Blake
  2020-07-02 15:09       ` Max Reitz
  0 siblings, 1 reply; 71+ messages in thread
From: Eric Blake @ 2020-07-02 14:58 UTC (permalink / raw)
  To: Max Reitz, Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On 7/2/20 9:50 AM, Max Reitz wrote:
> On 28.06.20 13:02, Alberto Garcia wrote:
>> This field allows us to indicate that the L2 metadata update does not
>> come from a write request with actual data but from a preallocation
>> request.
>>
>> For traditional images this does not make any difference, but for
>> images with extended L2 entries this means that the clusters are
>> allocated normally in the L2 table but individual subclusters are
>> marked as unallocated.
>>
>> This will allow preallocating images that have a backing file.
>>
>> There is one special case: when we resize an existing image we can
>> also request that the new clusters are preallocated. If the image
>> already had a backing file then we have to hide any possible stale
>> data and zero out the new clusters (see commit 955c7d6687 for more
>> details).
>>
>> In this case the subclusters cannot be left as unallocated so the L2
>> bitmap must be updated.
>>
>> Signed-off-by: Alberto Garcia <berto@igalia.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> ---
>>   block/qcow2.h         | 8 ++++++++
>>   block/qcow2-cluster.c | 2 +-
>>   block/qcow2.c         | 6 ++++++
>>   3 files changed, 15 insertions(+), 1 deletion(-)
> 
> Sounds good, but I’m just not quite sure about the details on
> falloc/full allocation: With .prealloc = true, writing to the
> preallocated subclusters will require a COW operation.  That’s not
> ideal, and avoiding those COWs may be a reason to do preallocation in
> the first place.

I'm not sure I follow the complaint.  If a cluster is preallocated but 
the subcluster is marked unallocated, then doing a partial write to that 
subcluster must provide the correct contents for the rest of the 
subcluster (either filling with zero, or reading from a backing file) - 
but this COW can be limited to just the portion of the subcluster, and 
is no different than the COW you have to perform without subclusters 
when doing a write to a preallocated cluster in general.

> 
> Now, with backing files, it’s entirely correct.  You need a COW
> operation, because that’s the point of having a backing file.
> 
> But without a backing file I wonder if it wouldn’t be better to set
> .prealloc = false to avoid that COW.

Without a backing file, there is no read required - writing to an 
unallocated subcluster within a preallocated cluster merely has to 
provide zeros to the rest of the write.  And depending on whether we can 
intelligently guarantee that the underlying protocol already reads as 
zeroes when preallocated, we even have an optimization where even that 
is not necessary.  We can still lump it in the "COW" terminology, in 
that our write is more complex than merely writing in place, but it 
isn't a true copy-on-write operation as there is nothing to be copied.

> 
> Of course, if we did that, you couldn’t create the overlay separately
> from the backing file, preallocate it, and only then attach the backing
> file to the overlay.  But is that a problem?
> 
> (Or are there other problems to consider?)
> 
> Max
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta
  2020-07-02 14:58     ` Eric Blake
@ 2020-07-02 15:09       ` Max Reitz
  2020-07-02 23:05         ` Alberto Garcia
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-02 15:09 UTC (permalink / raw)
  To: Eric Blake, Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 3434 bytes --]

On 02.07.20 16:58, Eric Blake wrote:
> On 7/2/20 9:50 AM, Max Reitz wrote:
>> On 28.06.20 13:02, Alberto Garcia wrote:
>>> This field allows us to indicate that the L2 metadata update does not
>>> come from a write request with actual data but from a preallocation
>>> request.
>>>
>>> For traditional images this does not make any difference, but for
>>> images with extended L2 entries this means that the clusters are
>>> allocated normally in the L2 table but individual subclusters are
>>> marked as unallocated.
>>>
>>> This will allow preallocating images that have a backing file.
>>>
>>> There is one special case: when we resize an existing image we can
>>> also request that the new clusters are preallocated. If the image
>>> already had a backing file then we have to hide any possible stale
>>> data and zero out the new clusters (see commit 955c7d6687 for more
>>> details).
>>>
>>> In this case the subclusters cannot be left as unallocated so the L2
>>> bitmap must be updated.
>>>
>>> Signed-off-by: Alberto Garcia <berto@igalia.com>
>>> Reviewed-by: Eric Blake <eblake@redhat.com>
>>> ---
>>>   block/qcow2.h         | 8 ++++++++
>>>   block/qcow2-cluster.c | 2 +-
>>>   block/qcow2.c         | 6 ++++++
>>>   3 files changed, 15 insertions(+), 1 deletion(-)
>>
>> Sounds good, but I’m just not quite sure about the details on
>> falloc/full allocation: With .prealloc = true, writing to the
>> preallocated subclusters will require a COW operation.  That’s not
>> ideal, and avoiding those COWs may be a reason to do preallocation in
>> the first place.
> 
> I'm not sure I follow the complaint.  If a cluster is preallocated but
> the subcluster is marked unallocated, then doing a partial write to that
> subcluster must provide the correct contents for the rest of the
> subcluster (either filling with zero, or reading from a backing file) -
> but this COW can be limited to just the portion of the subcluster, and
> is no different than the COW you have to perform without subclusters
> when doing a write to a preallocated cluster in general.

It was my impression that falloc/full preallocation would create normal
data clusters, not zero clusters, so no COW was necessary when writing
to them.

>> Now, with backing files, it’s entirely correct.  You need a COW
>> operation, because that’s the point of having a backing file.
>>
>> But without a backing file I wonder if it wouldn’t be better to set
>> .prealloc = false to avoid that COW.
> 
> Without a backing file, there is no read required - writing to an
> unallocated subcluster within a preallocated cluster merely has to
> provide zeros to the rest of the write.  And depending on whether we can
> intelligently guarantee that the underlying protocol already reads as
> zeroes when preallocated, we even have an optimization where even that
> is not necessary.  We can still lump it in the "COW" terminology, in
> that our write is more complex than merely writing in place, but it
> isn't a true copy-on-write operation as there is nothing to be copied.

The term “COW” specifically in the qcow2 driver also refers to having to
write zeroes to an area that isn’t written to by the guest as part of
the process of having to allocate a (sub)cluster.

(Of course there is no COW from a backing file if there is no backing file.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 31/34] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit
  2020-06-28 11:02 ` [PATCH v9 31/34] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit Alberto Garcia
@ 2020-07-02 15:13   ` Max Reitz
  2020-07-03 12:43     ` Alberto Garcia
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-02 15:13 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 2555 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> Now that the implementation of subclusters is complete we can finally
> add the necessary options to create and read images with this feature,
> which we call "extended L2 entries".
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  qapi/block-core.json             |   7 +++
>  block/qcow2.h                    |   8 ++-
>  include/block/block_int.h        |   1 +
>  block/qcow2.c                    |  74 ++++++++++++++++++++--
>  tests/qemu-iotests/031.out       |   8 +--
>  tests/qemu-iotests/036.out       |   4 +-
>  tests/qemu-iotests/049.out       | 102 +++++++++++++++----------------
>  tests/qemu-iotests/060.out       |   1 +
>  tests/qemu-iotests/061.out       |  20 +++---
>  tests/qemu-iotests/065           |  12 ++--
>  tests/qemu-iotests/082.out       |  48 ++++++++++++---
>  tests/qemu-iotests/085.out       |  38 ++++++------
>  tests/qemu-iotests/144.out       |   4 +-
>  tests/qemu-iotests/182.out       |   2 +-
>  tests/qemu-iotests/185.out       |   8 +--
>  tests/qemu-iotests/198.out       |   2 +
>  tests/qemu-iotests/206.out       |   4 ++
>  tests/qemu-iotests/242.out       |   5 ++
>  tests/qemu-iotests/255.out       |   8 +--
>  tests/qemu-iotests/274.out       |  49 ++++++++-------
>  tests/qemu-iotests/280.out       |   2 +-
>  tests/qemu-iotests/291.out       |   6 +-
>  tests/qemu-iotests/common.filter |   1 +
>  23 files changed, 272 insertions(+), 142 deletions(-)

Looks OK.

Reviewed-by: Max Reitz <mreitz@redhat.com>

[...]

> diff --git a/block/qcow2.c b/block/qcow2.c
> index 003f166024..37bfae823c 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c

[...]

> @@ -5491,6 +5539,14 @@ static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
>                                   "is not supported");
>                  return -ENOTSUP;
>              }
> +        } else if (!strcmp(desc->name, BLOCK_OPT_EXTL2)) {
> +            bool extended_l2 = qemu_opt_get_bool(opts, BLOCK_OPT_EXTL2,
> +                                                 has_subclusters(s));
> +            if (extended_l2 != has_subclusters(s)) {
> +                error_setg(errp, "Toggling extended L2 entries "
> +                           "is not supported");
> +                return -EINVAL;

Just a note: I think ENOTSUP would fit better.

(Then again, Maxim’s “block/amend: refactor qcow2 amend options” removes
all of this code anyway.)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()
  2020-07-02  9:57       ` Max Reitz
@ 2020-07-02 22:00         ` Alberto Garcia
  2020-07-03  7:17           ` Max Reitz
  0 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-07-02 22:00 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Thu 02 Jul 2020 11:57:46 AM CEST, Max Reitz wrote:
>> The reason why we would want to check it is, of course, because that
>> bit does have a meaning in regular L2 entries.
>> 
>> But that bit is ignored in images with subclusters so the only reason
>> why we would check it is to report corruption, not because we need to
>> know its value.
>
> Sure.  But isn’t that the whole point of having
> QCOW2_SUBCLUSTER_INVALID in the first place?

At the moment we're only returning QCOW2_SUBCLUSTER_INVALID in cases
where there is no way to interpret the entry correctly: a) the
allocation and zero bits are set for the same subcluster, and b) the
allocation bit is set but the entry has no valid offset.

It doesn't mean that we cannot use _SUBCLUSTER_INVALID for cases like
the one we're discussing, but this one is different from the other two.

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 21/34] qcow2: Add subcluster support to qcow2_get_host_offset()
  2020-07-02 12:46   ` Max Reitz
@ 2020-07-02 22:04     ` Alberto Garcia
  0 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-07-02 22:04 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Thu 02 Jul 2020 02:46:27 PM CEST, Max Reitz wrote:
>> -    /* must be allocated */
>> -    assert(first_cluster_type == QCOW2_CLUSTER_NORMAL ||
>> -           first_cluster_type == QCOW2_CLUSTER_ZERO_ALLOC);
>> +    assert(*l2_index + nb_clusters <= s->l2_size);
>
> Not l2_slice_size?

Oh, indeed!

>> +        } else if (check_offset) {
>
> My gcc (v10.1.1) appears to be a bit daft, and so doesn’t recognize
> that check_offset must always be initialized before this line is hit.

Yeah I noticed that patchew complained, I'll fix that.

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
  2020-07-02 14:28   ` Max Reitz
@ 2020-07-02 22:40     ` Alberto Garcia
  2020-07-03  7:18       ` Max Reitz
  0 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-07-02 22:40 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Thu 02 Jul 2020 04:28:57 PM CEST, Max Reitz wrote:
>> +    /* For full clusters use zero_in_l2_slice() instead */
>> +    assert(nb_subclusters > 0 && nb_subclusters < s->subclusters_per_cluster);
>> +    assert(sc + nb_subclusters <= s->subclusters_per_cluster);
>
> Maybe we should also assert that @offset is aligned to the subcluster
> size.

It doesn't hurt but the only caller already guarantees that already ...

>> @@ -4367,12 +4367,13 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
>>          uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
>
> Can we instead align this to just subclusters?

I think so, good catch.

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta
  2020-07-02 15:09       ` Max Reitz
@ 2020-07-02 23:05         ` Alberto Garcia
  2020-07-03  7:22           ` Max Reitz
  0 siblings, 1 reply; 71+ messages in thread
From: Alberto Garcia @ 2020-07-02 23:05 UTC (permalink / raw)
  To: Max Reitz, Eric Blake, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Thu 02 Jul 2020 05:09:47 PM CEST, Max Reitz wrote:
>> Without a backing file, there is no read required - writing to an
>> unallocated subcluster within a preallocated cluster merely has to
>> provide zeros to the rest of the write.  And depending on whether we
>> can intelligently guarantee that the underlying protocol already
>> reads as zeroes when preallocated, we even have an optimization where
>> even that is not necessary.  We can still lump it in the "COW"
>> terminology, in that our write is more complex than merely writing in
>> place, but it isn't a true copy-on-write operation as there is
>> nothing to be copied.
>
> The term “COW” specifically in the qcow2 driver also refers to having
> to write zeroes to an area that isn’t written to by the guest as part
> of the process of having to allocate a (sub)cluster.

The question is valid: if the space for the clusters is allocated but
the subclusters are not marked as such then any partial write request
will need to fill the rest with zeroes (in practice handle_alloc_space()
can do that efficiently but that's another question).

If there is a backing file then there's no other alternative because we
do need to copy the data from the backing file.

If there is no backing file perhaps we could allocate all subclusters as
well. I suppose we can detect that scenario at that point in the code (I
haven't checked) and I don't know what would happen if one later
attaches a backing file on runtime using the command-line options.

But what I would argue is that I don't see the benefit of using extended
L2 entries on an preallocated image with no backing file: other than
having twice as much L2 metadata what would be the use? The point of
subclusters is that they make allocation more efficient, but if the
image is already fully allocated then they give you nothing.

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type()
  2020-07-02 22:00         ` Alberto Garcia
@ 2020-07-03  7:17           ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-03  7:17 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 968 bytes --]

On 03.07.20 00:00, Alberto Garcia wrote:
> On Thu 02 Jul 2020 11:57:46 AM CEST, Max Reitz wrote:
>>> The reason why we would want to check it is, of course, because that
>>> bit does have a meaning in regular L2 entries.
>>>
>>> But that bit is ignored in images with subclusters so the only reason
>>> why we would check it is to report corruption, not because we need to
>>> know its value.
>>
>> Sure.  But isn’t that the whole point of having
>> QCOW2_SUBCLUSTER_INVALID in the first place?
> 
> At the moment we're only returning QCOW2_SUBCLUSTER_INVALID in cases
> where there is no way to interpret the entry correctly: a) the
> allocation and zero bits are set for the same subcluster, and b) the
> allocation bit is set but the entry has no valid offset.
> 
> It doesn't mean that we cannot use _SUBCLUSTER_INVALID for cases like
> the one we're discussing, but this one is different from the other two.

OK, that makes sense.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes()
  2020-07-02 22:40     ` Alberto Garcia
@ 2020-07-03  7:18       ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-03  7:18 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 573 bytes --]

On 03.07.20 00:40, Alberto Garcia wrote:
> On Thu 02 Jul 2020 04:28:57 PM CEST, Max Reitz wrote:
>>> +    /* For full clusters use zero_in_l2_slice() instead */
>>> +    assert(nb_subclusters > 0 && nb_subclusters < s->subclusters_per_cluster);
>>> +    assert(sc + nb_subclusters <= s->subclusters_per_cluster);
>>
>> Maybe we should also assert that @offset is aligned to the subcluster
>> size.
> 
> It doesn't hurt but the only caller already guarantees that already ...

Sure, but it also guarantees the rest of these conditions, doesn’t it? :)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta
  2020-07-02 23:05         ` Alberto Garcia
@ 2020-07-03  7:22           ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-03  7:22 UTC (permalink / raw)
  To: Alberto Garcia, Eric Blake, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 2184 bytes --]

On 03.07.20 01:05, Alberto Garcia wrote:
> On Thu 02 Jul 2020 05:09:47 PM CEST, Max Reitz wrote:
>>> Without a backing file, there is no read required - writing to an
>>> unallocated subcluster within a preallocated cluster merely has to
>>> provide zeros to the rest of the write.  And depending on whether we
>>> can intelligently guarantee that the underlying protocol already
>>> reads as zeroes when preallocated, we even have an optimization where
>>> even that is not necessary.  We can still lump it in the "COW"
>>> terminology, in that our write is more complex than merely writing in
>>> place, but it isn't a true copy-on-write operation as there is
>>> nothing to be copied.
>>
>> The term “COW” specifically in the qcow2 driver also refers to having
>> to write zeroes to an area that isn’t written to by the guest as part
>> of the process of having to allocate a (sub)cluster.
> 
> The question is valid: if the space for the clusters is allocated but
> the subclusters are not marked as such then any partial write request
> will need to fill the rest with zeroes (in practice handle_alloc_space()
> can do that efficiently but that's another question).
> 
> If there is a backing file then there's no other alternative because we
> do need to copy the data from the backing file.
> 
> If there is no backing file perhaps we could allocate all subclusters as
> well. I suppose we can detect that scenario at that point in the code (I
> haven't checked) and I don't know what would happen if one later
> attaches a backing file on runtime using the command-line options.
> 
> But what I would argue is that I don't see the benefit of using extended
> L2 entries on an preallocated image with no backing file: other than
> having twice as much L2 metadata what would be the use? The point of
> subclusters is that they make allocation more efficient, but if the
> image is already fully allocated then they give you nothing.

That’s true.  I didn’t think about it this way.

Then indeed it doesn’t make sense to potentially break cases of later
adding a backing file:

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 32/34] qcow2: Allow preallocation and backing files if extended_l2 is set
  2020-06-28 11:02 ` [PATCH v9 32/34] qcow2: Allow preallocation and backing files if extended_l2 is set Alberto Garcia
@ 2020-07-03  7:45   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-03  7:45 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 732 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> Traditional qcow2 images don't allow preallocation if a backing file
> is set. This is because once a cluster is allocated there is no way to
> tell that its data should be read from the backing file.
> 
> Extended L2 entries have individual allocation bits for each
> subcluster, and therefore it is perfectly possible to have an
> allocated cluster with all its subclusters unallocated.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2.c              | 7 ++++---
>  tests/qemu-iotests/206.out | 2 +-
>  2 files changed, 5 insertions(+), 4 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 33/34] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters
  2020-06-28 11:02 ` [PATCH v9 33/34] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters Alberto Garcia
@ 2020-07-03  7:46   ` Max Reitz
  0 siblings, 0 replies; 71+ messages in thread
From: Max Reitz @ 2020-07-03  7:46 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 669 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> This function is only used by qcow2_expand_zero_clusters() to
> downgrade a qcow2 image to a previous version. This would require
> transforming all extended L2 entries into normal L2 entries but this
> is not a simple task and there are no plans to implement this at the
> moment.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2-cluster.c      | 8 +++++++-
>  tests/qemu-iotests/061     | 6 ++++++
>  tests/qemu-iotests/061.out | 5 +++++
>  3 files changed, 18 insertions(+), 1 deletion(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries
  2020-06-28 11:02 ` [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries Alberto Garcia
@ 2020-07-03  9:49   ` Max Reitz
  2020-07-03 13:06     ` Alberto Garcia
  0 siblings, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-03  9:49 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 2671 bytes --]

On 28.06.20 13:02, Alberto Garcia wrote:
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  tests/qemu-iotests/271     | 901 +++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/271.out | 724 +++++++++++++++++++++++++++++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 1626 insertions(+)
>  create mode 100755 tests/qemu-iotests/271
>  create mode 100644 tests/qemu-iotests/271.out
> 
> diff --git a/tests/qemu-iotests/271 b/tests/qemu-iotests/271
> new file mode 100755
> index 0000000000..5ef3ebb2bf
> --- /dev/null
> +++ b/tests/qemu-iotests/271

[...]

> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
> +
> +_supported_fmt qcow2
> +_supported_proto file nfs
> +_supported_os Linux
> +_unsupported_imgopts extended_l2 compat=0.10 cluster_size data_file

I’d also add a 'refcount_bits=1[^0-9]', because this test doesn’t pass
with refcount-bits=1 (due to taking a snapshot at one point).

> +
> +l2_offset=$((0x40000))
> +
> +_verify_img()
> +{
> +    $QEMU_IMG compare "$TEST_IMG" "$TEST_IMG.raw" | grep -v 'Images are identical'
> +    $QEMU_IMG check "$TEST_IMG" | _filter_qemu_img_check | \
> +        grep -v 'No errors were found on the image'
> +}
> +
> +# Compare the bitmap of an extended L2 entry against an expected value
> +_verify_l2_bitmap()
> +{
> +    entry_no="$1"            # L2 entry number, starting from 0
> +    expected_alloc="$alloc"  # Space-separated list of allocated subcluster indexes
> +    expected_zero="$zero"    # Space-separated list of zero subcluster indexes
> +
> +    offset=$(($l2_offset + $entry_no * 16))
> +    entry=$(peek_file_be "$TEST_IMG" $offset 8)
> +    offset=$(($offset + 8))
> +    bitmap=$(peek_file_be "$TEST_IMG" $offset 8)
> +
> +    expected_bitmap=0
> +    for bit in $expected_alloc; do
> +        expected_bitmap=$(($expected_bitmap | (1 << $bit)))
> +    done
> +    for bit in $expected_zero; do
> +        expected_bitmap=$(($expected_bitmap | (1 << (32 + $bit))))
> +    done
> +    printf -v expected_bitmap "%llu" $expected_bitmap # Convert to unsigned

Does the length modifier “ll” actually do anything?

> +
> +    printf "L2 entry #%d: 0x%016lx %016lx\n" "$entry_no" "$entry" "$bitmap"

Or the “l” here?

> +    if [ "$bitmap" != "$expected_bitmap" ]; then
> +        printf "ERROR: expecting bitmap       0x%016lx\n" "$expected_bitmap"

(or here)

> +    fi
> +}
Apart from those nit picks, I didn’t find anything to complain about.
My brain feels like mush now, though, after having brooded over this
test for over an hour...

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 31/34] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit
  2020-07-02 15:13   ` Max Reitz
@ 2020-07-03 12:43     ` Alberto Garcia
  0 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-07-03 12:43 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Thu 02 Jul 2020 05:13:02 PM CEST, Max Reitz wrote:
>> +        } else if (!strcmp(desc->name, BLOCK_OPT_EXTL2)) {
>> +            bool extended_l2 = qemu_opt_get_bool(opts, BLOCK_OPT_EXTL2,
>> +                                                 has_subclusters(s));
>> +            if (extended_l2 != has_subclusters(s)) {
>> +                error_setg(errp, "Toggling extended L2 entries "
>> +                           "is not supported");
>> +                return -EINVAL;
>
> Just a note: I think ENOTSUP would fit better.
>
> (Then again, Maxim’s “block/amend: refactor qcow2 amend options”
> removes all of this code anyway.)

I can still change that :)

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries
  2020-07-03  9:49   ` Max Reitz
@ 2020-07-03 13:06     ` Alberto Garcia
  2020-07-03 13:47       ` Max Reitz
  2020-07-06 13:57       ` Eric Blake
  0 siblings, 2 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-07-03 13:06 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Fri 03 Jul 2020 11:49:14 AM CEST, Max Reitz wrote:
>> +_supported_fmt qcow2
>> +_supported_proto file nfs
>> +_supported_os Linux
>> +_unsupported_imgopts extended_l2 compat=0.10 cluster_size data_file
>
> I’d also add a 'refcount_bits=1[^0-9]', because this test doesn’t pass
> with refcount-bits=1 (due to taking a snapshot at one point).

Ok

>> +    expected_bitmap=0
>> +    for bit in $expected_alloc; do
>> +        expected_bitmap=$(($expected_bitmap | (1 << $bit)))
>> +    done
>> +    for bit in $expected_zero; do
>> +        expected_bitmap=$(($expected_bitmap | (1 << (32 + $bit))))
>> +    done
>> +    printf -v expected_bitmap "%llu" $expected_bitmap # Convert to unsigned
>
> Does the length modifier “ll” actually do anything?
>
>> +
>> +    printf "L2 entry #%d: 0x%016lx %016lx\n" "$entry_no" "$entry" "$bitmap"
>
> Or the “l” here?

Actually they don't (I just tested in i386 and x86_64), I assumed that
it would require the length modifiers like in C.

I'm tempted to leave them for clarity (using 'll' in both cases),
opinions?

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries
  2020-07-03 13:06     ` Alberto Garcia
@ 2020-07-03 13:47       ` Max Reitz
  2020-07-03 15:20         ` Alberto Garcia
  2020-07-06 13:57       ` Eric Blake
  1 sibling, 1 reply; 71+ messages in thread
From: Max Reitz @ 2020-07-03 13:47 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

[-- Attachment #1.1: Type: text/plain, Size: 1185 bytes --]

On 03.07.20 15:06, Alberto Garcia wrote:
> On Fri 03 Jul 2020 11:49:14 AM CEST, Max Reitz wrote:

[...]

>>> +    expected_bitmap=0
>>> +    for bit in $expected_alloc; do
>>> +        expected_bitmap=$(($expected_bitmap | (1 << $bit)))
>>> +    done
>>> +    for bit in $expected_zero; do
>>> +        expected_bitmap=$(($expected_bitmap | (1 << (32 + $bit))))
>>> +    done
>>> +    printf -v expected_bitmap "%llu" $expected_bitmap # Convert to unsigned
>>
>> Does the length modifier “ll” actually do anything?
>>
>>> +
>>> +    printf "L2 entry #%d: 0x%016lx %016lx\n" "$entry_no" "$entry" "$bitmap"
>>
>> Or the “l” here?
> 
> Actually they don't (I just tested in i386 and x86_64), I assumed that
> it would require the length modifiers like in C.
> 
> I'm tempted to leave them for clarity (using 'll' in both cases),
> opinions?

I don’t mind, although at least zsh’s printf doesn’t seem to support
them all:

$ printf %lli 42
printf: %ll: invalid directive

So it doesn’t seem portable to me.  But then again I think we run all of
our tests in bash, and in any case, Eric’s the expert on shell
compatibility. :)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries
  2020-07-03 13:47       ` Max Reitz
@ 2020-07-03 15:20         ` Alberto Garcia
  0 siblings, 0 replies; 71+ messages in thread
From: Alberto Garcia @ 2020-07-03 15:20 UTC (permalink / raw)
  To: Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On Fri 03 Jul 2020 03:47:41 PM CEST, Max Reitz wrote:
>>>> +    printf -v expected_bitmap "%llu" $expected_bitmap # Convert to unsigned
>>>
>>> Does the length modifier “ll” actually do anything?
>>>
>>>> +
>>>> +    printf "L2 entry #%d: 0x%016lx %016lx\n" "$entry_no" "$entry" "$bitmap"
>>>
>>> Or the “l” here?
>> 
>> Actually they don't (I just tested in i386 and x86_64), I assumed
>> that it would require the length modifiers like in C.
>> 
>> I'm tempted to leave them for clarity (using 'll' in both cases),
>> opinions?
>
> I don’t mind, although at least zsh’s printf doesn’t seem to support
> them all:
>
> $ printf %lli 42
> printf: %ll: invalid directive

I went to the bash source code, and it seems that the length modifiers
are simply skipped:

   https://git.savannah.gnu.org/cgit/bash.git/tree/builtins/printf.def#n412

I'll remove them then.

Berto


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries
  2020-07-03 13:06     ` Alberto Garcia
  2020-07-03 13:47       ` Max Reitz
@ 2020-07-06 13:57       ` Eric Blake
  1 sibling, 0 replies; 71+ messages in thread
From: Eric Blake @ 2020-07-06 13:57 UTC (permalink / raw)
  To: Alberto Garcia, Max Reitz, qemu-devel
  Cc: Kevin Wolf, Derek Su, Vladimir Sementsov-Ogievskiy, qemu-block

On 7/3/20 8:06 AM, Alberto Garcia wrote:

>>> +    printf -v expected_bitmap "%llu" $expected_bitmap # Convert to unsigned
>>
>> Does the length modifier “ll” actually do anything?
>>
>>> +
>>> +    printf "L2 entry #%d: 0x%016lx %016lx\n" "$entry_no" "$entry" "$bitmap"
>>
>> Or the “l” here?
> 
> Actually they don't (I just tested in i386 and x86_64), I assumed that
> it would require the length modifiers like in C.
> 
> I'm tempted to leave them for clarity (using 'll' in both cases),
> opinions?

POSIX doesn't require support for modifiers; %d and %lld are the same 
even in 32-bit bash.  If this is a #!/bin/bash script, then use it for 
clarity; if it is #!/bin/sh, omit it for portability.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, back to index

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-28 11:02 [PATCH v9 00/34] Add subcluster allocation to qcow2 Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 01/34] qcow2: Make Qcow2AioTask store the full host offset Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 02/34] qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset() Alberto Garcia
2020-06-30 10:19   ` Max Reitz
2020-06-30 10:27     ` Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 03/34] qcow2: Add calculate_l2_meta() Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 04/34] qcow2: Split cluster_needs_cow() out of count_cow_clusters() Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 05/34] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied() Alberto Garcia
2020-06-30 10:38   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 06/34] qcow2: Add get_l2_entry() and set_l2_entry() Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 07/34] qcow2: Document the Extended L2 Entries feature Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 08/34] qcow2: Add dummy has_subclusters() function Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 09/34] qcow2: Add subcluster-related fields to BDRVQcow2State Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 10/34] qcow2: Add offset_to_sc_index() Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 11/34] qcow2: Add offset_into_subcluster() and size_to_subclusters() Alberto Garcia
2020-07-01 12:23   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 12/34] qcow2: Add l2_entry_size() Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 13/34] qcow2: Update get/set_l2_entry() and add get/set_l2_bitmap() Alberto Garcia
2020-07-01 12:28   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 14/34] qcow2: Add QCow2SubclusterType and qcow2_get_subcluster_type() Alberto Garcia
2020-07-01 12:52   ` Max Reitz
2020-07-01 16:26     ` Alberto Garcia
2020-07-02  9:57       ` Max Reitz
2020-07-02 22:00         ` Alberto Garcia
2020-07-03  7:17           ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 15/34] qcow2: Add qcow2_get_subcluster_range_type() Alberto Garcia
2020-07-01 13:37   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 16/34] qcow2: Add qcow2_cluster_is_allocated() Alberto Garcia
2020-07-01 13:55   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 17/34] qcow2: Add cluster type parameter to qcow2_get_host_offset() Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 18/34] qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_* Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 19/34] qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 20/34] qcow2: Add subcluster support to calculate_l2_meta() Alberto Garcia
2020-07-02 11:30   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 21/34] qcow2: Add subcluster support to qcow2_get_host_offset() Alberto Garcia
2020-07-02 12:46   ` Max Reitz
2020-07-02 22:04     ` Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 22/34] qcow2: Add subcluster support to zero_in_l2_slice() Alberto Garcia
2020-07-02 12:56   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 23/34] qcow2: Add subcluster support to discard_in_l2_slice() Alberto Garcia
2020-07-02 13:24   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 24/34] qcow2: Add subcluster support to check_refcounts_l2() Alberto Garcia
2020-07-02 13:32   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 25/34] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2() Alberto Garcia
2020-07-02 14:01   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 26/34] qcow2: Clear the L2 bitmap when allocating a compressed cluster Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 27/34] qcow2: Add subcluster support to handle_alloc_space() Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 28/34] qcow2: Add subcluster support to qcow2_co_pwrite_zeroes() Alberto Garcia
2020-07-02 14:28   ` Max Reitz
2020-07-02 22:40     ` Alberto Garcia
2020-07-03  7:18       ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 29/34] qcow2: Add subcluster support to qcow2_measure() Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 30/34] qcow2: Add prealloc field to QCowL2Meta Alberto Garcia
2020-07-02 14:50   ` Max Reitz
2020-07-02 14:58     ` Eric Blake
2020-07-02 15:09       ` Max Reitz
2020-07-02 23:05         ` Alberto Garcia
2020-07-03  7:22           ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 31/34] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit Alberto Garcia
2020-07-02 15:13   ` Max Reitz
2020-07-03 12:43     ` Alberto Garcia
2020-06-28 11:02 ` [PATCH v9 32/34] qcow2: Allow preallocation and backing files if extended_l2 is set Alberto Garcia
2020-07-03  7:45   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 33/34] qcow2: Assert that expand_zero_clusters_in_l1() does not support subclusters Alberto Garcia
2020-07-03  7:46   ` Max Reitz
2020-06-28 11:02 ` [PATCH v9 34/34] iotests: Add tests for qcow2 images with extended L2 entries Alberto Garcia
2020-07-03  9:49   ` Max Reitz
2020-07-03 13:06     ` Alberto Garcia
2020-07-03 13:47       ` Max Reitz
2020-07-03 15:20         ` Alberto Garcia
2020-07-06 13:57       ` Eric Blake

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git