All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata
@ 2013-07-03 14:34 Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed Paolo Bonzini
                   ` (17 more replies)
  0 siblings, 18 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

This series adds a subcommand to "map" that can dump file metadata.
Metadata that is dumped includes:

- whether blocks are allocated in bs->file and, if so, where

- whether blocks are zero

- whether data is read from bs or bs->backing_hd

Metadata is dumped for an entire chain of images.  One example usage is
(for non-compressed, non-encrypted images) to transform the metadata
into a Linux device-mapper setup, and make a qcow2 image available (for
read only) as a block device.  Another possible usage is to determine
the used areas of a file, and convert it in place to another format.
Alternatively, the richer data can be used by block jobs to quickly
determine if a block is zero without reading it.

The patches implement a new operation, bdrv_co_get_block_status, that
supersedes bdrv_co_is_allocated.  The bdrv_is_allocated API is still
available of course, and implemented on top of the new operation.

Patches 1-3 touch cow, which uses bdrv_co_is_allocated during its reads
and writes.  I optimized it a bit because it was unbearably slow during
testing.  With these patches (also tested with blkverify), booting of
a cow guest image is not particularly slow.

Patches 4 to 6 clean up the bdrv_is_allocated and bdrv_is_allocated_above
implementation, eliminating some code duplication.

Patches 7 and 8 tweak bdrv_has_zero_init and its usage in qemu-img in
a way that helps when implementing the new API.

Patches 9 to 11 implement bdrv_get_block_status and change the formats
to report all the available information.

Patch 12 adds the "map" subcommand to qemu-img.

Finally, patches 13 to 17 tweak the implementation to extract more
information from protocols, and combine this information with format
metadata whenever possible.

Paolo

Paolo Bonzini (17):
  cow: make reads go at a decent speed
  cow: make writes go at a less indecent speed
  cow: do not call bdrv_co_is_allocated
  block: make bdrv_co_is_allocated static
  block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above
    distinction
  block: expect errors from bdrv_co_is_allocated
  qemu-img: always probe the input image for allocated sectors
  block: make bdrv_has_zero_init return false for copy-on-write-images
  block: introduce bdrv_get_block_status API
  block: define get_block_status return value
  block: return get_block_status data and flags for formats
  qemu-img: add a "map" subcommand
  block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO
  raw-posix: return get_block_status data and flags
  raw-posix: detect XFS unwritten extents
  block: add default get_block_status implementation for protocols
  block: look for zero blocks in bs->file

 block.c                   | 134 +++++++++++++--------------
 block/commit.c            |   6 +-
 block/cow.c               |  90 +++++++++++++------
 block/mirror.c            |   4 +-
 block/qcow.c              |  13 ++-
 block/qcow2.c             |  24 +++--
 block/qed.c               |  39 ++++++--
 block/raw-posix.c         |  24 +++--
 block/raw.c               |   6 +-
 block/sheepdog.c          |  14 +--
 block/stream.c            |  10 +--
 block/vdi.c               |  17 +++-
 block/vmdk.c              |  20 ++++-
 block/vvfat.c             |  15 ++--
 include/block/block.h     |  31 +++++--
 include/block/block_int.h |   2 +-
 qemu-img-cmds.hx          |   6 ++
 qemu-img.c                | 225 +++++++++++++++++++++++++++++++++++++++++-----
 18 files changed, 497 insertions(+), 183 deletions(-)

-- 
1.8.2.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-04  2:20   ` Fam Zheng
  2013-07-05  9:09   ` Stefan Hajnoczi
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 02/17] cow: make writes go at a less indecent speed Paolo Bonzini
                   ` (16 subsequent siblings)
  17 siblings, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Do not do two reads for each sector; load each sector of the bitmap
and use bitmap operations to process it.

Writes are still dog slow!

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/cow.c | 54 ++++++++++++++++++++++++++++++++----------------------
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/block/cow.c b/block/cow.c
index 1cc2e89..204451e 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -126,18 +126,31 @@ static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
     return 0;
 }
 
-static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
+#define BITS_PER_BITMAP_SECTOR (512 * 8)
+
+/* Cannot use bitmap.c on big-endian machines.  */
+static int cow_test_bit(int64_t bitnum, const uint8_t *bitmap)
 {
-    uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
-    uint8_t bitmap;
-    int ret;
+    return (bitmap[bitnum / 8] & (1 << (bitnum & 7))) != 0;
+}
 
-    ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
-    if (ret < 0) {
-       return ret;
+static int cow_find_streak(const uint8_t *bitmap, int value, int start, int nb_sectors)
+{
+    int streak_value = value ? 0xFF : 0;
+    int last = MIN(start + nb_sectors, BITS_PER_BITMAP_SECTOR);
+    int bitnum = start;
+    while (bitnum < last) {
+        if ((bitnum & 7) == 0 && bitmap[bitnum / 8] == streak_value) {
+            bitnum += 8;
+            continue;
+        }
+        if (cow_test_bit(bitnum, bitmap) == value) {
+            bitnum++;
+            continue;
+        }
+        break;
     }
-
-    return !!(bitmap & (1 << (bitnum % 8)));
+    return MIN(bitnum, last) - start;
 }
 
 /* Return true if first block has been changed (ie. current version is
@@ -146,23 +159,20 @@ static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
 static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *num_same)
 {
+    int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
+    uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
+    uint8_t bitmap[512];
+    int ret;
     int changed;
 
-    if (nb_sectors == 0) {
-	*num_same = nb_sectors;
-	return 0;
-    }
-
-    changed = is_bit_set(bs, sector_num);
-    if (changed < 0) {
-        return 0; /* XXX: how to return I/O errors? */
-    }
-
-    for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
-	if (is_bit_set(bs, sector_num + *num_same) != changed)
-	    break;
+    ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
+    if (ret < 0) {
+        return ret;
     }
 
+    bitnum &= BITS_PER_BITMAP_SECTOR - 1;
+    changed = cow_test_bit(bitnum, bitmap);
+    *num_same = cow_find_streak(bitmap, changed, bitnum, nb_sectors);
     return changed;
 }
 
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 02/17] cow: make writes go at a less indecent speed
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-04  2:40   ` Fam Zheng
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 03/17] cow: do not call bdrv_co_is_allocated Paolo Bonzini
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Only sync once per write, rather than once per sector.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/cow.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/block/cow.c b/block/cow.c
index 204451e..133e596 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -106,7 +106,7 @@ static int cow_open(BlockDriverState *bs, QDict *options, int flags)
  * XXX(hch): right now these functions are extremely inefficient.
  * We should just read the whole bitmap we'll need in one go instead.
  */
-static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
+static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum, bool *first)
 {
     uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
     uint8_t bitmap;
@@ -117,9 +117,21 @@ static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
        return ret;
     }
 
+    if (bitmap & (1 << (bitnum % 8))) {
+        return 0;
+    }
+
+    if (*first) {
+        ret = bdrv_flush(bs->file);
+        if (ret < 0) {
+           return ret;
+        }
+        *first = false;
+    }
+
     bitmap |= (1 << (bitnum % 8));
 
-    ret = bdrv_pwrite_sync(bs->file, offset, &bitmap, sizeof(bitmap));
+    ret = bdrv_pwrite(bs->file, offset, &bitmap, sizeof(bitmap));
     if (ret < 0) {
        return ret;
     }
@@ -181,9 +193,10 @@ static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
 {
     int error = 0;
     int i;
+    bool first = true;
 
     for (i = 0; i < nb_sectors; i++) {
-        error = cow_set_bit(bs, sector_num + i);
+        error = cow_set_bit(bs, sector_num + i, &first);
         if (error) {
             break;
         }
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 03/17] cow: do not call bdrv_co_is_allocated
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 02/17] cow: make writes go at a less indecent speed Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 04/17] block: make bdrv_co_is_allocated static Paolo Bonzini
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

As we change bdrv_is_allocated to gather more information from bs and
bs->file, it will become a bit slower.  It is still appropriate for online
jobs, but not for reads/writes.  Call the internal function instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/cow.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/cow.c b/block/cow.c
index 133e596..9d26592 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -46,6 +46,9 @@ typedef struct BDRVCowState {
     int64_t cow_sectors_offset;
 } BDRVCowState;
 
+static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors, int *num_same);
+
 static int cow_probe(const uint8_t *buf, int buf_size, const char *filename)
 {
     const struct cow_header_v2 *cow_header = (const void *)buf;
@@ -212,7 +215,7 @@ static int coroutine_fn cow_read(BlockDriverState *bs, int64_t sector_num,
     int ret, n;
 
     while (nb_sectors > 0) {
-        if (bdrv_co_is_allocated(bs, sector_num, nb_sectors, &n)) {
+        if (cow_co_is_allocated(bs, sector_num, nb_sectors, &n)) {
             ret = bdrv_pread(bs->file,
                         s->cow_sectors_offset + sector_num * 512,
                         buf, n * 512);
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 04/17] block: make bdrv_co_is_allocated static
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (2 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 03/17] cow: do not call bdrv_co_is_allocated Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 05/17] block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above distinction Paolo Bonzini
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

bdrv_is_allocated can detect coroutine context and go through a fast
path, similar to other block layer functions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c               | 24 +++++++++++++++---------
 block/raw.c           |  2 +-
 block/stream.c        |  4 ++--
 include/block/block.h |  2 --
 4 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/block.c b/block.c
index 6c493ad..8ca36e1 100644
--- a/block.c
+++ b/block.c
@@ -2508,7 +2508,7 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
     if (flags & BDRV_REQ_COPY_ON_READ) {
         int pnum;
 
-        ret = bdrv_co_is_allocated(bs, sector_num, nb_sectors, &pnum);
+        ret = bdrv_is_allocated(bs, sector_num, nb_sectors, &pnum);
         if (ret < 0) {
             goto out;
         }
@@ -2953,8 +2953,9 @@ typedef struct BdrvCoIsAllocatedData {
  * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
  * beyond the end of the disk image it will be clamped.
  */
-int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs, int64_t sector_num,
-                                      int nb_sectors, int *pnum)
+static int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs,
+                                             int64_t sector_num,
+                                             int nb_sectors, int *pnum)
 {
     int64_t n;
 
@@ -3004,10 +3005,15 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
         .done = false,
     };
 
-    co = qemu_coroutine_create(bdrv_is_allocated_co_entry);
-    qemu_coroutine_enter(co, &data);
-    while (!data.done) {
-        qemu_aio_wait();
+    if (qemu_in_coroutine()) {
+        /* Fast-path if already in coroutine context */
+        bdrv_is_allocated_co_entry(&data);
+    } else {
+        co = qemu_coroutine_create(bdrv_is_allocated_co_entry);
+        qemu_coroutine_enter(co, &data);
+        while (!data.done) {
+            qemu_aio_wait();
+        }
     }
     return data.ret;
 }
@@ -3035,8 +3041,8 @@ int coroutine_fn bdrv_co_is_allocated_above(BlockDriverState *top,
     intermediate = top;
     while (intermediate && intermediate != base) {
         int pnum_inter;
-        ret = bdrv_co_is_allocated(intermediate, sector_num, nb_sectors,
-                                   &pnum_inter);
+        ret = bdrv_is_allocated(intermediate, sector_num, nb_sectors,
+                                &pnum_inter);
         if (ret < 0) {
             return ret;
         } else if (ret) {
diff --git a/block/raw.c b/block/raw.c
index ce10422..df4a868 100644
--- a/block/raw.c
+++ b/block/raw.c
@@ -39,7 +39,7 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
                                             int64_t sector_num,
                                             int nb_sectors, int *pnum)
 {
-    return bdrv_co_is_allocated(bs->file, sector_num, nb_sectors, pnum);
+    return bdrv_is_allocated(bs->file, sector_num, nb_sectors, pnum);
 }
 
 static int64_t raw_getlength(BlockDriverState *bs)
diff --git a/block/stream.c b/block/stream.c
index 7fe9e48..fb1f9c3 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -115,8 +115,8 @@ wait:
             break;
         }
 
-        ret = bdrv_co_is_allocated(bs, sector_num,
-                                   STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE, &n);
+        ret = bdrv_is_allocated(bs, sector_num,
+                                STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE, &n);
         if (ret == 1) {
             /* Allocated in the top, no need to copy.  */
             copy = false;
diff --git a/include/block/block.h b/include/block/block.h
index dd8eca1..47a336a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -178,8 +178,6 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
  */
 int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
     int nb_sectors);
-int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs, int64_t sector_num,
-    int nb_sectors, int *pnum);
 int coroutine_fn bdrv_co_is_allocated_above(BlockDriverState *top,
                                             BlockDriverState *base,
                                             int64_t sector_num,
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 05/17] block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above distinction
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (3 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 04/17] block: make bdrv_co_is_allocated static Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 06/17] block: expect errors from bdrv_co_is_allocated Paolo Bonzini
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Now that bdrv_is_allocated detects coroutine context, the two can
use the same code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c               | 46 ++++------------------------------------------
 block/commit.c        |  6 +++---
 block/mirror.c        |  4 ++--
 block/stream.c        |  4 ++--
 include/block/block.h |  4 ----
 5 files changed, 11 insertions(+), 53 deletions(-)

diff --git a/block.c b/block.c
index 8ca36e1..3852130 100644
--- a/block.c
+++ b/block.c
@@ -3030,10 +3030,10 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
  *  allocated/unallocated state.
  *
  */
-int coroutine_fn bdrv_co_is_allocated_above(BlockDriverState *top,
-                                            BlockDriverState *base,
-                                            int64_t sector_num,
-                                            int nb_sectors, int *pnum)
+int coroutine_fn bdrv_is_allocated_above(BlockDriverState *top,
+                                         BlockDriverState *base,
+                                         int64_t sector_num,
+                                         int nb_sectors, int *pnum)
 {
     BlockDriverState *intermediate;
     int ret, n = nb_sectors;
@@ -3069,44 +3069,6 @@ int coroutine_fn bdrv_co_is_allocated_above(BlockDriverState *top,
     return 0;
 }
 
-/* Coroutine wrapper for bdrv_is_allocated_above() */
-static void coroutine_fn bdrv_is_allocated_above_co_entry(void *opaque)
-{
-    BdrvCoIsAllocatedData *data = opaque;
-    BlockDriverState *top = data->bs;
-    BlockDriverState *base = data->base;
-
-    data->ret = bdrv_co_is_allocated_above(top, base, data->sector_num,
-                                           data->nb_sectors, data->pnum);
-    data->done = true;
-}
-
-/*
- * Synchronous wrapper around bdrv_co_is_allocated_above().
- *
- * See bdrv_co_is_allocated_above() for details.
- */
-int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
-                            int64_t sector_num, int nb_sectors, int *pnum)
-{
-    Coroutine *co;
-    BdrvCoIsAllocatedData data = {
-        .bs = top,
-        .base = base,
-        .sector_num = sector_num,
-        .nb_sectors = nb_sectors,
-        .pnum = pnum,
-        .done = false,
-    };
-
-    co = qemu_coroutine_create(bdrv_is_allocated_above_co_entry);
-    qemu_coroutine_enter(co, &data);
-    while (!data.done) {
-        qemu_aio_wait();
-    }
-    return data.ret;
-}
-
 const char *bdrv_get_encrypted_filename(BlockDriverState *bs)
 {
     if (bs->backing_hd && bs->backing_hd->encrypted)
diff --git a/block/commit.c b/block/commit.c
index 2227fc2..37572f0 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -108,9 +108,9 @@ wait:
             break;
         }
         /* Copy if allocated above the base */
-        ret = bdrv_co_is_allocated_above(top, base, sector_num,
-                                         COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
-                                         &n);
+        ret = bdrv_is_allocated_above(top, base, sector_num,
+                                      COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
+                                      &n);
         copy = (ret == 1);
         trace_commit_one_iteration(s, sector_num, n, ret);
         if (copy) {
diff --git a/block/mirror.c b/block/mirror.c
index bed4a7e..0b3c2f6 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -338,8 +338,8 @@ static void coroutine_fn mirror_run(void *opaque)
         base = s->mode == MIRROR_SYNC_MODE_FULL ? NULL : bs->backing_hd;
         for (sector_num = 0; sector_num < end; ) {
             int64_t next = (sector_num | (sectors_per_chunk - 1)) + 1;
-            ret = bdrv_co_is_allocated_above(bs, base,
-                                             sector_num, next - sector_num, &n);
+            ret = bdrv_is_allocated_above(bs, base,
+                                          sector_num, next - sector_num, &n);
 
             if (ret < 0) {
                 goto immediate_exit;
diff --git a/block/stream.c b/block/stream.c
index fb1f9c3..802acae 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -123,8 +123,8 @@ wait:
         } else {
             /* Copy if allocated in the intermediate images.  Limit to the
              * known-unallocated area [sector_num, sector_num+n).  */
-            ret = bdrv_co_is_allocated_above(bs->backing_hd, base,
-                                             sector_num, n, &n);
+            ret = bdrv_is_allocated_above(bs->backing_hd, base,
+	                                   sector_num, n, &n);
 
             /* Finish early if end of backing file has been reached */
             if (ret == 0 && n == 0) {
diff --git a/include/block/block.h b/include/block/block.h
index 47a336a..6c50bb1 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -178,10 +178,6 @@ int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
  */
 int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
     int nb_sectors);
-int coroutine_fn bdrv_co_is_allocated_above(BlockDriverState *top,
-                                            BlockDriverState *base,
-                                            int64_t sector_num,
-                                            int nb_sectors, int *pnum);
 BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
     const char *backing_file);
 int bdrv_get_backing_file_depth(BlockDriverState *bs);
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 06/17] block: expect errors from bdrv_co_is_allocated
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (4 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 05/17] block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above distinction Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-05  9:19   ` Stefan Hajnoczi
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 07/17] qemu-img: always probe the input image for allocated sectors Paolo Bonzini
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, qemu-stable, stefanha

Some bdrv_is_allocated callers do not expect errors, but the fallback
in qcow2.c might make other callers trip on assertion failures or
infinite loops.

Fix the callers to always look for errors.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/qcow2.c  | 4 +---
 block/stream.c | 2 +-
 qemu-img.c     | 4 ++++
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 0eceefe..e2b4202 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -648,13 +648,11 @@ static int coroutine_fn qcow2_co_is_allocated(BlockDriverState *bs,
     int ret;
 
     *pnum = nb_sectors;
-    /* FIXME We can get errors here, but the bdrv_co_is_allocated interface
-     * can't pass them on today */
     qemu_co_mutex_lock(&s->lock);
     ret = qcow2_get_cluster_offset(bs, sector_num << 9, pnum, &cluster_offset);
     qemu_co_mutex_unlock(&s->lock);
     if (ret < 0) {
-        *pnum = 0;
+        return ret;
     }
 
     return (cluster_offset != 0) || (ret == QCOW2_CLUSTER_ZERO);
diff --git a/block/stream.c b/block/stream.c
index 802acae..28b702d 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -120,7 +120,7 @@ wait:
         if (ret == 1) {
             /* Allocated in the top, no need to copy.  */
             copy = false;
-        } else {
+        } else if (ret >= 0) {
             /* Copy if allocated in the intermediate images.  Limit to the
              * known-unallocated area [sector_num, sector_num+n).  */
             ret = bdrv_is_allocated_above(bs->backing_hd, base,
diff --git a/qemu-img.c b/qemu-img.c
index f8c97d3..857e2ca 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2073,6 +2073,10 @@ static int img_rebase(int argc, char **argv)
 
             /* If the cluster is allocated, we don't need to take action */
             ret = bdrv_is_allocated(bs, sector, n, &n);
+            if (ret < 0) {
+                error_report("error while reading from file");
+                goto out;
+            }
             if (ret) {
                 continue;
             }
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 07/17] qemu-img: always probe the input image for allocated sectors
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (5 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 06/17] block: expect errors from bdrv_co_is_allocated Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 08/17] block: make bdrv_has_zero_init return false for copy-on-write-images Paolo Bonzini
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

qemu-img convert is assuming "that sectors which are unallocated in the input
image are present in both the output's and input's base images", but it is
only doing this if the output image is zero initialized.  And checking if
the output image is zero initialized does not make much sense if the
output image is copy-on-write.  Always do the test.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qemu-img.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 857e2ca..eb2561f 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1476,21 +1476,19 @@ static int img_convert(int argc, char **argv)
                 n = bs_offset + bs_sectors - sector_num;
             }
 
-            if (has_zero_init) {
-                /* If the output image is being created as a copy on write image,
-                   assume that sectors which are unallocated in the input image
-                   are present in both the output's and input's base images (no
-                   need to copy them). */
-                if (out_baseimg) {
-                    if (!bdrv_is_allocated(bs[bs_i], sector_num - bs_offset,
-                                           n, &n1)) {
-                        sector_num += n1;
-                        continue;
-                    }
-                    /* The next 'n1' sectors are allocated in the input image. Copy
-                       only those as they may be followed by unallocated sectors. */
-                    n = n1;
+            /* If the output image is being created as a copy on write image,
+               assume that sectors which are unallocated in the input image
+               are present in both the output's and input's base images (no
+               need to copy them). */
+            if (out_baseimg) {
+                if (!bdrv_is_allocated(bs[bs_i], sector_num - bs_offset,
+                                       n, &n1)) {
+                    sector_num += n1;
+                    continue;
                 }
+                /* The next 'n1' sectors are allocated in the input image. Copy
+                   only those as they may be followed by unallocated sectors. */
+                n = n1;
             } else {
                 n1 = n;
             }
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 08/17] block: make bdrv_has_zero_init return false for copy-on-write-images
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (6 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 07/17] qemu-img: always probe the input image for allocated sectors Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 09/17] block: introduce bdrv_get_block_status API Paolo Bonzini
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

This helps implementing is_allocated on top of get_block_status.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c    | 5 +++++
 qemu-img.c | 9 +--------
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/block.c b/block.c
index 3852130..ab7d0bc 100644
--- a/block.c
+++ b/block.c
@@ -2920,6 +2920,11 @@ int bdrv_has_zero_init(BlockDriverState *bs)
 {
     assert(bs->drv);
 
+    /* If BS is a copy on write image, it is initialized to
+       the contents of the base image, which may not be zeroes.  */
+    if (bs->backing_hd) {
+        return 0;
+    }
     if (bs->drv->bdrv_has_zero_init) {
         return bs->drv->bdrv_has_zero_init(bs);
     }
diff --git a/qemu-img.c b/qemu-img.c
index eb2561f..6461932 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1504,14 +1504,7 @@ static int img_convert(int argc, char **argv)
                should add a specific call to have the info to go faster */
             buf1 = buf;
             while (n > 0) {
-                /* If the output image is being created as a copy on write image,
-                   copy all sectors even the ones containing only NUL bytes,
-                   because they may differ from the sectors in the base image.
-
-                   If the output is to a host device, we also write out
-                   sectors that are entirely 0, since whatever data was
-                   already there is garbage, not 0s. */
-                if (!has_zero_init || out_baseimg ||
+                if (!has_zero_init ||
                     is_allocated_sectors_min(buf1, n, &n1, min_sparse)) {
                     ret = bdrv_write(out_bs, sector_num, buf1, n1);
                     if (ret < 0) {
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 09/17] block: introduce bdrv_get_block_status API
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (7 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 08/17] block: make bdrv_has_zero_init return false for copy-on-write-images Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 10/17] block: define get_block_status return value Paolo Bonzini
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

For now, bdrv_get_block_status is just another name for bdrv_is_allocated.
The next patches will add more flags.

This also touches all block drivers with a mostly mechanical rename.  The
sole exception is cow; because it calls cow_co_is_allocated from the read
code, we keep that function and make cow_co_get_block_status a wrapper.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c                   | 46 ++++++++++++++++++++++++++--------------------
 block/cow.c               |  8 +++++++-
 block/qcow.c              |  4 ++--
 block/qcow2.c             |  4 ++--
 block/qed.c               |  4 ++--
 block/raw-posix.c         |  4 ++--
 block/raw.c               |  6 +++---
 block/sheepdog.c          | 12 ++++++------
 block/vdi.c               |  4 ++--
 block/vmdk.c              |  4 ++--
 block/vvfat.c             |  4 ++--
 include/block/block.h     |  2 ++
 include/block/block_int.h |  2 +-
 13 files changed, 59 insertions(+), 45 deletions(-)

diff --git a/block.c b/block.c
index ab7d0bc..aa1a5f7 100644
--- a/block.c
+++ b/block.c
@@ -2933,15 +2933,15 @@ int bdrv_has_zero_init(BlockDriverState *bs)
     return 0;
 }
 
-typedef struct BdrvCoIsAllocatedData {
+typedef struct BdrvCoGetBlockStatusData {
     BlockDriverState *bs;
     BlockDriverState *base;
     int64_t sector_num;
     int nb_sectors;
     int *pnum;
-    int ret;
+    int64_t ret;
     bool done;
-} BdrvCoIsAllocatedData;
+} BdrvCoGetBlockStatusData;
 
 /*
  * Returns true iff the specified sector is present in the disk image. Drivers
@@ -2958,9 +2958,9 @@ typedef struct BdrvCoIsAllocatedData {
  * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
  * beyond the end of the disk image it will be clamped.
  */
-static int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs,
-                                             int64_t sector_num,
-                                             int nb_sectors, int *pnum)
+static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
+                                                     int64_t sector_num,
+                                                     int nb_sectors, int *pnum)
 {
     int64_t n;
 
@@ -2974,35 +2974,35 @@ static int coroutine_fn bdrv_co_is_allocated(BlockDriverState *bs,
         nb_sectors = n;
     }
 
-    if (!bs->drv->bdrv_co_is_allocated) {
+    if (!bs->drv->bdrv_co_get_block_status) {
         *pnum = nb_sectors;
         return 1;
     }
 
-    return bs->drv->bdrv_co_is_allocated(bs, sector_num, nb_sectors, pnum);
+    return bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
 }
 
-/* Coroutine wrapper for bdrv_is_allocated() */
-static void coroutine_fn bdrv_is_allocated_co_entry(void *opaque)
+/* Coroutine wrapper for bdrv_get_block_status() */
+static void coroutine_fn bdrv_get_block_status_co_entry(void *opaque)
 {
-    BdrvCoIsAllocatedData *data = opaque;
+    BdrvCoGetBlockStatusData *data = opaque;
     BlockDriverState *bs = data->bs;
 
-    data->ret = bdrv_co_is_allocated(bs, data->sector_num, data->nb_sectors,
-                                     data->pnum);
+    data->ret = bdrv_co_get_block_status(bs, data->sector_num, data->nb_sectors,
+                                         data->pnum);
     data->done = true;
 }
 
 /*
- * Synchronous wrapper around bdrv_co_is_allocated().
+ * Synchronous wrapper around bdrv_co_get_block_status().
  *
- * See bdrv_co_is_allocated() for details.
+ * See bdrv_co_get_block_status() for details.
  */
-int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
-                      int *pnum)
+int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
+                              int nb_sectors, int *pnum)
 {
     Coroutine *co;
-    BdrvCoIsAllocatedData data = {
+    BdrvCoGetBlockStatusData data = {
         .bs = bs,
         .sector_num = sector_num,
         .nb_sectors = nb_sectors,
@@ -3012,9 +3012,9 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
 
     if (qemu_in_coroutine()) {
         /* Fast-path if already in coroutine context */
-        bdrv_is_allocated_co_entry(&data);
+        bdrv_get_block_status_co_entry(&data);
     } else {
-        co = qemu_coroutine_create(bdrv_is_allocated_co_entry);
+        co = qemu_coroutine_create(bdrv_get_block_status_co_entry);
         qemu_coroutine_enter(co, &data);
         while (!data.done) {
             qemu_aio_wait();
@@ -3023,6 +3023,12 @@ int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
     return data.ret;
 }
 
+int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
+                                   int nb_sectors, int *pnum)
+{
+    return bdrv_get_block_status(bs, sector_num, nb_sectors, pnum);
+}
+
 /*
  * Given an image chain: ... -> [BASE] -> [INTER1] -> [INTER2] -> [TOP]
  *
diff --git a/block/cow.c b/block/cow.c
index 9d26592..efe5ead 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -191,6 +191,12 @@ static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
     return changed;
 }
 
+static int64_t coroutine_fn cow_co_get_block_status(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors, int *num_same)
+{
+    return cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
+}
+
 static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
         int nb_sectors)
 {
@@ -370,7 +376,7 @@ static BlockDriver bdrv_cow = {
 
     .bdrv_read              = cow_co_read,
     .bdrv_write             = cow_co_write,
-    .bdrv_co_is_allocated   = cow_co_is_allocated,
+    .bdrv_co_get_block_status   = cow_co_get_block_status,
 
     .create_options = cow_create_options,
 };
diff --git a/block/qcow.c b/block/qcow.c
index 5239bd6..acd1aeb 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -395,7 +395,7 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
     return cluster_offset;
 }
 
-static int coroutine_fn qcow_co_is_allocated(BlockDriverState *bs,
+static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *pnum)
 {
     BDRVQcowState *s = bs->opaque;
@@ -896,7 +896,7 @@ static BlockDriver bdrv_qcow = {
 
     .bdrv_co_readv          = qcow_co_readv,
     .bdrv_co_writev         = qcow_co_writev,
-    .bdrv_co_is_allocated   = qcow_co_is_allocated,
+    .bdrv_co_get_block_status   = qcow_co_get_block_status,
 
     .bdrv_set_key           = qcow_set_key,
     .bdrv_make_empty        = qcow_make_empty,
diff --git a/block/qcow2.c b/block/qcow2.c
index e2b4202..d35a134 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -640,7 +640,7 @@ static int qcow2_reopen_prepare(BDRVReopenState *state,
     return 0;
 }
 
-static int coroutine_fn qcow2_co_is_allocated(BlockDriverState *bs,
+static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *pnum)
 {
     BDRVQcowState *s = bs->opaque;
@@ -1784,7 +1784,7 @@ static BlockDriver bdrv_qcow2 = {
     .bdrv_reopen_prepare  = qcow2_reopen_prepare,
     .bdrv_create        = qcow2_create,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,
-    .bdrv_co_is_allocated = qcow2_co_is_allocated,
+    .bdrv_co_get_block_status = qcow2_co_get_block_status,
     .bdrv_set_key       = qcow2_set_key,
     .bdrv_make_empty    = qcow2_make_empty,
 
diff --git a/block/qed.c b/block/qed.c
index f767b05..b0978ba 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -667,7 +667,7 @@ static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t l
     }
 }
 
-static int coroutine_fn bdrv_qed_co_is_allocated(BlockDriverState *bs,
+static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
                                                  int64_t sector_num,
                                                  int nb_sectors, int *pnum)
 {
@@ -1575,7 +1575,7 @@ static BlockDriver bdrv_qed = {
     .bdrv_reopen_prepare      = bdrv_qed_reopen_prepare,
     .bdrv_create              = bdrv_qed_create,
     .bdrv_has_zero_init       = bdrv_has_zero_init_1,
-    .bdrv_co_is_allocated     = bdrv_qed_co_is_allocated,
+    .bdrv_co_get_block_status = bdrv_qed_co_get_block_status,
     .bdrv_make_empty          = bdrv_qed_make_empty,
     .bdrv_aio_readv           = bdrv_qed_aio_readv,
     .bdrv_aio_writev          = bdrv_qed_aio_writev,
diff --git a/block/raw-posix.c b/block/raw-posix.c
index ba721d3..dbc65b0 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1084,7 +1084,7 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
  * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
  * beyond the end of the disk image it will be clamped.
  */
-static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
+static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
                                             int64_t sector_num,
                                             int nb_sectors, int *pnum)
 {
@@ -1200,7 +1200,7 @@ static BlockDriver bdrv_file = {
     .bdrv_close = raw_close,
     .bdrv_create = raw_create,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,
-    .bdrv_co_is_allocated = raw_co_is_allocated,
+    .bdrv_co_get_block_status = raw_co_get_block_status,
 
     .bdrv_aio_readv = raw_aio_readv,
     .bdrv_aio_writev = raw_aio_writev,
diff --git a/block/raw.c b/block/raw.c
index df4a868..1c7380d 100644
--- a/block/raw.c
+++ b/block/raw.c
@@ -35,11 +35,11 @@ static void raw_close(BlockDriverState *bs)
 {
 }
 
-static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
+static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
                                             int64_t sector_num,
                                             int nb_sectors, int *pnum)
 {
-    return bdrv_is_allocated(bs->file, sector_num, nb_sectors, pnum);
+    return bdrv_get_block_status(bs->file, sector_num, nb_sectors, pnum);
 }
 
 static int64_t raw_getlength(BlockDriverState *bs)
@@ -127,7 +127,7 @@ static BlockDriver bdrv_raw = {
 
     .bdrv_co_readv          = raw_co_readv,
     .bdrv_co_writev         = raw_co_writev,
-    .bdrv_co_is_allocated   = raw_co_is_allocated,
+    .bdrv_co_get_block_status   = raw_co_get_block_status,
     .bdrv_co_discard        = raw_co_discard,
 
     .bdrv_probe         = raw_probe,
diff --git a/block/sheepdog.c b/block/sheepdog.c
index b397b5b..e7b9c22 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -2287,9 +2287,9 @@ static coroutine_fn int sd_co_discard(BlockDriverState *bs, int64_t sector_num,
     return acb->ret;
 }
 
-static coroutine_fn int
-sd_co_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
-                   int *pnum)
+static coroutine_fn int64_t
+sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
+                       int *pnum)
 {
     BDRVSheepdogState *s = bs->opaque;
     SheepdogInode *inode = &s->inode;
@@ -2354,7 +2354,7 @@ static BlockDriver bdrv_sheepdog = {
     .bdrv_co_writev = sd_co_writev,
     .bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
     .bdrv_co_discard = sd_co_discard,
-    .bdrv_co_is_allocated = sd_co_is_allocated,
+    .bdrv_co_get_block_status = sd_co_get_block_status,
 
     .bdrv_snapshot_create   = sd_snapshot_create,
     .bdrv_snapshot_goto     = sd_snapshot_goto,
@@ -2381,7 +2381,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
     .bdrv_co_writev = sd_co_writev,
     .bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
     .bdrv_co_discard = sd_co_discard,
-    .bdrv_co_is_allocated = sd_co_is_allocated,
+    .bdrv_co_get_block_status = sd_co_get_block_status,
 
     .bdrv_snapshot_create   = sd_snapshot_create,
     .bdrv_snapshot_goto     = sd_snapshot_goto,
@@ -2409,7 +2409,7 @@ static BlockDriver bdrv_sheepdog_unix = {
     .bdrv_co_writev = sd_co_writev,
     .bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
     .bdrv_co_discard = sd_co_discard,
-    .bdrv_co_is_allocated = sd_co_is_allocated,
+    .bdrv_co_get_block_status = sd_co_get_block_status,
 
     .bdrv_snapshot_create   = sd_snapshot_create,
     .bdrv_snapshot_goto     = sd_snapshot_goto,
diff --git a/block/vdi.c b/block/vdi.c
index 8a91525..7ab2567 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -470,7 +470,7 @@ static int vdi_reopen_prepare(BDRVReopenState *state,
     return 0;
 }
 
-static int coroutine_fn vdi_co_is_allocated(BlockDriverState *bs,
+static int64_t coroutine_fn vdi_co_get_block_status(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *pnum)
 {
     /* TODO: Check for too large sector_num (in bdrv_is_allocated or here). */
@@ -780,7 +780,7 @@ static BlockDriver bdrv_vdi = {
     .bdrv_reopen_prepare = vdi_reopen_prepare,
     .bdrv_create = vdi_create,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,
-    .bdrv_co_is_allocated = vdi_co_is_allocated,
+    .bdrv_co_get_block_status = vdi_co_get_block_status,
     .bdrv_make_empty = vdi_make_empty,
 
     .bdrv_read = vdi_co_read,
diff --git a/block/vmdk.c b/block/vmdk.c
index a28fb5e..e24f6c7 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -998,7 +998,7 @@ static VmdkExtent *find_extent(BDRVVmdkState *s,
     return NULL;
 }
 
-static int coroutine_fn vmdk_co_is_allocated(BlockDriverState *bs,
+static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *pnum)
 {
     BDRVVmdkState *s = bs->opaque;
@@ -1773,7 +1773,7 @@ static BlockDriver bdrv_vmdk = {
     .bdrv_close     = vmdk_close,
     .bdrv_create    = vmdk_create,
     .bdrv_co_flush_to_disk  = vmdk_co_flush,
-    .bdrv_co_is_allocated   = vmdk_co_is_allocated,
+    .bdrv_co_get_block_status   = vmdk_co_get_block_status,
     .bdrv_get_allocated_file_size  = vmdk_get_allocated_file_size,
 
     .create_options = vmdk_create_options,
diff --git a/block/vvfat.c b/block/vvfat.c
index 87b0279..510a559 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -2874,7 +2874,7 @@ static coroutine_fn int vvfat_co_write(BlockDriverState *bs, int64_t sector_num,
     return ret;
 }
 
-static int coroutine_fn vvfat_co_is_allocated(BlockDriverState *bs,
+static int64_t coroutine_fn vvfat_co_get_block_status(BlockDriverState *bs,
 	int64_t sector_num, int nb_sectors, int* n)
 {
     BDRVVVFATState* s = bs->opaque;
@@ -2981,7 +2981,7 @@ static BlockDriver bdrv_vvfat = {
 
     .bdrv_read              = vvfat_co_read,
     .bdrv_write             = vvfat_co_write,
-    .bdrv_co_is_allocated   = vvfat_co_is_allocated,
+    .bdrv_co_get_block_status = vvfat_co_get_block_status,
 };
 
 static void bdrv_vvfat_init(void)
diff --git a/include/block/block.h b/include/block/block.h
index 6c50bb1..2b50b51 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -268,6 +268,8 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_co_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors);
 int bdrv_has_zero_init_1(BlockDriverState *bs);
 int bdrv_has_zero_init(BlockDriverState *bs);
+int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
+                              int nb_sectors, int *pnum);
 int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
                       int *pnum);
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index c6ac871..8f2d2c7 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -134,7 +134,7 @@ struct BlockDriver {
         int64_t sector_num, int nb_sectors);
     int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors);
-    int coroutine_fn (*bdrv_co_is_allocated)(BlockDriverState *bs,
+    int64_t coroutine_fn (*bdrv_co_get_block_status)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *pnum);
 
     /*
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 10/17] block: define get_block_status return value
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (8 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 09/17] block: introduce bdrv_get_block_status API Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 21:04   ` Peter Lieven
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 11/17] block: return get_block_status data and flags for formats Paolo Bonzini
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Define the return value of get_block_status.  Bits 0, 1, 2 and 8-62
are valid; bit 63 (the sign bit) is reserved for errors.  Bits 3-7
are left for future extensions.

The return code is compatible with the old is_allocated API: returning
just 0 or 1 (aka BDRV_BLOCK_DATA) will not cause any behavioral change
in clients of is_allocated.  We will return more precise information
in the next patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c               |  7 +++++--
 include/block/block.h | 23 +++++++++++++++++++++++
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index aa1a5f7..8931cac 100644
--- a/block.c
+++ b/block.c
@@ -2976,7 +2976,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
 
     if (!bs->drv->bdrv_co_get_block_status) {
         *pnum = nb_sectors;
-        return 1;
+        return BDRV_BLOCK_DATA;
     }
 
     return bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
@@ -3026,7 +3026,10 @@ int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
 int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
                                    int nb_sectors, int *pnum)
 {
-    return bdrv_get_block_status(bs, sector_num, nb_sectors, pnum);
+    int64_t ret = bdrv_get_block_status(bs, sector_num, nb_sectors, pnum);
+    return
+        (ret & BDRV_BLOCK_DATA) ||
+        ((ret & BDRV_BLOCK_ZERO) && !bdrv_has_zero_init(bs));
 }
 
 /*
diff --git a/include/block/block.h b/include/block/block.h
index 2b50b51..9e44bdd 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -81,6 +81,29 @@ typedef struct BlockDevOps {
 #define BDRV_SECTOR_SIZE   (1ULL << BDRV_SECTOR_BITS)
 #define BDRV_SECTOR_MASK   ~(BDRV_SECTOR_SIZE - 1)
 
+/* BDRV_BLOCK_DATA: data is read from bs->file or another file
+ * BDRV_BLOCK_ZERO: sectors read as zero
+ * BDRV_BLOCK_OFFSET_VALID: contents of sector available in bs->file at offset
+ *
+ * DATA == 0 && ZERO == 0 means that data is read from backing_hd if present.
+ *
+ * DATA ZERO OFFSET_VALID
+ *  t    t        t       sectors read as zero, bs->file is zero at offset
+ *  t    f        t       sectors read as valid from bs->file at offset
+ *  f    t        t       sectors preallocated, read as zero, bs->file not
+ *                        necessarily zero at offset
+ *  f    f        t       sectors preallocated but read from backing_hd,
+ *                        bs->file contains garbage at offset
+ *  t    t        f       sectors preallocated, read as zero, unknown offset
+ *  t    f        f       sectors read from unknown file or offset
+ *  f    t        f       not allocated or unknown offset, read as zero
+ *  f    f        f       not allocated or unknown offset, read from backing_hd
+ */
+#define BDRV_BLOCK_DATA         1
+#define BDRV_BLOCK_ZERO         2
+#define BDRV_BLOCK_OFFSET_VALID 4
+#define BDRV_BLOCK_OFFSET_MASK  BDRV_SECTOR_MASK
+
 typedef enum {
     BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
 } BlockErrorAction;
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 11/17] block: return get_block_status data and flags for formats
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (9 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 10/17] block: define get_block_status return value Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-04  3:22   ` Fam Zheng
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand Paolo Bonzini
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/cow.c      |  8 +++++++-
 block/qcow.c     |  9 ++++++++-
 block/qcow2.c    | 16 ++++++++++++++--
 block/qed.c      | 35 ++++++++++++++++++++++++++++-------
 block/sheepdog.c |  2 +-
 block/vdi.c      | 13 ++++++++++++-
 block/vmdk.c     | 16 +++++++++++++++-
 block/vvfat.c    | 11 ++++++-----
 8 files changed, 91 insertions(+), 19 deletions(-)

diff --git a/block/cow.c b/block/cow.c
index efe5ead..f6982d4 100644
--- a/block/cow.c
+++ b/block/cow.c
@@ -194,7 +194,13 @@ static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
 static int64_t coroutine_fn cow_co_get_block_status(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, int *num_same)
 {
-    return cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
+    BDRVCowState *s = bs->opaque;
+    int ret = cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
+    int64_t offset = s->cow_sectors_offset + (sector_num << BDRV_SECTOR_BITS);
+    if (ret < 0) {
+        return ret;
+    }
+    return (ret ? BDRV_BLOCK_DATA : 0) | offset | BDRV_BLOCK_OFFSET_VALID;
 }
 
 static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
diff --git a/block/qcow.c b/block/qcow.c
index acd1aeb..9aebcc3 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -410,7 +410,14 @@ static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
     if (n > nb_sectors)
         n = nb_sectors;
     *pnum = n;
-    return (cluster_offset != 0);
+    if (!cluster_offset) {
+	return 0;
+    }
+    if ((cluster_offset & QCOW_OFLAG_COMPRESSED) || s->crypt_method) {
+        return BDRV_BLOCK_DATA;
+    }
+    cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
+    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | cluster_offset;
 }
 
 static int decompress_buffer(uint8_t *out_buf, int out_buf_size,
diff --git a/block/qcow2.c b/block/qcow2.c
index d35a134..cac7ee8 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -645,7 +645,8 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
 {
     BDRVQcowState *s = bs->opaque;
     uint64_t cluster_offset;
-    int ret;
+    int index_in_cluster, ret;
+    int64_t status = 0;
 
     *pnum = nb_sectors;
     qemu_co_mutex_lock(&s->lock);
@@ -655,7 +656,18 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
         return ret;
     }
 
-    return (cluster_offset != 0) || (ret == QCOW2_CLUSTER_ZERO);
+    if (cluster_offset != 0 && ret != QCOW2_CLUSTER_COMPRESSED &&
+        !s->crypt_method) {
+        index_in_cluster = sector_num & (s->cluster_sectors - 1);
+        cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
+        status |= BDRV_BLOCK_OFFSET_VALID | cluster_offset;
+    }
+    if (ret == QCOW2_CLUSTER_ZERO) {
+        status |= BDRV_BLOCK_ZERO;
+    } else if (ret != QCOW2_CLUSTER_UNALLOCATED) {
+        status |= BDRV_BLOCK_DATA;
+    }
+    return status;
 }
 
 /* handle reading after the end of the backing file */
diff --git a/block/qed.c b/block/qed.c
index b0978ba..cf4fbed 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -652,16 +652,36 @@ static int bdrv_qed_create(const char *filename, QEMUOptionParameter *options)
 }
 
 typedef struct {
+    BlockDriverState *bs;
     Coroutine *co;
-    int is_allocated;
+    uint64_t pos;
+    int64_t status;
     int *pnum;
 } QEDIsAllocatedCB;
 
 static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t len)
 {
     QEDIsAllocatedCB *cb = opaque;
+    BDRVQEDState *s = cb->bs->opaque;
     *cb->pnum = len / BDRV_SECTOR_SIZE;
-    cb->is_allocated = (ret == QED_CLUSTER_FOUND || ret == QED_CLUSTER_ZERO);
+    switch (ret) {
+    case QED_CLUSTER_FOUND:
+       offset |= qed_offset_into_cluster(s, cb->pos);
+	cb->status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
+	break;
+    case QED_CLUSTER_ZERO:
+	cb->status = BDRV_BLOCK_ZERO;
+	break;
+    case QED_CLUSTER_L2:
+    case QED_CLUSTER_L1:
+	cb->status = 0;
+	break;
+    default:
+	assert(ret < 0);
+	cb->status = ret;
+        break;
+    }
+
     if (cb->co) {
         qemu_coroutine_enter(cb->co, NULL);
     }
@@ -672,25 +692,26 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
                                                  int nb_sectors, int *pnum)
 {
     BDRVQEDState *s = bs->opaque;
-    uint64_t pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;
     size_t len = (size_t)nb_sectors * BDRV_SECTOR_SIZE;
     QEDIsAllocatedCB cb = {
-        .is_allocated = -1,
+        .bs = bs,
+        .pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE,
+        .status = BDRV_BLOCK_OFFSET_MASK,
         .pnum = pnum,
     };
     QEDRequest request = { .l2_table = NULL };
 
-    qed_find_cluster(s, &request, pos, len, qed_is_allocated_cb, &cb);
+    qed_find_cluster(s, &request, cb.pos, len, qed_is_allocated_cb, &cb);
 
     /* Now sleep if the callback wasn't invoked immediately */
-    while (cb.is_allocated == -1) {
+    while (cb.status == BDRV_BLOCK_OFFSET_MASK) {
         cb.co = qemu_coroutine_self();
         qemu_coroutine_yield();
     }
 
     qed_unref_l2_cache_entry(request.l2_table);
 
-    return cb.is_allocated;
+    return cb.status;
 }
 
 static int bdrv_qed_make_empty(BlockDriverState *bs)
diff --git a/block/sheepdog.c b/block/sheepdog.c
index e7b9c22..3dca26a 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -2297,7 +2297,7 @@ sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
                   end = DIV_ROUND_UP((sector_num + nb_sectors) *
                                      BDRV_SECTOR_SIZE, SD_DATA_OBJ_SIZE);
     unsigned long idx;
-    int ret = 1;
+    int64_t ret = BDRV_BLOCK_DATA;
 
     for (idx = start; idx < end; idx++) {
         if (inode->data_vdi_id[idx] == 0) {
diff --git a/block/vdi.c b/block/vdi.c
index 7ab2567..e96b5b3 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -479,12 +479,23 @@ static int64_t coroutine_fn vdi_co_get_block_status(BlockDriverState *bs,
     size_t sector_in_block = sector_num % s->block_sectors;
     int n_sectors = s->block_sectors - sector_in_block;
     uint32_t bmap_entry = le32_to_cpu(s->bmap[bmap_index]);
+    uint64_t offset;
+    int result;
+
     logout("%p, %" PRId64 ", %d, %p\n", bs, sector_num, nb_sectors, pnum);
     if (n_sectors > nb_sectors) {
         n_sectors = nb_sectors;
     }
     *pnum = n_sectors;
-    return VDI_IS_ALLOCATED(bmap_entry);
+    result = VDI_IS_ALLOCATED(bmap_entry);
+    if (!result) {
+	return 0;
+    }
+
+    offset = s->header.offset_data +
+                              (uint64_t)bmap_entry * s->block_size +
+                              sector_in_block * SECTOR_SIZE;
+    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
 }
 
 static int vdi_co_read(BlockDriverState *bs,
diff --git a/block/vmdk.c b/block/vmdk.c
index e24f6c7..f829620 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1015,7 +1015,21 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
                             sector_num * 512, 0, &offset);
     qemu_co_mutex_unlock(&s->lock);
 
-    ret = (ret == VMDK_OK || ret == VMDK_ZEROED);
+    switch (ret) {
+    case VMDK_ERROR:
+	ret = -EIO;
+	break;
+    case VMDK_UNALLOC:
+	ret = 0;
+	break;
+    case VMDK_ZEROED:
+	ret = BDRV_BLOCK_ZERO;
+	break;
+    case VMDK_OK:
+	/* TODO: might return offset if the extents are in bs->file.  */
+	ret = BDRV_BLOCK_DATA;
+	break;
+    }
 
     index_in_cluster = sector_num % extent->cluster_sectors;
     n = extent->cluster_sectors - index_in_cluster;
diff --git a/block/vvfat.c b/block/vvfat.c
index 510a559..415fba3 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -2879,11 +2879,12 @@ static int64_t coroutine_fn vvfat_co_get_block_status(BlockDriverState *bs,
 {
     BDRVVVFATState* s = bs->opaque;
     *n = s->sector_count - sector_num;
-    if (*n > nb_sectors)
-	*n = nb_sectors;
-    else if (*n < 0)
-	return 0;
-    return 1;
+    if (*n > nb_sectors) {
+        *n = nb_sectors;
+    } else if (*n < 0) {
+        return 0;
+    }
+    return BDRV_BLOCK_DATA;
 }
 
 static int write_target_commit(BlockDriverState *bs, int64_t sector_num,
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (10 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 11/17] block: return get_block_status data and flags for formats Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-04  5:34   ` Fam Zheng
                     ` (2 more replies)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 13/17] block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO Paolo Bonzini
                   ` (5 subsequent siblings)
  17 siblings, 3 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

This command dumps the metadata of an entire chain, in either tabular or JSON
format.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qemu-img-cmds.hx |   6 ++
 qemu-img.c       | 186 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 192 insertions(+)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 4ca7e95..acc3a1c 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -45,6 +45,12 @@ STEXI
 @item info [-f @var{fmt}] [--output=@var{ofmt}] [--backing-chain] @var{filename}
 ETEXI
 
+DEF("map", img_map,
+    "map [-f fmt] [--output=ofmt] filename")
+STEXI
+@item map [-f @var{fmt}] [--output=@var{ofmt}] filename
+ETEXI
+
 DEF("snapshot", img_snapshot,
     "snapshot [-q] [-l | -a snapshot | -c snapshot | -d snapshot] filename")
 STEXI
diff --git a/qemu-img.c b/qemu-img.c
index 6461932..97392a5 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1768,6 +1768,192 @@ static int img_info(int argc, char **argv)
     return 0;
 }
 
+
+typedef struct MapEntry {
+    int flags;
+    int depth;
+    int64_t start;
+    int64_t length;
+    int64_t offset;
+} MapEntry;
+
+static void dump_map_entry(OutputFormat output_format, MapEntry *e,
+                           MapEntry *next)
+{
+    switch (output_format) {
+    case OFORMAT_HUMAN:
+        if ((e->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == BDRV_BLOCK_DATA) {
+            printf("%lld %lld %d %lld\n",
+                   (long long) e->start, (long long) e->length,
+                   e->depth, (long long) e->offset);
+        }
+        /* This format ignores the distinction between 0, ZERO and ZERO|DATA.
+         * Modify the flags here to allow more coalescing.
+         */
+        if (next &&
+            (next->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) != BDRV_BLOCK_DATA) {
+            next->flags &= ~BDRV_BLOCK_DATA;
+            next->flags |= BDRV_BLOCK_ZERO;
+        }
+        break;
+    case OFORMAT_JSON:
+        printf("%s{ 'depth': %d, 'start': %lld, 'length': %lld, "
+               "'zero': %s, 'data': %s",
+               (e->start == 0 ? "[" : ",\n"),
+               e->depth, (long long) e->start, (long long) e->length,
+               (e->flags & BDRV_BLOCK_ZERO) ? "true" : "false",
+               (e->flags & BDRV_BLOCK_DATA) ? "true" : "false");
+        if (e->flags & BDRV_BLOCK_OFFSET_VALID) {
+            printf(", 'offset': %lld", (long long) e->offset);
+        }
+	putchar('}');
+
+        if (!next) {
+            printf("]\n");
+        }
+        break;
+    }
+}
+
+static int64_t get_block_status(BlockDriverState *bs, int64_t sector_num,
+                                int *nb_sectors, int *depth)
+{
+    int64_t ret;
+
+    /* As an optimization, we could cache the current range of unallocated
+     * clusters in each file of the chain, and avoid querying the same
+     * range repeatedly.
+     */
+
+    *depth = 0;
+    for (;;) {
+        int orig_nb_sectors = *nb_sectors;
+
+        if (bs == NULL) {
+            return 0;
+        }
+        ret = bdrv_get_block_status(bs, sector_num, *nb_sectors, nb_sectors);
+        if (ret < 0) {
+            return ret;
+        }
+
+        if (ret & (BDRV_BLOCK_ZERO|BDRV_BLOCK_DATA)) {
+            return ret;
+        }
+        if (!*nb_sectors) {
+            /* Beyond the end of this image (while presumably within the
+             * range of the BlockDriverState above this one).  The extra
+             * data is read as zeroes.
+             */
+            *nb_sectors = orig_nb_sectors;
+            return BDRV_BLOCK_ZERO;
+        }
+
+        bs = bs->backing_hd;
+        (*depth)++;
+    }
+}
+
+static int img_map(int argc, char **argv)
+{
+    int c;
+    OutputFormat output_format = OFORMAT_HUMAN;
+    BlockDriverState *bs;
+    const char *filename, *fmt, *output;
+    int64_t length;
+    MapEntry curr = { .length = 0 }, next;
+
+    fmt = NULL;
+    output = NULL;
+    for(;;) {
+        int option_index = 0;
+        static const struct option long_options[] = {
+            {"help", no_argument, 0, 'h'},
+            {"format", required_argument, 0, 'f'},
+            {"output", required_argument, 0, OPTION_OUTPUT},
+            {0, 0, 0, 0}
+        };
+        c = getopt_long(argc, argv, "f:h",
+                        long_options, &option_index);
+        if (c == -1) {
+            break;
+        }
+        switch(c) {
+        case '?':
+        case 'h':
+            help();
+            break;
+        case 'f':
+            fmt = optarg;
+            break;
+        case OPTION_OUTPUT:
+            output = optarg;
+            break;
+        }
+    }
+    if (optind >= argc) {
+        help();
+    }
+    filename = argv[optind++];
+
+    if (output && !strcmp(output, "json")) {
+        output_format = OFORMAT_JSON;
+    } else if (output && !strcmp(output, "human")) {
+        output_format = OFORMAT_HUMAN;
+    } else if (output) {
+        error_report("--output must be used with human or json as argument.");
+        return 1;
+    }
+
+    bs = bdrv_new_open(filename, fmt, BDRV_O_FLAGS, true, false);
+    if (!bs) {
+        return 1;
+    }
+
+    length = bdrv_getlength(bs);
+    while (curr.start + curr.length < length) {
+        int64_t nsectors_left, ret;
+        int64_t sector_num;
+        int n, depth, flags;
+
+        /* Probe up to 1 G at a time.  */
+        sector_num = (curr.start + curr.length) >> BDRV_SECTOR_BITS;
+        nsectors_left = DIV_ROUND_UP(length, BDRV_SECTOR_SIZE) - sector_num;
+        n = MAX(1 << (30 - BDRV_SECTOR_BITS), nsectors_left);
+        ret = get_block_status(bs, sector_num, &n, &depth);
+
+        if (ret < 0) {
+            error_report("Could not read file metadata: %s", strerror(-ret));
+            return 1;
+        }
+        if ((ret & BDRV_BLOCK_DATA) &&
+            !(ret & BDRV_BLOCK_OFFSET_VALID)) {
+            error_report("File contains external or compressed clusters.");
+            return 1;
+        }
+
+        flags = ret & ~BDRV_BLOCK_OFFSET_MASK;
+        ret &= BDRV_BLOCK_OFFSET_MASK;
+        if (curr.length == 0 || curr.flags != flags || curr.depth != depth ||
+            ((curr.flags & BDRV_BLOCK_OFFSET_VALID) &&
+	     curr.offset + curr.length != ret)) {
+            next.flags = flags;
+            next.depth = depth;
+            next.start = sector_num << BDRV_SECTOR_BITS;
+            next.offset = ret;
+            next.length = 0;
+            if (curr.length > 0) {
+                dump_map_entry(output_format, &curr, &next);
+            }
+            curr = next;
+        }
+        curr.length += n << BDRV_SECTOR_BITS;
+    }
+
+    dump_map_entry(output_format, &curr, NULL);
+    return 0;
+}
+
 #define SNAPSHOT_LIST   1
 #define SNAPSHOT_CREATE 2
 #define SNAPSHOT_APPLY  3
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 13/17] block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (11 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 14/17] raw-posix: return get_block_status data and flags Paolo Bonzini
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Alternatively, this could use a "discard zeroes data" flag returned
by bdrv_get_info.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 8931cac..cd371cd 100644
--- a/block.c
+++ b/block.c
@@ -2963,6 +2963,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
                                                      int nb_sectors, int *pnum)
 {
     int64_t n;
+    int64_t ret;
 
     if (sector_num >= bs->total_sectors) {
         *pnum = 0;
@@ -2979,7 +2980,12 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
         return BDRV_BLOCK_DATA;
     }
 
-    return bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
+    ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
+    if (!(ret & BDRV_BLOCK_DATA) && bdrv_has_zero_init(bs)) {
+        ret |= BDRV_BLOCK_ZERO;
+    }
+
+    return ret;
 }
 
 /* Coroutine wrapper for bdrv_get_block_status() */
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 14/17] raw-posix: return get_block_status data and flags
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (12 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 13/17] block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 15/17] raw-posix: detect XFS unwritten extents Paolo Bonzini
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/raw-posix.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index dbc65b0..d011cfd 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1089,7 +1089,7 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
                                             int nb_sectors, int *pnum)
 {
     off_t start, data, hole;
-    int ret;
+    int64_t ret;
 
     ret = fd_open(bs);
     if (ret < 0) {
@@ -1097,6 +1097,7 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
     }
 
     start = sector_num * BDRV_SECTOR_SIZE;
+    ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
 
 #ifdef CONFIG_FIEMAP
 
@@ -1114,7 +1115,7 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
     if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) {
         /* Assume everything is allocated.  */
         *pnum = nb_sectors;
-        return 1;
+        return ret;
     }
 
     if (f.fm.fm_mapped_extents == 0) {
@@ -1141,7 +1142,7 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
 
         /* Most likely EINVAL.  Assume everything is allocated.  */
         *pnum = nb_sectors;
-        return 1;
+        return ret;
     }
 
     if (hole > start) {
@@ -1154,19 +1155,21 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
         }
     }
 #else
-    *pnum = nb_sectors;
-    return 1;
+    data = 0;
+    hole = start + nb_sectors * BDRV_SECTOR_SIZE;
 #endif
 
     if (data <= start) {
         /* On a data extent, compute sectors to the end of the extent.  */
         *pnum = MIN(nb_sectors, (hole - start) / BDRV_SECTOR_SIZE);
-        return 1;
     } else {
         /* On a hole, compute sectors to the beginning of the next extent.  */
         *pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
-        return 0;
+        ret &= ~BDRV_BLOCK_DATA;
+        ret |= BDRV_BLOCK_ZERO;
     }
+
+    return ret;
 }
 
 static coroutine_fn BlockDriverAIOCB *raw_aio_discard(BlockDriverState *bs,
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 15/17] raw-posix: detect XFS unwritten extents
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (13 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 14/17] raw-posix: return get_block_status data and flags Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols Paolo Bonzini
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/raw-posix.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index d011cfd..1b41ea3 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1128,6 +1128,9 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
     } else {
         data = f.fe.fe_logical;
         hole = f.fe.fe_logical + f.fe.fe_length;
+        if (f.fe.fe_flags & FIEMAP_EXTENT_UNWRITTEN) {
+            ret |= BDRV_BLOCK_ZERO;
+        }
     }
 
 #elif defined SEEK_HOLE && defined SEEK_DATA
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (14 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 15/17] raw-posix: detect XFS unwritten extents Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-16  6:47   ` Peter Lieven
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 17/17] block: look for zero blocks in bs->file Paolo Bonzini
  2013-07-16  3:41 ` [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Stefan Hajnoczi
  17 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Protocols return raw data, so you can assume the offsets to pass
through unchanged.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index cd371cd..ff8ced7 100644
--- a/block.c
+++ b/block.c
@@ -2977,7 +2977,11 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
 
     if (!bs->drv->bdrv_co_get_block_status) {
         *pnum = nb_sectors;
-        return BDRV_BLOCK_DATA;
+        ret = BDRV_BLOCK_DATA;
+        if (bs->drv->protocol_name) {
+            ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num * BDRV_SECTOR_SIZE);
+        }
+        return ret;
     }
 
     ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [Qemu-devel] [PATCH 17/17] block: look for zero blocks in bs->file
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (15 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols Paolo Bonzini
@ 2013-07-03 14:34 ` Paolo Bonzini
  2013-07-16  3:41 ` [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Stefan Hajnoczi
  17 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-03 14:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pl, stefanha

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index ff8ced7..80f8dee 100644
--- a/block.c
+++ b/block.c
@@ -2963,7 +2963,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
                                                      int nb_sectors, int *pnum)
 {
     int64_t n;
-    int64_t ret;
+    int64_t ret, ret2;
 
     if (sector_num >= bs->total_sectors) {
         *pnum = 0;
@@ -2989,6 +2989,14 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
         ret |= BDRV_BLOCK_ZERO;
     }
 
+    if (bs->file &&
+        (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
+        (ret & BDRV_BLOCK_OFFSET_VALID)) {
+        ret2 = bdrv_co_get_block_status(bs->file, ret >> BDRV_SECTOR_BITS,
+                                        *pnum, pnum);
+        ret |= (ret2 & BDRV_BLOCK_ZERO);
+    }
+
     return ret;
 }
 
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 10/17] block: define get_block_status return value
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 10/17] block: define get_block_status return value Paolo Bonzini
@ 2013-07-03 21:04   ` Peter Lieven
  2013-07-04  8:13     ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Lieven @ 2013-07-03 21:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, qemu-devel, stefanha


Am 03.07.2013 um 16:34 schrieb Paolo Bonzini <pbonzini@redhat.com>:

> Define the return value of get_block_status.  Bits 0, 1, 2 and 8-62
> are valid; bit 63 (the sign bit) is reserved for errors.  Bits 3-7
> are left for future extensions.

Is Bit 8 not also reserved for future use? BDRV_SECTOR_BITS is 9.

Can you explain which information is exactly returned in Bits 9-62?

Peter

> 
> The return code is compatible with the old is_allocated API: returning
> just 0 or 1 (aka BDRV_BLOCK_DATA) will not cause any behavioral change
> in clients of is_allocated.  We will return more precise information
> in the next patches.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block.c               |  7 +++++--
> include/block/block.h | 23 +++++++++++++++++++++++
> 2 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index aa1a5f7..8931cac 100644
> --- a/block.c
> +++ b/block.c
> @@ -2976,7 +2976,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
> 
>     if (!bs->drv->bdrv_co_get_block_status) {
>         *pnum = nb_sectors;
> -        return 1;
> +        return BDRV_BLOCK_DATA;
>     }
> 
>     return bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
> @@ -3026,7 +3026,10 @@ int64_t bdrv_get_block_status(BlockDriverState *bs, int64_t sector_num,
> int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
>                                    int nb_sectors, int *pnum)
> {
> -    return bdrv_get_block_status(bs, sector_num, nb_sectors, pnum);
> +    int64_t ret = bdrv_get_block_status(bs, sector_num, nb_sectors, pnum);
> +    return
> +        (ret & BDRV_BLOCK_DATA) ||
> +        ((ret & BDRV_BLOCK_ZERO) && !bdrv_has_zero_init(bs));
> }
> 
> /*
> diff --git a/include/block/block.h b/include/block/block.h
> index 2b50b51..9e44bdd 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -81,6 +81,29 @@ typedef struct BlockDevOps {
> #define BDRV_SECTOR_SIZE   (1ULL << BDRV_SECTOR_BITS)
> #define BDRV_SECTOR_MASK   ~(BDRV_SECTOR_SIZE - 1)
> 
> +/* BDRV_BLOCK_DATA: data is read from bs->file or another file
> + * BDRV_BLOCK_ZERO: sectors read as zero
> + * BDRV_BLOCK_OFFSET_VALID: contents of sector available in bs->file at offset
> + *
> + * DATA == 0 && ZERO == 0 means that data is read from backing_hd if present.
> + *
> + * DATA ZERO OFFSET_VALID
> + *  t    t        t       sectors read as zero, bs->file is zero at offset
> + *  t    f        t       sectors read as valid from bs->file at offset
> + *  f    t        t       sectors preallocated, read as zero, bs->file not
> + *                        necessarily zero at offset
> + *  f    f        t       sectors preallocated but read from backing_hd,
> + *                        bs->file contains garbage at offset
> + *  t    t        f       sectors preallocated, read as zero, unknown offset
> + *  t    f        f       sectors read from unknown file or offset
> + *  f    t        f       not allocated or unknown offset, read as zero
> + *  f    f        f       not allocated or unknown offset, read from backing_hd
> + */
> +#define BDRV_BLOCK_DATA         1
> +#define BDRV_BLOCK_ZERO         2
> +#define BDRV_BLOCK_OFFSET_VALID 4
> +#define BDRV_BLOCK_OFFSET_MASK  BDRV_SECTOR_MASK
> +
> typedef enum {
>     BDRV_ACTION_REPORT, BDRV_ACTION_IGNORE, BDRV_ACTION_STOP
> } BlockErrorAction;
> -- 
> 1.8.2.1
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed Paolo Bonzini
@ 2013-07-04  2:20   ` Fam Zheng
  2013-07-04  8:08     ` Paolo Bonzini
  2013-07-05  9:09   ` Stefan Hajnoczi
  1 sibling, 1 reply; 48+ messages in thread
From: Fam Zheng @ 2013-07-04  2:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

On Wed, 07/03 16:34, Paolo Bonzini wrote:
> Do not do two reads for each sector; load each sector of the bitmap
> and use bitmap operations to process it.
> 
> Writes are still dog slow!
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/cow.c | 54 ++++++++++++++++++++++++++++++++----------------------
>  1 file changed, 32 insertions(+), 22 deletions(-)
> 
> diff --git a/block/cow.c b/block/cow.c
> index 1cc2e89..204451e 100644
> --- a/block/cow.c
> +++ b/block/cow.c
> @@ -126,18 +126,31 @@ static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
>      return 0;
>  }
>  
> -static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
> +#define BITS_PER_BITMAP_SECTOR (512 * 8)
> +
> +/* Cannot use bitmap.c on big-endian machines.  */
> +static int cow_test_bit(int64_t bitnum, const uint8_t *bitmap)
>  {
> -    uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
> -    uint8_t bitmap;
> -    int ret;
> +    return (bitmap[bitnum / 8] & (1 << (bitnum & 7))) != 0;
> +}
>  
> -    ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
> -    if (ret < 0) {
> -       return ret;
> +static int cow_find_streak(const uint8_t *bitmap, int value, int start, int nb_sectors)
I think type bool is better for 'value' as you don't booleanize it.  And
also int64_t for start?
> +{
> +    int streak_value = value ? 0xFF : 0;
> +    int last = MIN(start + nb_sectors, BITS_PER_BITMAP_SECTOR);
> +    int bitnum = start;
> +    while (bitnum < last) {
> +        if ((bitnum & 7) == 0 && bitmap[bitnum / 8] == streak_value) {
> +            bitnum += 8;
> +            continue;
> +        }
> +        if (cow_test_bit(bitnum, bitmap) == value) {
> +            bitnum++;
> +            continue;
> +        }
> +        break;
>      }
> -
> -    return !!(bitmap & (1 << (bitnum % 8)));
> +    return MIN(bitnum, last) - start;
>  }
>  
>  /* Return true if first block has been changed (ie. current version is
> @@ -146,23 +159,20 @@ static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
>  static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
>          int64_t sector_num, int nb_sectors, int *num_same)
>  {
> +    int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
> +    uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
> +    uint8_t bitmap[512];
> +    int ret;
>      int changed;
>  
> -    if (nb_sectors == 0) {
> -	*num_same = nb_sectors;
> -	return 0;
> -    }
> -
> -    changed = is_bit_set(bs, sector_num);
> -    if (changed < 0) {
> -        return 0; /* XXX: how to return I/O errors? */
> -    }
> -
> -    for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
> -	if (is_bit_set(bs, sector_num + *num_same) != changed)
> -	    break;
> +    ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
> +    if (ret < 0) {
> +        return ret;
>      }
>  
> +    bitnum &= BITS_PER_BITMAP_SECTOR - 1;
> +    changed = cow_test_bit(bitnum, bitmap);
> +    *num_same = cow_find_streak(bitmap, changed, bitnum, nb_sectors);
>      return changed;
>  }
>  
> -- 
> 1.8.2.1
> 
> 
> 

-- 
Fam

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 02/17] cow: make writes go at a less indecent speed
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 02/17] cow: make writes go at a less indecent speed Paolo Bonzini
@ 2013-07-04  2:40   ` Fam Zheng
  2013-07-04  8:11     ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Fam Zheng @ 2013-07-04  2:40 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

On Wed, 07/03 16:34, Paolo Bonzini wrote:
> Only sync once per write, rather than once per sector.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/cow.c | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/block/cow.c b/block/cow.c
> index 204451e..133e596 100644
> --- a/block/cow.c
> +++ b/block/cow.c
> @@ -106,7 +106,7 @@ static int cow_open(BlockDriverState *bs, QDict *options, int flags)
>   * XXX(hch): right now these functions are extremely inefficient.
>   * We should just read the whole bitmap we'll need in one go instead.
>   */
> -static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
> +static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum, bool *first)
Why flush _before first_ write, rather than (more intuitively) flush
_after last_ write? And personally I think "bool sync" makes a better
signature than "bool *first", although it's not that critical as
cow_update_bitmap is the only caller.
>  {
>      uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
>      uint8_t bitmap;
> @@ -117,9 +117,21 @@ static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
>         return ret;
>      }
>  
> +    if (bitmap & (1 << (bitnum % 8))) {
> +        return 0;
> +    }
> +
> +    if (*first) {
> +        ret = bdrv_flush(bs->file);
> +        if (ret < 0) {
> +           return ret;
> +        }
> +        *first = false;
> +    }
> +
>      bitmap |= (1 << (bitnum % 8));
>  
> -    ret = bdrv_pwrite_sync(bs->file, offset, &bitmap, sizeof(bitmap));
> +    ret = bdrv_pwrite(bs->file, offset, &bitmap, sizeof(bitmap));
>      if (ret < 0) {
>         return ret;
>      }
> @@ -181,9 +193,10 @@ static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
>  {
>      int error = 0;
>      int i;
> +    bool first = true;
>  
>      for (i = 0; i < nb_sectors; i++) {
> -        error = cow_set_bit(bs, sector_num + i);
> +        error = cow_set_bit(bs, sector_num + i, &first);
>          if (error) {
>              break;
>          }
> -- 
> 1.8.2.1
> 
> 
> 

-- 
Fam

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 11/17] block: return get_block_status data and flags for formats
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 11/17] block: return get_block_status data and flags for formats Paolo Bonzini
@ 2013-07-04  3:22   ` Fam Zheng
  2013-07-04  8:14     ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Fam Zheng @ 2013-07-04  3:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

On Wed, 07/03 16:34, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/cow.c      |  8 +++++++-
>  block/qcow.c     |  9 ++++++++-
>  block/qcow2.c    | 16 ++++++++++++++--
>  block/qed.c      | 35 ++++++++++++++++++++++++++++-------
>  block/sheepdog.c |  2 +-
>  block/vdi.c      | 13 ++++++++++++-
>  block/vmdk.c     | 16 +++++++++++++++-
>  block/vvfat.c    | 11 ++++++-----
>  8 files changed, 91 insertions(+), 19 deletions(-)
> 
> diff --git a/block/cow.c b/block/cow.c
> index efe5ead..f6982d4 100644
> --- a/block/cow.c
> +++ b/block/cow.c
> @@ -194,7 +194,13 @@ static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
>  static int64_t coroutine_fn cow_co_get_block_status(BlockDriverState *bs,
>          int64_t sector_num, int nb_sectors, int *num_same)
>  {
> -    return cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
> +    BDRVCowState *s = bs->opaque;
> +    int ret = cow_co_is_allocated(bs, sector_num, nb_sectors, num_same);
> +    int64_t offset = s->cow_sectors_offset + (sector_num << BDRV_SECTOR_BITS);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +    return (ret ? BDRV_BLOCK_DATA : 0) | offset | BDRV_BLOCK_OFFSET_VALID;
>  }
>  
>  static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
> diff --git a/block/qcow.c b/block/qcow.c
> index acd1aeb..9aebcc3 100644
> --- a/block/qcow.c
> +++ b/block/qcow.c
> @@ -410,7 +410,14 @@ static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
>      if (n > nb_sectors)
>          n = nb_sectors;
>      *pnum = n;
> -    return (cluster_offset != 0);
> +    if (!cluster_offset) {
> +	return 0;
> +    }
> +    if ((cluster_offset & QCOW_OFLAG_COMPRESSED) || s->crypt_method) {
> +        return BDRV_BLOCK_DATA;
> +    }
> +    cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
> +    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | cluster_offset;
>  }
>  
>  static int decompress_buffer(uint8_t *out_buf, int out_buf_size,
> diff --git a/block/qcow2.c b/block/qcow2.c
> index d35a134..cac7ee8 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -645,7 +645,8 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
>  {
>      BDRVQcowState *s = bs->opaque;
>      uint64_t cluster_offset;
> -    int ret;
> +    int index_in_cluster, ret;
> +    int64_t status = 0;
>  
>      *pnum = nb_sectors;
>      qemu_co_mutex_lock(&s->lock);
> @@ -655,7 +656,18 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
>          return ret;
>      }
>  
> -    return (cluster_offset != 0) || (ret == QCOW2_CLUSTER_ZERO);
> +    if (cluster_offset != 0 && ret != QCOW2_CLUSTER_COMPRESSED &&
> +        !s->crypt_method) {
> +        index_in_cluster = sector_num & (s->cluster_sectors - 1);
> +        cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
> +        status |= BDRV_BLOCK_OFFSET_VALID | cluster_offset;
> +    }
> +    if (ret == QCOW2_CLUSTER_ZERO) {
> +        status |= BDRV_BLOCK_ZERO;
> +    } else if (ret != QCOW2_CLUSTER_UNALLOCATED) {
> +        status |= BDRV_BLOCK_DATA;
> +    }
> +    return status;
>  }
>  
>  /* handle reading after the end of the backing file */
> diff --git a/block/qed.c b/block/qed.c
> index b0978ba..cf4fbed 100644
> --- a/block/qed.c
> +++ b/block/qed.c
> @@ -652,16 +652,36 @@ static int bdrv_qed_create(const char *filename, QEMUOptionParameter *options)
>  }
>  
>  typedef struct {
> +    BlockDriverState *bs;
>      Coroutine *co;
> -    int is_allocated;
> +    uint64_t pos;
> +    int64_t status;
>      int *pnum;
>  } QEDIsAllocatedCB;
>  
>  static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t len)
>  {
>      QEDIsAllocatedCB *cb = opaque;
> +    BDRVQEDState *s = cb->bs->opaque;
>      *cb->pnum = len / BDRV_SECTOR_SIZE;
> -    cb->is_allocated = (ret == QED_CLUSTER_FOUND || ret == QED_CLUSTER_ZERO);
> +    switch (ret) {
> +    case QED_CLUSTER_FOUND:
> +       offset |= qed_offset_into_cluster(s, cb->pos);
> +	cb->status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
> +	break;
> +    case QED_CLUSTER_ZERO:
> +	cb->status = BDRV_BLOCK_ZERO;
> +	break;
> +    case QED_CLUSTER_L2:
> +    case QED_CLUSTER_L1:
> +	cb->status = 0;
> +	break;
> +    default:
> +	assert(ret < 0);
> +	cb->status = ret;
> +        break;
> +    }
> +
>      if (cb->co) {
>          qemu_coroutine_enter(cb->co, NULL);
>      }
> @@ -672,25 +692,26 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
>                                                   int nb_sectors, int *pnum)
>  {
>      BDRVQEDState *s = bs->opaque;
> -    uint64_t pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE;
>      size_t len = (size_t)nb_sectors * BDRV_SECTOR_SIZE;
>      QEDIsAllocatedCB cb = {
> -        .is_allocated = -1,
> +        .bs = bs,
> +        .pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE,
> +        .status = BDRV_BLOCK_OFFSET_MASK,
>          .pnum = pnum,
>      };
>      QEDRequest request = { .l2_table = NULL };
>  
> -    qed_find_cluster(s, &request, pos, len, qed_is_allocated_cb, &cb);
> +    qed_find_cluster(s, &request, cb.pos, len, qed_is_allocated_cb, &cb);
>  
>      /* Now sleep if the callback wasn't invoked immediately */
> -    while (cb.is_allocated == -1) {
> +    while (cb.status == BDRV_BLOCK_OFFSET_MASK) {
>          cb.co = qemu_coroutine_self();
>          qemu_coroutine_yield();
>      }
>  
>      qed_unref_l2_cache_entry(request.l2_table);
>  
> -    return cb.is_allocated;
> +    return cb.status;
>  }
>  
>  static int bdrv_qed_make_empty(BlockDriverState *bs)
> diff --git a/block/sheepdog.c b/block/sheepdog.c
> index e7b9c22..3dca26a 100644
> --- a/block/sheepdog.c
> +++ b/block/sheepdog.c
> @@ -2297,7 +2297,7 @@ sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
>                    end = DIV_ROUND_UP((sector_num + nb_sectors) *
>                                       BDRV_SECTOR_SIZE, SD_DATA_OBJ_SIZE);
>      unsigned long idx;
> -    int ret = 1;
> +    int64_t ret = BDRV_BLOCK_DATA;
>  
>      for (idx = start; idx < end; idx++) {
>          if (inode->data_vdi_id[idx] == 0) {
> diff --git a/block/vdi.c b/block/vdi.c
> index 7ab2567..e96b5b3 100644
> --- a/block/vdi.c
> +++ b/block/vdi.c
> @@ -479,12 +479,23 @@ static int64_t coroutine_fn vdi_co_get_block_status(BlockDriverState *bs,
>      size_t sector_in_block = sector_num % s->block_sectors;
>      int n_sectors = s->block_sectors - sector_in_block;
>      uint32_t bmap_entry = le32_to_cpu(s->bmap[bmap_index]);
> +    uint64_t offset;
> +    int result;
> +
>      logout("%p, %" PRId64 ", %d, %p\n", bs, sector_num, nb_sectors, pnum);
>      if (n_sectors > nb_sectors) {
>          n_sectors = nb_sectors;
>      }
>      *pnum = n_sectors;
> -    return VDI_IS_ALLOCATED(bmap_entry);
> +    result = VDI_IS_ALLOCATED(bmap_entry);
> +    if (!result) {
> +	return 0;
> +    }
> +
> +    offset = s->header.offset_data +
> +                              (uint64_t)bmap_entry * s->block_size +
> +                              sector_in_block * SECTOR_SIZE;
> +    return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
>  }
>  
>  static int vdi_co_read(BlockDriverState *bs,
> diff --git a/block/vmdk.c b/block/vmdk.c
> index e24f6c7..f829620 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -1015,7 +1015,21 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
>                              sector_num * 512, 0, &offset);
>      qemu_co_mutex_unlock(&s->lock);
>  
> -    ret = (ret == VMDK_OK || ret == VMDK_ZEROED);
> +    switch (ret) {
> +    case VMDK_ERROR:
> +	ret = -EIO;
> +	break;
> +    case VMDK_UNALLOC:
> +	ret = 0;
> +	break;
> +    case VMDK_ZEROED:
> +	ret = BDRV_BLOCK_ZERO;
> +	break;
> +    case VMDK_OK:
> +	/* TODO: might return offset if the extents are in bs->file.  */
> +	ret = BDRV_BLOCK_DATA;
if (extent->file == bs->file) {
    ret |= BDRV_BLOCK_OFFSET_VALID | offset;
}
> +	break;
> +    }
>  
>      index_in_cluster = sector_num % extent->cluster_sectors;
>      n = extent->cluster_sectors - index_in_cluster;
> diff --git a/block/vvfat.c b/block/vvfat.c
> index 510a559..415fba3 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -2879,11 +2879,12 @@ static int64_t coroutine_fn vvfat_co_get_block_status(BlockDriverState *bs,
>  {
>      BDRVVVFATState* s = bs->opaque;
>      *n = s->sector_count - sector_num;
> -    if (*n > nb_sectors)
> -	*n = nb_sectors;
> -    else if (*n < 0)
> -	return 0;
> -    return 1;
> +    if (*n > nb_sectors) {
> +        *n = nb_sectors;
> +    } else if (*n < 0) {
> +        return 0;
> +    }
> +    return BDRV_BLOCK_DATA;
>  }
>  
>  static int write_target_commit(BlockDriverState *bs, int64_t sector_num,
> -- 
> 1.8.2.1
> 
> 
> 

-- 
Fam

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand Paolo Bonzini
@ 2013-07-04  5:34   ` Fam Zheng
  2013-07-04  8:16     ` Paolo Bonzini
  2013-07-16  3:31   ` Stefan Hajnoczi
  2013-07-18 20:04   ` Eric Blake
  2 siblings, 1 reply; 48+ messages in thread
From: Fam Zheng @ 2013-07-04  5:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

On Wed, 07/03 16:34, Paolo Bonzini wrote:
> This command dumps the metadata of an entire chain, in either tabular or JSON
> format.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qemu-img-cmds.hx |   6 ++
>  qemu-img.c       | 186 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 192 insertions(+)
> 
> diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
> index 4ca7e95..acc3a1c 100644
> --- a/qemu-img-cmds.hx
> +++ b/qemu-img-cmds.hx
> @@ -45,6 +45,12 @@ STEXI
>  @item info [-f @var{fmt}] [--output=@var{ofmt}] [--backing-chain] @var{filename}
>  ETEXI
>  
> +DEF("map", img_map,
> +    "map [-f fmt] [--output=ofmt] filename")
> +STEXI
> +@item map [-f @var{fmt}] [--output=@var{ofmt}] filename
> +ETEXI
> +
>  DEF("snapshot", img_snapshot,
>      "snapshot [-q] [-l | -a snapshot | -c snapshot | -d snapshot] filename")
>  STEXI
> diff --git a/qemu-img.c b/qemu-img.c
> index 6461932..97392a5 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -1768,6 +1768,192 @@ static int img_info(int argc, char **argv)
>      return 0;
>  }
>  
> +
> +typedef struct MapEntry {
> +    int flags;
> +    int depth;
> +    int64_t start;
> +    int64_t length;
> +    int64_t offset;
> +} MapEntry;
> +
> +static void dump_map_entry(OutputFormat output_format, MapEntry *e,
> +                           MapEntry *next)
> +{
> +    switch (output_format) {
> +    case OFORMAT_HUMAN:
> +        if ((e->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == BDRV_BLOCK_DATA) {
> +            printf("%lld %lld %d %lld\n",
> +                   (long long) e->start, (long long) e->length,
> +                   e->depth, (long long) e->offset);
> +        }
Why %lld and explicit cast, not using PRId64?

Is BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO distinguishable here for the
user? By offset?
> +        /* This format ignores the distinction between 0, ZERO and ZERO|DATA.
> +         * Modify the flags here to allow more coalescing.
> +         */
> +        if (next &&
> +            (next->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) != BDRV_BLOCK_DATA) {
> +            next->flags &= ~BDRV_BLOCK_DATA;
> +            next->flags |= BDRV_BLOCK_ZERO;
> +        }
> +        break;
> +    case OFORMAT_JSON:
> +        printf("%s{ 'depth': %d, 'start': %lld, 'length': %lld, "
> +               "'zero': %s, 'data': %s",
> +               (e->start == 0 ? "[" : ",\n"),
> +               e->depth, (long long) e->start, (long long) e->length,
> +               (e->flags & BDRV_BLOCK_ZERO) ? "true" : "false",
> +               (e->flags & BDRV_BLOCK_DATA) ? "true" : "false");
> +        if (e->flags & BDRV_BLOCK_OFFSET_VALID) {
> +            printf(", 'offset': %lld", (long long) e->offset);
> +        }
> +	putchar('}');
No indentation.
> +
> +        if (!next) {
> +            printf("]\n");
> +        }
> +        break;
> +    }
> +}
> +
> +static int64_t get_block_status(BlockDriverState *bs, int64_t sector_num,
> +                                int *nb_sectors, int *depth)
> +{
> +    int64_t ret;
> +
> +    /* As an optimization, we could cache the current range of unallocated
> +     * clusters in each file of the chain, and avoid querying the same
> +     * range repeatedly.
> +     */
> +
> +    *depth = 0;
> +    for (;;) {
> +        int orig_nb_sectors = *nb_sectors;
> +
> +        if (bs == NULL) {
> +            return 0;
> +        }
> +        ret = bdrv_get_block_status(bs, sector_num, *nb_sectors, nb_sectors);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +
> +        if (ret & (BDRV_BLOCK_ZERO|BDRV_BLOCK_DATA)) {
> +            return ret;
> +        }
> +        if (!*nb_sectors) {
> +            /* Beyond the end of this image (while presumably within the
> +             * range of the BlockDriverState above this one).  The extra
> +             * data is read as zeroes.
> +             */
> +            *nb_sectors = orig_nb_sectors;
> +            return BDRV_BLOCK_ZERO;
> +        }
> +
> +        bs = bs->backing_hd;
> +        (*depth)++;
> +    }
> +}
> +
> +static int img_map(int argc, char **argv)
> +{
> +    int c;
> +    OutputFormat output_format = OFORMAT_HUMAN;
> +    BlockDriverState *bs;
> +    const char *filename, *fmt, *output;
> +    int64_t length;
> +    MapEntry curr = { .length = 0 }, next;
> +
> +    fmt = NULL;
> +    output = NULL;
> +    for(;;) {
> +        int option_index = 0;
> +        static const struct option long_options[] = {
> +            {"help", no_argument, 0, 'h'},
> +            {"format", required_argument, 0, 'f'},
> +            {"output", required_argument, 0, OPTION_OUTPUT},
> +            {0, 0, 0, 0}
> +        };
> +        c = getopt_long(argc, argv, "f:h",
> +                        long_options, &option_index);
> +        if (c == -1) {
> +            break;
> +        }
> +        switch(c) {
> +        case '?':
> +        case 'h':
> +            help();
> +            break;
> +        case 'f':
> +            fmt = optarg;
> +            break;
> +        case OPTION_OUTPUT:
> +            output = optarg;
> +            break;
> +        }
> +    }
> +    if (optind >= argc) {
> +        help();
> +    }
> +    filename = argv[optind++];
> +
> +    if (output && !strcmp(output, "json")) {
> +        output_format = OFORMAT_JSON;
> +    } else if (output && !strcmp(output, "human")) {
> +        output_format = OFORMAT_HUMAN;
> +    } else if (output) {
> +        error_report("--output must be used with human or json as argument.");
> +        return 1;
> +    }
> +
> +    bs = bdrv_new_open(filename, fmt, BDRV_O_FLAGS, true, false);
> +    if (!bs) {
> +        return 1;
> +    }
> +
> +    length = bdrv_getlength(bs);
> +    while (curr.start + curr.length < length) {
> +        int64_t nsectors_left, ret;
> +        int64_t sector_num;
> +        int n, depth, flags;
> +
> +        /* Probe up to 1 G at a time.  */
> +        sector_num = (curr.start + curr.length) >> BDRV_SECTOR_BITS;
> +        nsectors_left = DIV_ROUND_UP(length, BDRV_SECTOR_SIZE) - sector_num;
> +        n = MAX(1 << (30 - BDRV_SECTOR_BITS), nsectors_left);
It's MAX or MIN?
> +        ret = get_block_status(bs, sector_num, &n, &depth);
> +
> +        if (ret < 0) {
> +            error_report("Could not read file metadata: %s", strerror(-ret));
> +            return 1;
> +        }
> +        if ((ret & BDRV_BLOCK_DATA) &&
> +            !(ret & BDRV_BLOCK_OFFSET_VALID)) {
> +            error_report("File contains external or compressed clusters.");
> +            return 1;
> +        }
> +
> +        flags = ret & ~BDRV_BLOCK_OFFSET_MASK;
> +        ret &= BDRV_BLOCK_OFFSET_MASK;
> +        if (curr.length == 0 || curr.flags != flags || curr.depth != depth ||
> +            ((curr.flags & BDRV_BLOCK_OFFSET_VALID) &&
> +	     curr.offset + curr.length != ret)) {
A hard tab here.
> +            next.flags = flags;
> +            next.depth = depth;
> +            next.start = sector_num << BDRV_SECTOR_BITS;
> +            next.offset = ret;
> +            next.length = 0;
> +            if (curr.length > 0) {
> +                dump_map_entry(output_format, &curr, &next);
> +            }
> +            curr = next;
> +        }
> +        curr.length += n << BDRV_SECTOR_BITS;
> +    }
> +
> +    dump_map_entry(output_format, &curr, NULL);
> +    return 0;
> +}
> +
>  #define SNAPSHOT_LIST   1
>  #define SNAPSHOT_CREATE 2
>  #define SNAPSHOT_APPLY  3
> -- 
> 1.8.2.1
> 
> 
> 

-- 
Fam

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed
  2013-07-04  2:20   ` Fam Zheng
@ 2013-07-04  8:08     ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-04  8:08 UTC (permalink / raw)
  To: famz; +Cc: kwolf, pl, qemu-devel, stefanha

Il 04/07/2013 04:20, Fam Zheng ha scritto:
> On Wed, 07/03 16:34, Paolo Bonzini wrote:
>> Do not do two reads for each sector; load each sector of the bitmap
>> and use bitmap operations to process it.
>>
>> Writes are still dog slow!
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  block/cow.c | 54 ++++++++++++++++++++++++++++++++----------------------
>>  1 file changed, 32 insertions(+), 22 deletions(-)
>>
>> diff --git a/block/cow.c b/block/cow.c
>> index 1cc2e89..204451e 100644
>> --- a/block/cow.c
>> +++ b/block/cow.c
>> @@ -126,18 +126,31 @@ static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
>>      return 0;
>>  }
>>  
>> -static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
>> +#define BITS_PER_BITMAP_SECTOR (512 * 8)
>> +
>> +/* Cannot use bitmap.c on big-endian machines.  */
>> +static int cow_test_bit(int64_t bitnum, const uint8_t *bitmap)
>>  {
>> -    uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
>> -    uint8_t bitmap;
>> -    int ret;
>> +    return (bitmap[bitnum / 8] & (1 << (bitnum & 7))) != 0;
>> +}
>>  
>> -    ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
>> -    if (ret < 0) {
>> -       return ret;
>> +static int cow_find_streak(const uint8_t *bitmap, int value, int start, int nb_sectors)
> I think type bool is better for 'value' as you don't booleanize it.  And
> also int64_t for start?

start is always between 0 and BITS_PER_BITMAP_SECTOR.

"value" here is a bit value, so 0 or 1 rather than true or false.  I
prefer to keep it as "int", but it can be changed.

Paolo

>> +{
>> +    int streak_value = value ? 0xFF : 0;
>> +    int last = MIN(start + nb_sectors, BITS_PER_BITMAP_SECTOR);
>> +    int bitnum = start;
>> +    while (bitnum < last) {
>> +        if ((bitnum & 7) == 0 && bitmap[bitnum / 8] == streak_value) {
>> +            bitnum += 8;
>> +            continue;
>> +        }
>> +        if (cow_test_bit(bitnum, bitmap) == value) {
>> +            bitnum++;
>> +            continue;
>> +        }
>> +        break;
>>      }
>> -
>> -    return !!(bitmap & (1 << (bitnum % 8)));
>> +    return MIN(bitnum, last) - start;
>>  }
>>  
>>  /* Return true if first block has been changed (ie. current version is
>> @@ -146,23 +159,20 @@ static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
>>  static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
>>          int64_t sector_num, int nb_sectors, int *num_same)
>>  {
>> +    int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
>> +    uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
>> +    uint8_t bitmap[512];
>> +    int ret;
>>      int changed;
>>  
>> -    if (nb_sectors == 0) {
>> -	*num_same = nb_sectors;
>> -	return 0;
>> -    }
>> -
>> -    changed = is_bit_set(bs, sector_num);
>> -    if (changed < 0) {
>> -        return 0; /* XXX: how to return I/O errors? */
>> -    }
>> -
>> -    for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
>> -	if (is_bit_set(bs, sector_num + *num_same) != changed)
>> -	    break;
>> +    ret = bdrv_pread(bs->file, offset, &bitmap, sizeof(bitmap));
>> +    if (ret < 0) {
>> +        return ret;
>>      }
>>  
>> +    bitnum &= BITS_PER_BITMAP_SECTOR - 1;
>> +    changed = cow_test_bit(bitnum, bitmap);
>> +    *num_same = cow_find_streak(bitmap, changed, bitnum, nb_sectors);
>>      return changed;
>>  }
>>  
>> -- 
>> 1.8.2.1
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 02/17] cow: make writes go at a less indecent speed
  2013-07-04  2:40   ` Fam Zheng
@ 2013-07-04  8:11     ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-04  8:11 UTC (permalink / raw)
  To: famz; +Cc: kwolf, pl, qemu-devel, stefanha

Il 04/07/2013 04:40, Fam Zheng ha scritto:
> On Wed, 07/03 16:34, Paolo Bonzini wrote:
>> Only sync once per write, rather than once per sector.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  block/cow.c | 19 ++++++++++++++++---
>>  1 file changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/cow.c b/block/cow.c
>> index 204451e..133e596 100644
>> --- a/block/cow.c
>> +++ b/block/cow.c
>> @@ -106,7 +106,7 @@ static int cow_open(BlockDriverState *bs, QDict *options, int flags)
>>   * XXX(hch): right now these functions are extremely inefficient.
>>   * We should just read the whole bitmap we'll need in one go instead.
>>   */
>> -static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
>> +static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum, bool *first)
> Why flush _before first_ write, rather than (more intuitively) flush
> _after last_ write?

Because you have to flush the data before you start writing the
metadata.  Flushing the metadata can be done when the guest issues a flush.

This ensures that, in case of a power loss, the metadata will never
refer to data that hasn't been written.

Paolo

 And personally I think "bool sync" makes a better
> signature than "bool *first", although it's not that critical as
> cow_update_bitmap is the only caller.
>>  {
>>      uint64_t offset = sizeof(struct cow_header_v2) + bitnum / 8;
>>      uint8_t bitmap;
>> @@ -117,9 +117,21 @@ static inline int cow_set_bit(BlockDriverState *bs, int64_t bitnum)
>>         return ret;
>>      }
>>  
>> +    if (bitmap & (1 << (bitnum % 8))) {
>> +        return 0;
>> +    }
>> +
>> +    if (*first) {
>> +        ret = bdrv_flush(bs->file);
>> +        if (ret < 0) {
>> +           return ret;
>> +        }
>> +        *first = false;
>> +    }
>> +
>>      bitmap |= (1 << (bitnum % 8));
>>  
>> -    ret = bdrv_pwrite_sync(bs->file, offset, &bitmap, sizeof(bitmap));
>> +    ret = bdrv_pwrite(bs->file, offset, &bitmap, sizeof(bitmap));
>>      if (ret < 0) {
>>         return ret;
>>      }
>> @@ -181,9 +193,10 @@ static int cow_update_bitmap(BlockDriverState *bs, int64_t sector_num,
>>  {
>>      int error = 0;
>>      int i;
>> +    bool first = true;
>>  
>>      for (i = 0; i < nb_sectors; i++) {
>> -        error = cow_set_bit(bs, sector_num + i);
>> +        error = cow_set_bit(bs, sector_num + i, &first);
>>          if (error) {
>>              break;
>>          }
>> -- 
>> 1.8.2.1
>>
>>
>>
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 10/17] block: define get_block_status return value
  2013-07-03 21:04   ` Peter Lieven
@ 2013-07-04  8:13     ` Paolo Bonzini
  2013-07-04 21:10       ` Peter Lieven
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-04  8:13 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, qemu-devel, stefanha

Il 03/07/2013 23:04, Peter Lieven ha scritto:
>> > Define the return value of get_block_status.  Bits 0, 1, 2 and 8-62
>> > are valid; bit 63 (the sign bit) is reserved for errors.  Bits 3-7
>> > are left for future extensions.
> Is Bit 8 not also reserved for future use? BDRV_SECTOR_BITS is 9.

Right.

> Can you explain which information is exactly returned in Bits 9-62?

Bits 9-62 are the offset at which the data is stored in bs->file, they
are valid if bit 2 (BDRV_BLOCK_OFFSET_VALID) is 1.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 11/17] block: return get_block_status data and flags for formats
  2013-07-04  3:22   ` Fam Zheng
@ 2013-07-04  8:14     ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-04  8:14 UTC (permalink / raw)
  To: famz; +Cc: kwolf, pl, qemu-devel, stefanha

Il 04/07/2013 05:22, Fam Zheng ha scritto:
>> > +    case VMDK_OK:
>> > +	/* TODO: might return offset if the extents are in bs->file.  */
>> > +	ret = BDRV_BLOCK_DATA;
> if (extent->file == bs->file) {
>     ret |= BDRV_BLOCK_OFFSET_VALID | offset;
> }

Thanks. :)

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-04  5:34   ` Fam Zheng
@ 2013-07-04  8:16     ` Paolo Bonzini
  2013-07-04  8:36       ` Fam Zheng
  0 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-04  8:16 UTC (permalink / raw)
  To: famz; +Cc: kwolf, pl, qemu-devel, stefanha

Il 04/07/2013 07:34, Fam Zheng ha scritto:
>> > +        if ((e->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == BDRV_BLOCK_DATA) {
>> > +            printf("%lld %lld %d %lld\n",
>> > +                   (long long) e->start, (long long) e->length,
>> > +                   e->depth, (long long) e->offset);
>> > +        }
> Why %lld and explicit cast, not using PRId64?

Will fix.

> Is BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO distinguishable here for the
> user? By offset?

I'm not sure I understand the question.

Zero blocks are always omitted in the "human" format.  Only non-zero
blocks are listed.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-04  8:16     ` Paolo Bonzini
@ 2013-07-04  8:36       ` Fam Zheng
  0 siblings, 0 replies; 48+ messages in thread
From: Fam Zheng @ 2013-07-04  8:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

On Thu, 07/04 10:16, Paolo Bonzini wrote:
> Il 04/07/2013 07:34, Fam Zheng ha scritto:
> >> > +        if ((e->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == BDRV_BLOCK_DATA) {
> >> > +            printf("%lld %lld %d %lld\n",
> >> > +                   (long long) e->start, (long long) e->length,
> >> > +                   e->depth, (long long) e->offset);
> >> > +        }
> > Why %lld and explicit cast, not using PRId64?
> 
> Will fix.
> 
> > Is BDRV_BLOCK_DATA and BDRV_BLOCK_ZERO distinguishable here for the
> > user? By offset?
> 
> I'm not sure I understand the question.
> 
> Zero blocks are always omitted in the "human" format.  Only non-zero
> blocks are listed.
I missed this.

-- 
Fam

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 10/17] block: define get_block_status return value
  2013-07-04  8:13     ` Paolo Bonzini
@ 2013-07-04 21:10       ` Peter Lieven
  2013-07-05  0:49         ` Fam Zheng
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Lieven @ 2013-07-04 21:10 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, qemu-devel, stefanha


Am 04.07.2013 um 10:13 schrieb Paolo Bonzini <pbonzini@redhat.com>:

> Il 03/07/2013 23:04, Peter Lieven ha scritto:
>>>> Define the return value of get_block_status.  Bits 0, 1, 2 and 8-62
>>>> are valid; bit 63 (the sign bit) is reserved for errors.  Bits 3-7
>>>> are left for future extensions.
>> Is Bit 8 not also reserved for future use? BDRV_SECTOR_BITS is 9.
> 
> Right.
> 
>> Can you explain which information is exactly returned in Bits 9-62?
> 
> Bits 9-62 are the offset at which the data is stored in bs->file, they
> are valid if bit 2 (BDRV_BLOCK_OFFSET_VALID) is 1.

Ok, so this is if bs->file is not linear?

If we return the offset into bs->file this would only make sense if the data
at that position is raw and not encoded otherwise and if *pnum is limited
to the size of the extend at that position, right?

I currently do not understand for what operation this info is needed.

Maybe you could add some info to the commit or the function including
your comment from above.

Thanks,
Peter

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 10/17] block: define get_block_status return value
  2013-07-04 21:10       ` Peter Lieven
@ 2013-07-05  0:49         ` Fam Zheng
  0 siblings, 0 replies; 48+ messages in thread
From: Fam Zheng @ 2013-07-05  0:49 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, Paolo Bonzini, qemu-devel, stefanha

On Thu, 07/04 23:10, Peter Lieven wrote:
> 
> Am 04.07.2013 um 10:13 schrieb Paolo Bonzini <pbonzini@redhat.com>:
> 
> > Il 03/07/2013 23:04, Peter Lieven ha scritto:
> >>>> Define the return value of get_block_status.  Bits 0, 1, 2 and 8-62
> >>>> are valid; bit 63 (the sign bit) is reserved for errors.  Bits 3-7
> >>>> are left for future extensions.
> >> Is Bit 8 not also reserved for future use? BDRV_SECTOR_BITS is 9.
> > 
> > Right.
> > 
> >> Can you explain which information is exactly returned in Bits 9-62?
> > 
> > Bits 9-62 are the offset at which the data is stored in bs->file, they
> > are valid if bit 2 (BDRV_BLOCK_OFFSET_VALID) is 1.
> 
> Ok, so this is if bs->file is not linear?
> 
> If we return the offset into bs->file this would only make sense if the data
> at that position is raw and not encoded otherwise and if *pnum is limited
> to the size of the extend at that position, right?
Exactly.
> 
> I currently do not understand for what operation this info is needed.
Quoted from the cover letter:
  > One example usage is
  > (for non-compressed, non-encrypted images) to transform the metadata
  > into a Linux device-mapper setup, and make a qcow2 image available (for
  > read only) as a block device.  Another possible usage is to determine
  > the used areas of a file, and convert it in place to another format.

Thanks.

Fam

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed Paolo Bonzini
  2013-07-04  2:20   ` Fam Zheng
@ 2013-07-05  9:09   ` Stefan Hajnoczi
  1 sibling, 0 replies; 48+ messages in thread
From: Stefan Hajnoczi @ 2013-07-05  9:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

On Wed, Jul 03, 2013 at 04:34:15PM +0200, Paolo Bonzini wrote:
> @@ -146,23 +159,20 @@ static inline int is_bit_set(BlockDriverState *bs, int64_t bitnum)
>  static int coroutine_fn cow_co_is_allocated(BlockDriverState *bs,
>          int64_t sector_num, int nb_sectors, int *num_same)
>  {
> +    int64_t bitnum = sector_num + sizeof(struct cow_header_v2) * 8;
> +    uint64_t offset = (bitnum / 8) & -BDRV_SECTOR_SIZE;
> +    uint8_t bitmap[512];

Please use BDRV_SECTOR_SIZE, you used it one line above but then
switched to literal 512.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 06/17] block: expect errors from bdrv_co_is_allocated
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 06/17] block: expect errors from bdrv_co_is_allocated Paolo Bonzini
@ 2013-07-05  9:19   ` Stefan Hajnoczi
  2013-07-05 10:28     ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Stefan Hajnoczi @ 2013-07-05  9:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha, qemu-stable

On Wed, Jul 03, 2013 at 04:34:20PM +0200, Paolo Bonzini wrote:
> diff --git a/qemu-img.c b/qemu-img.c
> index f8c97d3..857e2ca 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -2073,6 +2073,10 @@ static int img_rebase(int argc, char **argv)
>  
>              /* If the cluster is allocated, we don't need to take action */
>              ret = bdrv_is_allocated(bs, sector, n, &n);
> +            if (ret < 0) {
> +                error_report("error while reading from file");
> +                goto out;
> +            }

We should print the errno valid and saying "while reading from file" is
a little misleading:

"error while checking cluster allocation status: %d", ret

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 06/17] block: expect errors from bdrv_co_is_allocated
  2013-07-05  9:19   ` Stefan Hajnoczi
@ 2013-07-05 10:28     ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-05 10:28 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, pl, qemu-devel, stefanha, qemu-stable

Il 05/07/2013 11:19, Stefan Hajnoczi ha scritto:
> >              /* If the cluster is allocated, we don't need to take action */
> >              ret = bdrv_is_allocated(bs, sector, n, &n);
> > +            if (ret < 0) {
> > +                error_report("error while reading from file");
> > +                goto out;
> > +            }
> 
> We should print the errno valid and saying "while reading from file" is
> a little misleading:
> 
> "error while checking cluster allocation status: %d", ret

I think this is too specific.  In the end, errors from bdrv_is_allocated
are almost always due to a problem reading the metadata from the file.
I will change it to "error reading image metadata", or something like that.

(And also use strerror).

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand Paolo Bonzini
  2013-07-04  5:34   ` Fam Zheng
@ 2013-07-16  3:31   ` Stefan Hajnoczi
  2013-07-16  6:26     ` Paolo Bonzini
  2013-07-18 20:04   ` Eric Blake
  2 siblings, 1 reply; 48+ messages in thread
From: Stefan Hajnoczi @ 2013-07-16  3:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

On Wed, Jul 03, 2013 at 04:34:26PM +0200, Paolo Bonzini wrote:
> +        if ((ret & BDRV_BLOCK_DATA) &&
> +            !(ret & BDRV_BLOCK_OFFSET_VALID)) {
> +            error_report("File contains external or compressed clusters.");

What does "external clusters" mean?  Did you mean encrypted?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata
  2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
                   ` (16 preceding siblings ...)
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 17/17] block: look for zero blocks in bs->file Paolo Bonzini
@ 2013-07-16  3:41 ` Stefan Hajnoczi
  2013-07-16  6:49   ` Peter Lieven
  17 siblings, 1 reply; 48+ messages in thread
From: Stefan Hajnoczi @ 2013-07-16  3:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

On Wed, Jul 03, 2013 at 04:34:14PM +0200, Paolo Bonzini wrote:
> This series adds a subcommand to "map" that can dump file metadata.
> Metadata that is dumped includes:
> 
> - whether blocks are allocated in bs->file and, if so, where
> 
> - whether blocks are zero
> 
> - whether data is read from bs or bs->backing_hd
> 
> Metadata is dumped for an entire chain of images.  One example usage is
> (for non-compressed, non-encrypted images) to transform the metadata
> into a Linux device-mapper setup, and make a qcow2 image available (for
> read only) as a block device.  Another possible usage is to determine
> the used areas of a file, and convert it in place to another format.
> Alternatively, the richer data can be used by block jobs to quickly
> determine if a block is zero without reading it.
> 
> The patches implement a new operation, bdrv_co_get_block_status, that
> supersedes bdrv_co_is_allocated.  The bdrv_is_allocated API is still
> available of course, and implemented on top of the new operation.
> 
> Patches 1-3 touch cow, which uses bdrv_co_is_allocated during its reads
> and writes.  I optimized it a bit because it was unbearably slow during
> testing.  With these patches (also tested with blkverify), booting of
> a cow guest image is not particularly slow.
> 
> Patches 4 to 6 clean up the bdrv_is_allocated and bdrv_is_allocated_above
> implementation, eliminating some code duplication.
> 
> Patches 7 and 8 tweak bdrv_has_zero_init and its usage in qemu-img in
> a way that helps when implementing the new API.
> 
> Patches 9 to 11 implement bdrv_get_block_status and change the formats
> to report all the available information.
> 
> Patch 12 adds the "map" subcommand to qemu-img.
> 
> Finally, patches 13 to 17 tweak the implementation to extract more
> information from protocols, and combine this information with format
> metadata whenever possible.
> 
> Paolo
> 
> Paolo Bonzini (17):
>   cow: make reads go at a decent speed
>   cow: make writes go at a less indecent speed
>   cow: do not call bdrv_co_is_allocated
>   block: make bdrv_co_is_allocated static
>   block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above
>     distinction
>   block: expect errors from bdrv_co_is_allocated
>   qemu-img: always probe the input image for allocated sectors
>   block: make bdrv_has_zero_init return false for copy-on-write-images
>   block: introduce bdrv_get_block_status API
>   block: define get_block_status return value
>   block: return get_block_status data and flags for formats
>   qemu-img: add a "map" subcommand
>   block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO
>   raw-posix: return get_block_status data and flags
>   raw-posix: detect XFS unwritten extents
>   block: add default get_block_status implementation for protocols
>   block: look for zero blocks in bs->file
> 
>  block.c                   | 134 +++++++++++++--------------
>  block/commit.c            |   6 +-
>  block/cow.c               |  90 +++++++++++++------
>  block/mirror.c            |   4 +-
>  block/qcow.c              |  13 ++-
>  block/qcow2.c             |  24 +++--
>  block/qed.c               |  39 ++++++--
>  block/raw-posix.c         |  24 +++--
>  block/raw.c               |   6 +-
>  block/sheepdog.c          |  14 +--
>  block/stream.c            |  10 +--
>  block/vdi.c               |  17 +++-
>  block/vmdk.c              |  20 ++++-
>  block/vvfat.c             |  15 ++--
>  include/block/block.h     |  31 +++++--
>  include/block/block_int.h |   2 +-
>  qemu-img-cmds.hx          |   6 ++
>  qemu-img.c                | 225 +++++++++++++++++++++++++++++++++++++++++-----
>  18 files changed, 497 insertions(+), 183 deletions(-)

Makes sense to me.  I have left a few small comments but nothing major.

Stefan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-16  3:31   ` Stefan Hajnoczi
@ 2013-07-16  6:26     ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-16  6:26 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, pl, qemu-devel, stefanha

Il 16/07/2013 05:31, Stefan Hajnoczi ha scritto:
> On Wed, Jul 03, 2013 at 04:34:26PM +0200, Paolo Bonzini wrote:
>> > +        if ((ret & BDRV_BLOCK_DATA) &&
>> > +            !(ret & BDRV_BLOCK_OFFSET_VALID)) {
>> > +            error_report("File contains external or compressed clusters.");
> What does "external clusters" mean?  Did you mean encrypted?

It means clusters that come from a separate file, as in vmdk extents.
But yes, these are encrypted too.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols Paolo Bonzini
@ 2013-07-16  6:47   ` Peter Lieven
  2013-07-16  7:19     ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Lieven @ 2013-07-16  6:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, qemu-devel, stefanha

On 03.07.2013 16:34, Paolo Bonzini wrote:
> Protocols return raw data, so you can assume the offsets to pass
> through unchanged.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   block.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/block.c b/block.c
> index cd371cd..ff8ced7 100644
> --- a/block.c
> +++ b/block.c
> @@ -2977,7 +2977,11 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
>   
>       if (!bs->drv->bdrv_co_get_block_status) {
>           *pnum = nb_sectors;
> -        return BDRV_BLOCK_DATA;
> +        ret = BDRV_BLOCK_DATA;
> +        if (bs->drv->protocol_name) {
> +            ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num * BDRV_SECTOR_SIZE);
> +        }
> +        return ret;
>       }
>   
>       ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
I am curious if this is right. Doesn't this mean we say that at offset
sector_num * BDRV_SECTOR_SIZE are nb_sectors of linear data? This is
something we do not know for sure.

Peter

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata
  2013-07-16  3:41 ` [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Stefan Hajnoczi
@ 2013-07-16  6:49   ` Peter Lieven
  2013-07-18  3:54     ` Stefan Hajnoczi
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Lieven @ 2013-07-16  6:49 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, Paolo Bonzini, qemu-devel, stefanha

On 16.07.2013 05:41, Stefan Hajnoczi wrote:
> On Wed, Jul 03, 2013 at 04:34:14PM +0200, Paolo Bonzini wrote:
>> This series adds a subcommand to "map" that can dump file metadata.
>> Metadata that is dumped includes:
>>
>> - whether blocks are allocated in bs->file and, if so, where
>>
>> - whether blocks are zero
>>
>> - whether data is read from bs or bs->backing_hd
>>
>> Metadata is dumped for an entire chain of images.  One example usage is
>> (for non-compressed, non-encrypted images) to transform the metadata
>> into a Linux device-mapper setup, and make a qcow2 image available (for
>> read only) as a block device.  Another possible usage is to determine
>> the used areas of a file, and convert it in place to another format.
>> Alternatively, the richer data can be used by block jobs to quickly
>> determine if a block is zero without reading it.
>>
>> The patches implement a new operation, bdrv_co_get_block_status, that
>> supersedes bdrv_co_is_allocated.  The bdrv_is_allocated API is still
>> available of course, and implemented on top of the new operation.
>>
>> Patches 1-3 touch cow, which uses bdrv_co_is_allocated during its reads
>> and writes.  I optimized it a bit because it was unbearably slow during
>> testing.  With these patches (also tested with blkverify), booting of
>> a cow guest image is not particularly slow.
>>
>> Patches 4 to 6 clean up the bdrv_is_allocated and bdrv_is_allocated_above
>> implementation, eliminating some code duplication.
>>
>> Patches 7 and 8 tweak bdrv_has_zero_init and its usage in qemu-img in
>> a way that helps when implementing the new API.
>>
>> Patches 9 to 11 implement bdrv_get_block_status and change the formats
>> to report all the available information.
>>
>> Patch 12 adds the "map" subcommand to qemu-img.
>>
>> Finally, patches 13 to 17 tweak the implementation to extract more
>> information from protocols, and combine this information with format
>> metadata whenever possible.
>>
>> Paolo
>>
>> Paolo Bonzini (17):
>>    cow: make reads go at a decent speed
>>    cow: make writes go at a less indecent speed
>>    cow: do not call bdrv_co_is_allocated
>>    block: make bdrv_co_is_allocated static
>>    block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above
>>      distinction
>>    block: expect errors from bdrv_co_is_allocated
>>    qemu-img: always probe the input image for allocated sectors
>>    block: make bdrv_has_zero_init return false for copy-on-write-images
>>    block: introduce bdrv_get_block_status API
>>    block: define get_block_status return value
>>    block: return get_block_status data and flags for formats
>>    qemu-img: add a "map" subcommand
>>    block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO
>>    raw-posix: return get_block_status data and flags
>>    raw-posix: detect XFS unwritten extents
>>    block: add default get_block_status implementation for protocols
>>    block: look for zero blocks in bs->file
>>
>>   block.c                   | 134 +++++++++++++--------------
>>   block/commit.c            |   6 +-
>>   block/cow.c               |  90 +++++++++++++------
>>   block/mirror.c            |   4 +-
>>   block/qcow.c              |  13 ++-
>>   block/qcow2.c             |  24 +++--
>>   block/qed.c               |  39 ++++++--
>>   block/raw-posix.c         |  24 +++--
>>   block/raw.c               |   6 +-
>>   block/sheepdog.c          |  14 +--
>>   block/stream.c            |  10 +--
>>   block/vdi.c               |  17 +++-
>>   block/vmdk.c              |  20 ++++-
>>   block/vvfat.c             |  15 ++--
>>   include/block/block.h     |  31 +++++--
>>   include/block/block_int.h |   2 +-
>>   qemu-img-cmds.hx          |   6 ++
>>   qemu-img.c                | 225 +++++++++++++++++++++++++++++++++++++++++-----
>>   18 files changed, 497 insertions(+), 183 deletions(-)
> Makes sense to me.  I have left a few small comments but nothing major.
>
> Stefan
Paolo, could you respin it once with all the small fixes. I would go over it again
after that if you like.

Peter

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols
  2013-07-16  6:47   ` Peter Lieven
@ 2013-07-16  7:19     ` Paolo Bonzini
  2013-07-16  7:54       ` Peter Lieven
  2013-07-17 10:26       ` Peter Lieven
  0 siblings, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-16  7:19 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, qemu-devel, stefanha

Il 16/07/2013 08:47, Peter Lieven ha scritto:
>>
>> @@ -2977,7 +2977,11 @@ static int64_t coroutine_fn
>> bdrv_co_get_block_status(BlockDriverState *bs,
>>         if (!bs->drv->bdrv_co_get_block_status) {
>>           *pnum = nb_sectors;
>> -        return BDRV_BLOCK_DATA;
>> +        ret = BDRV_BLOCK_DATA;
>> +        if (bs->drv->protocol_name) {
>> +            ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num *
>> BDRV_SECTOR_SIZE);
>> +        }
>> +        return ret;
>>       }
>>         ret = bs->drv->bdrv_co_get_block_status(bs, sector_num,
>> nb_sectors, pnum);
> I am curious if this is right. Doesn't this mean we say that at offset
> sector_num * BDRV_SECTOR_SIZE are nb_sectors of linear data? This is
> something we do not know for sure.

Only for protocols.  In this case, we do know, or format=raw wouldn't
work.  This is not propagated up to the actual BDS for the image, except
for format=raw.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols
  2013-07-16  7:19     ` Paolo Bonzini
@ 2013-07-16  7:54       ` Peter Lieven
  2013-07-16  9:37         ` Paolo Bonzini
  2013-07-17 10:26       ` Peter Lieven
  1 sibling, 1 reply; 48+ messages in thread
From: Peter Lieven @ 2013-07-16  7:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, qemu-devel, stefanha

On 16.07.2013 09:19, Paolo Bonzini wrote:
> Il 16/07/2013 08:47, Peter Lieven ha scritto:
>>> @@ -2977,7 +2977,11 @@ static int64_t coroutine_fn
>>> bdrv_co_get_block_status(BlockDriverState *bs,
>>>          if (!bs->drv->bdrv_co_get_block_status) {
>>>            *pnum = nb_sectors;
>>> -        return BDRV_BLOCK_DATA;
>>> +        ret = BDRV_BLOCK_DATA;
>>> +        if (bs->drv->protocol_name) {
>>> +            ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num *
>>> BDRV_SECTOR_SIZE);
>>> +        }
>>> +        return ret;
>>>        }
>>>          ret = bs->drv->bdrv_co_get_block_status(bs, sector_num,
>>> nb_sectors, pnum);
>> I am curious if this is right. Doesn't this mean we say that at offset
>> sector_num * BDRV_SECTOR_SIZE are nb_sectors of linear data? This is
>> something we do not know for sure.
> Only for protocols.  In this case, we do know, or format=raw wouldn't
> work.  This is not propagated up to the actual BDS for the image, except
> for format=raw.
If bs->drv->protocol_name is only != NULL if format=raw, I am
fine with this.

Peter

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols
  2013-07-16  7:54       ` Peter Lieven
@ 2013-07-16  9:37         ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-16  9:37 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, qemu-devel, stefanha

Il 16/07/2013 09:54, Peter Lieven ha scritto:
> On 16.07.2013 09:19, Paolo Bonzini wrote:
>> Il 16/07/2013 08:47, Peter Lieven ha scritto:
>>>> @@ -2977,7 +2977,11 @@ static int64_t coroutine_fn
>>>> bdrv_co_get_block_status(BlockDriverState *bs,
>>>>          if (!bs->drv->bdrv_co_get_block_status) {
>>>>            *pnum = nb_sectors;
>>>> -        return BDRV_BLOCK_DATA;
>>>> +        ret = BDRV_BLOCK_DATA;
>>>> +        if (bs->drv->protocol_name) {
>>>> +            ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num *
>>>> BDRV_SECTOR_SIZE);
>>>> +        }
>>>> +        return ret;
>>>>        }
>>>>          ret = bs->drv->bdrv_co_get_block_status(bs, sector_num,
>>>> nb_sectors, pnum);
>>> I am curious if this is right. Doesn't this mean we say that at offset
>>> sector_num * BDRV_SECTOR_SIZE are nb_sectors of linear data? This is
>>> something we do not know for sure.
>> Only for protocols.  In this case, we do know, or format=raw wouldn't
>> work.  This is not propagated up to the actual BDS for the image, except
>> for format=raw.
> If bs->drv->protocol_name is only != NULL if format=raw, I am
> fine with this.

No, bs->drv->protocol_name is only != NULL if it is a protocol (like
file or iscsi) rather than a format (like raw or qcow2). :)

But the user will never call bdrv_co_get_block_status on a protocol
(bs->file, roughly), only on a format (bs); and this information is only
passed from bs->file to bs for format=raw.  Other formats compute the
offsets themselves.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols
  2013-07-16  7:19     ` Paolo Bonzini
  2013-07-16  7:54       ` Peter Lieven
@ 2013-07-17 10:26       ` Peter Lieven
  2013-07-17 10:33         ` Paolo Bonzini
  1 sibling, 1 reply; 48+ messages in thread
From: Peter Lieven @ 2013-07-17 10:26 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, qemu-devel, stefanha


Am 16.07.2013 um 09:19 schrieb Paolo Bonzini <pbonzini@redhat.com>:

> Il 16/07/2013 08:47, Peter Lieven ha scritto:
>>> 
>>> @@ -2977,7 +2977,11 @@ static int64_t coroutine_fn
>>> bdrv_co_get_block_status(BlockDriverState *bs,
>>>        if (!bs->drv->bdrv_co_get_block_status) {
>>>          *pnum = nb_sectors;
>>> -        return BDRV_BLOCK_DATA;
>>> +        ret = BDRV_BLOCK_DATA;
>>> +        if (bs->drv->protocol_name) {
>>> +            ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num *
>>> BDRV_SECTOR_SIZE);
>>> +        }
>>> +        return ret;
>>>      }
>>>        ret = bs->drv->bdrv_co_get_block_status(bs, sector_num,
>>> nb_sectors, pnum);
>> I am curious if this is right. Doesn't this mean we say that at offset
>> sector_num * BDRV_SECTOR_SIZE are nb_sectors of linear data? This is
>> something we do not know for sure.
> 
> Only for protocols.  In this case, we do know, or format=raw wouldn't
> work.  This is not propagated up to the actual BDS for the image, except
> for format=raw.

Wouldn't it be better to add this general tweak in to raw_co_is_allocated
rather than at the block level?

What you write above is true, but you will never know what will happen in the
future. For raw this tweak is right, but for anything developed in the future
it might not be and in the end nobody remembers to fix it at this position.

I was just thinking of the has_zero_init = 1 thing. At the time it was written
the code was correct and then came more and more extensions like
extends on a block device or iscsi and nobody remembered.

Peter

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols
  2013-07-17 10:26       ` Peter Lieven
@ 2013-07-17 10:33         ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-17 10:33 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, qemu-devel, stefanha

Il 17/07/2013 12:26, Peter Lieven ha scritto:
> 
> Am 16.07.2013 um 09:19 schrieb Paolo Bonzini <pbonzini@redhat.com>:
> 
>> Il 16/07/2013 08:47, Peter Lieven ha scritto:
>>>>
>>>> @@ -2977,7 +2977,11 @@ static int64_t coroutine_fn
>>>> bdrv_co_get_block_status(BlockDriverState *bs,
>>>>        if (!bs->drv->bdrv_co_get_block_status) {
>>>>          *pnum = nb_sectors;
>>>> -        return BDRV_BLOCK_DATA;
>>>> +        ret = BDRV_BLOCK_DATA;
>>>> +        if (bs->drv->protocol_name) {
>>>> +            ret |= BDRV_BLOCK_OFFSET_VALID | (sector_num *
>>>> BDRV_SECTOR_SIZE);
>>>> +        }
>>>> +        return ret;
>>>>      }
>>>>        ret = bs->drv->bdrv_co_get_block_status(bs, sector_num,
>>>> nb_sectors, pnum);
>>> I am curious if this is right. Doesn't this mean we say that at offset
>>> sector_num * BDRV_SECTOR_SIZE are nb_sectors of linear data? This is
>>> something we do not know for sure.
>>
>> Only for protocols.  In this case, we do know, or format=raw wouldn't
>> work.  This is not propagated up to the actual BDS for the image, except
>> for format=raw.
> 
> Wouldn't it be better to add this general tweak in to raw_co_is_allocated
> rather than at the block level?

No, this is not a "general tweak".  It is a default implementation (it
is protected by "if (!bs->drv->bdrv_co_get_block_status)".  Other
implementations can do the same, for example raw-posix.c also returns
BDRV_BLOCK_OFFSET_VALID | (sector_num * BDRV_SECTOR_SIZE).

Each layer must return complete data.  If a higher layer wants to query
bs->file, it must obviously filters those flags that still make sense.

In the case of format=raw, all flags make sense.  For other layers, some
flags may not make sense and have to be stripped.

> What you write above is true, but you will never know what will happen in the
> future. For raw this tweak is right, but for anything developed in the future
> it might not be and in the end nobody remembers to fix it at this position.

Does the above explain it better?

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata
  2013-07-16  6:49   ` Peter Lieven
@ 2013-07-18  3:54     ` Stefan Hajnoczi
  0 siblings, 0 replies; 48+ messages in thread
From: Stefan Hajnoczi @ 2013-07-18  3:54 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, Stefan Hajnoczi, qemu-devel, Paolo Bonzini

On Tue, Jul 16, 2013 at 08:49:13AM +0200, Peter Lieven wrote:
> On 16.07.2013 05:41, Stefan Hajnoczi wrote:
> >On Wed, Jul 03, 2013 at 04:34:14PM +0200, Paolo Bonzini wrote:
> >>This series adds a subcommand to "map" that can dump file metadata.
> >>Metadata that is dumped includes:
> >>
> >>- whether blocks are allocated in bs->file and, if so, where
> >>
> >>- whether blocks are zero
> >>
> >>- whether data is read from bs or bs->backing_hd
> >>
> >>Metadata is dumped for an entire chain of images.  One example usage is
> >>(for non-compressed, non-encrypted images) to transform the metadata
> >>into a Linux device-mapper setup, and make a qcow2 image available (for
> >>read only) as a block device.  Another possible usage is to determine
> >>the used areas of a file, and convert it in place to another format.
> >>Alternatively, the richer data can be used by block jobs to quickly
> >>determine if a block is zero without reading it.
> >>
> >>The patches implement a new operation, bdrv_co_get_block_status, that
> >>supersedes bdrv_co_is_allocated.  The bdrv_is_allocated API is still
> >>available of course, and implemented on top of the new operation.
> >>
> >>Patches 1-3 touch cow, which uses bdrv_co_is_allocated during its reads
> >>and writes.  I optimized it a bit because it was unbearably slow during
> >>testing.  With these patches (also tested with blkverify), booting of
> >>a cow guest image is not particularly slow.
> >>
> >>Patches 4 to 6 clean up the bdrv_is_allocated and bdrv_is_allocated_above
> >>implementation, eliminating some code duplication.
> >>
> >>Patches 7 and 8 tweak bdrv_has_zero_init and its usage in qemu-img in
> >>a way that helps when implementing the new API.
> >>
> >>Patches 9 to 11 implement bdrv_get_block_status and change the formats
> >>to report all the available information.
> >>
> >>Patch 12 adds the "map" subcommand to qemu-img.
> >>
> >>Finally, patches 13 to 17 tweak the implementation to extract more
> >>information from protocols, and combine this information with format
> >>metadata whenever possible.
> >>
> >>Paolo
> >>
> >>Paolo Bonzini (17):
> >>   cow: make reads go at a decent speed
> >>   cow: make writes go at a less indecent speed
> >>   cow: do not call bdrv_co_is_allocated
> >>   block: make bdrv_co_is_allocated static
> >>   block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above
> >>     distinction
> >>   block: expect errors from bdrv_co_is_allocated
> >>   qemu-img: always probe the input image for allocated sectors
> >>   block: make bdrv_has_zero_init return false for copy-on-write-images
> >>   block: introduce bdrv_get_block_status API
> >>   block: define get_block_status return value
> >>   block: return get_block_status data and flags for formats
> >>   qemu-img: add a "map" subcommand
> >>   block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO
> >>   raw-posix: return get_block_status data and flags
> >>   raw-posix: detect XFS unwritten extents
> >>   block: add default get_block_status implementation for protocols
> >>   block: look for zero blocks in bs->file
> >>
> >>  block.c                   | 134 +++++++++++++--------------
> >>  block/commit.c            |   6 +-
> >>  block/cow.c               |  90 +++++++++++++------
> >>  block/mirror.c            |   4 +-
> >>  block/qcow.c              |  13 ++-
> >>  block/qcow2.c             |  24 +++--
> >>  block/qed.c               |  39 ++++++--
> >>  block/raw-posix.c         |  24 +++--
> >>  block/raw.c               |   6 +-
> >>  block/sheepdog.c          |  14 +--
> >>  block/stream.c            |  10 +--
> >>  block/vdi.c               |  17 +++-
> >>  block/vmdk.c              |  20 ++++-
> >>  block/vvfat.c             |  15 ++--
> >>  include/block/block.h     |  31 +++++--
> >>  include/block/block_int.h |   2 +-
> >>  qemu-img-cmds.hx          |   6 ++
> >>  qemu-img.c                | 225 +++++++++++++++++++++++++++++++++++++++++-----
> >>  18 files changed, 497 insertions(+), 183 deletions(-)
> >Makes sense to me.  I have left a few small comments but nothing major.
> >
> >Stefan
> Paolo, could you respin it once with all the small fixes. I would go over it again
> after that if you like.

A respin is necessary.  I didn't mean to imply that the patches will be
merged as-is.

Stefan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-03 14:34 ` [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand Paolo Bonzini
  2013-07-04  5:34   ` Fam Zheng
  2013-07-16  3:31   ` Stefan Hajnoczi
@ 2013-07-18 20:04   ` Eric Blake
  2013-07-19  4:48     ` Stefan Hajnoczi
  2 siblings, 1 reply; 48+ messages in thread
From: Eric Blake @ 2013-07-18 20:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, pl, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 1144 bytes --]

On 07/03/2013 08:34 AM, Paolo Bonzini wrote:
> This command dumps the metadata of an entire chain, in either tabular or JSON
> format.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  qemu-img-cmds.hx |   6 ++
>  qemu-img.c       | 186 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 192 insertions(+)
> 

> +    case OFORMAT_JSON:
> +        printf("%s{ 'depth': %d, 'start': %lld, 'length': %lld, "
> +               "'zero': %s, 'data': %s",
> +               (e->start == 0 ? "[" : ",\n"),
> +               e->depth, (long long) e->start, (long long) e->length,
> +               (e->flags & BDRV_BLOCK_ZERO) ? "true" : "false",
> +               (e->flags & BDRV_BLOCK_DATA) ? "true" : "false");
> +        if (e->flags & BDRV_BLOCK_OFFSET_VALID) {
> +            printf(", 'offset': %lld", (long long) e->offset);
> +        }
> +	putchar('}');

Can we please get this format documented in qapi-schema.json, even if we
aren't using qapi to generate it yet?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-18 20:04   ` Eric Blake
@ 2013-07-19  4:48     ` Stefan Hajnoczi
  2013-07-19  5:54       ` Paolo Bonzini
  0 siblings, 1 reply; 48+ messages in thread
From: Stefan Hajnoczi @ 2013-07-19  4:48 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, Paolo Bonzini, pl, qemu-devel, stefanha

On Thu, Jul 18, 2013 at 02:04:31PM -0600, Eric Blake wrote:
> On 07/03/2013 08:34 AM, Paolo Bonzini wrote:
> > This command dumps the metadata of an entire chain, in either tabular or JSON
> > format.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  qemu-img-cmds.hx |   6 ++
> >  qemu-img.c       | 186 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 192 insertions(+)
> > 
> 
> > +    case OFORMAT_JSON:
> > +        printf("%s{ 'depth': %d, 'start': %lld, 'length': %lld, "
> > +               "'zero': %s, 'data': %s",
> > +               (e->start == 0 ? "[" : ",\n"),
> > +               e->depth, (long long) e->start, (long long) e->length,
> > +               (e->flags & BDRV_BLOCK_ZERO) ? "true" : "false",
> > +               (e->flags & BDRV_BLOCK_DATA) ? "true" : "false");
> > +        if (e->flags & BDRV_BLOCK_OFFSET_VALID) {
> > +            printf(", 'offset': %lld", (long long) e->offset);
> > +        }
> > +	putchar('}');
> 
> Can we please get this format documented in qapi-schema.json, even if we
> aren't using qapi to generate it yet?

Paolo: Please send a follow-up patch documenting the json schema, I've
already merged this series.

I was although thinking about qemu-iotests for qemu-img map, but it's
tricky since the allocation is an internal detail of the image format.
Perhaps a test case using raw?

Stefan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand
  2013-07-19  4:48     ` Stefan Hajnoczi
@ 2013-07-19  5:54       ` Paolo Bonzini
  0 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2013-07-19  5:54 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: kwolf, pl, qemu-devel, stefanha

Il 19/07/2013 06:48, Stefan Hajnoczi ha scritto:
> On Thu, Jul 18, 2013 at 02:04:31PM -0600, Eric Blake wrote:
>> On 07/03/2013 08:34 AM, Paolo Bonzini wrote:
>>> This command dumps the metadata of an entire chain, in either tabular or JSON
>>> format.
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>  qemu-img-cmds.hx |   6 ++
>>>  qemu-img.c       | 186 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 192 insertions(+)
>>>
>>
>>> +    case OFORMAT_JSON:
>>> +        printf("%s{ 'depth': %d, 'start': %lld, 'length': %lld, "
>>> +               "'zero': %s, 'data': %s",
>>> +               (e->start == 0 ? "[" : ",\n"),
>>> +               e->depth, (long long) e->start, (long long) e->length,
>>> +               (e->flags & BDRV_BLOCK_ZERO) ? "true" : "false",
>>> +               (e->flags & BDRV_BLOCK_DATA) ? "true" : "false");
>>> +        if (e->flags & BDRV_BLOCK_OFFSET_VALID) {
>>> +            printf(", 'offset': %lld", (long long) e->offset);
>>> +        }
>>> +	putchar('}');
>>
>> Can we please get this format documented in qapi-schema.json, even if we
>> aren't using qapi to generate it yet?
> 
> Paolo: Please send a follow-up patch documenting the json schema, I've
> already merged this series.
> 
> I was although thinking about qemu-iotests for qemu-img map, but it's
> tricky since the allocation is an internal detail of the image format.
> Perhaps a test case using raw?

If you have interesting cases with sparse images, even raw depends on
how the underlying file system splits extents.

I can do a simple test that ignores the 'offset' field.

Paolo

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2013-07-19  5:54 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-03 14:34 [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 01/17] cow: make reads go at a decent speed Paolo Bonzini
2013-07-04  2:20   ` Fam Zheng
2013-07-04  8:08     ` Paolo Bonzini
2013-07-05  9:09   ` Stefan Hajnoczi
2013-07-03 14:34 ` [Qemu-devel] [PATCH 02/17] cow: make writes go at a less indecent speed Paolo Bonzini
2013-07-04  2:40   ` Fam Zheng
2013-07-04  8:11     ` Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 03/17] cow: do not call bdrv_co_is_allocated Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 04/17] block: make bdrv_co_is_allocated static Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 05/17] block: remove bdrv_is_allocated_above/bdrv_co_is_allocated_above distinction Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 06/17] block: expect errors from bdrv_co_is_allocated Paolo Bonzini
2013-07-05  9:19   ` Stefan Hajnoczi
2013-07-05 10:28     ` Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 07/17] qemu-img: always probe the input image for allocated sectors Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 08/17] block: make bdrv_has_zero_init return false for copy-on-write-images Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 09/17] block: introduce bdrv_get_block_status API Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 10/17] block: define get_block_status return value Paolo Bonzini
2013-07-03 21:04   ` Peter Lieven
2013-07-04  8:13     ` Paolo Bonzini
2013-07-04 21:10       ` Peter Lieven
2013-07-05  0:49         ` Fam Zheng
2013-07-03 14:34 ` [Qemu-devel] [PATCH 11/17] block: return get_block_status data and flags for formats Paolo Bonzini
2013-07-04  3:22   ` Fam Zheng
2013-07-04  8:14     ` Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 12/17] qemu-img: add a "map" subcommand Paolo Bonzini
2013-07-04  5:34   ` Fam Zheng
2013-07-04  8:16     ` Paolo Bonzini
2013-07-04  8:36       ` Fam Zheng
2013-07-16  3:31   ` Stefan Hajnoczi
2013-07-16  6:26     ` Paolo Bonzini
2013-07-18 20:04   ` Eric Blake
2013-07-19  4:48     ` Stefan Hajnoczi
2013-07-19  5:54       ` Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 13/17] block: use bdrv_has_zero_init to return BDRV_BLOCK_ZERO Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 14/17] raw-posix: return get_block_status data and flags Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 15/17] raw-posix: detect XFS unwritten extents Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 16/17] block: add default get_block_status implementation for protocols Paolo Bonzini
2013-07-16  6:47   ` Peter Lieven
2013-07-16  7:19     ` Paolo Bonzini
2013-07-16  7:54       ` Peter Lieven
2013-07-16  9:37         ` Paolo Bonzini
2013-07-17 10:26       ` Peter Lieven
2013-07-17 10:33         ` Paolo Bonzini
2013-07-03 14:34 ` [Qemu-devel] [PATCH 17/17] block: look for zero blocks in bs->file Paolo Bonzini
2013-07-16  3:41 ` [Qemu-devel] [PATCH 00/17] Add qemu-img subcommand to dump file metadata Stefan Hajnoczi
2013-07-16  6:49   ` Peter Lieven
2013-07-18  3:54     ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.