All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format
@ 2012-06-13 14:36 Dong Xu Wang
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 2/6] block: make some functions public Dong Xu Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-13 14:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang

Introduce a new file format:add-cow. The usage can be found at this patch.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
 docs/specs/add-cow.txt |   87 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 87 insertions(+), 0 deletions(-)
 create mode 100644 docs/specs/add-cow.txt

diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
new file mode 100644
index 0000000..e077fc2
--- /dev/null
+++ b/docs/specs/add-cow.txt
@@ -0,0 +1,87 @@
+== General ==
+
+Raw file format does not support backing_file and copy on write feature.
+The add-cow image format makes it possible to use backing files with raw
+image by keeping a separate .add-cow metadata file.  Once all sectors
+have been written into the raw image it is safe to discard the .add-cow
+and backing files, then we can use the raw image directly.
+
+While using add-cow, procedures may like this:
+(ubuntu.img is a disk image which has been installed OS.)
+    1)  Create a raw image with the same size of ubuntu.img
+            qemu-img create -f raw test.raw 8G
+    2)  Create an add-cow image which will store dirty bitmap
+            qemu-img create -f add-cow test.add-cow \
+                -o backing_file=ubuntu.img,image_file=test.raw
+    3)  Run qemu with add-cow image
+            qemu -drive if=virtio,file=test.add-cow
+
+=Specification=
+
+The file format looks like this:
+
+ +---------------+-------------+-----------------+
+ |     Header    |   Reserved  |    COW bitmap   |
+ +---------------+-------------+-----------------+
+
+All numbers in add-cow are stored in Little Endian byte order.
+
+== Header ==
+
+The Header is included in the first bytes:
+
+    Byte    0 -  7:     magic
+                        add-cow magic string ("ADD_COW\xff")
+
+            8 -  11:    version
+                        Version number (only valid value is 1 now)
+
+            12 - 15:    backing_filename_offset
+                        Offset in the add-cow file at which the backing file name
+                        is stored (NB: The string is not null terminated). 0 if the
+                        image doesn't have a backing file.
+
+            16 - 19:    backing_filename_size
+                        Length of the backing file name in bytes. Undefined if the
+                        image doesn't have a backing file.
+
+            20 - 23:    image_filename_offset
+                        Offset in the add-cow file at which the image_file name
+                        is stored (NB: The string is not null terminated).
+
+            24 - 27:    image_filename_size
+                        Length of the image_file name in bytes.
+
+            28 - 35:    features
+                        Currently only 2 feature bits are used:
+                        Feature bits:
+                            The image uses a backing file:
+                            * ADD_COW_F_BACKING_FILE = 0x01.
+                            The backing file's format is raw:
+                            * ADD_COW_F_BACKING_FORMAT_NO_PROBE = 0x02.
+
+== Reserved ==
+
+    Byte    36 - 4095:  Reserved field:
+                        It is used to make sure COW bitmap field starts at the
+                        4096th byte, backing_file name and image_file name will
+                        be stored here.
+
+== COW bitmap ==
+
+The "COW bitmap" field starts at the 4096th byte, stores a bitmap related to
+backing_file and image_file. The bitmap will track whether the sector in
+backing_file is dirty or not.
+
+Each bit in the bitmap indicates one cluster's status. One cluster includes 128
+sectors, then each bit indicates 512 * 128 = 64k bytes, So the size of bitmap is
+calculated according to virtual size of image_file. In each byte, bit 0 to 7
+will track the 1st to 7th cluster in sequence, bit orders in one byte look like:
+ +----+----+----+----+----+----+----+----+
+ | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
+ +----+----+----+----+----+----+----+----+
+
+If the bit is 0, indicates the sector has not been allocated in image_file, data
+should be loaded from backing_file while reading; if the bit is 1,  indicates the
+related sector has been dirty, should be loaded from image_file while reading.
+Writing to a sector causes the corresponding bit to be set to 1.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 2/6] block: make some functions public
  2012-06-13 14:36 [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Dong Xu Wang
@ 2012-06-13 14:36 ` Dong Xu Wang
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 3/6] add-cow file format Dong Xu Wang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-13 14:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang

In add-cow file format, we will use path_has_protocol and we will read
a NUL-terminated string from image , qed_read_string has done the samething,
so make the two functions public, then we will reuse them directly.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
 block.c     |   17 ++++++++++++++++-
 block.h     |    3 +++
 block/qed.c |   29 +----------------------------
 3 files changed, 20 insertions(+), 29 deletions(-)

diff --git a/block.c b/block.c
index 7547051..3cf5fe9 100644
--- a/block.c
+++ b/block.c
@@ -196,7 +196,7 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs,
 }
 
 /* check if the path starts with "<protocol>:" */
-static int path_has_protocol(const char *path)
+int path_has_protocol(const char *path)
 {
     const char *p;
 
@@ -1606,6 +1606,21 @@ int bdrv_read(BlockDriverState *bs, int64_t sector_num,
     return bdrv_rw_co(bs, sector_num, buf, nb_sectors, false);
 }
 
+int bdrv_read_string(BlockDriverState *file, uint64_t offset, size_t n,
+                           char *buf, size_t buflen)
+{
+    int ret;
+    if (n >= buflen) {
+        return -EINVAL;
+    }
+    ret = bdrv_pread(file, offset, buf, n);
+    if (ret < 0) {
+        return ret;
+    }
+    buf[n] = '\0';
+    return 0;
+}
+
 #define BITS_PER_LONG  (sizeof(unsigned long) * 8)
 
 static void set_dirty_bitmap(BlockDriverState *bs, int64_t sector_num,
diff --git a/block.h b/block.h
index 7408acc..fd4f8cf 100644
--- a/block.h
+++ b/block.h
@@ -149,6 +149,8 @@ int bdrv_pwrite(BlockDriverState *bs, int64_t offset,
                 const void *buf, int count);
 int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
     const void *buf, int count);
+int bdrv_read_string(BlockDriverState *file, uint64_t offset, size_t n,
+                           char *buf, size_t buflen);
 int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
     int nb_sectors, QEMUIOVector *qiov);
 int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
@@ -321,6 +323,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn);
 
 char *get_human_readable_size(char *buf, int buf_size, int64_t size);
 int path_is_absolute(const char *path);
+int path_has_protocol(const char *path);
 void path_combine(char *dest, int dest_size,
                   const char *base_path,
                   const char *filename);
diff --git a/block/qed.c b/block/qed.c
index 30a31f9..b300584 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -217,33 +217,6 @@ static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size,
 }
 
 /**
- * Read a string of known length from the image file
- *
- * @file:       Image file
- * @offset:     File offset to start of string, in bytes
- * @n:          String length in bytes
- * @buf:        Destination buffer
- * @buflen:     Destination buffer length in bytes
- * @ret:        0 on success, -errno on failure
- *
- * The string is NUL-terminated.
- */
-static int qed_read_string(BlockDriverState *file, uint64_t offset, size_t n,
-                           char *buf, size_t buflen)
-{
-    int ret;
-    if (n >= buflen) {
-        return -EINVAL;
-    }
-    ret = bdrv_pread(file, offset, buf, n);
-    if (ret < 0) {
-        return ret;
-    }
-    buf[n] = '\0';
-    return 0;
-}
-
-/**
  * Allocate new clusters
  *
  * @s:          QED state
@@ -437,7 +410,7 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags)
             return -EINVAL;
         }
 
-        ret = qed_read_string(bs->file, s->header.backing_filename_offset,
+        ret = bdrv_read_string(bs->file, s->header.backing_filename_offset,
                               s->header.backing_filename_size, bs->backing_file,
                               sizeof(bs->backing_file));
         if (ret < 0) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 3/6] add-cow file format
  2012-06-13 14:36 [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Dong Xu Wang
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 2/6] block: make some functions public Dong Xu Wang
@ 2012-06-13 14:36 ` Dong Xu Wang
  2012-06-14 11:13   ` Paolo Bonzini
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert Dong Xu Wang
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-13 14:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang

This is the implementation code for add-cow file format. Because image_file
might be very huge, then we can't read entire bitmap into memory, we must use
a cache. Since qcow-cache.c has implemted cache code, we can create our cache
code based on it.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
 block/Makefile.objs   |    1 +
 block/add-cow-cache.c |  218 ++++++++++++++++++++++
 block/add-cow.c       |  488 +++++++++++++++++++++++++++++++++++++++++++++++++
 block/add-cow.h       |   95 ++++++++++
 block_int.h           |    1 +
 5 files changed, 803 insertions(+), 0 deletions(-)
 create mode 100644 block/add-cow-cache.c
 create mode 100644 block/add-cow.c
 create mode 100644 block/add-cow.h

diff --git a/block/Makefile.objs b/block/Makefile.objs
index b5754d3..357a3b1 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -2,6 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat
 block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
+block-obj-y += add-cow.o add-cow-cache.o
 block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
 block-obj-y += stream.o
 block-obj-$(CONFIG_WIN32) += raw-win32.o
diff --git a/block/add-cow-cache.c b/block/add-cow-cache.c
new file mode 100644
index 0000000..b4b1546
--- /dev/null
+++ b/block/add-cow-cache.c
@@ -0,0 +1,218 @@
+/*
+ * Cache For QEMU ADD-COW Disk Format
+ *
+ * Copyright IBM, Corp. 2012
+ *
+ * Authors:
+ *  Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
+ *
+ * This file is based on qcow2-cache.c, see its copyrights below:
+ *
+ * L2/refcount table cache for the QCOW2 format
+ *
+ * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "block_int.h"
+#include "qemu-common.h"
+#include "add-cow.h"
+
+AddCowCache *add_cow_cache_create(BlockDriverState *bs, int num_tables,
+    bool writethrough)
+{
+    AddCowCache *c;
+    int i;
+
+    c = g_malloc0(sizeof(*c));
+    c->size = num_tables;
+    c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
+    c->writethrough = writethrough;
+    c->entry_size = ADD_COW_CACHE_ENTRY_SIZE;
+
+    for (i = 0; i < c->size; i++) {
+        c->entries[i].table = qemu_blockalign(bs, c->entry_size);
+        c->entries[i].offset = -1;
+    }
+
+    return c;
+}
+
+int add_cow_cache_destroy(BlockDriverState *bs, AddCowCache *c)
+{
+    int i;
+
+    for (i = 0; i < c->size; i++) {
+        qemu_vfree(c->entries[i].table);
+    }
+
+    g_free(c->entries);
+    g_free(c);
+
+    return 0;
+}
+
+static int add_cow_cache_entry_flush(BlockDriverState *bs,
+    AddCowCache *c,
+    int i)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int ret = 0;
+
+    if (!c->entries[i].dirty || -1 == c->entries[i].offset) {
+        return ret;
+    }
+
+    ret = bdrv_pwrite(bs->file,
+        ADD_COW_BITMAP_POS + c->entries[i].offset,
+        c->entries[i].table,
+        MIN(c->entry_size, s->bitmap_size - c->entries[i].offset));
+    if (ret < 0) {
+        return ret;
+    }
+
+    c->entries[i].dirty = false;
+
+    return 0;
+}
+
+int add_cow_cache_flush(BlockDriverState *bs, AddCowCache *c)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int result = 0;
+    int ret;
+    int i;
+
+    ret = bdrv_flush(s->image_hd);
+    if (ret < 0) {
+        return result;
+    }
+
+    for (i = 0; i < c->size; i++) {
+        ret = add_cow_cache_entry_flush(bs, c, i);
+        if (ret < 0 && result != -ENOSPC) {
+            result = ret;
+        }
+    }
+
+    if (result == 0) {
+        ret = bdrv_flush(bs->file);
+        if (ret < 0) {
+            result = ret;
+        }
+    }
+
+    return result;
+}
+
+static int add_cow_cache_find_entry_to_replace(AddCowCache *c)
+{
+    int i;
+    int min_count = INT_MAX;
+    int min_index = -1;
+
+
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].cache_hits < min_count) {
+            min_index = i;
+            min_count = c->entries[i].cache_hits;
+        }
+
+        c->entries[i].cache_hits /= 2;
+    }
+
+    if (min_index == -1) {
+        abort();
+    }
+    return min_index;
+}
+
+static int add_cow_cache_do_get(BlockDriverState *bs, AddCowCache *c,
+    uint64_t offset, void **table)
+{
+    int i, ret;
+
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].offset == offset) {
+            goto found;
+        }
+    }
+
+    i = add_cow_cache_find_entry_to_replace(c);
+    if (i < 0) {
+        return i;
+    }
+
+    ret = add_cow_cache_entry_flush(bs, c, i);
+    if (ret < 0) {
+        return ret;
+    }
+    c->entries[i].offset = -1;
+    ret = bdrv_pread(bs->file, ADD_COW_BITMAP_POS + offset,
+        c->entries[i].table, c->entry_size);
+    if (ret < 0) {
+        return ret;
+    }
+
+    c->entries[i].cache_hits = 32;
+    c->entries[i].offset = offset;
+
+found:
+    c->entries[i].cache_hits++;
+    *table = c->entries[i].table;
+
+    return 0;
+}
+
+int add_cow_cache_get(BlockDriverState *bs, AddCowCache *c, uint64_t sector_num,
+    void **table)
+{
+    /* each byte in bitmap indicates 8 * SECTORS_PER_CLUSTER clusters */
+    uint64_t offset = offset_in_bitmap(sector_num) & (~(c->entry_size - 1));
+    return add_cow_cache_do_get(bs, c, offset, table);
+}
+
+void add_cow_cache_entry_mark_dirty(AddCowCache *c, void *table)
+{
+    int i;
+
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].table == table) {
+            goto found;
+        }
+    }
+    abort();
+
+found:
+    c->entries[i].dirty = true;
+}
+
+bool add_cow_cache_set_writethrough(BlockDriverState *bs, AddCowCache *c,
+    bool enable)
+{
+    bool old = c->writethrough;
+
+    if (!old && enable) {
+        add_cow_cache_flush(bs, c);
+    }
+
+    c->writethrough = enable;
+    return old;
+}
diff --git a/block/add-cow.c b/block/add-cow.c
new file mode 100644
index 0000000..f8bc299
--- /dev/null
+++ b/block/add-cow.c
@@ -0,0 +1,488 @@
+/*
+ * QEMU ADD-COW Disk Format
+ *
+ * Copyright IBM, Corp. 2012
+ *
+ * Authors:
+ *  Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "block_int.h"
+#include "module.h"
+#include "add-cow.h"
+
+static int add_cow_probe(const uint8_t *buf, int buf_size, const char *filename)
+{
+    const AddCowHeader *header = (const AddCowHeader *)buf;
+
+    if (le64_to_cpu(header->magic) == ADD_COW_MAGIC &&
+        le32_to_cpu(header->version) == ADD_COW_VERSION) {
+        return 100;
+    } else {
+        return 0;
+    }
+}
+
+static int add_cow_create(const char *filename, QEMUOptionParameter *options)
+{
+    AddCowHeader header;
+    int64_t image_len = 0;
+    const char *backing_filename = NULL;
+    const char *backing_fmt = NULL;
+    const char *image_filename = NULL;
+    int ret;
+    BlockDriverState *bs, *image_bs = NULL;
+    BlockDriver *drv = bdrv_find_format("add-cow");
+
+    while (options && options->name) {
+        if (!strcmp(options->name, BLOCK_OPT_SIZE)) {
+            image_len = options->value.n;
+        } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
+            backing_filename = options->value.s;
+        } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) {
+            backing_fmt = options->value.s;
+        } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FILE)) {
+            image_filename = options->value.s;
+        }
+        options++;
+    }
+    memset(&header, 0, sizeof(header));
+    header.magic    = cpu_to_le64(ADD_COW_MAGIC);
+    header.version  = cpu_to_le32(ADD_COW_VERSION);
+    header.features = cpu_to_le64(ADD_COW_F_BACKING_FILE);
+
+    if (backing_filename) {
+        header.features |= ADD_COW_F_BACKING_FILE;
+        header.backing_filename_offset = sizeof(header);
+        header.backing_filename_size = strlen(backing_filename);
+
+        if (backing_fmt && strcmp(backing_fmt, "raw") == 0) {
+            header.features |= ADD_COW_F_BACKING_FORMAT_NO_PROBE;
+            pstrcpy(bs->backing_format, sizeof(bs->backing_format), "raw");
+        }
+    } else {
+        error_report("backing_file should be given.");
+        return -EINVAL;
+    }
+
+    if (image_filename) {
+        header.image_filename_offset =
+            sizeof(header) + header.backing_filename_size;
+        header.image_filename_size = strlen(image_filename);
+    } else {
+        error_report("image_file should be given.");
+        return -EINVAL;
+    }
+
+    if (header.image_filename_offset + header.image_filename_size > ADD_COW_BITMAP_POS) {
+        error_report("image_file name or backing_file name too long.");
+        return -ENOSPC;
+    }
+
+    ret = bdrv_file_open(&image_bs, image_filename, BDRV_O_RDWR
+            | BDRV_O_CACHE_WB);
+    if (ret < 0) {
+        return ret;
+    }
+    bdrv_delete(image_bs);
+
+    ret = bdrv_create_file(filename, NULL);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = bdrv_file_open(&bs, filename, BDRV_O_RDWR);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = bdrv_pwrite(bs, 0, &header, sizeof(header));
+    if (ret < 0) {
+        bdrv_delete(bs);
+        return ret;
+    }
+
+    ret = bdrv_pwrite(bs,
+        header.backing_filename_offset,
+        backing_filename,
+        header.backing_filename_size);
+    if (ret < 0) {
+        bdrv_delete(bs);
+        return ret;
+    }
+
+    ret = bdrv_pwrite(bs,
+        header.image_filename_offset,
+        image_filename,
+        header.image_filename_size);
+    if (ret < 0) {
+        bdrv_delete(bs);
+        return ret;
+    }
+
+    ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv);
+    if (ret < 0) {
+        bdrv_delete(bs);
+        return ret;
+    }
+
+    ret = bdrv_truncate(bs, image_len);
+    bdrv_delete(bs);
+    return ret;
+}
+
+static int add_cow_open(BlockDriverState *bs, int flags)
+{
+    char                image_filename[ADD_COW_FILE_LEN];
+    char                tmp_name[ADD_COW_FILE_LEN];
+    BlockDriver         *image_drv = NULL;
+    int                 ret;
+    bool                writethrough;
+    int                 sector_per_byte;
+    BDRVAddCowState     *s = bs->opaque;
+    uint64_t            features;
+
+    ret = bdrv_pread(bs->file, 0, &s->header, sizeof(s->header));
+    if (ret != sizeof(s->header)) {
+        goto fail;
+    }
+
+    if (le64_to_cpu(s->header.magic) != ADD_COW_MAGIC) {
+        ret = -EINVAL;
+        goto fail;
+    }
+    if (le32_to_cpu(s->header.version) != ADD_COW_VERSION) {
+        char version[64];
+        snprintf(version, sizeof(version), "ADD-COW version %d",
+            le32_to_cpu(s->header.version));
+        qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+            bs->device_name, "add-cow", version);
+        ret = -ENOTSUP;
+        goto fail;
+    }
+    features = le64_to_cpu(s->header.features);
+    if (features & ~ADD_COW_FEATURE_MASK) {
+        char buf[64];
+        snprintf(buf, sizeof(buf), "%" PRIx64,
+            s->header.features & ~ADD_COW_FEATURE_MASK);
+        qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+            bs->device_name, "add-cow", buf);
+        return -ENOTSUP;
+    }
+    if ((features & ADD_COW_F_BACKING_FILE) == 0) {
+        /* Now, only with backing_file is supported */
+        ret = -ENOTSUP;
+        goto fail;
+    }
+    ret = bdrv_read_string(bs->file, s->header.backing_filename_offset,
+                      s->header.backing_filename_size, bs->backing_file,
+                      sizeof(bs->backing_file));
+    if (ret < 0) {
+        goto fail;
+    }
+    ret = bdrv_read_string(bs->file, s->header.image_filename_offset,
+                      s->header.image_filename_size, tmp_name,
+                      sizeof(tmp_name));
+    if (ret < 0) {
+        goto fail;
+    }
+    s->image_hd = bdrv_new("");
+    if (path_has_protocol(image_filename)) {
+        pstrcpy(image_filename, sizeof(image_filename), tmp_name);
+    } else {
+        path_combine(image_filename, sizeof(image_filename),
+                     bs->filename, tmp_name);
+    }
+    ret = bdrv_open(s->image_hd, image_filename, flags, image_drv);
+    if (ret < 0) {
+        bdrv_delete(s->image_hd);
+        goto fail;
+    }
+    bs->total_sectors = bdrv_getlength(s->image_hd) >> 9;
+    s->cluster_size = ADD_COW_CLUSTER_SIZE;
+    sector_per_byte = SECTORS_PER_CLUSTER * 8;
+    s->bitmap_size =
+        (bs->total_sectors + sector_per_byte - 1) / sector_per_byte;
+    writethrough = ((flags & BDRV_O_CACHE_WB) == 0);
+    s->bitmap_cache =
+        add_cow_cache_create(bs, ADD_COW_CACHE_SIZE, writethrough);
+
+    qemu_co_mutex_init(&s->lock);
+    return 0;
+fail:
+    if (s->bitmap_cache) {
+        add_cow_cache_destroy(bs, s->bitmap_cache);
+    }
+    return ret;
+}
+
+static void add_cow_close(BlockDriverState *bs)
+{
+    BDRVAddCowState *s = bs->opaque;
+    add_cow_cache_destroy(bs, s->bitmap_cache);
+    bdrv_delete(s->image_hd);
+}
+
+static bool is_allocated(BlockDriverState *bs, int64_t sector_num)
+{
+    BDRVAddCowState *s  = bs->opaque;
+    int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER;
+    uint8_t *table      = NULL;
+    int ret = add_cow_cache_get(bs, s->bitmap_cache,
+        sector_num, (void **)&table);
+
+    if (ret < 0) {
+        return ret;
+    }
+    return table[cluster_num / 8 % ADD_COW_CACHE_ENTRY_SIZE]
+        & (1 << (cluster_num % 8));
+}
+
+static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors, int *num_same)
+{
+    int changed;
+
+    if (nb_sectors == 0) {
+        *num_same = 0;
+        return 0;
+    }
+    changed = is_allocated(bs, sector_num);
+
+    for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
+        if (is_allocated(bs, sector_num + *num_same) != changed) {
+            break;
+        }
+    }
+    return changed;
+}
+
+static coroutine_fn int add_cow_co_readv(BlockDriverState *bs,
+    int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov)
+{
+    BDRVAddCowState *s  = bs->opaque;
+    int cur_nr_sectors;
+    uint64_t bytes_done = 0;
+    QEMUIOVector hd_qiov;
+    int n, ret = 0;
+
+    qemu_iovec_init(&hd_qiov, qiov->niov);
+    qemu_co_mutex_lock(&s->lock);
+    while (remaining_sectors != 0) {
+        cur_nr_sectors = remaining_sectors;
+        if (add_cow_is_allocated(bs, sector_num, cur_nr_sectors, &n)) {
+            cur_nr_sectors = n;
+            qemu_iovec_reset(&hd_qiov);
+            qemu_iovec_copy(&hd_qiov, qiov, bytes_done,
+                            cur_nr_sectors * BDRV_SECTOR_SIZE);
+            qemu_co_mutex_unlock(&s->lock);
+            ret = bdrv_co_readv(s->image_hd, sector_num, n, &hd_qiov);
+            qemu_co_mutex_lock(&s->lock);
+            if (ret < 0) {
+                goto fail;
+            }
+        } else {
+            cur_nr_sectors = n;
+            if (bs->backing_hd) {
+                qemu_iovec_reset(&hd_qiov);
+                qemu_iovec_copy(&hd_qiov, qiov, bytes_done,
+                            cur_nr_sectors * BDRV_SECTOR_SIZE);
+                qemu_co_mutex_unlock(&s->lock);
+                ret = bdrv_co_readv(bs->backing_hd, sector_num,
+                                    n, &hd_qiov);
+                qemu_co_mutex_lock(&s->lock);
+                if (ret < 0) {
+                    goto fail;
+                }
+            } else {
+                qemu_iovec_memset(&hd_qiov, 0,
+                    BDRV_SECTOR_SIZE * cur_nr_sectors);
+            }
+        }
+        remaining_sectors -= cur_nr_sectors;
+        sector_num += cur_nr_sectors;
+        bytes_done += cur_nr_sectors * BDRV_SECTOR_SIZE;
+    }
+fail:
+    qemu_co_mutex_unlock(&s->lock);
+    qemu_iovec_destroy(&hd_qiov);
+    return ret;
+}
+
+static int coroutine_fn copy_sectors(BlockDriverState *bs,
+                                     int n_start, int n_end)
+{
+    BDRVAddCowState *s = bs->opaque;
+    QEMUIOVector qiov;
+    struct iovec iov;
+    int n, ret;
+
+    n = n_end - n_start;
+    if (n <= 0) {
+        return 0;
+    }
+
+    iov.iov_len = n * BDRV_SECTOR_SIZE;
+    iov.iov_base = qemu_blockalign(bs, iov.iov_len);
+
+    qemu_iovec_init_external(&qiov, &iov, 1);
+
+    ret = bdrv_co_readv(bs->backing_hd, n_start, n, &qiov);
+    if (ret < 0) {
+        goto out;
+    }
+    ret = bdrv_co_writev(s->image_hd, n_start, n, &qiov);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = 0;
+out:
+    qemu_vfree(iov.iov_base);
+    return ret;
+}
+
+static coroutine_fn int add_cow_co_writev(BlockDriverState *bs,
+        int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int ret = 0, i;
+    QEMUIOVector hd_qiov;
+    uint8_t *table;
+    bool changed = false;
+
+    qemu_co_mutex_lock(&s->lock);
+    qemu_iovec_init(&hd_qiov, qiov->niov);
+    ret = bdrv_co_writev(s->image_hd,
+                     sector_num,
+                     remaining_sectors, qiov);
+
+    if (ret < 0) {
+        goto fail;
+    }
+    /* Copy content of unmodified sectors */
+    if (!is_cluster_head(sector_num) && !is_allocated(bs, sector_num)) {
+        ret = copy_sectors(bs, sector_num & ~(SECTORS_PER_CLUSTER - 1),
+            sector_num);
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
+    if (!is_cluster_tail(sector_num + remaining_sectors - 1)
+        && !is_allocated(bs, sector_num + remaining_sectors - 1)) {
+        ret = copy_sectors(bs, sector_num + remaining_sectors,
+            ((sector_num + remaining_sectors) | (SECTORS_PER_CLUSTER - 1)) + 1);
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
+    for (i = sector_num / SECTORS_PER_CLUSTER;
+        i <= (sector_num + remaining_sectors - 1) / SECTORS_PER_CLUSTER;
+        i++) {
+        ret = add_cow_cache_get(bs, s->bitmap_cache,
+            i * SECTORS_PER_CLUSTER, (void **)&table);
+        if (ret < 0) {
+            return ret;
+        }
+        if ((table[i / 8] & (1 << (i % 8))) == 0) {
+            table[i / 8] |= (1 << (i % 8));
+            changed = true;
+            add_cow_cache_entry_mark_dirty(s->bitmap_cache, table);
+        }
+
+    }
+    ret = 0;
+fail:
+    if (changed) {
+        ret = add_cow_cache_flush(bs, s->bitmap_cache);
+    }
+    qemu_co_mutex_unlock(&s->lock);
+    qemu_iovec_destroy(&hd_qiov);
+    return ret;
+}
+
+static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int sector_per_byte = SECTORS_PER_CLUSTER * 8;
+    int ret;
+    int64_t bitmap_size =
+        (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte;
+
+    ret = bdrv_truncate(bs->file,
+        ADD_COW_BITMAP_POS + bitmap_size);
+    if (ret < 0) {
+        return ret;
+    }
+    bdrv_truncate(s->image_hd, size);
+    return 0;
+}
+
+static coroutine_fn int add_cow_co_flush(BlockDriverState *bs)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int ret;
+
+    qemu_co_mutex_lock(&s->lock);
+    ret = add_cow_cache_flush(bs, s->bitmap_cache);
+    qemu_co_mutex_unlock(&s->lock);
+    if (ret < 0) {
+        return ret;
+    }
+
+    return bdrv_co_flush(bs->file);
+}
+
+static QEMUOptionParameter add_cow_create_options[] = {
+    {
+        .name = BLOCK_OPT_SIZE,
+        .type = OPT_SIZE,
+        .help = "Virtual disk size"
+    },
+    {
+        .name = BLOCK_OPT_BACKING_FILE,
+        .type = OPT_STRING,
+        .help = "File name of a base image"
+    },
+    {
+        .name = BLOCK_OPT_IMAGE_FILE,
+        .type = OPT_STRING,
+        .help = "File name of a image file"
+    },
+    {
+        .name = BLOCK_OPT_BACKING_FMT,
+        .type = OPT_STRING,
+        .help = "Image format of the base image"
+    },
+    { NULL }
+};
+
+static BlockDriver bdrv_add_cow = {
+    .format_name                = "add-cow",
+    .instance_size              = sizeof(BDRVAddCowState),
+    .bdrv_probe                 = add_cow_probe,
+    .bdrv_open                  = add_cow_open,
+    .bdrv_close                 = add_cow_close,
+    .bdrv_create                = add_cow_create,
+    .bdrv_co_readv              = add_cow_co_readv,
+    .bdrv_co_writev             = add_cow_co_writev,
+    .bdrv_truncate              = bdrv_add_cow_truncate,
+    .bdrv_co_is_allocated       = add_cow_is_allocated,
+
+    .create_options             = add_cow_create_options,
+    .bdrv_co_flush_to_disk      = add_cow_co_flush,
+};
+
+static void bdrv_add_cow_init(void)
+{
+    bdrv_register(&bdrv_add_cow);
+}
+
+block_init(bdrv_add_cow_init);
diff --git a/block/add-cow.h b/block/add-cow.h
new file mode 100644
index 0000000..f83f8e7
--- /dev/null
+++ b/block/add-cow.h
@@ -0,0 +1,95 @@
+/*
+ * QEMU ADD-COW Disk Format
+ *
+ * Copyright IBM, Corp. 2012
+ *
+ * Authors:
+ *  Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#ifndef BLOCK_ADD_COW_H
+#define BLOCK_ADD_COW_H
+
+enum {
+    ADD_COW_F_BACKING_FILE      = 0x01,
+    ADD_COW_F_BACKING_FORMAT_NO_PROBE = 0x02,
+    ADD_COW_FEATURE_MASK        = ADD_COW_F_BACKING_FILE |
+                       ADD_COW_F_BACKING_FORMAT_NO_PROBE,
+
+    ADD_COW_MAGIC = (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) | \
+                    ((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | \
+                    ((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | \
+                    ((uint64_t)'W' << 8) | 0xFF),
+    ADD_COW_VERSION             = 1,
+    ADD_COW_FILE_LEN            = 1024,
+    ADD_COW_CACHE_SIZE          = 16,
+    ADD_COW_CACHE_ENTRY_SIZE    = 65536,
+    ADD_COW_CLUSTER_SIZE        = 65536,
+    SECTORS_PER_CLUSTER         = (ADD_COW_CLUSTER_SIZE / BDRV_SECTOR_SIZE),
+    ADD_COW_BITMAP_POS          = 4096
+};
+
+typedef struct AddCowHeader {
+    uint64_t        magic;
+    uint32_t        version;
+    uint32_t        backing_filename_offset;
+    uint32_t        backing_filename_size;
+    uint32_t        image_filename_offset;
+    uint32_t        image_filename_size;
+    uint64_t        features;
+} QEMU_PACKED AddCowHeader;
+
+typedef struct AddCowCachedTable {
+    void    *table;
+    int64_t offset;
+    bool    dirty;
+    int     cache_hits;
+} AddCowCachedTable;
+
+typedef struct AddCowCache {
+    AddCowCachedTable       *entries;
+    int                     entry_size;
+    int                     size;
+    bool                    writethrough;
+} AddCowCache;
+
+typedef struct BDRVAddCowState {
+    BlockDriverState    *image_hd;
+    CoMutex             lock;
+    int                 cluster_size;
+    AddCowCache         *bitmap_cache;
+    uint64_t            bitmap_size;
+    AddCowHeader        header;
+} BDRVAddCowState;
+
+/* Convert sector_num to offset in bitmap */
+static inline int64_t offset_in_bitmap(int64_t sector_num)
+{
+    int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER;
+    return cluster_num / 8;
+}
+
+static inline bool is_cluster_head(int64_t sector_num)
+{
+    return sector_num % SECTORS_PER_CLUSTER == 0;
+}
+
+static inline bool is_cluster_tail(int64_t sector_num)
+{
+    return (sector_num + 1) % SECTORS_PER_CLUSTER == 0;
+}
+
+AddCowCache *add_cow_cache_create(BlockDriverState *bs, int num_tables,
+    bool writethrough);
+int add_cow_cache_destroy(BlockDriverState *bs, AddCowCache *c);
+void add_cow_cache_entry_mark_dirty(AddCowCache *c, void *table);
+int add_cow_cache_get(BlockDriverState *bs, AddCowCache *c, uint64_t offset,
+    void **table);
+int add_cow_cache_flush(BlockDriverState *bs, AddCowCache *c);
+bool add_cow_cache_set_writethrough(BlockDriverState *bs, AddCowCache *c,
+    bool enable);
+#endif
diff --git a/block_int.h b/block_int.h
index 3d4abc6..a0be58c 100644
--- a/block_int.h
+++ b/block_int.h
@@ -51,6 +51,7 @@
 #define BLOCK_OPT_PREALLOC      "preallocation"
 #define BLOCK_OPT_SUBFMT        "subformat"
 #define BLOCK_OPT_COMPAT_LEVEL  "compat"
+#define BLOCK_OPT_IMAGE_FILE    "image_file"
 
 typedef struct BdrvTrackedRequest BdrvTrackedRequest;
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert
  2012-06-13 14:36 [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Dong Xu Wang
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 2/6] block: make some functions public Dong Xu Wang
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 3/6] add-cow file format Dong Xu Wang
@ 2012-06-13 14:36 ` Dong Xu Wang
  2012-06-14 10:51   ` Kevin Wolf
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev Dong Xu Wang
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-13 14:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang

add-cow file can't live alone, must together will image_file and backing_file.
If we implement qemu-img convert operation for add-cow file format, we must
create image_file and backing_file manually, that will be confused for users,
so we just ignore add-cow format while doing converting.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
 qemu-img.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index c8a70ff..0bce61c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -756,6 +756,12 @@ static int img_convert(int argc, char **argv)
         }
     }
 
+    if (strcmp(out_fmt, "add-cow") == 0) {
+        error_report("Convert operation does not support %s format.", out_fmt);
+        ret = -1;
+        goto out;
+    }
+
     /* Find driver and parse its options */
     drv = bdrv_find_format(out_fmt);
     if (!drv) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev
  2012-06-13 14:36 [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Dong Xu Wang
                   ` (2 preceding siblings ...)
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert Dong Xu Wang
@ 2012-06-13 14:36 ` Dong Xu Wang
  2012-06-14 10:59   ` Kevin Wolf
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 6/6] add-cow: support qemu-iotests Dong Xu Wang
  2012-06-13 15:10 ` [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Eric Blake
  5 siblings, 1 reply; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-13 14:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang

add-cow will let raw file support snapshot_blkdev indirectly.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
 blockdev.c              |   31 +++++++++++++++++++++++++++----
 docs/live-block-ops.txt |   10 +++++++++-
 2 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 622ecba..2d89e5e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -783,15 +783,38 @@ void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
 
         /* create new image w/backing file */
         if (mode != NEW_IMAGE_MODE_EXISTING) {
-            ret = bdrv_img_create(new_image_file, format,
+            if (strcmp(format, "add-cow")) {
+                ret = bdrv_img_create(new_image_file, format,
                                   states->old_bs->filename,
                                   states->old_bs->drv->format_name,
                                   NULL, -1, flags);
-            if (ret) {
-                error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
-                goto delete_and_fail;
+            } else {
+                char image_file[1024];
+                char option[1024];
+                uint64_t size;
+
+                bdrv_get_geometry(states->old_bs, &size);
+                size *= BDRV_SECTOR_SIZE;
+
+                sprintf(image_file, "%s.raw", new_image_file);
+
+                ret = bdrv_img_create(image_file, "raw", NULL,
+                                      NULL, NULL, size, flags);
+                if (ret) {
+                    error_set(errp, QERR_UNDEFINED_ERROR);
+                    return;
+                }
+                sprintf(option, "image_file=%s.raw", new_image_file);
+                ret = bdrv_img_create(new_image_file, format,
+                                      states->old_bs->filename,
+                                      states->old_bs->drv->format_name,
+                                      option, -1, flags);
             }
         }
+        if (ret) {
+            error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
+            goto delete_and_fail;
+        }
 
         /* We will manually add the backing_hd field to the bs later */
         states->new_bs = bdrv_new("");
diff --git a/docs/live-block-ops.txt b/docs/live-block-ops.txt
index a257087..c97344b 100644
--- a/docs/live-block-ops.txt
+++ b/docs/live-block-ops.txt
@@ -2,7 +2,8 @@ LIVE BLOCK OPERATIONS
 =====================
 
 High level description of live block operations. Note these are not
-supported for use with the raw format at the moment.
+supported for use with the raw format at the moment, but we can use
+add-cow as metadata to suport raw format.
 
 Snapshot live merge
 ===================
@@ -56,3 +57,10 @@ into that image. Example:
 (qemu) block_stream ide0-hd0
 
 
+
+Raw is not supported, but we can use add-cow in the 1st step:
+
+(qemu) snapshot_blkdev ide0-hd0 /new-path/disk.img add-cow
+
+It will create a raw file named disk.img.raw, with the same virtual size of
+ide0-hd0 first, and then create disk.img.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [Qemu-devel] [PATCH 6/6] add-cow: support qemu-iotests
  2012-06-13 14:36 [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Dong Xu Wang
                   ` (3 preceding siblings ...)
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev Dong Xu Wang
@ 2012-06-13 14:36 ` Dong Xu Wang
  2012-06-13 15:10 ` [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Eric Blake
  5 siblings, 0 replies; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-13 14:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang

add qemu-iotests support for add-cow file format. add-cow need not support
convert operation, so not all tests cases related to backing_file are involved.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
---
 tests/qemu-iotests/017       |    2 +-
 tests/qemu-iotests/020       |    2 +-
 tests/qemu-iotests/check     |    4 ++--
 tests/qemu-iotests/common    |    6 ++++++
 tests/qemu-iotests/common.rc |   19 +++++++++++++++++++
 5 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/017 b/tests/qemu-iotests/017
index 66951eb..d31432f 100755
--- a/tests/qemu-iotests/017
+++ b/tests/qemu-iotests/017
@@ -40,7 +40,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.pattern
 
 # Any format supporting backing files
-_supported_fmt qcow qcow2 vmdk qed
+_supported_fmt qcow qcow2 vmdk qed add-cow
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index 2fb0ff8..3dbb495 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -42,7 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.pattern
 
 # Any format supporting backing files
-_supported_fmt qcow qcow2 vmdk qed
+_supported_fmt qcow qcow2 vmdk qed add-cow
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 432732c..cab7bbc 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -243,7 +243,7 @@ do
 		echo " - no qualified output"
 		err=true
 	    else
-		if diff -w $seq.out $tmp.out >/dev/null 2>&1
+               if diff -w -I "^Formatting" $seq.out $tmp.out >/dev/null 2>&1
 		then
 		    echo ""
 		    if $err
@@ -255,7 +255,7 @@ do
 		else
 		    echo " - output mismatch (see $seq.out.bad)"
 		    mv $tmp.out $seq.out.bad
-		    $diff -w $seq.out $seq.out.bad
+                   $diff -w -I "^Formatting" $seq.out $seq.out.bad
 		    err=true
 		fi
 	    fi
diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index eeb70cb..c80d18c 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -127,6 +127,7 @@ common options
 check options
     -raw                test raw (default)
     -cow                test cow
+    -add-cow            test add-cow
     -qcow               test qcow
     -qcow2              test qcow2
     -qed                test qed
@@ -162,6 +163,11 @@ testlist options
 	    xpand=false
 	    ;;
 
+    -add-cow)
+        IMGFMT=add-cow
+        xpand=false
+       ;;
+
 	-qcow)
 	    IMGFMT=qcow
 	    xpand=false
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index e535874..93595c3 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -87,6 +87,18 @@ _make_test_img()
     fi
     if [ \( "$IMGFMT" = "qcow2" -o "$IMGFMT" = "qed" \) -a -n "$CLUSTER_SIZE" ]; then
         optstr=$(_optstr_add "$optstr" "cluster_size=$CLUSTER_SIZE")
+    elif [ "$IMGFMT" = "add-cow" ]; then
+        local BACKING="$TEST_IMG"".qcow2"
+        local IMG="$TEST_IMG"".raw"
+        if [ "$1" = "-b" ]; then
+            IMG="$IMG"".b"
+            $QEMU_IMG create -f raw $IMG $image_size>/dev/null
+            extra_img_options="-o image_file=$IMG $extra_img_options"
+        else
+            $QEMU_IMG create -f raw $IMG $image_size>/dev/null
+            $QEMU_IMG create -f qcow2 $BACKING $image_size>/dev/null
+            extra_img_options="-o backing_file=$BACKING,image_file=$IMG"
+        fi
     fi
 
     if [ -n "$optstr" ]; then
@@ -114,6 +126,13 @@ _cleanup_test_img()
             rm -f $TEST_DIR/t.$IMGFMT
             rm -f $TEST_DIR/t.$IMGFMT.orig
             rm -f $TEST_DIR/t.$IMGFMT.base
+            if [ "$IMGFMT" = "add-cow" ]; then
+                rm -f $TEST_DIR/t.$IMGFMT.qcow2
+                rm -f $TEST_DIR/t.$IMGFMT.raw
+                rm -f $TEST_DIR/t.$IMGFMT.raw.b
+                rm -f $TEST_DIR/t.$IMGFMT.ct.qcow2
+                rm -f $TEST_DIR/t.$IMGFMT.ct.raw
+            fi
             ;;
 
         rbd)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format
  2012-06-13 14:36 [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Dong Xu Wang
                   ` (4 preceding siblings ...)
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 6/6] add-cow: support qemu-iotests Dong Xu Wang
@ 2012-06-13 15:10 ` Eric Blake
  2012-06-14  3:06   ` Dong Xu Wang
  5 siblings, 1 reply; 26+ messages in thread
From: Eric Blake @ 2012-06-13 15:10 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: kwolf, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 7990 bytes --]

On 06/13/2012 08:36 AM, Dong Xu Wang wrote:
> Introduce a new file format:add-cow. The usage can be found at this patch.
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> ---
>  docs/specs/add-cow.txt |   87 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 87 insertions(+), 0 deletions(-)
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
> new file mode 100644
> index 0000000..e077fc2
> --- /dev/null
> +++ b/docs/specs/add-cow.txt
> @@ -0,0 +1,87 @@
> +== General ==
> +
> +Raw file format does not support backing_file and copy on write feature.
> +The add-cow image format makes it possible to use backing files with raw
> +image by keeping a separate .add-cow metadata file.  Once all sectors
> +have been written into the raw image it is safe to discard the .add-cow
> +and backing files, then we can use the raw image directly.
> +
> +While using add-cow, procedures may like this:
> +(ubuntu.img is a disk image which has been installed OS.)
> +    1)  Create a raw image with the same size of ubuntu.img
> +            qemu-img create -f raw test.raw 8G

Make sure we also support a raw file larger than the backing file.

Does it matter whether the raw file starts life sparse?

> +    2)  Create an add-cow image which will store dirty bitmap
> +            qemu-img create -f add-cow test.add-cow \
> +                -o backing_file=ubuntu.img,image_file=test.raw
> +    3)  Run qemu with add-cow image
> +            qemu -drive if=virtio,file=test.add-cow

How does this interact with live snapshots/live disk mirroring?  Is this
something where I have to call 'block-stream' to pull data into the new
raw file on-demand?  I take it that test.add-cow is required until the
block-stream completes?  Is there a way, while qemu is still running,
but after the block-stream is complete, to reassociate the drive with
the actual raw file instead of the add-cow, or does the add-cow have to
remain around as long as the qemu process is still running?

> +
> +=Specification=
> +
> +The file format looks like this:
> +
> + +---------------+-------------+-----------------+
> + |     Header    |   Reserved  |    COW bitmap   |
> + +---------------+-------------+-----------------+
> +
> +All numbers in add-cow are stored in Little Endian byte order.

Okay, but different than network byte order, which means we can't use
htonl() and friends to convert host bytes into the proper format.
<endian.h> is not (yet) standardized, although there is talk of adding
it to the next version of POSIX, in which case htole32() and friends
would be guaranteed.

> +
> +== Header ==
> +
> +The Header is included in the first bytes:
> +
> +    Byte    0 -  7:     magic
> +                        add-cow magic string ("ADD_COW\xff")
> +
> +            8 -  11:    version
> +                        Version number (only valid value is 1 now)
> +
> +            12 - 15:    backing_filename_offset
> +                        Offset in the add-cow file at which the backing file name
> +                        is stored (NB: The string is not null terminated). 0 if the
> +                        image doesn't have a backing file.

Mention that if this is not 0, then it must be between 36 and 4094 (a
file name must be at least 1 byte).  What are the semantics if the
filename is relative?

> +
> +            16 - 19:    backing_filename_size
> +                        Length of the backing file name in bytes. Undefined if the
> +                        image doesn't have a backing file.

Better to require 0 if backing_filename_offset is 0, than to leave this
field undefined; also if backing_filename_offset is non-zero, then this
must be non-zero.  Must be less than 4096-36 to fit in the reserved part
of the header.

> +
> +            20 - 23:    image_filename_offset
> +                        Offset in the add-cow file at which the image_file name
> +                        is stored (NB: The string is not null terminated).

Mention that this must be between 36 and 4094 (a file name must be at
least 1 byte). What are the semantics if the filename is relative?

> +
> +            24 - 27:    image_filename_size
> +                        Length of the image_file name in bytes.

If backing_filename_offset is non-zero, then this must be non-zero.
Must be less than 4096-36 to fit in the reserved part of the header.

May image_filename and backing_filename overlap (possible if one is a
suffix of the other)?  Are there any constraints to prevent infinite
loops, such as forbidding backing_filename and image_filename from
resolving either to the same file or to the add-cow file?

> +
> +            28 - 35:    features
> +                        Currently only 2 feature bits are used:
> +                        Feature bits:
> +                            The image uses a backing file:
> +                            * ADD_COW_F_BACKING_FILE = 0x01.
> +                            The backing file's format is raw:
> +                            * ADD_COW_F_BACKING_FORMAT_NO_PROBE = 0x02.

Should this follow the qcow2v3 proposal of splitting into mandatory vs.
optional feature bits?

I agree that ADD_COW_F_BACKING_FORMAT_NO_PROBE is sufficient to avoid
security implications, but do we want the extra flexibility of
specifying the backing format file format rather than just requiring
probes on all but raw?

> +
> +== Reserved ==
> +
> +    Byte    36 - 4095:  Reserved field:
> +                        It is used to make sure COW bitmap field starts at the
> +                        4096th byte, backing_file name and image_file name will
> +                        be stored here.

Do we want to keep a fixed-size header, or should we be planning on the
possibility of future extensions requiring enough other header
extensions that a variable-sized header would be wiser?  That is, I'm
fine with requiring that the header be a multiple of 4k, but maybe it
would make sense to have a mandatory header field that states how many
header pages are present before the COW bitmap begins.  In the first
round of implementation, this header field can be required to be 1 (that
is, for now, we require exactly 4k header), but having the field would
let us change in the future to a design with an 8k header to hold more
metadata as needed.

> +
> +== COW bitmap ==
> +
> +The "COW bitmap" field starts at the 4096th byte, stores a bitmap related to
> +backing_file and image_file. The bitmap will track whether the sector in
> +backing_file is dirty or not.
> +
> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128
> +sectors, then each bit indicates 512 * 128 = 64k bytes, So the size of bitmap is
> +calculated according to virtual size of image_file. In each byte, bit 0 to 7
> +will track the 1st to 7th cluster in sequence, bit orders in one byte look like:
> + +----+----+----+----+----+----+----+----+
> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
> + +----+----+----+----+----+----+----+----+
> +
> +If the bit is 0, indicates the sector has not been allocated in image_file, data
> +should be loaded from backing_file while reading; if the bit is 1,  indicates the
> +related sector has been dirty, should be loaded from image_file while reading.
> +Writing to a sector causes the corresponding bit to be set to 1.

So basically an add-cow image is thin as long as at least one bit is 0,
and the add-cow wrapper can only be discarded when all bits are 1.

How do you handle the case where the raw image is not an even multiple
of cluster bytes?  That is, do bits that correspond to bytes beyond the
raw file size have to be in a certain state?

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format
  2012-06-13 15:10 ` [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Eric Blake
@ 2012-06-14  3:06   ` Dong Xu Wang
  2012-06-14 10:47     ` Kevin Wolf
  0 siblings, 1 reply; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-14  3:06 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, qemu-devel

On Wed, Jun 13, 2012 at 11:10 PM, Eric Blake <eblake@redhat.com> wrote:
> On 06/13/2012 08:36 AM, Dong Xu Wang wrote:
>> Introduce a new file format:add-cow. The usage can be found at this patch.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>> ---
>>  docs/specs/add-cow.txt |   87 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 87 insertions(+), 0 deletions(-)
>>  create mode 100644 docs/specs/add-cow.txt
>>
>> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
>> new file mode 100644
>> index 0000000..e077fc2
>> --- /dev/null
>> +++ b/docs/specs/add-cow.txt
>> @@ -0,0 +1,87 @@
>> +== General ==
>> +
>> +Raw file format does not support backing_file and copy on write feature.
>> +The add-cow image format makes it possible to use backing files with raw
>> +image by keeping a separate .add-cow metadata file.  Once all sectors
>> +have been written into the raw image it is safe to discard the .add-cow
>> +and backing files, then we can use the raw image directly.
>> +
>> +While using add-cow, procedures may like this:
>> +(ubuntu.img is a disk image which has been installed OS.)
>> +    1)  Create a raw image with the same size of ubuntu.img
>> +            qemu-img create -f raw test.raw 8G
>
> Make sure we also support a raw file larger than the backing file.
>

Okay, in this version, I just truncate raw file with the same size of
backing_file,
I will correct this in v11.

> Does it matter whether the raw file starts life sparse?

No, it can work with sparse raw file.

>
>> +    2)  Create an add-cow image which will store dirty bitmap
>> +            qemu-img create -f add-cow test.add-cow \
>> +                -o backing_file=ubuntu.img,image_file=test.raw
>> +    3)  Run qemu with add-cow image
>> +            qemu -drive if=virtio,file=test.add-cow
>
> How does this interact with live snapshots/live disk mirroring?  Is this
> something where I have to call 'block-stream' to pull data into the new
> raw file on-demand?  I take it that test.add-cow is required until the
> block-stream completes?  Is there a way, while qemu is still running,
> but after the block-stream is complete, to reassociate the drive with
> the actual raw file instead of the add-cow, or does the add-cow have to
> remain around as long as the qemu process is still running?
>

I did not touch snapshots/live disk mirroring code much, so have to call
"block_stream".

Now, add-cow has to remain until qemu process is still running, I did not
implement a way to use raw file directly after the block-stream is complete.

>> +
>> +=Specification=
>> +
>> +The file format looks like this:
>> +
>> + +---------------+-------------+-----------------+
>> + |     Header    |   Reserved  |    COW bitmap   |
>> + +---------------+-------------+-----------------+
>> +
>> +All numbers in add-cow are stored in Little Endian byte order.
>
> Okay, but different than network byte order, which means we can't use
> htonl() and friends to convert host bytes into the proper format.
> <endian.h> is not (yet) standardized, although there is talk of adding
> it to the next version of POSIX, in which case htole32() and friends
> would be guaranteed.
>
>> +
>> +== Header ==
>> +
>> +The Header is included in the first bytes:
>> +
>> +    Byte    0 -  7:     magic
>> +                        add-cow magic string ("ADD_COW\xff")
>> +
>> +            8 -  11:    version
>> +                        Version number (only valid value is 1 now)
>> +
>> +            12 - 15:    backing_filename_offset
>> +                        Offset in the add-cow file at which the backing file name
>> +                        is stored (NB: The string is not null terminated). 0 if the
>> +                        image doesn't have a backing file.
>
> Mention that if this is not 0, then it must be between 36 and 4094 (a
> file name must be at least 1 byte).  What are the semantics if the
> filename is relative?

relative filename is ok, I tested it just now.

>
>> +
>> +            16 - 19:    backing_filename_size
>> +                        Length of the backing file name in bytes. Undefined if the
>> +                        image doesn't have a backing file.
>
> Better to require 0 if backing_filename_offset is 0, than to leave this
> field undefined; also if backing_filename_offset is non-zero, then this
> must be non-zero.  Must be less than 4096-36 to fit in the reserved part
> of the header.
>

Okay.

>> +
>> +            20 - 23:    image_filename_offset
>> +                        Offset in the add-cow file at which the image_file name
>> +                        is stored (NB: The string is not null terminated).
>
> Mention that this must be between 36 and 4094 (a file name must be at
> least 1 byte). What are the semantics if the filename is relative?

relative filename is ok, I tested it just now.

>
>> +
>> +            24 - 27:    image_filename_size
>> +                        Length of the image_file name in bytes.
>
> If backing_filename_offset is non-zero, then this must be non-zero.
> Must be less than 4096-36 to fit in the reserved part of the header.
>

Yes,

> May image_filename and backing_filename overlap (possible if one is a
> suffix of the other)?  Are there any constraints to prevent infinite
> loops, such as forbidding backing_filename and image_filename from
> resolving either to the same file or to the add-cow file?
>

Sorry, I should add the code that judge if image_file is valid. will fix.

>> +
>> +            28 - 35:    features
>> +                        Currently only 2 feature bits are used:
>> +                        Feature bits:
>> +                            The image uses a backing file:
>> +                            * ADD_COW_F_BACKING_FILE = 0x01.
>> +                            The backing file's format is raw:
>> +                            * ADD_COW_F_BACKING_FORMAT_NO_PROBE = 0x02.
>
> Should this follow the qcow2v3 proposal of splitting into mandatory vs.
> optional feature bits?
>
> I agree that ADD_COW_F_BACKING_FORMAT_NO_PROBE is sufficient to avoid
> security implications, but do we want the extra flexibility of
> specifying the backing format file format rather than just requiring
> probes on all but raw?

Kevin, or Stefan, can you give some comments for this? thanks.

>
>> +
>> +== Reserved ==
>> +
>> +    Byte    36 - 4095:  Reserved field:
>> +                        It is used to make sure COW bitmap field starts at the
>> +                        4096th byte, backing_file name and image_file name will
>> +                        be stored here.
>
> Do we want to keep a fixed-size header, or should we be planning on the
> possibility of future extensions requiring enough other header
> extensions that a variable-sized header would be wiser?  That is, I'm
> fine with requiring that the header be a multiple of 4k, but maybe it
> would make sense to have a mandatory header field that states how many
> header pages are present before the COW bitmap begins.  In the first
> round of implementation, this header field can be required to be 1 (that
> is, for now, we require exactly 4k header), but having the field would
> let us change in the future to a design with an 8k header to hold more
> metadata as needed.
>

Okay.

>> +
>> +== COW bitmap ==
>> +
>> +The "COW bitmap" field starts at the 4096th byte, stores a bitmap related to
>> +backing_file and image_file. The bitmap will track whether the sector in
>> +backing_file is dirty or not.
>> +
>> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128
>> +sectors, then each bit indicates 512 * 128 = 64k bytes, So the size of bitmap is
>> +calculated according to virtual size of image_file. In each byte, bit 0 to 7
>> +will track the 1st to 7th cluster in sequence, bit orders in one byte look like:
>> + +----+----+----+----+----+----+----+----+
>> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
>> + +----+----+----+----+----+----+----+----+
>> +
>> +If the bit is 0, indicates the sector has not been allocated in image_file, data
>> +should be loaded from backing_file while reading; if the bit is 1,  indicates the
>> +related sector has been dirty, should be loaded from image_file while reading.
>> +Writing to a sector causes the corresponding bit to be set to 1.
>
> So basically an add-cow image is thin as long as at least one bit is 0,
> and the add-cow wrapper can only be discarded when all bits are 1.
>
> How do you handle the case where the raw image is not an even multiple
> of cluster bytes?  That is, do bits that correspond to bytes beyond the
> raw file size have to be in a certain state?
>

Now, bits correspond to bytes beyond the raw file size are set to 0.

Really thanks for your reviewing, Eric.

> --
> Eric Blake   eblake@redhat.com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format
  2012-06-14  3:06   ` Dong Xu Wang
@ 2012-06-14 10:47     ` Kevin Wolf
  2012-06-18  2:08       ` Dong Xu Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Kevin Wolf @ 2012-06-14 10:47 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: Eric Blake, qemu-devel

Am 14.06.2012 05:06, schrieb Dong Xu Wang:
> On Wed, Jun 13, 2012 at 11:10 PM, Eric Blake <eblake@redhat.com> wrote:
>> On 06/13/2012 08:36 AM, Dong Xu Wang wrote:
>>> Introduce a new file format:add-cow. The usage can be found at this patch.
>>>
>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>> ---
>>>  docs/specs/add-cow.txt |   87 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>  1 files changed, 87 insertions(+), 0 deletions(-)
>>>  create mode 100644 docs/specs/add-cow.txt

>>> +
>>> +== Header ==
>>> +
>>> +The Header is included in the first bytes:
>>> +
>>> +    Byte    0 -  7:     magic
>>> +                        add-cow magic string ("ADD_COW\xff")
>>> +
>>> +            8 -  11:    version
>>> +                        Version number (only valid value is 1 now)
>>> +
>>> +            12 - 15:    backing_filename_offset
>>> +                        Offset in the add-cow file at which the backing file name
>>> +                        is stored (NB: The string is not null terminated). 0 if the
>>> +                        image doesn't have a backing file.
>>
>> Mention that if this is not 0, then it must be between 36 and 4094 (a
>> file name must be at least 1 byte).  What are the semantics if the
>> filename is relative?
> 
> relative filename is ok, I tested it just now.

I believe Eric wanted to know what a relative path means, i.e. that it's
relative to the image file rather than relative to the working directory.

>>> +
>>> +            16 - 19:    backing_filename_size
>>> +                        Length of the backing file name in bytes. Undefined if the
>>> +                        image doesn't have a backing file.
>>
>> Better to require 0 if backing_filename_offset is 0, than to leave this
>> field undefined; also if backing_filename_offset is non-zero, then this
>> must be non-zero.  Must be less than 4096-36 to fit in the reserved part
>> of the header.
>>
> 
> Okay.

Does an add-cow image without a backing file even make sense?

>>> +            28 - 35:    features
>>> +                        Currently only 2 feature bits are used:
>>> +                        Feature bits:
>>> +                            The image uses a backing file:
>>> +                            * ADD_COW_F_BACKING_FILE = 0x01.
>>> +                            The backing file's format is raw:
>>> +                            * ADD_COW_F_BACKING_FORMAT_NO_PROBE = 0x02.
>>
>> Should this follow the qcow2v3 proposal of splitting into mandatory vs.
>> optional feature bits?
>>
>> I agree that ADD_COW_F_BACKING_FORMAT_NO_PROBE is sufficient to avoid
>> security implications, but do we want the extra flexibility of
>> specifying the backing format file format rather than just requiring
>> probes on all but raw?
> 
> Kevin, or Stefan, can you give some comments for this? thanks.

I tend to agree that a format name is better than relying on probing.

Also, I think we need the same thing for image_file. add-cow is not only
useful for raw images, but also for other image format types for which
we don't support backing files.

>>> +== Reserved ==
>>> +
>>> +    Byte    36 - 4095:  Reserved field:
>>> +                        It is used to make sure COW bitmap field starts at the
>>> +                        4096th byte, backing_file name and image_file name will
>>> +                        be stored here.
>>
>> Do we want to keep a fixed-size header, or should we be planning on the
>> possibility of future extensions requiring enough other header
>> extensions that a variable-sized header would be wiser?  That is, I'm
>> fine with requiring that the header be a multiple of 4k, but maybe it
>> would make sense to have a mandatory header field that states how many
>> header pages are present before the COW bitmap begins.  In the first
>> round of implementation, this header field can be required to be 1 (that
>> is, for now, we require exactly 4k header), but having the field would
>> let us change in the future to a design with an 8k header to hold more
>> metadata as needed.

I have the impression that this simple add-cow hack is starting to get
seriously overengineered... :-)

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert Dong Xu Wang
@ 2012-06-14 10:51   ` Kevin Wolf
  2012-06-14 14:06     ` Dong Xu Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Kevin Wolf @ 2012-06-14 10:51 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: qemu-devel

Am 13.06.2012 16:36, schrieb Dong Xu Wang:
> add-cow file can't live alone, must together will image_file and backing_file.
> If we implement qemu-img convert operation for add-cow file format, we must
> create image_file and backing_file manually, that will be confused for users,
> so we just ignore add-cow format while doing converting.
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

NACK.

This stupid "let's drop the feature, it might confuse users" attitude is
known from Gnome, but not from qemu.

There's no technical reason to forbid it and anyone who manages to
create a valid add-cow image will also be able to specify the very same
options to a convert command. Also, having image format specific code in
qemu-img is evil.

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev Dong Xu Wang
@ 2012-06-14 10:59   ` Kevin Wolf
  2012-06-14 11:18     ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: Kevin Wolf @ 2012-06-14 10:59 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: Paolo Bonzini, qemu-devel

Am 13.06.2012 16:36, schrieb Dong Xu Wang:
> add-cow will let raw file support snapshot_blkdev indirectly.
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Paolo, what do you think about this magic?

I think I can see its use especially for HMP because it's quite
convenient, but on the other hand it assumes a fixed image path for the
new raw image. This isn't going to work for block devices, for example.

If we don't do it this way, we need to allow passing image creation
options to the snapshotting command so that you can pass a precreated
image file.

Kevin

> ---
>  blockdev.c              |   31 +++++++++++++++++++++++++++----
>  docs/live-block-ops.txt |   10 +++++++++-
>  2 files changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 622ecba..2d89e5e 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -783,15 +783,38 @@ void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
>  
>          /* create new image w/backing file */
>          if (mode != NEW_IMAGE_MODE_EXISTING) {
> -            ret = bdrv_img_create(new_image_file, format,
> +            if (strcmp(format, "add-cow")) {
> +                ret = bdrv_img_create(new_image_file, format,
>                                    states->old_bs->filename,
>                                    states->old_bs->drv->format_name,
>                                    NULL, -1, flags);
> -            if (ret) {
> -                error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
> -                goto delete_and_fail;
> +            } else {
> +                char image_file[1024];
> +                char option[1024];
> +                uint64_t size;
> +
> +                bdrv_get_geometry(states->old_bs, &size);
> +                size *= BDRV_SECTOR_SIZE;
> +
> +                sprintf(image_file, "%s.raw", new_image_file);
> +
> +                ret = bdrv_img_create(image_file, "raw", NULL,
> +                                      NULL, NULL, size, flags);
> +                if (ret) {
> +                    error_set(errp, QERR_UNDEFINED_ERROR);
> +                    return;
> +                }
> +                sprintf(option, "image_file=%s.raw", new_image_file);
> +                ret = bdrv_img_create(new_image_file, format,
> +                                      states->old_bs->filename,
> +                                      states->old_bs->drv->format_name,
> +                                      option, -1, flags);
>              }
>          }
> +        if (ret) {
> +            error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
> +            goto delete_and_fail;
> +        }
>  
>          /* We will manually add the backing_hd field to the bs later */
>          states->new_bs = bdrv_new("");
> diff --git a/docs/live-block-ops.txt b/docs/live-block-ops.txt
> index a257087..c97344b 100644
> --- a/docs/live-block-ops.txt
> +++ b/docs/live-block-ops.txt
> @@ -2,7 +2,8 @@ LIVE BLOCK OPERATIONS
>  =====================
>  
>  High level description of live block operations. Note these are not
> -supported for use with the raw format at the moment.
> +supported for use with the raw format at the moment, but we can use
> +add-cow as metadata to suport raw format.
>  
>  Snapshot live merge
>  ===================
> @@ -56,3 +57,10 @@ into that image. Example:
>  (qemu) block_stream ide0-hd0
>  
>  
> +
> +Raw is not supported, but we can use add-cow in the 1st step:
> +
> +(qemu) snapshot_blkdev ide0-hd0 /new-path/disk.img add-cow
> +
> +It will create a raw file named disk.img.raw, with the same virtual size of
> +ide0-hd0 first, and then create disk.img.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] add-cow file format
  2012-06-13 14:36 ` [Qemu-devel] [PATCH 3/6] add-cow file format Dong Xu Wang
@ 2012-06-14 11:13   ` Paolo Bonzini
  2012-06-18  2:08     ` Dong Xu Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2012-06-14 11:13 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: kwolf, qemu-devel

I just took a quick look at the flush code.

Il 13/06/2012 16:36, Dong Xu Wang ha scritto:
> 
> +bool add_cow_cache_set_writethrough(BlockDriverState *bs, AddCowCache *c,
> +    bool enable)
> +{
> +    bool old = c->writethrough;
> +
> +    if (!old && enable) {
> +        add_cow_cache_flush(bs, c);
> +    }
> +
> +    c->writethrough = enable;
> +    return old;
> +}

Writethrough mode should not be needed anymore in 1.2 if the new
implementation of writethrough is added.

And anyway...

> +    if (changed) {
> +        ret = add_cow_cache_flush(bs, s->bitmap_cache);
> +    }

... here you're treating the cache essentially as writethrough.  Is this
flush necessary?

> +    qemu_co_mutex_unlock(&s->lock);
> +    qemu_iovec_destroy(&hd_qiov);
> +    return ret;
> +}
> +
> +static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size)
> +{
> +    BDRVAddCowState *s = bs->opaque;
> +    int sector_per_byte = SECTORS_PER_CLUSTER * 8;
> +    int ret;
> +    int64_t bitmap_size =
> +        (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte;
> +
> +    ret = bdrv_truncate(bs->file,
> +        ADD_COW_BITMAP_POS + bitmap_size);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +    bdrv_truncate(s->image_hd, size);
> +    return 0;
> +}
> +
> +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs)
> +{
> +    BDRVAddCowState *s = bs->opaque;
> +    int ret;
> +
> +    qemu_co_mutex_lock(&s->lock);
> +    ret = add_cow_cache_flush(bs, s->bitmap_cache);
> +    qemu_co_mutex_unlock(&s->lock);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    return bdrv_co_flush(bs->file);

Flushing bs->file here is not needed anymore...

> +}
> +
> +static QEMUOptionParameter add_cow_create_options[] = {
> +    {
> +        .name = BLOCK_OPT_SIZE,
> +        .type = OPT_SIZE,
> +        .help = "Virtual disk size"
> +    },
> +    {
> +        .name = BLOCK_OPT_BACKING_FILE,
> +        .type = OPT_STRING,
> +        .help = "File name of a base image"
> +    },
> +    {
> +        .name = BLOCK_OPT_IMAGE_FILE,
> +        .type = OPT_STRING,
> +        .help = "File name of a image file"
> +    },
> +    {
> +        .name = BLOCK_OPT_BACKING_FMT,
> +        .type = OPT_STRING,
> +        .help = "Image format of the base image"
> +    },
> +    { NULL }
> +};
> +
> +static BlockDriver bdrv_add_cow = {
> +    .format_name                = "add-cow",
> +    .instance_size              = sizeof(BDRVAddCowState),
> +    .bdrv_probe                 = add_cow_probe,
> +    .bdrv_open                  = add_cow_open,
> +    .bdrv_close                 = add_cow_close,
> +    .bdrv_create                = add_cow_create,
> +    .bdrv_co_readv              = add_cow_co_readv,
> +    .bdrv_co_writev             = add_cow_co_writev,
> +    .bdrv_truncate              = bdrv_add_cow_truncate,
> +    .bdrv_co_is_allocated       = add_cow_is_allocated,
> +
> +    .create_options             = add_cow_create_options,
> +    .bdrv_co_flush_to_disk      = add_cow_co_flush,

... and this should be bdrv_co_flush_to_os.

Paolo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev
  2012-06-14 10:59   ` Kevin Wolf
@ 2012-06-14 11:18     ` Paolo Bonzini
  2012-06-14 11:33       ` Kevin Wolf
  0 siblings, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2012-06-14 11:18 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Dong Xu Wang, qemu-devel

Il 14/06/2012 12:59, Kevin Wolf ha scritto:
> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>> add-cow will let raw file support snapshot_blkdev indirectly.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> 
> Paolo, what do you think about this magic?

Besides the obvious layering violation (it would be better to add a new
method bdrv_ext_snapshot_create perhaps) I don't see very much a problem
in it.  Passing image creation options sounds like a good idea that we
can add in the future as an extension.

But honestly, I don't really see the point of add-cow in general...  The
raw image is anyway not usable without a pass through qemu-io convert,
and mirroring will also allow collapsing an image to raw (with the
persistent dirty bitmap playing the role of the add-cow metadata).

> I think I can see its use especially for HMP because it's quite
> convenient, but on the other hand it assumes a fixed image path for the
> new raw image. This isn't going to work for block devices, for example.

True, but then probably you would use mode='existing', because you need
to pre-create the logical volume.

> If we don't do it this way, we need to allow passing image creation
> options to the snapshotting command so that you can pass a precreated
> image file.

This sounds like a useful extension anyway, except that passing an
unstructured string for image creation options is ugly...  Perhaps we
can base a better implementation of options on Laszlo's QemuOpts visitor.

Paolo
> 
> Kevin
> 
>> ---
>>  blockdev.c              |   31 +++++++++++++++++++++++++++----
>>  docs/live-block-ops.txt |   10 +++++++++-
>>  2 files changed, 36 insertions(+), 5 deletions(-)
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index 622ecba..2d89e5e 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -783,15 +783,38 @@ void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
>>  
>>          /* create new image w/backing file */
>>          if (mode != NEW_IMAGE_MODE_EXISTING) {
>> -            ret = bdrv_img_create(new_image_file, format,
>> +            if (strcmp(format, "add-cow")) {
>> +                ret = bdrv_img_create(new_image_file, format,
>>                                    states->old_bs->filename,
>>                                    states->old_bs->drv->format_name,
>>                                    NULL, -1, flags);
>> -            if (ret) {
>> -                error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
>> -                goto delete_and_fail;
>> +            } else {
>> +                char image_file[1024];
>> +                char option[1024];
>> +                uint64_t size;
>> +
>> +                bdrv_get_geometry(states->old_bs, &size);
>> +                size *= BDRV_SECTOR_SIZE;
>> +
>> +                sprintf(image_file, "%s.raw", new_image_file);
>> +
>> +                ret = bdrv_img_create(image_file, "raw", NULL,
>> +                                      NULL, NULL, size, flags);
>> +                if (ret) {
>> +                    error_set(errp, QERR_UNDEFINED_ERROR);
>> +                    return;
>> +                }
>> +                sprintf(option, "image_file=%s.raw", new_image_file);
>> +                ret = bdrv_img_create(new_image_file, format,
>> +                                      states->old_bs->filename,
>> +                                      states->old_bs->drv->format_name,
>> +                                      option, -1, flags);
>>              }
>>          }
>> +        if (ret) {
>> +            error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
>> +            goto delete_and_fail;
>> +        }
>>  
>>          /* We will manually add the backing_hd field to the bs later */
>>          states->new_bs = bdrv_new("");
>> diff --git a/docs/live-block-ops.txt b/docs/live-block-ops.txt
>> index a257087..c97344b 100644
>> --- a/docs/live-block-ops.txt
>> +++ b/docs/live-block-ops.txt
>> @@ -2,7 +2,8 @@ LIVE BLOCK OPERATIONS
>>  =====================
>>  
>>  High level description of live block operations. Note these are not
>> -supported for use with the raw format at the moment.
>> +supported for use with the raw format at the moment, but we can use
>> +add-cow as metadata to suport raw format.
>>  
>>  Snapshot live merge
>>  ===================
>> @@ -56,3 +57,10 @@ into that image. Example:
>>  (qemu) block_stream ide0-hd0
>>  
>>  
>> +
>> +Raw is not supported, but we can use add-cow in the 1st step:
>> +
>> +(qemu) snapshot_blkdev ide0-hd0 /new-path/disk.img add-cow
>> +
>> +It will create a raw file named disk.img.raw, with the same virtual size of
>> +ide0-hd0 first, and then create disk.img.
> 
> 
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev
  2012-06-14 11:18     ` Paolo Bonzini
@ 2012-06-14 11:33       ` Kevin Wolf
  2012-07-19  2:20         ` Dong Xu Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Kevin Wolf @ 2012-06-14 11:33 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Dong Xu Wang, qemu-devel

Am 14.06.2012 13:18, schrieb Paolo Bonzini:
> Il 14/06/2012 12:59, Kevin Wolf ha scritto:
>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>>> add-cow will let raw file support snapshot_blkdev indirectly.
>>>
>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>
>> Paolo, what do you think about this magic?
> 
> Besides the obvious layering violation (it would be better to add a new
> method bdrv_ext_snapshot_create perhaps) I don't see very much a problem
> in it.  Passing image creation options sounds like a good idea that we
> can add in the future as an extension.
> 
> But honestly, I don't really see the point of add-cow in general...  The
> raw image is anyway not usable without a pass through qemu-io convert,
> and mirroring will also allow collapsing an image to raw (with the
> persistent dirty bitmap playing the role of the add-cow metadata).

Well, the idea was that you stream into add-cow and once the streaming
has completed, you can just drop the bitmap.

There might be some overlap with mirroring, though when we discussed
introducing add-cow, mirrors were not yet on the table. One difference
that remains is that with streaming the guest only writes to the target
image and can't have any problem with convergence.

>> I think I can see its use especially for HMP because it's quite
>> convenient, but on the other hand it assumes a fixed image path for the
>> new raw image. This isn't going to work for block devices, for example.
> 
> True, but then probably you would use mode='existing', because you need
> to pre-create the logical volume.

I think it might be convenient to have the raw volume precreated (you
need to do that anyway), but create the COW bitmap only during the
snapshot command. But yeah, not really important.

>> If we don't do it this way, we need to allow passing image creation
>> options to the snapshotting command so that you can pass a precreated
>> image file.
> 
> This sounds like a useful extension anyway, except that passing an
> unstructured string for image creation options is ugly...  Perhaps we
> can base a better implementation of options on Laszlo's QemuOpts visitor.

Yes, I wouldn't want to use a string here, we should use something
structured. Image creation still uses the old-style options instead of
QemuOpts, but maybe this is the opportunity to convert it.

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert
  2012-06-14 10:51   ` Kevin Wolf
@ 2012-06-14 14:06     ` Dong Xu Wang
  2012-06-14 14:11       ` Kevin Wolf
  0 siblings, 1 reply; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-14 14:06 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Hajnoczi; +Cc: qemu-devel

On Thu, Jun 14, 2012 at 6:51 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>> add-cow file can't live alone, must together will image_file and backing_file.
>> If we implement qemu-img convert operation for add-cow file format, we must
>> create image_file and backing_file manually, that will be confused for users,
>> so we just ignore add-cow format while doing converting.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>
> NACK.
>
> This stupid "let's drop the feature, it might confuse users" attitude is
> known from Gnome, but not from qemu.
>
> There's no technical reason to forbid it and anyone who manages to
> create a valid add-cow image will also be able to specify the very same
> options to a convert command. Also, having image format specific code in
> qemu-img is evil.
>

If I implement add-cow convert command, I am wondering which method should
I use:
1) create add-cow, and its backing_file, and its image_file,  then we
need 3 files.
2) create add-cow(with all bitmap marked to allocated), and its
image_file, then we
    need 2 files.

2) will be easier, I should let .add-cow file can live only with
image_file, without backing_file.

I think both 1) and 2) need add code to qemu-img.c. Or I will have to create
image_file automaticly in add_cow_create function.

Can you give some comments on how to implement convert? Thanks.

> Kevin
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert
  2012-06-14 14:06     ` Dong Xu Wang
@ 2012-06-14 14:11       ` Kevin Wolf
  2012-06-14 14:17         ` Dong Xu Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Kevin Wolf @ 2012-06-14 14:11 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: Stefan Hajnoczi, qemu-devel

Am 14.06.2012 16:06, schrieb Dong Xu Wang:
> On Thu, Jun 14, 2012 at 6:51 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>>> add-cow file can't live alone, must together will image_file and backing_file.
>>> If we implement qemu-img convert operation for add-cow file format, we must
>>> create image_file and backing_file manually, that will be confused for users,
>>> so we just ignore add-cow format while doing converting.
>>>
>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>
>> NACK.
>>
>> This stupid "let's drop the feature, it might confuse users" attitude is
>> known from Gnome, but not from qemu.
>>
>> There's no technical reason to forbid it and anyone who manages to
>> create a valid add-cow image will also be able to specify the very same
>> options to a convert command. Also, having image format specific code in
>> qemu-img is evil.
>>
> 
> If I implement add-cow convert command, I am wondering which method should
> I use:
> 1) create add-cow, and its backing_file, and its image_file,  then we
> need 3 files.
> 2) create add-cow(with all bitmap marked to allocated), and its
> image_file, then we
>     need 2 files.
> 
> 2) will be easier, I should let .add-cow file can live only with
> image_file, without backing_file.
> 
> I think both 1) and 2) need add code to qemu-img.c. Or I will have to create
> image_file automaticly in add_cow_create function.
> 
> Can you give some comments on how to implement convert? Thanks.

Just leave it alone and it will work.

qemu-img convert takes the same options as qemu-img create. So like for
any other image you specify the backing file with -b or -o backing_file,
and for add-cow images to be successfully created you also need to
specify -o image_file.

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert
  2012-06-14 14:11       ` Kevin Wolf
@ 2012-06-14 14:17         ` Dong Xu Wang
  2012-06-14 14:24           ` Kevin Wolf
  0 siblings, 1 reply; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-14 14:17 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Stefan Hajnoczi, qemu-devel

On Thu, Jun 14, 2012 at 10:11 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 14.06.2012 16:06, schrieb Dong Xu Wang:
>> On Thu, Jun 14, 2012 at 6:51 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>>>> add-cow file can't live alone, must together will image_file and backing_file.
>>>> If we implement qemu-img convert operation for add-cow file format, we must
>>>> create image_file and backing_file manually, that will be confused for users,
>>>> so we just ignore add-cow format while doing converting.
>>>>
>>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>>
>>> NACK.
>>>
>>> This stupid "let's drop the feature, it might confuse users" attitude is
>>> known from Gnome, but not from qemu.
>>>
>>> There's no technical reason to forbid it and anyone who manages to
>>> create a valid add-cow image will also be able to specify the very same
>>> options to a convert command. Also, having image format specific code in
>>> qemu-img is evil.
>>>
>>
>> If I implement add-cow convert command, I am wondering which method should
>> I use:
>> 1) create add-cow, and its backing_file, and its image_file,  then we
>> need 3 files.
>> 2) create add-cow(with all bitmap marked to allocated), and its
>> image_file, then we
>>     need 2 files.
>>
>> 2) will be easier, I should let .add-cow file can live only with
>> image_file, without backing_file.
>>
>> I think both 1) and 2) need add code to qemu-img.c. Or I will have to create
>> image_file automaticly in add_cow_create function.
>>
>> Can you give some comments on how to implement convert? Thanks.
>
> Just leave it alone and it will work.
>
> qemu-img convert takes the same options as qemu-img create. So like for
> any other image you specify the backing file with -b or -o backing_file,
> and for add-cow images to be successfully created you also need to
> specify -o image_file.
>
"-o image_file", the image_file should be precreated? Or I should created it
manually?
> Kevin
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert
  2012-06-14 14:17         ` Dong Xu Wang
@ 2012-06-14 14:24           ` Kevin Wolf
  2012-06-14 14:26             ` Dong Xu Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Kevin Wolf @ 2012-06-14 14:24 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: Stefan Hajnoczi, qemu-devel

Am 14.06.2012 16:17, schrieb Dong Xu Wang:
> On Thu, Jun 14, 2012 at 10:11 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 14.06.2012 16:06, schrieb Dong Xu Wang:
>>> On Thu, Jun 14, 2012 at 6:51 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>>>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>>>>> add-cow file can't live alone, must together will image_file and backing_file.
>>>>> If we implement qemu-img convert operation for add-cow file format, we must
>>>>> create image_file and backing_file manually, that will be confused for users,
>>>>> so we just ignore add-cow format while doing converting.
>>>>>
>>>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>>>
>>>> NACK.
>>>>
>>>> This stupid "let's drop the feature, it might confuse users" attitude is
>>>> known from Gnome, but not from qemu.
>>>>
>>>> There's no technical reason to forbid it and anyone who manages to
>>>> create a valid add-cow image will also be able to specify the very same
>>>> options to a convert command. Also, having image format specific code in
>>>> qemu-img is evil.
>>>>
>>>
>>> If I implement add-cow convert command, I am wondering which method should
>>> I use:
>>> 1) create add-cow, and its backing_file, and its image_file,  then we
>>> need 3 files.
>>> 2) create add-cow(with all bitmap marked to allocated), and its
>>> image_file, then we
>>>     need 2 files.
>>>
>>> 2) will be easier, I should let .add-cow file can live only with
>>> image_file, without backing_file.
>>>
>>> I think both 1) and 2) need add code to qemu-img.c. Or I will have to create
>>> image_file automaticly in add_cow_create function.
>>>
>>> Can you give some comments on how to implement convert? Thanks.
>>
>> Just leave it alone and it will work.
>>
>> qemu-img convert takes the same options as qemu-img create. So like for
>> any other image you specify the backing file with -b or -o backing_file,
>> and for add-cow images to be successfully created you also need to
>> specify -o image_file.
>>
> "-o image_file", the image_file should be precreated? Or I should created it
> manually?

Yes, it must already exist. Just like with qemu-img create. To convert
an existing qcow2 image that has a backing file 'base.img' to an add-cow
image, you would use something like:

$ qemu-img create -f raw target.img 4G
$ qemu-img convert -O add-cow -o image_file=target.img -B base.img
source.qcow2 target.add-cow

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert
  2012-06-14 14:24           ` Kevin Wolf
@ 2012-06-14 14:26             ` Dong Xu Wang
  0 siblings, 0 replies; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-14 14:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Stefan Hajnoczi, qemu-devel

On Thu, Jun 14, 2012 at 10:24 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 14.06.2012 16:17, schrieb Dong Xu Wang:
>> On Thu, Jun 14, 2012 at 10:11 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>>> Am 14.06.2012 16:06, schrieb Dong Xu Wang:
>>>> On Thu, Jun 14, 2012 at 6:51 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>>>>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>>>>>> add-cow file can't live alone, must together will image_file and backing_file.
>>>>>> If we implement qemu-img convert operation for add-cow file format, we must
>>>>>> create image_file and backing_file manually, that will be confused for users,
>>>>>> so we just ignore add-cow format while doing converting.
>>>>>>
>>>>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>>>>
>>>>> NACK.
>>>>>
>>>>> This stupid "let's drop the feature, it might confuse users" attitude is
>>>>> known from Gnome, but not from qemu.
>>>>>
>>>>> There's no technical reason to forbid it and anyone who manages to
>>>>> create a valid add-cow image will also be able to specify the very same
>>>>> options to a convert command. Also, having image format specific code in
>>>>> qemu-img is evil.
>>>>>
>>>>
>>>> If I implement add-cow convert command, I am wondering which method should
>>>> I use:
>>>> 1) create add-cow, and its backing_file, and its image_file,  then we
>>>> need 3 files.
>>>> 2) create add-cow(with all bitmap marked to allocated), and its
>>>> image_file, then we
>>>>     need 2 files.
>>>>
>>>> 2) will be easier, I should let .add-cow file can live only with
>>>> image_file, without backing_file.
>>>>
>>>> I think both 1) and 2) need add code to qemu-img.c. Or I will have to create
>>>> image_file automaticly in add_cow_create function.
>>>>
>>>> Can you give some comments on how to implement convert? Thanks.
>>>
>>> Just leave it alone and it will work.
>>>
>>> qemu-img convert takes the same options as qemu-img create. So like for
>>> any other image you specify the backing file with -b or -o backing_file,
>>> and for add-cow images to be successfully created you also need to
>>> specify -o image_file.
>>>
>> "-o image_file", the image_file should be precreated? Or I should created it
>> manually?
>
> Yes, it must already exist. Just like with qemu-img create. To convert
> an existing qcow2 image that has a backing file 'base.img' to an add-cow
> image, you would use something like:
>
> $ qemu-img create -f raw target.img 4G
> $ qemu-img convert -O add-cow -o image_file=target.img -B base.img
> source.qcow2 target.add-cow
>
> Kevin
>
Got it, thanks

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format
  2012-06-14 10:47     ` Kevin Wolf
@ 2012-06-18  2:08       ` Dong Xu Wang
  2012-06-18 15:33         ` Eric Blake
  0 siblings, 1 reply; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-18  2:08 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Eric Blake, qemu-devel

On Thu, Jun 14, 2012 at 6:47 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 14.06.2012 05:06, schrieb Dong Xu Wang:
>> On Wed, Jun 13, 2012 at 11:10 PM, Eric Blake <eblake@redhat.com> wrote:
>>> On 06/13/2012 08:36 AM, Dong Xu Wang wrote:
>>>> Introduce a new file format:add-cow. The usage can be found at this patch.
>>>>
>>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>>> ---
>>>>  docs/specs/add-cow.txt |   87 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  1 files changed, 87 insertions(+), 0 deletions(-)
>>>>  create mode 100644 docs/specs/add-cow.txt
>
>>>> +
>>>> +== Header ==
>>>> +
>>>> +The Header is included in the first bytes:
>>>> +
>>>> +    Byte    0 -  7:     magic
>>>> +                        add-cow magic string ("ADD_COW\xff")
>>>> +
>>>> +            8 -  11:    version
>>>> +                        Version number (only valid value is 1 now)
>>>> +
>>>> +            12 - 15:    backing_filename_offset
>>>> +                        Offset in the add-cow file at which the backing file name
>>>> +                        is stored (NB: The string is not null terminated). 0 if the
>>>> +                        image doesn't have a backing file.
>>>
>>> Mention that if this is not 0, then it must be between 36 and 4094 (a
>>> file name must be at least 1 byte).  What are the semantics if the
>>> filename is relative?
>>
>> relative filename is ok, I tested it just now.
>
> I believe Eric wanted to know what a relative path means, i.e. that it's
> relative to the image file rather than relative to the working directory.
>
Now it is relative to working directory.
>>>> +
>>>> +            16 - 19:    backing_filename_size
>>>> +                        Length of the backing file name in bytes. Undefined if the
>>>> +                        image doesn't have a backing file.
>>>
>>> Better to require 0 if backing_filename_offset is 0, than to leave this
>>> field undefined; also if backing_filename_offset is non-zero, then this
>>> must be non-zero.  Must be less than 4096-36 to fit in the reserved part
>>> of the header.
>>>
>>
>> Okay.
>
> Does an add-cow image without a backing file even make sense?
>

Okay, I think so, it will be v11.

>>>> +            28 - 35:    features
>>>> +                        Currently only 2 feature bits are used:
>>>> +                        Feature bits:
>>>> +                            The image uses a backing file:
>>>> +                            * ADD_COW_F_BACKING_FILE = 0x01.
>>>> +                            The backing file's format is raw:
>>>> +                            * ADD_COW_F_BACKING_FORMAT_NO_PROBE = 0x02.
>>>
>>> Should this follow the qcow2v3 proposal of splitting into mandatory vs.
>>> optional feature bits?
>>>
>>> I agree that ADD_COW_F_BACKING_FORMAT_NO_PROBE is sufficient to avoid
>>> security implications, but do we want the extra flexibility of
>>> specifying the backing format file format rather than just requiring
>>> probes on all but raw?
>>
>> Kevin, or Stefan, can you give some comments for this? thanks.
>
> I tend to agree that a format name is better than relying on probing.
>
> Also, I think we need the same thing for image_file. add-cow is not only
> useful for raw images, but also for other image format types for which
> we don't support backing files.
>

Okay.

>>>> +== Reserved ==
>>>> +
>>>> +    Byte    36 - 4095:  Reserved field:
>>>> +                        It is used to make sure COW bitmap field starts at the
>>>> +                        4096th byte, backing_file name and image_file name will
>>>> +                        be stored here.
>>>
>>> Do we want to keep a fixed-size header, or should we be planning on the
>>> possibility of future extensions requiring enough other header
>>> extensions that a variable-sized header would be wiser?  That is, I'm
>>> fine with requiring that the header be a multiple of 4k, but maybe it
>>> would make sense to have a mandatory header field that states how many
>>> header pages are present before the COW bitmap begins.  In the first
>>> round of implementation, this header field can be required to be 1 (that
>>> is, for now, we require exactly 4k header), but having the field would
>>> let us change in the future to a design with an 8k header to hold more
>>> metadata as needed.
>
> I have the impression that this simple add-cow hack is starting to get
> seriously overengineered... :-)
>
> Kevin
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] add-cow file format
  2012-06-14 11:13   ` Paolo Bonzini
@ 2012-06-18  2:08     ` Dong Xu Wang
  0 siblings, 0 replies; 26+ messages in thread
From: Dong Xu Wang @ 2012-06-18  2:08 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, qemu-devel

On Thu, Jun 14, 2012 at 7:13 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> I just took a quick look at the flush code.
>
> Il 13/06/2012 16:36, Dong Xu Wang ha scritto:
>>
>> +bool add_cow_cache_set_writethrough(BlockDriverState *bs, AddCowCache *c,
>> +    bool enable)
>> +{
>> +    bool old = c->writethrough;
>> +
>> +    if (!old && enable) {
>> +        add_cow_cache_flush(bs, c);
>> +    }
>> +
>> +    c->writethrough = enable;
>> +    return old;
>> +}
>
> Writethrough mode should not be needed anymore in 1.2 if the new
> implementation of writethrough is added.
>
> And anyway...
>
>> +    if (changed) {
>> +        ret = add_cow_cache_flush(bs, s->bitmap_cache);
>> +    }
>
> ... here you're treating the cache essentially as writethrough.  Is this
> flush necessary?
>
>> +    qemu_co_mutex_unlock(&s->lock);
>> +    qemu_iovec_destroy(&hd_qiov);
>> +    return ret;
>> +}
>> +
>> +static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size)
>> +{
>> +    BDRVAddCowState *s = bs->opaque;
>> +    int sector_per_byte = SECTORS_PER_CLUSTER * 8;
>> +    int ret;
>> +    int64_t bitmap_size =
>> +        (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte;
>> +
>> +    ret = bdrv_truncate(bs->file,
>> +        ADD_COW_BITMAP_POS + bitmap_size);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +    bdrv_truncate(s->image_hd, size);
>> +    return 0;
>> +}
>> +
>> +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs)
>> +{
>> +    BDRVAddCowState *s = bs->opaque;
>> +    int ret;
>> +
>> +    qemu_co_mutex_lock(&s->lock);
>> +    ret = add_cow_cache_flush(bs, s->bitmap_cache);
>> +    qemu_co_mutex_unlock(&s->lock);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +
>> +    return bdrv_co_flush(bs->file);
>
> Flushing bs->file here is not needed anymore...
>
>> +}
>> +
>> +static QEMUOptionParameter add_cow_create_options[] = {
>> +    {
>> +        .name = BLOCK_OPT_SIZE,
>> +        .type = OPT_SIZE,
>> +        .help = "Virtual disk size"
>> +    },
>> +    {
>> +        .name = BLOCK_OPT_BACKING_FILE,
>> +        .type = OPT_STRING,
>> +        .help = "File name of a base image"
>> +    },
>> +    {
>> +        .name = BLOCK_OPT_IMAGE_FILE,
>> +        .type = OPT_STRING,
>> +        .help = "File name of a image file"
>> +    },
>> +    {
>> +        .name = BLOCK_OPT_BACKING_FMT,
>> +        .type = OPT_STRING,
>> +        .help = "Image format of the base image"
>> +    },
>> +    { NULL }
>> +};
>> +
>> +static BlockDriver bdrv_add_cow = {
>> +    .format_name                = "add-cow",
>> +    .instance_size              = sizeof(BDRVAddCowState),
>> +    .bdrv_probe                 = add_cow_probe,
>> +    .bdrv_open                  = add_cow_open,
>> +    .bdrv_close                 = add_cow_close,
>> +    .bdrv_create                = add_cow_create,
>> +    .bdrv_co_readv              = add_cow_co_readv,
>> +    .bdrv_co_writev             = add_cow_co_writev,
>> +    .bdrv_truncate              = bdrv_add_cow_truncate,
>> +    .bdrv_co_is_allocated       = add_cow_is_allocated,
>> +
>> +    .create_options             = add_cow_create_options,
>> +    .bdrv_co_flush_to_disk      = add_cow_co_flush,
>
> ... and this should be bdrv_co_flush_to_os.
>
> Paolo
>
Okay, thanks for your review, Paolo. :)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format
  2012-06-18  2:08       ` Dong Xu Wang
@ 2012-06-18 15:33         ` Eric Blake
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Blake @ 2012-06-18 15:33 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: Kevin Wolf, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 946 bytes --]

On 06/17/2012 08:08 PM, Dong Xu Wang wrote:

>>>> Mention that if this is not 0, then it must be between 36 and 4094 (a
>>>> file name must be at least 1 byte).  What are the semantics if the
>>>> filename is relative?
>>>
>>> relative filename is ok, I tested it just now.
>>
>> I believe Eric wanted to know what a relative path means, i.e. that it's
>> relative to the image file rather than relative to the working directory.
>>
> Now it is relative to working directory.

Yuck.  So I have to change directories before opening an add-cow image
in order to interpret it the same way as the creator?  You really need
to fix this to be like qcow2 backing files; relative path names there
are interpreted relative to the qcow2 file, not to the current working
directory of the process that opened the qcow2 file.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev
  2012-06-14 11:33       ` Kevin Wolf
@ 2012-07-19  2:20         ` Dong Xu Wang
  2012-07-19  8:17           ` Kevin Wolf
  2012-07-19  9:57           ` Stefan Hajnoczi
  0 siblings, 2 replies; 26+ messages in thread
From: Dong Xu Wang @ 2012-07-19  2:20 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel, Stefan Hajnoczi

On Thu, Jun 14, 2012 at 7:33 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 14.06.2012 13:18, schrieb Paolo Bonzini:
>> Il 14/06/2012 12:59, Kevin Wolf ha scritto:
>>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>>>> add-cow will let raw file support snapshot_blkdev indirectly.
>>>>
>>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>>
>>> Paolo, what do you think about this magic?
>>
>> Besides the obvious layering violation (it would be better to add a new
>> method bdrv_ext_snapshot_create perhaps) I don't see very much a problem
>> in it.  Passing image creation options sounds like a good idea that we
>> can add in the future as an extension.
>>
>> But honestly, I don't really see the point of add-cow in general...  The
>> raw image is anyway not usable without a pass through qemu-io convert,
>> and mirroring will also allow collapsing an image to raw (with the
>> persistent dirty bitmap playing the role of the add-cow metadata).
>
> Well, the idea was that you stream into add-cow and once the streaming
> has completed, you can just drop the bitmap.
>
> There might be some overlap with mirroring, though when we discussed
> introducing add-cow, mirrors were not yet on the table. One difference
> that remains is that with streaming the guest only writes to the target
> image and can't have any problem with convergence.
>
>>> I think I can see its use especially for HMP because it's quite
>>> convenient, but on the other hand it assumes a fixed image path for the
>>> new raw image. This isn't going to work for block devices, for example.
>>
>> True, but then probably you would use mode='existing', because you need
>> to pre-create the logical volume.
>
> I think it might be convenient to have the raw volume precreated (you
> need to do that anyway), but create the COW bitmap only during the
> snapshot command. But yeah, not really important.
>
>>> If we don't do it this way, we need to allow passing image creation
>>> options to the snapshotting command so that you can pass a precreated
>>> image file.
>>
>> This sounds like a useful extension anyway, except that passing an
>> unstructured string for image creation options is ugly...  Perhaps we
>> can base a better implementation of options on Laszlo's QemuOpts visitor.
>
> Yes, I wouldn't want to use a string here, we should use something
> structured. Image creation still uses the old-style options instead of
> QemuOpts, but maybe this is the opportunity to convert it.

Kevin, do you mean I need to replace QEMUOptionParameter with QemuOpts?

If true, other image formats should also be changed, I noticed Stefan
has an un-comleted patch:
http://repo.or.cz/w/qemu/stefanha.git/commitdiff/b49babb2c8b476a36357cfd7276ca45a11039ca5

then I can work based on Stefan's patch.


>
> Kevin
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev
  2012-07-19  2:20         ` Dong Xu Wang
@ 2012-07-19  8:17           ` Kevin Wolf
  2012-07-19 13:18             ` Luiz Capitulino
  2012-07-19  9:57           ` Stefan Hajnoczi
  1 sibling, 1 reply; 26+ messages in thread
From: Kevin Wolf @ 2012-07-19  8:17 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: Paolo Bonzini, Luiz Capitulino, qemu-devel, Stefan Hajnoczi

Am 19.07.2012 04:20, schrieb Dong Xu Wang:
> On Thu, Jun 14, 2012 at 7:33 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 14.06.2012 13:18, schrieb Paolo Bonzini:
>>> Il 14/06/2012 12:59, Kevin Wolf ha scritto:
>>>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>>>>> add-cow will let raw file support snapshot_blkdev indirectly.
>>>>>
>>>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>>>
>>>> Paolo, what do you think about this magic?
>>>
>>> Besides the obvious layering violation (it would be better to add a new
>>> method bdrv_ext_snapshot_create perhaps) I don't see very much a problem
>>> in it.  Passing image creation options sounds like a good idea that we
>>> can add in the future as an extension.
>>>
>>> But honestly, I don't really see the point of add-cow in general...  The
>>> raw image is anyway not usable without a pass through qemu-io convert,
>>> and mirroring will also allow collapsing an image to raw (with the
>>> persistent dirty bitmap playing the role of the add-cow metadata).
>>
>> Well, the idea was that you stream into add-cow and once the streaming
>> has completed, you can just drop the bitmap.
>>
>> There might be some overlap with mirroring, though when we discussed
>> introducing add-cow, mirrors were not yet on the table. One difference
>> that remains is that with streaming the guest only writes to the target
>> image and can't have any problem with convergence.
>>
>>>> I think I can see its use especially for HMP because it's quite
>>>> convenient, but on the other hand it assumes a fixed image path for the
>>>> new raw image. This isn't going to work for block devices, for example.
>>>
>>> True, but then probably you would use mode='existing', because you need
>>> to pre-create the logical volume.
>>
>> I think it might be convenient to have the raw volume precreated (you
>> need to do that anyway), but create the COW bitmap only during the
>> snapshot command. But yeah, not really important.
>>
>>>> If we don't do it this way, we need to allow passing image creation
>>>> options to the snapshotting command so that you can pass a precreated
>>>> image file.
>>>
>>> This sounds like a useful extension anyway, except that passing an
>>> unstructured string for image creation options is ugly...  Perhaps we
>>> can base a better implementation of options on Laszlo's QemuOpts visitor.
>>
>> Yes, I wouldn't want to use a string here, we should use something
>> structured. Image creation still uses the old-style options instead of
>> QemuOpts, but maybe this is the opportunity to convert it.
> 
> Kevin, do you mean I need to replace QEMUOptionParameter with QemuOpts?
> 
> If true, other image formats should also be changed, I noticed Stefan
> has an un-comleted patch:
> http://repo.or.cz/w/qemu/stefanha.git/commitdiff/b49babb2c8b476a36357cfd7276ca45a11039ca5
> 
> then I can work based on Stefan's patch.

Yes, the direction of this patch from Stefan's repo looks good.

I'm not sure whether it will immediately allow passing structured image
creation options in QMP, though. It might work using the old-style
monitor commands, just getting a QDict from the options. Not quite sure
how this is going to work with QAPI.

Kevin

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev
  2012-07-19  2:20         ` Dong Xu Wang
  2012-07-19  8:17           ` Kevin Wolf
@ 2012-07-19  9:57           ` Stefan Hajnoczi
  1 sibling, 0 replies; 26+ messages in thread
From: Stefan Hajnoczi @ 2012-07-19  9:57 UTC (permalink / raw)
  To: Dong Xu Wang; +Cc: Kevin Wolf, Paolo Bonzini, qemu-devel, Stefan Hajnoczi

On Thu, Jul 19, 2012 at 3:20 AM, Dong Xu Wang
<wdongxu@linux.vnet.ibm.com> wrote:
> On Thu, Jun 14, 2012 at 7:33 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> Am 14.06.2012 13:18, schrieb Paolo Bonzini:
>>> Il 14/06/2012 12:59, Kevin Wolf ha scritto:
>>>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
>>>>> add-cow will let raw file support snapshot_blkdev indirectly.
>>>>>
>>>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>>>
>>>> Paolo, what do you think about this magic?
>>>
>>> Besides the obvious layering violation (it would be better to add a new
>>> method bdrv_ext_snapshot_create perhaps) I don't see very much a problem
>>> in it.  Passing image creation options sounds like a good idea that we
>>> can add in the future as an extension.
>>>
>>> But honestly, I don't really see the point of add-cow in general...  The
>>> raw image is anyway not usable without a pass through qemu-io convert,
>>> and mirroring will also allow collapsing an image to raw (with the
>>> persistent dirty bitmap playing the role of the add-cow metadata).
>>
>> Well, the idea was that you stream into add-cow and once the streaming
>> has completed, you can just drop the bitmap.
>>
>> There might be some overlap with mirroring, though when we discussed
>> introducing add-cow, mirrors were not yet on the table. One difference
>> that remains is that with streaming the guest only writes to the target
>> image and can't have any problem with convergence.
>>
>>>> I think I can see its use especially for HMP because it's quite
>>>> convenient, but on the other hand it assumes a fixed image path for the
>>>> new raw image. This isn't going to work for block devices, for example.
>>>
>>> True, but then probably you would use mode='existing', because you need
>>> to pre-create the logical volume.
>>
>> I think it might be convenient to have the raw volume precreated (you
>> need to do that anyway), but create the COW bitmap only during the
>> snapshot command. But yeah, not really important.
>>
>>>> If we don't do it this way, we need to allow passing image creation
>>>> options to the snapshotting command so that you can pass a precreated
>>>> image file.
>>>
>>> This sounds like a useful extension anyway, except that passing an
>>> unstructured string for image creation options is ugly...  Perhaps we
>>> can base a better implementation of options on Laszlo's QemuOpts visitor.
>>
>> Yes, I wouldn't want to use a string here, we should use something
>> structured. Image creation still uses the old-style options instead of
>> QemuOpts, but maybe this is the opportunity to convert it.
>
> Kevin, do you mean I need to replace QEMUOptionParameter with QemuOpts?
>
> If true, other image formats should also be changed, I noticed Stefan
> has an un-comleted patch:
> http://repo.or.cz/w/qemu/stefanha.git/commitdiff/b49babb2c8b476a36357cfd7276ca45a11039ca5
>
> then I can work based on Stefan's patch.

It's been a while so I don't remember details, but if you want to
discuss more let me know and I'll refresh my memory.

Stefan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev
  2012-07-19  8:17           ` Kevin Wolf
@ 2012-07-19 13:18             ` Luiz Capitulino
  0 siblings, 0 replies; 26+ messages in thread
From: Luiz Capitulino @ 2012-07-19 13:18 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, Dong Xu Wang, qemu-devel, Stefan Hajnoczi

On Thu, 19 Jul 2012 10:17:10 +0200
Kevin Wolf <kwolf@redhat.com> wrote:

> Am 19.07.2012 04:20, schrieb Dong Xu Wang:
> > On Thu, Jun 14, 2012 at 7:33 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> >> Am 14.06.2012 13:18, schrieb Paolo Bonzini:
> >>> Il 14/06/2012 12:59, Kevin Wolf ha scritto:
> >>>> Am 13.06.2012 16:36, schrieb Dong Xu Wang:
> >>>>> add-cow will let raw file support snapshot_blkdev indirectly.
> >>>>>
> >>>>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> >>>>
> >>>> Paolo, what do you think about this magic?
> >>>
> >>> Besides the obvious layering violation (it would be better to add a new
> >>> method bdrv_ext_snapshot_create perhaps) I don't see very much a problem
> >>> in it.  Passing image creation options sounds like a good idea that we
> >>> can add in the future as an extension.
> >>>
> >>> But honestly, I don't really see the point of add-cow in general...  The
> >>> raw image is anyway not usable without a pass through qemu-io convert,
> >>> and mirroring will also allow collapsing an image to raw (with the
> >>> persistent dirty bitmap playing the role of the add-cow metadata).
> >>
> >> Well, the idea was that you stream into add-cow and once the streaming
> >> has completed, you can just drop the bitmap.
> >>
> >> There might be some overlap with mirroring, though when we discussed
> >> introducing add-cow, mirrors were not yet on the table. One difference
> >> that remains is that with streaming the guest only writes to the target
> >> image and can't have any problem with convergence.
> >>
> >>>> I think I can see its use especially for HMP because it's quite
> >>>> convenient, but on the other hand it assumes a fixed image path for the
> >>>> new raw image. This isn't going to work for block devices, for example.
> >>>
> >>> True, but then probably you would use mode='existing', because you need
> >>> to pre-create the logical volume.
> >>
> >> I think it might be convenient to have the raw volume precreated (you
> >> need to do that anyway), but create the COW bitmap only during the
> >> snapshot command. But yeah, not really important.
> >>
> >>>> If we don't do it this way, we need to allow passing image creation
> >>>> options to the snapshotting command so that you can pass a precreated
> >>>> image file.
> >>>
> >>> This sounds like a useful extension anyway, except that passing an
> >>> unstructured string for image creation options is ugly...  Perhaps we
> >>> can base a better implementation of options on Laszlo's QemuOpts visitor.
> >>
> >> Yes, I wouldn't want to use a string here, we should use something
> >> structured. Image creation still uses the old-style options instead of
> >> QemuOpts, but maybe this is the opportunity to convert it.
> > 
> > Kevin, do you mean I need to replace QEMUOptionParameter with QemuOpts?
> > 
> > If true, other image formats should also be changed, I noticed Stefan
> > has an un-comleted patch:
> > http://repo.or.cz/w/qemu/stefanha.git/commitdiff/b49babb2c8b476a36357cfd7276ca45a11039ca5
> > 
> > then I can work based on Stefan's patch.
> 
> Yes, the direction of this patch from Stefan's repo looks good.
> 
> I'm not sure whether it will immediately allow passing structured image
> creation options in QMP, though. It might work using the old-style
> monitor commands, just getting a QDict from the options. Not quite sure
> how this is going to work with QAPI.

We did it for netdev_add and it works as you said above. The schema
entry looks like this:

 { 'command': 'netdev_add',
   'data': {'type': 'str', 'id': 'str', '*props': '**'},
   'gen': 'no' }

Where 'props' is:

 # @props: #optional a list of properties to be passed to the backend in
 #         the format 'name=value', like 'ifname=tap0,script=no'


Then we have the qmp front-end, using the old style monitor signature:

 int qmp_netdev_add(Monitor *mon, const QDict *qdict, QObject **ret);

And we have the HMP front-end (which should be used elsewhere too):

 void netdev_add(QemuOpts *opts, Error **errp);

As Paolo said, Laszlo's work will help us to this the right way.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-07-19 13:17 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-13 14:36 [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Dong Xu Wang
2012-06-13 14:36 ` [Qemu-devel] [PATCH 2/6] block: make some functions public Dong Xu Wang
2012-06-13 14:36 ` [Qemu-devel] [PATCH 3/6] add-cow file format Dong Xu Wang
2012-06-14 11:13   ` Paolo Bonzini
2012-06-18  2:08     ` Dong Xu Wang
2012-06-13 14:36 ` [Qemu-devel] [PATCH 4/6] qemu-img: add-cow will not support convert Dong Xu Wang
2012-06-14 10:51   ` Kevin Wolf
2012-06-14 14:06     ` Dong Xu Wang
2012-06-14 14:11       ` Kevin Wolf
2012-06-14 14:17         ` Dong Xu Wang
2012-06-14 14:24           ` Kevin Wolf
2012-06-14 14:26             ` Dong Xu Wang
2012-06-13 14:36 ` [Qemu-devel] [PATCH 5/6] add-cow: support snapshot_blkdev Dong Xu Wang
2012-06-14 10:59   ` Kevin Wolf
2012-06-14 11:18     ` Paolo Bonzini
2012-06-14 11:33       ` Kevin Wolf
2012-07-19  2:20         ` Dong Xu Wang
2012-07-19  8:17           ` Kevin Wolf
2012-07-19 13:18             ` Luiz Capitulino
2012-07-19  9:57           ` Stefan Hajnoczi
2012-06-13 14:36 ` [Qemu-devel] [PATCH 6/6] add-cow: support qemu-iotests Dong Xu Wang
2012-06-13 15:10 ` [Qemu-devel] [PATCH 1/6 v10] docs: spec for add-cow file format Eric Blake
2012-06-14  3:06   ` Dong Xu Wang
2012-06-14 10:47     ` Kevin Wolf
2012-06-18  2:08       ` Dong Xu Wang
2012-06-18 15:33         ` Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.