All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH V19 0/8] add-cow file format
@ 2013-05-30 10:00 Dongxu Wang
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 1/8] V18: docs: document for " Dongxu Wang
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dongxu Wang, wdongxu, stefanha

It will introduce a new file format: add-cow.

The add-cow file format makes it possible to perform copy-on-write on top of
a raw disk image.  When we know that no backing file clusters remain visible
(e.g. we have streamed the entire image and copied all data from the backing
file), then it is possible to discard the add-cow file and use the raw image
file directly.

This feature adds the copy-on-write feature to raw files (which cannot support
it natively) while allowing us to get full performance again later when we no
longer need copy-on-write.

add-cow can benefit from other available functions, such as path_has_protocol
and qed_read_string, so we will make them public.

snapshot_blkdev are not supported now for add-cow. Will add it in futher patches.

These patches are using QemuOpts parser, former patches could be found here:
http://patchwork.ozlabs.org/patch/247508/


v18->v19:
1) support parallel aio write.
2) fix flush method.
3) other small fix.
v17 -> v18:
1) remove version field.
2) header size is maximum value and cluster size value.
3) fix type.
4) move struct to source file.
5) cluster_size->table_size.
6) use error_report, not fprintf.
7) remove version field from header.
8) header_size is MAX(cluster_size, 4096).
9) introduce s->cluster_sectors.
10) use BLKDBG_L2_LOAD/UPDATE.
11) add 037 and 038 tests.

v16->v17):
1) Use stringify.

v15->v16):
1) Rebased on QEMU upstream source tree.
2) Judge if opts is null in add_cow_create function.

v14->v15:
1) Fix typo and make some sentences more readable in docs.
2) Introduce STRINGIZER macro.

v13->v14:
1) Make some sentences more clear in docs.
2) Make MAGIC from 8 bytes to 4 bytes.
3) Fix some bugs.

v12->v13:
1) Use QemuOpts, not QEMUOptionParameter
2) cluster_size configuable
3) Refactor block-cache.c
4) Correct qemu-iotests script.
5) Other bug fix.

v11->v12:
1) Removed un-used feature bit.
2) Share cache code with qcow2.c.
3) Remove snapshot_blkdev support, will add it in another patch.
5) COW Bitmap field in add-cow file will be multiple of 65536.
6) fix grammer and typo.

Dong Xu Wang (8):
  V18: docs: document for add-cow file format
  make path_has_protocol non static
  qed_read_string to bdrv_read_string
  rename qcow2-cache.c to block-cache.c
  Make block-cache.c be common interface
  add debug event for add-cow
  add-cow file format core code.
  qemu-iotests: add add-cow iotests support

 block.c                      |  29 +-
 block/Makefile.objs          |   4 +-
 block/add-cow.c              | 812 +++++++++++++++++++++++++++++++++++++++++++
 block/blkdebug.c             |   3 +
 block/block-cache.c          | 342 ++++++++++++++++++
 block/qcow2-cache.c          | 323 -----------------
 block/qcow2-cluster.c        |  52 +--
 block/qcow2-refcount.c       |  62 ++--
 block/qcow2.c                |  21 +-
 block/qcow2.h                |  24 +-
 block/qed.c                  |  34 +-
 docs/specs/add-cow.txt       | 172 +++++++++
 include/block/block-cache.h  |  59 ++++
 include/block/block.h        |   6 +
 include/block/block_int.h    |   2 +
 tests/qemu-iotests/017       |   2 +-
 tests/qemu-iotests/020       |   2 +-
 tests/qemu-iotests/037       |   2 +-
 tests/qemu-iotests/038       |   2 +-
 tests/qemu-iotests/common    |   6 +
 tests/qemu-iotests/common.rc |  15 +-
 trace-events                 |  13 +-
 22 files changed, 1536 insertions(+), 451 deletions(-)
 create mode 100644 block/add-cow.c
 create mode 100644 block/block-cache.c
 delete mode 100644 block/qcow2-cache.c
 create mode 100644 docs/specs/add-cow.txt
 create mode 100644 include/block/block-cache.h

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH V19 1/8] V18: docs: document for add-cow file format
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
@ 2013-05-30 10:00 ` Dongxu Wang
  2013-06-03  1:48   ` Fam Zheng
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 2/8] make path_has_protocol non static Dongxu Wang
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang, wdongxu, stefanha

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Document for add-cow format, the usage and spec of add-cow are
introduced.

v18-v19:
1) backing_fmt and image_fmt NUL-terminated.
2) other fix.
V17->V18:
1) remove version field.
2) header size is maximum value and cluster size value.
3) fix type.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
---
 docs/specs/add-cow.txt | 172 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 172 insertions(+)
 create mode 100644 docs/specs/add-cow.txt

diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
new file mode 100644
index 0000000..fba74dd
--- /dev/null
+++ b/docs/specs/add-cow.txt
@@ -0,0 +1,172 @@
+== General ==
+
+The raw file format does not support backing files or copy on write
+feature. The add-cow image format makes it possible to use backing
+files with an image by keeping a separate .add-cow metadata file.
+Once all clusters have been written into the image it is safe to
+discard the .add-cow and backing files, then we can use the image
+directly.
+
+An example usage of add-cow would look like:
+(ubuntu.img is a disk image which has an installed OS.)
+    1)  Create an image, such as raw format, with the same size of
+        ubuntu.img:
+            qemu-img create -f raw test.raw 8G
+    2)  Create an add-cow image which will store dirty bitmap
+            qemu-img create -f add-cow test.add-cow \
+                -o backing_file=ubuntu.img,image_file=test.raw
+    3)  Run qemu with add-cow image
+            qemu -drive if=virtio,file=test.add-cow
+
+test.raw may be larger than ubuntu.img, in that case, the size of
+test.add-cow will be calculated from the size of test.raw.
+
+image_fmt can be omitted, in that case image_fmt is assumed to be
+"raw". backing_fmt can also be omitted, add-cow should do a probe
+operation and determine what the backing file's format is. It is
+recommended to always specify the format for any raw file, because
+probing a raw file is a security hole.
+
+=Specification=
+
+The file format looks like this:
+
+ +---------------+-------------------------------+
+ |     Header    |           COW bitmap          |
+ +---------------+-------------------------------+
+
+All numbers in add-cow are stored in Little Endian byte order.
+
+== Header ==
+
+The Header is included in the first bytes:
+(HEADER_SIZE is defined in 40-43 bytes.)
+    Byte    0  -  3:    magic
+                        add-cow magic string ("ACOW"). It is coded in
+                        free-form ASCII.
+
+            4  -  7:    backing file name offset
+                        Offset in the add-cow file at which the backing
+                        file name is stored (NB: The string is NOT
+                        NUL-terminated).
+                        If backing file name does NOT exist, this field
+                        will be 0. Must be between 76 and [HEADER_SIZE
+                        - 2](a file name must be at least 1 byte).
+
+            8  - 11:    backing file name size
+                        Length of the backing file name in bytes. It
+                        will be 0 if the backing file name offset is
+                        0. If backing file name offset is non-zero,
+                        then it must be non-zero. Must be less than
+                        [HEADER_SIZE - 76] to fit in the reserved
+                        part of the header. Backing file name offset
+                        + size must be no more than HEADER_SIZE.
+
+            12 - 15:    image file name offset
+                        Offset in the add-cow file at which the image
+                        file name is stored (NB: The string is NOT
+                        NUL-terminated). It must be between 76 and
+                        [HEADER_SIZE - 2]. Image file name size + offset
+                        must be no more than HEADER_SIZE.
+
+            16 - 19:    image file name size
+                        Length of the image file name in bytes.
+                        Must be less than [HEADER_SIZE - 76] to fit in
+                        the reserved part of the header.
+
+            20 - 23:    cluster bits
+                        Number of bits that are used for addressing an
+                        offset within a cluster (1 << cluster_bits is
+                        the cluster size). Must not be less than 12
+                        (i.e. 4096 byte clusters).
+
+                        Note: qemu as of today has an implementation
+                        limit of 2 MB as the maximum cluster size and
+                        won't be able to open images with larger cluster
+                        sizes.
+
+            24 - 31:    features
+                        Bitmask of features. If a feature bit is set
+                        but not recognized, the opening add-cow file must
+                        fail.  No features bits are currently defined.
+
+                        Bits 0-63:  Reserved (set to 0)
+
+            32 - 39:    compatible features
+                        Bitmask of compatible features. An implementation
+                        can safely ignore any unknown bits that are set.
+                        Bit 0:      All allocated bit.  If this bit is
+                                    set then backing file and COW bitmap
+                                    will not be used, and can read from
+                                    or write to image file directly.
+
+                        Bits 1-63:  Reserved (set to 0)
+
+            40 - 43:    HEADER_SIZE
+                        The header field is variable-sized. This field
+                        indicates how many bytes will be used to store
+                        add-cow header. By default, it is maximum value
+                        of 4096 and cluster size value.
+
+            44 - 59:    backing file format
+                        Format of backing file. It will be filled with
+                        0 if backing file name offset is 0. If backing
+                        file name offset is non-empty, it must be
+                        non-empty. It is coded in free-form ASCII, and
+                        is NUL-terminated. Zero padded on the right. If
+                        backing_fmt is omitted, must do a probe operation
+                        and store the real format here.
+
+            60 - 75:    image file format
+                        Format of image file. It must be non-empty. It
+                        is coded in free-form ASCII, and is
+                        NUL-terminated. Zero padded on the right. If
+                        image_fmt is omitted, "raw" will be stored
+                        here.
+
+            76 - [HEADER_SIZE - 1]:
+                        It is used to make sure COW bitmap field starts
+                        at the HEADER_SIZE byte, backing file name and
+                        image file name will be stored here. The bytes
+                        that are not pointing to backing file and image
+                        file names must be set to 0.
+
+== COW bitmap ==
+
+The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap
+related to backing file and image file.  It is tracking whether the
+cluster in image file is allocated or not.
+
+Each bit in the bitmap tracks one cluster's status. For example, if
+cluster bit is 16, then each bit tracks one cluster, (1 << 16) = 65536
+bytes. The image file size is rounded up to cluster size (where any
+bytes in the last cluster that do not fit in the image are ignored),
+then if the number of clusters is not a multiple of 8, then remaining
+bits in the bitmap will be set to 0.
+
+The size of bitmap is calculated according to virtual size of image
+file, and the size of bitmap should be multiple of add-cow file's
+cluster size, the bits not used will be set to 0. Within each byte,
+the least significant bit covers the first cluster. Bit orders in one
+byte look like:
+ +----+----+----+----+----+----+----+----+
+ | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
+ +----+----+----+----+----+----+----+----+
+
+If the bit is 0, it indicates the cluster has not been allocated in
+image file, data should be loaded from backing file while reading; if
+the bit is 1, it indicates the related cluster has been dirty, should
+be loaded from image file while reading. Writing to a cluster causes
+the corresponding bit to be set to 1. If there is no backing file, or
+if the image file is larger than the backing file and the offset is
+beyond the end of the backing file, then the data should be read as
+all zero bytes instead.
+
+If image file is not an even multiple of cluster bytes, bits that
+correspond to bytes beyond the image file size in add-cow must be written
+as 0 and must be ignored when reading.
+
+Image file name and backing file name must NOT be the same, we prevent
+this while creating add-cow files via qemu-img. If image file name and
+backing file name are the same, the add-cow image must be treated as
+invalid.
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH V19 2/8] make path_has_protocol non static
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 1/8] V18: docs: document for " Dongxu Wang
@ 2013-05-30 10:00 ` Dongxu Wang
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 3/8] qed_read_string to bdrv_read_string Dongxu Wang
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang, wdongxu, stefanha

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

We will use path_has_protocol outside block.c, so just make it public.

Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
---
 block.c               | 2 +-
 include/block/block.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 710362f..5135044 100644
--- a/block.c
+++ b/block.c
@@ -193,7 +193,7 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs,
 }
 
 /* check if the path starts with "<protocol>:" */
-static int path_has_protocol(const char *path)
+int path_has_protocol(const char *path)
 {
     const char *p;
 
diff --git a/include/block/block.h b/include/block/block.h
index 90f95ef..fd776a1 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -345,6 +345,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn);
 
 char *get_human_readable_size(char *buf, int buf_size, int64_t size);
 int path_is_absolute(const char *path);
+int path_has_protocol(const char *path);
 void path_combine(char *dest, int dest_size,
                   const char *base_path,
                   const char *filename);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH V19 3/8] qed_read_string to bdrv_read_string
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 1/8] V18: docs: document for " Dongxu Wang
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 2/8] make path_has_protocol non static Dongxu Wang
@ 2013-05-30 10:00 ` Dongxu Wang
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 4/8] rename qcow2-cache.c to block-cache.c Dongxu Wang
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang, wdongxu, stefanha

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Make qed_read_string function to a common interface, so move it to
block.c.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
---
 block.c               | 27 +++++++++++++++++++++++++++
 block/qed.c           | 34 ++++------------------------------
 include/block/block.h |  2 ++
 3 files changed, 33 insertions(+), 30 deletions(-)

diff --git a/block.c b/block.c
index 5135044..65c5e76 100644
--- a/block.c
+++ b/block.c
@@ -210,6 +210,33 @@ int path_has_protocol(const char *path)
     return *p == ':';
 }
 
+/**
+ * Read a string of known length from the image file
+ *
+ * @bs:         Image file
+ * @offset:     File offset to start of string, in bytes
+ * @n:          String length in bytes
+ * @buf:        Destination buffer
+ * @buflen:     Destination buffer length in bytes
+ * @ret:        0 on success, -errno on failure
+ *
+ * The string is NUL-terminated.
+ */
+int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n,
+                     char *buf, size_t buflen)
+{
+    int ret;
+    if (n >= buflen) {
+        return -EINVAL;
+    }
+    ret = bdrv_pread(bs, offset, buf, n);
+    if (ret < 0) {
+        return ret;
+    }
+    buf[n] = '\0';
+    return 0;
+}
+
 int path_is_absolute(const char *path)
 {
 #ifdef _WIN32
diff --git a/block/qed.c b/block/qed.c
index 88daf22..077da76 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -217,33 +217,6 @@ static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size,
 }
 
 /**
- * Read a string of known length from the image file
- *
- * @file:       Image file
- * @offset:     File offset to start of string, in bytes
- * @n:          String length in bytes
- * @buf:        Destination buffer
- * @buflen:     Destination buffer length in bytes
- * @ret:        0 on success, -errno on failure
- *
- * The string is NUL-terminated.
- */
-static int qed_read_string(BlockDriverState *file, uint64_t offset, size_t n,
-                           char *buf, size_t buflen)
-{
-    int ret;
-    if (n >= buflen) {
-        return -EINVAL;
-    }
-    ret = bdrv_pread(file, offset, buf, n);
-    if (ret < 0) {
-        return ret;
-    }
-    buf[n] = '\0';
-    return 0;
-}
-
-/**
  * Allocate new clusters
  *
  * @s:          QED state
@@ -437,9 +410,10 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags)
             return -EINVAL;
         }
 
-        ret = qed_read_string(bs->file, s->header.backing_filename_offset,
-                              s->header.backing_filename_size, bs->backing_file,
-                              sizeof(bs->backing_file));
+        ret = bdrv_read_string(bs->file, s->header.backing_filename_offset,
+                               s->header.backing_filename_size,
+                               bs->backing_file,
+                               sizeof(bs->backing_file));
         if (ret < 0) {
             return ret;
         }
diff --git a/include/block/block.h b/include/block/block.h
index fd776a1..2989da6 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -175,6 +175,8 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset,
     const void *buf, int count);
 int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
     int nb_sectors, QEMUIOVector *qiov);
+int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n,
+                     char *buf, size_t buflen);
 int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
 int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num,
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH V19 4/8] rename qcow2-cache.c to block-cache.c
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
                   ` (2 preceding siblings ...)
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 3/8] qed_read_string to bdrv_read_string Dongxu Wang
@ 2013-05-30 10:00 ` Dongxu Wang
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 5/8] Make block-cache.c be common interface Dongxu Wang
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang, wdongxu, stefanha

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Block layer will use qcow2-cache as common cache code, so rename
it to block-cache.c.

v18->v19:
1) only rename, did not touch other code.

Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
---
 block/Makefile.objs    |   3 +-
 block/block-cache.c    | 323 +++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2-cache.c    | 323 -------------------------------------------------
 block/qcow2-cluster.c  |  48 ++++----
 block/qcow2-refcount.c |  42 +++----
 block/qcow2.c          |  18 +--
 block/qcow2.h          |  28 ++---
 trace-events           |  12 +-
 8 files changed, 399 insertions(+), 398 deletions(-)
 create mode 100644 block/block-cache.c
 delete mode 100644 block/qcow2-cache.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 5f0358a..16e574a 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,6 @@
 block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
-block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
+block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
+block-obj-y += block-cache.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
 block-obj-y += vhdx.o
diff --git a/block/block-cache.c b/block/block-cache.c
new file mode 100644
index 0000000..bc057a8
--- /dev/null
+++ b/block/block-cache.c
@@ -0,0 +1,323 @@
+/*
+ * L2/refcount table cache for the QCOW2 format
+ *
+ * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "block/block_int.h"
+#include "qemu-common.h"
+#include "qcow2.h"
+#include "trace.h"
+
+typedef struct BlockCachedTable {
+    void*   table;
+    int64_t offset;
+    bool    dirty;
+    int     cache_hits;
+    int     ref;
+} BlockCachedTable;
+
+struct BlockCache {
+    BlockCachedTable*       entries;
+    struct BlockCache*      depends;
+    int                     size;
+    bool                    depends_on_flush;
+};
+
+BlockCache *block_cache_create(BlockDriverState *bs, int num_tables)
+{
+    BDRVQcowState *s = bs->opaque;
+    BlockCache *c;
+    int i;
+
+    c = g_malloc0(sizeof(*c));
+    c->size = num_tables;
+    c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
+
+    for (i = 0; i < c->size; i++) {
+        c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
+    }
+
+    return c;
+}
+
+int block_cache_destroy(BlockDriverState* bs, BlockCache *c)
+{
+    int i;
+
+    for (i = 0; i < c->size; i++) {
+        assert(c->entries[i].ref == 0);
+        qemu_vfree(c->entries[i].table);
+    }
+
+    g_free(c->entries);
+    g_free(c);
+
+    return 0;
+}
+
+static int block_cache_flush_dependency(BlockDriverState *bs, BlockCache *c)
+{
+    int ret;
+
+    ret = block_cache_flush(bs, c->depends);
+    if (ret < 0) {
+        return ret;
+    }
+
+    c->depends = NULL;
+    c->depends_on_flush = false;
+
+    return 0;
+}
+
+static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
+{
+    BDRVQcowState *s = bs->opaque;
+    int ret = 0;
+
+    if (!c->entries[i].dirty || !c->entries[i].offset) {
+        return 0;
+    }
+
+    trace_block_cache_entry_flush(qemu_coroutine_self(),
+                                  c == s->l2_table_cache, i);
+
+    if (c->depends) {
+        ret = block_cache_flush_dependency(bs, c);
+    } else if (c->depends_on_flush) {
+        ret = bdrv_flush(bs->file);
+        if (ret >= 0) {
+            c->depends_on_flush = false;
+        }
+    }
+
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (c == s->refcount_block_cache) {
+        BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART);
+    } else if (c == s->l2_table_cache) {
+        BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
+    }
+
+    ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
+        s->cluster_size);
+    if (ret < 0) {
+        return ret;
+    }
+
+    c->entries[i].dirty = false;
+
+    return 0;
+}
+
+int block_cache_flush(BlockDriverState *bs, BlockCache *c)
+{
+    BDRVQcowState *s = bs->opaque;
+    int result = 0;
+    int ret;
+    int i;
+
+    trace_block_cache_flush(qemu_coroutine_self(), c == s->l2_table_cache);
+
+    for (i = 0; i < c->size; i++) {
+        ret = block_cache_entry_flush(bs, c, i);
+        if (ret < 0 && result != -ENOSPC) {
+            result = ret;
+        }
+    }
+
+    if (result == 0) {
+        ret = bdrv_flush(bs->file);
+        if (ret < 0) {
+            result = ret;
+        }
+    }
+
+    return result;
+}
+
+int block_cache_set_dependency(BlockDriverState *bs, BlockCache *c,
+    BlockCache *dependency)
+{
+    int ret;
+
+    if (dependency->depends) {
+        ret = block_cache_flush_dependency(bs, dependency);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    if (c->depends && (c->depends != dependency)) {
+        ret = block_cache_flush_dependency(bs, c);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    c->depends = dependency;
+    return 0;
+}
+
+void block_cache_depends_on_flush(BlockCache *c)
+{
+    c->depends_on_flush = true;
+}
+
+static int block_cache_find_entry_to_replace(BlockCache *c)
+{
+    int i;
+    int min_count = INT_MAX;
+    int min_index = -1;
+
+
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].ref) {
+            continue;
+        }
+
+        if (c->entries[i].cache_hits < min_count) {
+            min_index = i;
+            min_count = c->entries[i].cache_hits;
+        }
+
+        /* Give newer hits priority */
+        /* TODO Check how to optimize the replacement strategy */
+        c->entries[i].cache_hits /= 2;
+    }
+
+    if (min_index == -1) {
+        /* This can't happen in current synchronous code, but leave the check
+         * here as a reminder for whoever starts using AIO with the cache */
+        abort();
+    }
+    return min_index;
+}
+
+static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
+    uint64_t offset, void **table, bool read_from_disk)
+{
+    BDRVQcowState *s = bs->opaque;
+    int i;
+    int ret;
+
+    trace_block_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
+                          offset, read_from_disk);
+
+    /* Check if the table is already cached */
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].offset == offset) {
+            goto found;
+        }
+    }
+
+    /* If not, write a table back and replace it */
+    i = block_cache_find_entry_to_replace(c);
+    trace_block_cache_get_replace_entry(qemu_coroutine_self(),
+                                        c == s->l2_table_cache, i);
+    if (i < 0) {
+        return i;
+    }
+
+    ret = block_cache_entry_flush(bs, c, i);
+    if (ret < 0) {
+        return ret;
+    }
+
+    trace_block_cache_get_read(qemu_coroutine_self(),
+                               c == s->l2_table_cache, i);
+    c->entries[i].offset = 0;
+    if (read_from_disk) {
+        if (c == s->l2_table_cache) {
+            BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
+        }
+
+        ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    /* Give the table some hits for the start so that it won't be replaced
+     * immediately. The number 32 is completely arbitrary. */
+    c->entries[i].cache_hits = 32;
+    c->entries[i].offset = offset;
+
+    /* And return the right table */
+found:
+    c->entries[i].cache_hits++;
+    c->entries[i].ref++;
+    *table = c->entries[i].table;
+
+    trace_block_cache_get_done(qemu_coroutine_self(),
+                               c == s->l2_table_cache, i);
+
+    return 0;
+}
+
+int block_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset,
+    void **table)
+{
+    return block_cache_do_get(bs, c, offset, table, true);
+}
+
+int block_cache_get_empty(BlockDriverState *bs, BlockCache *c, uint64_t offset,
+    void **table)
+{
+    return block_cache_do_get(bs, c, offset, table, false);
+}
+
+int block_cache_put(BlockDriverState *bs, BlockCache *c, void **table)
+{
+    int i;
+
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].table == *table) {
+            goto found;
+        }
+    }
+    return -ENOENT;
+
+found:
+    c->entries[i].ref--;
+    *table = NULL;
+
+    assert(c->entries[i].ref >= 0);
+    return 0;
+}
+
+void block_cache_entry_mark_dirty(BlockCache *c, void *table)
+{
+    int i;
+
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].table == table) {
+            goto found;
+        }
+    }
+    abort();
+
+found:
+    c->entries[i].dirty = true;
+}
diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
deleted file mode 100644
index 2f3114e..0000000
--- a/block/qcow2-cache.c
+++ /dev/null
@@ -1,323 +0,0 @@
-/*
- * L2/refcount table cache for the QCOW2 format
- *
- * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-
-#include "block/block_int.h"
-#include "qemu-common.h"
-#include "qcow2.h"
-#include "trace.h"
-
-typedef struct Qcow2CachedTable {
-    void*   table;
-    int64_t offset;
-    bool    dirty;
-    int     cache_hits;
-    int     ref;
-} Qcow2CachedTable;
-
-struct Qcow2Cache {
-    Qcow2CachedTable*       entries;
-    struct Qcow2Cache*      depends;
-    int                     size;
-    bool                    depends_on_flush;
-};
-
-Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
-{
-    BDRVQcowState *s = bs->opaque;
-    Qcow2Cache *c;
-    int i;
-
-    c = g_malloc0(sizeof(*c));
-    c->size = num_tables;
-    c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
-
-    for (i = 0; i < c->size; i++) {
-        c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
-    }
-
-    return c;
-}
-
-int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)
-{
-    int i;
-
-    for (i = 0; i < c->size; i++) {
-        assert(c->entries[i].ref == 0);
-        qemu_vfree(c->entries[i].table);
-    }
-
-    g_free(c->entries);
-    g_free(c);
-
-    return 0;
-}
-
-static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c)
-{
-    int ret;
-
-    ret = qcow2_cache_flush(bs, c->depends);
-    if (ret < 0) {
-        return ret;
-    }
-
-    c->depends = NULL;
-    c->depends_on_flush = false;
-
-    return 0;
-}
-
-static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
-{
-    BDRVQcowState *s = bs->opaque;
-    int ret = 0;
-
-    if (!c->entries[i].dirty || !c->entries[i].offset) {
-        return 0;
-    }
-
-    trace_qcow2_cache_entry_flush(qemu_coroutine_self(),
-                                  c == s->l2_table_cache, i);
-
-    if (c->depends) {
-        ret = qcow2_cache_flush_dependency(bs, c);
-    } else if (c->depends_on_flush) {
-        ret = bdrv_flush(bs->file);
-        if (ret >= 0) {
-            c->depends_on_flush = false;
-        }
-    }
-
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (c == s->refcount_block_cache) {
-        BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART);
-    } else if (c == s->l2_table_cache) {
-        BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
-    }
-
-    ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
-        s->cluster_size);
-    if (ret < 0) {
-        return ret;
-    }
-
-    c->entries[i].dirty = false;
-
-    return 0;
-}
-
-int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
-{
-    BDRVQcowState *s = bs->opaque;
-    int result = 0;
-    int ret;
-    int i;
-
-    trace_qcow2_cache_flush(qemu_coroutine_self(), c == s->l2_table_cache);
-
-    for (i = 0; i < c->size; i++) {
-        ret = qcow2_cache_entry_flush(bs, c, i);
-        if (ret < 0 && result != -ENOSPC) {
-            result = ret;
-        }
-    }
-
-    if (result == 0) {
-        ret = bdrv_flush(bs->file);
-        if (ret < 0) {
-            result = ret;
-        }
-    }
-
-    return result;
-}
-
-int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
-    Qcow2Cache *dependency)
-{
-    int ret;
-
-    if (dependency->depends) {
-        ret = qcow2_cache_flush_dependency(bs, dependency);
-        if (ret < 0) {
-            return ret;
-        }
-    }
-
-    if (c->depends && (c->depends != dependency)) {
-        ret = qcow2_cache_flush_dependency(bs, c);
-        if (ret < 0) {
-            return ret;
-        }
-    }
-
-    c->depends = dependency;
-    return 0;
-}
-
-void qcow2_cache_depends_on_flush(Qcow2Cache *c)
-{
-    c->depends_on_flush = true;
-}
-
-static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c)
-{
-    int i;
-    int min_count = INT_MAX;
-    int min_index = -1;
-
-
-    for (i = 0; i < c->size; i++) {
-        if (c->entries[i].ref) {
-            continue;
-        }
-
-        if (c->entries[i].cache_hits < min_count) {
-            min_index = i;
-            min_count = c->entries[i].cache_hits;
-        }
-
-        /* Give newer hits priority */
-        /* TODO Check how to optimize the replacement strategy */
-        c->entries[i].cache_hits /= 2;
-    }
-
-    if (min_index == -1) {
-        /* This can't happen in current synchronous code, but leave the check
-         * here as a reminder for whoever starts using AIO with the cache */
-        abort();
-    }
-    return min_index;
-}
-
-static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
-    uint64_t offset, void **table, bool read_from_disk)
-{
-    BDRVQcowState *s = bs->opaque;
-    int i;
-    int ret;
-
-    trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
-                          offset, read_from_disk);
-
-    /* Check if the table is already cached */
-    for (i = 0; i < c->size; i++) {
-        if (c->entries[i].offset == offset) {
-            goto found;
-        }
-    }
-
-    /* If not, write a table back and replace it */
-    i = qcow2_cache_find_entry_to_replace(c);
-    trace_qcow2_cache_get_replace_entry(qemu_coroutine_self(),
-                                        c == s->l2_table_cache, i);
-    if (i < 0) {
-        return i;
-    }
-
-    ret = qcow2_cache_entry_flush(bs, c, i);
-    if (ret < 0) {
-        return ret;
-    }
-
-    trace_qcow2_cache_get_read(qemu_coroutine_self(),
-                               c == s->l2_table_cache, i);
-    c->entries[i].offset = 0;
-    if (read_from_disk) {
-        if (c == s->l2_table_cache) {
-            BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
-        }
-
-        ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size);
-        if (ret < 0) {
-            return ret;
-        }
-    }
-
-    /* Give the table some hits for the start so that it won't be replaced
-     * immediately. The number 32 is completely arbitrary. */
-    c->entries[i].cache_hits = 32;
-    c->entries[i].offset = offset;
-
-    /* And return the right table */
-found:
-    c->entries[i].cache_hits++;
-    c->entries[i].ref++;
-    *table = c->entries[i].table;
-
-    trace_qcow2_cache_get_done(qemu_coroutine_self(),
-                               c == s->l2_table_cache, i);
-
-    return 0;
-}
-
-int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
-    void **table)
-{
-    return qcow2_cache_do_get(bs, c, offset, table, true);
-}
-
-int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
-    void **table)
-{
-    return qcow2_cache_do_get(bs, c, offset, table, false);
-}
-
-int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
-{
-    int i;
-
-    for (i = 0; i < c->size; i++) {
-        if (c->entries[i].table == *table) {
-            goto found;
-        }
-    }
-    return -ENOENT;
-
-found:
-    c->entries[i].ref--;
-    *table = NULL;
-
-    assert(c->entries[i].ref >= 0);
-    return 0;
-}
-
-void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table)
-{
-    int i;
-
-    for (i = 0; i < c->size; i++) {
-        if (c->entries[i].table == table) {
-            goto found;
-        }
-    }
-    abort();
-
-found:
-    c->entries[i].dirty = true;
-}
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 76f30e5..ede64af 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -75,7 +75,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
         return new_l1_table_offset;
     }
 
-    ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+    ret = block_cache_flush(bs, s->refcount_block_cache);
     if (ret < 0) {
         goto fail;
     }
@@ -125,7 +125,7 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
     BDRVQcowState *s = bs->opaque;
     int ret;
 
-    ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table);
+    ret = block_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table);
 
     return ret;
 }
@@ -186,7 +186,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
         return l2_offset;
     }
 
-    ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+    ret = block_cache_flush(bs, s->refcount_block_cache);
     if (ret < 0) {
         goto fail;
     }
@@ -194,7 +194,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
     /* allocate a new entry in the l2 cache */
 
     trace_qcow2_l2_allocate_get_empty(bs, l1_index);
-    ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table);
+    ret = block_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table);
     if (ret < 0) {
         return ret;
     }
@@ -209,7 +209,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
 
         /* if there was an old l2 table, read it from the disk */
         BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ);
-        ret = qcow2_cache_get(bs, s->l2_table_cache,
+        ret = block_cache_get(bs, s->l2_table_cache,
             old_l2_offset & L1E_OFFSET_MASK,
             (void**) &old_table);
         if (ret < 0) {
@@ -218,7 +218,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
 
         memcpy(l2_table, old_table, s->cluster_size);
 
-        ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &old_table);
+        ret = block_cache_put(bs, s->l2_table_cache, (void**) &old_table);
         if (ret < 0) {
             goto fail;
         }
@@ -228,8 +228,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
     BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
 
     trace_qcow2_l2_allocate_write_l2(bs, l1_index);
-    qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
-    ret = qcow2_cache_flush(bs, s->l2_table_cache);
+    block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+    ret = block_cache_flush(bs, s->l2_table_cache);
     if (ret < 0) {
         goto fail;
     }
@@ -248,7 +248,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
 
 fail:
     trace_qcow2_l2_allocate_done(bs, l1_index, ret);
-    qcow2_cache_put(bs, s->l2_table_cache, (void**) table);
+    block_cache_put(bs, s->l2_table_cache, (void**) table);
     s->l1_table[l1_index] = old_l2_offset;
     return ret;
 }
@@ -484,7 +484,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
         abort();
     }
 
-    qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
 
     nb_available = (c * s->cluster_sectors);
 
@@ -594,13 +594,13 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
      * allocated. */
     cluster_offset = be64_to_cpu(l2_table[l2_index]);
     if (cluster_offset & L2E_OFFSET_MASK) {
-        qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+        block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
         return 0;
     }
 
     cluster_offset = qcow2_alloc_bytes(bs, compressed_size);
     if (cluster_offset < 0) {
-        qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+        block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
         return 0;
     }
 
@@ -615,9 +615,9 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
     /* compressed clusters never have the copied flag */
 
     BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED);
-    qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+    block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
     l2_table[l2_index] = cpu_to_be64(cluster_offset);
-    ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     if (ret < 0) {
         return 0;
     }
@@ -649,7 +649,7 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
      * need to be sure that the refcounts have been increased and COW was
      * handled.
      */
-    qcow2_cache_depends_on_flush(s->l2_table_cache);
+    block_cache_depends_on_flush(s->l2_table_cache);
 
     return 0;
 }
@@ -682,7 +682,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
         qcow2_mark_dirty(bs);
     }
     if (qcow2_need_accurate_refcounts(s)) {
-        qcow2_cache_set_dependency(bs, s->l2_table_cache,
+        block_cache_set_dependency(bs, s->l2_table_cache,
                                    s->refcount_block_cache);
     }
 
@@ -690,7 +690,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
     if (ret < 0) {
         goto err;
     }
-    qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+    block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
 
     for (i = 0; i < m->nb_clusters; i++) {
         /* if two concurrent writes happen to the same unallocated cluster
@@ -707,7 +707,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
      }
 
 
-    ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     if (ret < 0) {
         goto err;
     }
@@ -916,7 +916,7 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
 
     /* Cleanup */
 out:
-    pret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    pret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     if (pret < 0) {
         return pret;
     }
@@ -1044,7 +1044,7 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
      * wrong with our code. */
     assert(nb_clusters > 0);
 
-    ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     if (ret < 0) {
         return ret;
     }
@@ -1335,14 +1335,14 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
         }
 
         /* First remove L2 entries */
-        qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+        block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
         l2_table[l2_index + i] = cpu_to_be64(0);
 
         /* Then decrease the refcount */
         qcow2_free_any_clusters(bs, old_offset, 1);
     }
 
-    ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     if (ret < 0) {
         return ret;
     }
@@ -1412,7 +1412,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
         old_offset = be64_to_cpu(l2_table[l2_index + i]);
 
         /* Update L2 entries */
-        qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+        block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
         if (old_offset & QCOW_OFLAG_COMPRESSED) {
             l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
             qcow2_free_any_clusters(bs, old_offset, 1);
@@ -1421,7 +1421,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
         }
     }
 
-    ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     if (ret < 0) {
         return ret;
     }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index b32738f..332af27 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -71,7 +71,7 @@ static int load_refcount_block(BlockDriverState *bs,
     int ret;
 
     BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_LOAD);
-    ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
+    ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
         refcount_block);
 
     return ret;
@@ -98,7 +98,7 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
     if (!refcount_block_offset)
         return 0;
 
-    ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
+    ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
         (void**) &refcount_block);
     if (ret < 0) {
         return ret;
@@ -108,7 +108,7 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
         ((1 << (s->cluster_bits - REFCOUNT_SHIFT)) - 1);
     refcount = be16_to_cpu(refcount_block[block_index]);
 
-    ret = qcow2_cache_put(bs, s->refcount_block_cache,
+    ret = block_cache_put(bs, s->refcount_block_cache,
         (void**) &refcount_block);
     if (ret < 0) {
         return ret;
@@ -201,7 +201,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
     *refcount_block = NULL;
 
     /* We write to the refcount table, so we might depend on L2 tables */
-    ret = qcow2_cache_flush(bs, s->l2_table_cache);
+    ret = block_cache_flush(bs, s->l2_table_cache);
     if (ret < 0) {
         return ret;
     }
@@ -220,7 +220,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
 
     if (in_same_refcount_block(s, new_block, cluster_index << s->cluster_bits)) {
         /* Zero the new refcount block before updating it */
-        ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block,
+        ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block,
             (void**) refcount_block);
         if (ret < 0) {
             goto fail_block;
@@ -240,14 +240,14 @@ static int alloc_refcount_block(BlockDriverState *bs,
             goto fail_block;
         }
 
-        ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+        ret = block_cache_flush(bs, s->refcount_block_cache);
         if (ret < 0) {
             goto fail_block;
         }
 
         /* Initialize the new refcount block only after updating its refcount,
          * update_refcount uses the refcount cache itself */
-        ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block,
+        ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block,
             (void**) refcount_block);
         if (ret < 0) {
             goto fail_block;
@@ -258,8 +258,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
 
     /* Now the new refcount block needs to be written to disk */
     BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC_WRITE);
-    qcow2_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block);
-    ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+    block_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block);
+    ret = block_cache_flush(bs, s->refcount_block_cache);
     if (ret < 0) {
         goto fail_block;
     }
@@ -279,7 +279,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
         return 0;
     }
 
-    ret = qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
+    ret = block_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
     if (ret < 0) {
         goto fail_block;
     }
@@ -413,7 +413,7 @@ fail_table:
     g_free(new_table);
 fail_block:
     if (*refcount_block != NULL) {
-        qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
+        block_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
     }
     return ret;
 }
@@ -439,7 +439,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
     }
 
     if (addend < 0) {
-        qcow2_cache_set_dependency(bs, s->refcount_block_cache,
+        block_cache_set_dependency(bs, s->refcount_block_cache,
             s->l2_table_cache);
     }
 
@@ -456,7 +456,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
         /* Load the refcount block and allocate it if needed */
         if (table_index != old_table_index) {
             if (refcount_block) {
-                ret = qcow2_cache_put(bs, s->refcount_block_cache,
+                ret = block_cache_put(bs, s->refcount_block_cache,
                     (void**) &refcount_block);
                 if (ret < 0) {
                     goto fail;
@@ -470,7 +470,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
         }
         old_table_index = table_index;
 
-        qcow2_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block);
+        block_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block);
 
         /* we can update the count and save it */
         block_index = cluster_index &
@@ -493,7 +493,7 @@ fail:
     /* Write last changed block to disk */
     if (refcount_block) {
         int wret;
-        wret = qcow2_cache_put(bs, s->refcount_block_cache,
+        wret = block_cache_put(bs, s->refcount_block_cache,
             (void**) &refcount_block);
         if (wret < 0) {
             return ret < 0 ? ret : wret;
@@ -671,7 +671,7 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
      * or explicitly by update_cluster_refcount().  Refcount blocks must be
      * flushed before the caller's L2 table updates.
      */
-    qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache);
+    block_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache);
     return offset;
 }
 
@@ -767,7 +767,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
             old_l2_offset = l2_offset;
             l2_offset &= L1E_OFFSET_MASK;
 
-            ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
+            ret = block_cache_get(bs, s->l2_table_cache, l2_offset,
                 (void**) &l2_table);
             if (ret < 0) {
                 goto fail;
@@ -811,16 +811,16 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                     }
                     if (offset != old_offset) {
                         if (addend > 0) {
-                            qcow2_cache_set_dependency(bs, s->l2_table_cache,
+                            block_cache_set_dependency(bs, s->l2_table_cache,
                                 s->refcount_block_cache);
                         }
                         l2_table[j] = cpu_to_be64(offset);
-                        qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+                        block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
                     }
                 }
             }
 
-            ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+            ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
             if (ret < 0) {
                 goto fail;
             }
@@ -847,7 +847,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     ret = bdrv_flush(bs);
 fail:
     if (l2_table) {
-        qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+        block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     }
 
     /* Update L1 only if it isn't deleted anyway (addend = -1) */
diff --git a/block/qcow2.c b/block/qcow2.c
index dcd63d1..c3340ad 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -454,8 +454,8 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
     }
 
     /* alloc L2 table/refcount block cache */
-    s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE);
-    s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE);
+    s->l2_table_cache = block_cache_create(bs, L2_CACHE_SIZE);
+    s->refcount_block_cache = block_cache_create(bs, REFCOUNT_CACHE_SIZE);
 
     s->cluster_cache = g_malloc(s->cluster_size);
     /* one more sector for decompressed data alignment */
@@ -556,7 +556,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
     qcow2_refcount_close(bs);
     g_free(s->l1_table);
     if (s->l2_table_cache) {
-        qcow2_cache_destroy(bs, s->l2_table_cache);
+        block_cache_destroy(bs, s->l2_table_cache);
     }
     g_free(s->cluster_cache);
     qemu_vfree(s->cluster_data);
@@ -921,13 +921,13 @@ static void qcow2_close(BlockDriverState *bs)
     BDRVQcowState *s = bs->opaque;
     g_free(s->l1_table);
 
-    qcow2_cache_flush(bs, s->l2_table_cache);
-    qcow2_cache_flush(bs, s->refcount_block_cache);
+    block_cache_flush(bs, s->l2_table_cache);
+    block_cache_flush(bs, s->refcount_block_cache);
 
     qcow2_mark_clean(bs);
 
-    qcow2_cache_destroy(bs, s->l2_table_cache);
-    qcow2_cache_destroy(bs, s->refcount_block_cache);
+    block_cache_destroy(bs, s->l2_table_cache);
+    block_cache_destroy(bs, s->refcount_block_cache);
 
     g_free(s->unknown_header_fields);
     cleanup_unknown_header_ext(bs);
@@ -1632,14 +1632,14 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
     int ret;
 
     qemu_co_mutex_lock(&s->lock);
-    ret = qcow2_cache_flush(bs, s->l2_table_cache);
+    ret = block_cache_flush(bs, s->l2_table_cache);
     if (ret < 0) {
         qemu_co_mutex_unlock(&s->lock);
         return ret;
     }
 
     if (qcow2_need_accurate_refcounts(s)) {
-        ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+        ret = block_cache_flush(bs, s->refcount_block_cache);
         if (ret < 0) {
             qemu_co_mutex_unlock(&s->lock);
             return ret;
diff --git a/block/qcow2.h b/block/qcow2.h
index 6959c6a..6e63217 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -97,8 +97,8 @@ typedef struct QCowSnapshot {
     uint64_t vm_clock_nsec;
 } QCowSnapshot;
 
-struct Qcow2Cache;
-typedef struct Qcow2Cache Qcow2Cache;
+struct BlockCache;
+typedef struct BlockCache BlockCache;
 
 typedef struct Qcow2UnknownHeaderExtension {
     uint32_t magic;
@@ -149,8 +149,8 @@ typedef struct BDRVQcowState {
     uint64_t l1_table_offset;
     uint64_t *l1_table;
 
-    Qcow2Cache* l2_table_cache;
-    Qcow2Cache* refcount_block_cache;
+    BlockCache* l2_table_cache;
+    BlockCache* refcount_block_cache;
 
     uint8_t *cluster_cache;
     uint8_t *cluster_data;
@@ -393,19 +393,19 @@ void qcow2_free_snapshots(BlockDriverState *bs);
 int qcow2_read_snapshots(BlockDriverState *bs);
 
 /* qcow2-cache.c functions */
-Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables);
-int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
+BlockCache *block_cache_create(BlockDriverState *bs, int num_tables);
+int block_cache_destroy(BlockDriverState* bs, BlockCache *c);
 
-void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table);
-int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
-int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
-    Qcow2Cache *dependency);
-void qcow2_cache_depends_on_flush(Qcow2Cache *c);
+void block_cache_entry_mark_dirty(BlockCache *c, void *table);
+int block_cache_flush(BlockDriverState *bs, BlockCache *c);
+int block_cache_set_dependency(BlockDriverState *bs, BlockCache *c,
+    BlockCache *dependency);
+void block_cache_depends_on_flush(BlockCache *c);
 
-int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
+int block_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset,
     void **table);
-int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
+int block_cache_get_empty(BlockDriverState *bs, BlockCache *c, uint64_t offset,
     void **table);
-int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
+int block_cache_put(BlockDriverState *bs, BlockCache *c, void **table);
 
 #endif
diff --git a/trace-events b/trace-events
index c5f1ccb..b7edcc9 100644
--- a/trace-events
+++ b/trace-events
@@ -503,12 +503,12 @@ qcow2_l2_allocate_write_l2(void *bs, int l1_index) "bs %p l1_index %d"
 qcow2_l2_allocate_write_l1(void *bs, int l1_index) "bs %p l1_index %d"
 qcow2_l2_allocate_done(void *bs, int l1_index, int ret) "bs %p l1_index %d ret %d"
 
-qcow2_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d"
-qcow2_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d"
-qcow2_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d"
-qcow2_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d"
-qcow2_cache_flush(void *co, int c) "co %p is_l2_cache %d"
-qcow2_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d"
+block_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d"
+block_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d"
+block_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d"
+block_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d"
+block_cache_flush(void *co, int c) "co %p is_l2_cache %d"
+block_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d"
 
 # block/qed-l2-cache.c
 qed_alloc_l2_cache_entry(void *l2_cache, void *entry) "l2_cache %p entry %p"
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH V19 5/8] Make block-cache.c be common interface
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
                   ` (3 preceding siblings ...)
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 4/8] rename qcow2-cache.c to block-cache.c Dongxu Wang
@ 2013-05-30 10:00 ` Dongxu Wang
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow Dongxu Wang
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang, wdongxu, stefanha

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

To use block-cache.c be common cache interface, we need to add
some parameter to identify cache's type.

Define a struct named BlockTableType, pass BlockTableType and
table size parameters to block cache initialization function.

v17-v18:
1) move struct to source file.
2) cluster_size->table_size.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
---
 block/block-cache.c         | 75 ++++++++++++++++++++++++++-------------------
 block/qcow2-cluster.c       | 30 +++++++++---------
 block/qcow2-refcount.c      | 30 +++++++++---------
 block/qcow2.c               |  7 +++--
 block/qcow2.h               | 24 ++-------------
 include/block/block-cache.h | 55 +++++++++++++++++++++++++++++++++
 trace-events                |  1 +
 7 files changed, 139 insertions(+), 83 deletions(-)
 create mode 100644 include/block/block-cache.h

diff --git a/block/block-cache.c b/block/block-cache.c
index bc057a8..f5d75d1 100644
--- a/block/block-cache.c
+++ b/block/block-cache.c
@@ -1,4 +1,8 @@
 /*
+ * QEMU Block Layer Cache
+ *
+ * This file is based on qcow2-cache.c, see its copyrights below:
+ *
  * L2/refcount table cache for the QCOW2 format
  *
  * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
@@ -24,11 +28,11 @@
 
 #include "block/block_int.h"
 #include "qemu-common.h"
-#include "qcow2.h"
 #include "trace.h"
+#include "block/block-cache.h"
 
 typedef struct BlockCachedTable {
-    void*   table;
+    void    *table;
     int64_t offset;
     bool    dirty;
     int     cache_hits;
@@ -36,30 +40,34 @@ typedef struct BlockCachedTable {
 } BlockCachedTable;
 
 struct BlockCache {
-    BlockCachedTable*       entries;
-    struct BlockCache*      depends;
-    int                     size;
-    bool                    depends_on_flush;
+    BlockCachedTable    *entries;
+    struct BlockCache   *depends;
+    int                 size;
+    size_t              table_size;
+    BlockTableType      table_type;
+    bool                depends_on_flush;
 };
 
-BlockCache *block_cache_create(BlockDriverState *bs, int num_tables)
+BlockCache *block_cache_create(BlockDriverState *bs, int num_tables,
+                               size_t table_size, BlockTableType type)
 {
-    BDRVQcowState *s = bs->opaque;
     BlockCache *c;
     int i;
 
     c = g_malloc0(sizeof(*c));
     c->size = num_tables;
     c->entries = g_malloc0(sizeof(*c->entries) * num_tables);
+    c->table_type = type;
+    c->table_size = table_size;
 
     for (i = 0; i < c->size; i++) {
-        c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
+        c->entries[i].table = qemu_blockalign(bs, table_size);
     }
 
     return c;
 }
 
-int block_cache_destroy(BlockDriverState* bs, BlockCache *c)
+int block_cache_destroy(BlockDriverState *bs, BlockCache *c)
 {
     int i;
 
@@ -91,15 +99,13 @@ static int block_cache_flush_dependency(BlockDriverState *bs, BlockCache *c)
 
 static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
 {
-    BDRVQcowState *s = bs->opaque;
     int ret = 0;
 
     if (!c->entries[i].dirty || !c->entries[i].offset) {
         return 0;
     }
 
-    trace_block_cache_entry_flush(qemu_coroutine_self(),
-                                  c == s->l2_table_cache, i);
+    trace_block_cache_entry_flush(qemu_coroutine_self(), c->table_type, i);
 
     if (c->depends) {
         ret = block_cache_flush_dependency(bs, c);
@@ -114,14 +120,16 @@ static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
         return ret;
     }
 
-    if (c == s->refcount_block_cache) {
+    if (c->table_type == BLOCK_TABLE_REF) {
         BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART);
-    } else if (c == s->l2_table_cache) {
+    } else if (c->table_type == BLOCK_TABLE_L2) {
         BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
+    } else if (c->table_type == BLOCK_TABLE_BITMAP) {
+        BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
     }
 
-    ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
-        s->cluster_size);
+    ret = bdrv_pwrite(bs->file, c->entries[i].offset,
+                      c->entries[i].table, c->table_size);
     if (ret < 0) {
         return ret;
     }
@@ -133,12 +141,11 @@ static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
 
 int block_cache_flush(BlockDriverState *bs, BlockCache *c)
 {
-    BDRVQcowState *s = bs->opaque;
     int result = 0;
     int ret;
     int i;
 
-    trace_block_cache_flush(qemu_coroutine_self(), c == s->l2_table_cache);
+    trace_block_cache_flush(qemu_coroutine_self(), c->table_type);
 
     for (i = 0; i < c->size; i++) {
         ret = block_cache_entry_flush(bs, c, i);
@@ -157,8 +164,9 @@ int block_cache_flush(BlockDriverState *bs, BlockCache *c)
     return result;
 }
 
-int block_cache_set_dependency(BlockDriverState *bs, BlockCache *c,
-    BlockCache *dependency)
+int block_cache_set_dependency(BlockDriverState *bs,
+                               BlockCache *c,
+                               BlockCache *dependency)
 {
     int ret;
 
@@ -216,13 +224,13 @@ static int block_cache_find_entry_to_replace(BlockCache *c)
 }
 
 static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
-    uint64_t offset, void **table, bool read_from_disk)
+                              uint64_t offset, void **table,
+                              bool read_from_disk)
 {
-    BDRVQcowState *s = bs->opaque;
     int i;
     int ret;
 
-    trace_block_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
+    trace_block_cache_get(qemu_coroutine_self(), c->table_type,
                           offset, read_from_disk);
 
     /* Check if the table is already cached */
@@ -235,7 +243,7 @@ static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
     /* If not, write a table back and replace it */
     i = block_cache_find_entry_to_replace(c);
     trace_block_cache_get_replace_entry(qemu_coroutine_self(),
-                                        c == s->l2_table_cache, i);
+                                        c->table_type, i);
     if (i < 0) {
         return i;
     }
@@ -246,14 +254,17 @@ static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
     }
 
     trace_block_cache_get_read(qemu_coroutine_self(),
-                               c == s->l2_table_cache, i);
+                               c->table_type, i);
     c->entries[i].offset = 0;
     if (read_from_disk) {
-        if (c == s->l2_table_cache) {
+        if (c->table_type == BLOCK_TABLE_L2) {
             BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
+        } else if (c->table_type == BLOCK_TABLE_BITMAP) {
+            BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
         }
 
-        ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size);
+        ret = bdrv_pread(bs->file, offset, c->entries[i].table,
+                         c->table_size);
         if (ret < 0) {
             return ret;
         }
@@ -271,19 +282,19 @@ found:
     *table = c->entries[i].table;
 
     trace_block_cache_get_done(qemu_coroutine_self(),
-                               c == s->l2_table_cache, i);
+                               c->table_type, i);
 
     return 0;
 }
 
 int block_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset,
-    void **table)
+                    void **table)
 {
     return block_cache_do_get(bs, c, offset, table, true);
 }
 
-int block_cache_get_empty(BlockDriverState *bs, BlockCache *c, uint64_t offset,
-    void **table)
+int block_cache_get_empty(BlockDriverState *bs, BlockCache *c,
+                          uint64_t offset, void **table)
 {
     return block_cache_do_get(bs, c, offset, table, false);
 }
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index ede64af..7a01c27 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -125,7 +125,8 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
     BDRVQcowState *s = bs->opaque;
     int ret;
 
-    ret = block_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table);
+    ret = block_cache_get(bs, s->l2_table_cache, l2_offset,
+                          (void **) l2_table);
 
     return ret;
 }
@@ -194,7 +195,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
     /* allocate a new entry in the l2 cache */
 
     trace_qcow2_l2_allocate_get_empty(bs, l1_index);
-    ret = block_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table);
+    ret = block_cache_get_empty(bs, s->l2_table_cache, l2_offset,
+                                (void **) table);
     if (ret < 0) {
         return ret;
     }
@@ -211,14 +213,14 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
         BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ);
         ret = block_cache_get(bs, s->l2_table_cache,
             old_l2_offset & L1E_OFFSET_MASK,
-            (void**) &old_table);
+            (void **) &old_table);
         if (ret < 0) {
             goto fail;
         }
 
         memcpy(l2_table, old_table, s->cluster_size);
 
-        ret = block_cache_put(bs, s->l2_table_cache, (void**) &old_table);
+        ret = block_cache_put(bs, s->l2_table_cache, (void **) &old_table);
         if (ret < 0) {
             goto fail;
         }
@@ -248,7 +250,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
 
 fail:
     trace_qcow2_l2_allocate_done(bs, l1_index, ret);
-    block_cache_put(bs, s->l2_table_cache, (void**) table);
+    block_cache_put(bs, s->l2_table_cache, (void **) table);
     s->l1_table[l1_index] = old_l2_offset;
     return ret;
 }
@@ -484,7 +486,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
         abort();
     }
 
-    block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
 
     nb_available = (c * s->cluster_sectors);
 
@@ -594,13 +596,13 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
      * allocated. */
     cluster_offset = be64_to_cpu(l2_table[l2_index]);
     if (cluster_offset & L2E_OFFSET_MASK) {
-        block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+        block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
         return 0;
     }
 
     cluster_offset = qcow2_alloc_bytes(bs, compressed_size);
     if (cluster_offset < 0) {
-        block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+        block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
         return 0;
     }
 
@@ -617,7 +619,7 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
     BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED);
     block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
     l2_table[l2_index] = cpu_to_be64(cluster_offset);
-    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
     if (ret < 0) {
         return 0;
     }
@@ -707,7 +709,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
      }
 
 
-    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
     if (ret < 0) {
         goto err;
     }
@@ -916,7 +918,7 @@ static int handle_copied(BlockDriverState *bs, uint64_t guest_offset,
 
     /* Cleanup */
 out:
-    pret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    pret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
     if (pret < 0) {
         return pret;
     }
@@ -1044,7 +1046,7 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
      * wrong with our code. */
     assert(nb_clusters > 0);
 
-    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
     if (ret < 0) {
         return ret;
     }
@@ -1342,7 +1344,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
         qcow2_free_any_clusters(bs, old_offset, 1);
     }
 
-    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
     if (ret < 0) {
         return ret;
     }
@@ -1421,7 +1423,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
         }
     }
 
-    ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+    ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
     if (ret < 0) {
         return ret;
     }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 332af27..c7427e8 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -99,7 +99,7 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
         return 0;
 
     ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
-        (void**) &refcount_block);
+        (void **) &refcount_block);
     if (ret < 0) {
         return ret;
     }
@@ -109,7 +109,7 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
     refcount = be16_to_cpu(refcount_block[block_index]);
 
     ret = block_cache_put(bs, s->refcount_block_cache,
-        (void**) &refcount_block);
+        (void **) &refcount_block);
     if (ret < 0) {
         return ret;
     }
@@ -172,7 +172,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
         /* If it's already there, we're done */
         if (refcount_block_offset) {
              return load_refcount_block(bs, refcount_block_offset,
-                 (void**) refcount_block);
+                 (void **) refcount_block);
         }
     }
 
@@ -221,7 +221,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
     if (in_same_refcount_block(s, new_block, cluster_index << s->cluster_bits)) {
         /* Zero the new refcount block before updating it */
         ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block,
-            (void**) refcount_block);
+            (void **) refcount_block);
         if (ret < 0) {
             goto fail_block;
         }
@@ -248,7 +248,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
         /* Initialize the new refcount block only after updating its refcount,
          * update_refcount uses the refcount cache itself */
         ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block,
-            (void**) refcount_block);
+            (void **) refcount_block);
         if (ret < 0) {
             goto fail_block;
         }
@@ -279,7 +279,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
         return 0;
     }
 
-    ret = block_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
+    ret = block_cache_put(bs, s->refcount_block_cache,
+                          (void **) refcount_block);
     if (ret < 0) {
         goto fail_block;
     }
@@ -402,7 +403,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
     qcow2_free_clusters(bs, old_table_offset, old_table_size * sizeof(uint64_t));
     s->free_cluster_index = old_free_cluster_index;
 
-    ret = load_refcount_block(bs, new_block, (void**) refcount_block);
+    ret = load_refcount_block(bs, new_block, (void **) refcount_block);
     if (ret < 0) {
         return ret;
     }
@@ -413,7 +414,7 @@ fail_table:
     g_free(new_table);
 fail_block:
     if (*refcount_block != NULL) {
-        block_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
+        block_cache_put(bs, s->refcount_block_cache, (void **) refcount_block);
     }
     return ret;
 }
@@ -457,7 +458,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
         if (table_index != old_table_index) {
             if (refcount_block) {
                 ret = block_cache_put(bs, s->refcount_block_cache,
-                    (void**) &refcount_block);
+                    (void **) &refcount_block);
                 if (ret < 0) {
                     goto fail;
                 }
@@ -494,7 +495,7 @@ fail:
     if (refcount_block) {
         int wret;
         wret = block_cache_put(bs, s->refcount_block_cache,
-            (void**) &refcount_block);
+            (void **) &refcount_block);
         if (wret < 0) {
             return ret < 0 ? ret : wret;
         }
@@ -768,7 +769,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
             l2_offset &= L1E_OFFSET_MASK;
 
             ret = block_cache_get(bs, s->l2_table_cache, l2_offset,
-                (void**) &l2_table);
+                (void **) &l2_table);
             if (ret < 0) {
                 goto fail;
             }
@@ -815,12 +816,13 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                                 s->refcount_block_cache);
                         }
                         l2_table[j] = cpu_to_be64(offset);
-                        block_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
+                        block_cache_entry_mark_dirty(s->l2_table_cache,
+                                                     l2_table);
                     }
                 }
             }
 
-            ret = block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+            ret = block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
             if (ret < 0) {
                 goto fail;
             }
@@ -847,7 +849,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     ret = bdrv_flush(bs);
 fail:
     if (l2_table) {
-        block_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+        block_cache_put(bs, s->l2_table_cache, (void **) &l2_table);
     }
 
     /* Update L1 only if it isn't deleted anyway (addend = -1) */
diff --git a/block/qcow2.c b/block/qcow2.c
index c3340ad..cec1c26 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -454,8 +454,11 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
     }
 
     /* alloc L2 table/refcount block cache */
-    s->l2_table_cache = block_cache_create(bs, L2_CACHE_SIZE);
-    s->refcount_block_cache = block_cache_create(bs, REFCOUNT_CACHE_SIZE);
+    s->l2_table_cache = block_cache_create(bs, L2_CACHE_SIZE,
+                                           s->cluster_size, BLOCK_TABLE_L2);
+    s->refcount_block_cache = block_cache_create(bs, REFCOUNT_CACHE_SIZE,
+                                                 s->cluster_size,
+                                                 BLOCK_TABLE_REF);
 
     s->cluster_cache = g_malloc(s->cluster_size);
     /* one more sector for decompressed data alignment */
diff --git a/block/qcow2.h b/block/qcow2.h
index 6e63217..7494814 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
 
 #include "qemu/aes.h"
 #include "block/coroutine.h"
+#include "block/block-cache.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
@@ -97,9 +98,6 @@ typedef struct QCowSnapshot {
     uint64_t vm_clock_nsec;
 } QCowSnapshot;
 
-struct BlockCache;
-typedef struct BlockCache BlockCache;
-
 typedef struct Qcow2UnknownHeaderExtension {
     uint32_t magic;
     uint32_t len;
@@ -149,8 +147,8 @@ typedef struct BDRVQcowState {
     uint64_t l1_table_offset;
     uint64_t *l1_table;
 
-    BlockCache* l2_table_cache;
-    BlockCache* refcount_block_cache;
+    BlockCache  *l2_table_cache;
+    BlockCache  *refcount_block_cache;
 
     uint8_t *cluster_cache;
     uint8_t *cluster_data;
@@ -392,20 +390,4 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name);
 void qcow2_free_snapshots(BlockDriverState *bs);
 int qcow2_read_snapshots(BlockDriverState *bs);
 
-/* qcow2-cache.c functions */
-BlockCache *block_cache_create(BlockDriverState *bs, int num_tables);
-int block_cache_destroy(BlockDriverState* bs, BlockCache *c);
-
-void block_cache_entry_mark_dirty(BlockCache *c, void *table);
-int block_cache_flush(BlockDriverState *bs, BlockCache *c);
-int block_cache_set_dependency(BlockDriverState *bs, BlockCache *c,
-    BlockCache *dependency);
-void block_cache_depends_on_flush(BlockCache *c);
-
-int block_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset,
-    void **table);
-int block_cache_get_empty(BlockDriverState *bs, BlockCache *c, uint64_t offset,
-    void **table);
-int block_cache_put(BlockDriverState *bs, BlockCache *c, void **table);
-
 #endif
diff --git a/include/block/block-cache.h b/include/block/block-cache.h
new file mode 100644
index 0000000..1992765
--- /dev/null
+++ b/include/block/block-cache.h
@@ -0,0 +1,55 @@
+/*
+ * QEMU Block Layer Cache
+ *
+ * This file is based on qcow2-cache.c, see its copyrights below:
+ *
+ * L2/refcount table cache for the QCOW2 format
+ *
+ * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef BLOCK_CACHE_H
+#define BLOCK_CACHE_H
+
+typedef enum {
+    BLOCK_TABLE_REF,
+    BLOCK_TABLE_L2,
+    BLOCK_TABLE_BITMAP,
+} BlockTableType;
+
+struct BlockCache;
+typedef struct BlockCache BlockCache;
+
+BlockCache *block_cache_create(BlockDriverState *bs, int num_tables,
+                               size_t table_size, BlockTableType type);
+int block_cache_destroy(BlockDriverState *bs, BlockCache *c);
+int block_cache_flush(BlockDriverState *bs, BlockCache *c);
+int block_cache_set_dependency(BlockDriverState *bs,
+                               BlockCache *c,
+                               BlockCache *dependency);
+void block_cache_depends_on_flush(BlockCache *c);
+int block_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset,
+                    void **table);
+int block_cache_get_empty(BlockDriverState *bs, BlockCache *c,
+                          uint64_t offset, void **table);
+int block_cache_put(BlockDriverState *bs, BlockCache *c, void **table);
+void block_cache_entry_mark_dirty(BlockCache *c, void *table);
+#endif /* BLOCK_CACHE_H */
diff --git a/trace-events b/trace-events
index b7edcc9..b701e01 100644
--- a/trace-events
+++ b/trace-events
@@ -503,6 +503,7 @@ qcow2_l2_allocate_write_l2(void *bs, int l1_index) "bs %p l1_index %d"
 qcow2_l2_allocate_write_l1(void *bs, int l1_index) "bs %p l1_index %d"
 qcow2_l2_allocate_done(void *bs, int l1_index, int ret) "bs %p l1_index %d ret %d"
 
+# block/block-cache.c
 block_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d"
 block_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d"
 block_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d"
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
                   ` (4 preceding siblings ...)
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 5/8] Make block-cache.c be common interface Dongxu Wang
@ 2013-05-30 10:00 ` Dongxu Wang
  2013-06-05 13:35   ` Stefan Hajnoczi
  2013-06-05 13:46   ` Kevin Wolf
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 7/8] add-cow file format core code Dongxu Wang
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang, wdongxu, stefanha

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
---
 block/blkdebug.c      | 3 +++
 block/block-cache.c   | 4 ++--
 include/block/block.h | 3 +++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 71f99e4..2bd6a53 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -182,6 +182,9 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
     [BLKDBG_CLUSTER_ALLOC]                  = "cluster_alloc",
     [BLKDBG_CLUSTER_ALLOC_BYTES]            = "cluster_alloc_bytes",
     [BLKDBG_CLUSTER_FREE]                   = "cluster_free",
+
+    [BLKDBG_ADDCOW_READ]                    = "add_cow_read",
+    [BLKDBG_ADDCOW_WRITE]                   = "add_cow_write",
 };
 
 static int get_event_by_name(const char *name, BlkDebugEvent *event)
diff --git a/block/block-cache.c b/block/block-cache.c
index f5d75d1..454269c 100644
--- a/block/block-cache.c
+++ b/block/block-cache.c
@@ -125,7 +125,7 @@ static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
     } else if (c->table_type == BLOCK_TABLE_L2) {
         BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
     } else if (c->table_type == BLOCK_TABLE_BITMAP) {
-        BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
+        BLKDBG_EVENT(bs->file, BLKDBG_ADDCOW_WRITE);
     }
 
     ret = bdrv_pwrite(bs->file, c->entries[i].offset,
@@ -260,7 +260,7 @@ static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
         if (c->table_type == BLOCK_TABLE_L2) {
             BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
         } else if (c->table_type == BLOCK_TABLE_BITMAP) {
-            BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
+            BLKDBG_EVENT(bs->file, BLKDBG_ADDCOW_READ);
         }
 
         ret = bdrv_pread(bs->file, offset, c->entries[i].table,
diff --git a/include/block/block.h b/include/block/block.h
index 2989da6..3573e3e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -451,6 +451,9 @@ typedef enum {
     BLKDBG_CLUSTER_ALLOC_BYTES,
     BLKDBG_CLUSTER_FREE,
 
+    BLKDBG_ADDCOW_READ,
+    BLKDBG_ADDCOW_WRITE,
+
     BLKDBG_EVENT_MAX,
 } BlkDebugEvent;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH V19 7/8] add-cow file format core code.
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
                   ` (5 preceding siblings ...)
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow Dongxu Wang
@ 2013-05-30 10:00 ` Dongxu Wang
  2013-06-05 14:21   ` Stefan Hajnoczi
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 8/8] qemu-iotests: add add-cow iotests support Dongxu Wang
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang, wdongxu, stefanha

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

add-cow file format core code. It use block-cache.c as cache code.
It lacks of snapshot_blkdev support.

v18->v19:
1) add aio parallel write support.
2) fix flush method.

v17-v18:
1) use error_report, not fprintf.
2) remove version field from header.
3) header_size is MAX(cluster_size, 4096).
4) introduce s->cluster_sectors.
5) use BLKDBG_L2_LOAD/UPDATE.

v16->v17:
1) Use stringify.

v15->v16:
1) Judge if opts is null in add_cow_create function.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

Conflicts:
	block/block-cache.c
Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
---
 block/Makefile.objs         |   1 +
 block/add-cow.c             | 812 ++++++++++++++++++++++++++++++++++++++++++++
 block/block-cache.c         |  16 +-
 include/block/block-cache.h |   4 +
 include/block/block_int.h   |   2 +
 5 files changed, 831 insertions(+), 4 deletions(-)
 create mode 100644 block/add-cow.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 16e574a..f666a7a 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,6 +1,7 @@
 block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o
 block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
 block-obj-y += block-cache.o
+block-obj-y += add-cow.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
 block-obj-y += vhdx.o
diff --git a/block/add-cow.c b/block/add-cow.c
new file mode 100644
index 0000000..290da3c
--- /dev/null
+++ b/block/add-cow.c
@@ -0,0 +1,812 @@
+/*
+ * QEMU ADD-COW Disk Format
+ *
+ * Copyright IBM, Corp. 2012
+ *
+ * Authors:
+ *  Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "qemu-common.h"
+#include "block/block_int.h"
+#include "qemu/module.h"
+#include "block/block-cache.h"
+
+#define ACOW_CLUSTER_SIZE 65536
+
+typedef struct AddCowMeta
+{
+    uint64_t start_sector;
+    int remaining_sectors;
+    CoQueue dependent_requests;
+    struct AddCowMeta *next;
+
+    QLIST_ENTRY(AddCowMeta) next_in_flight;
+} AddCowMeta;
+
+enum {
+    /* compat_features bit */
+    ACOW_F_ALL_ALLOCATED    = 0X01,
+
+    /* none feature bit used now */
+    ACOW_FEATURE_MASK       = 0,
+
+    ACOW_MAGIC              = 'A' | 'C' << 8 | 'O' << 16 | 'W' << 24,
+    ACOW_CACHE_SIZE         = 16,
+    DEFAULT_HEADER_SIZE     = 4096,
+    MIN_CLUSTER_BITS        = 12,
+    MAX_CLUSTER_BITS        = 21,
+};
+
+typedef struct AddCowHeader {
+    uint32_t    magic;
+
+    uint32_t    backing_offset;
+    uint32_t    backing_size;
+    uint32_t    image_offset;
+    uint32_t    image_size;
+
+    uint32_t    cluster_bits;
+    uint64_t    features;
+    uint64_t    compat_features;
+    uint32_t    header_size;
+
+    char        backing_fmt[16];
+    char        image_fmt[16];
+} QEMU_PACKED AddCowHeader;
+
+typedef struct BDRVAddCowState {
+    BlockDriverState    *image_hd;
+    CoMutex             lock;
+    int                 cluster_size;
+    int                 cluster_sectors;
+    BlockCache          *bitmap_cache;
+    uint64_t            bitmap_size;
+    AddCowHeader        header;
+    char                backing_fmt[16];
+    char                image_fmt[16];
+    BlockCache          *image_hd_cache;
+    QLIST_HEAD(AddCowAlloc, AddCowMeta) cluster_allocs;
+} BDRVAddCowState;
+
+/* Convert sector_num to offset in bitmap */
+static inline int64_t offset_in_bitmap(int64_t sector_num,
+                                       int64_t cluster_sectors)
+{
+    int64_t cluster_num = sector_num / cluster_sectors;
+    return cluster_num / 8;
+}
+
+static inline bool is_cluster_head(int64_t sector_num,
+                                   int64_t cluster_sectors)
+{
+    return sector_num % cluster_sectors == 0;
+}
+
+static inline bool is_cluster_tail(int64_t sector_num,
+                                   int64_t cluster_sectors)
+{
+    return (sector_num + 1) % cluster_sectors == 0;
+}
+
+static void add_cow_header_le_to_cpu(const AddCowHeader *le, AddCowHeader *cpu)
+{
+    cpu->magic              = le32_to_cpu(le->magic);
+
+    cpu->backing_offset     = le32_to_cpu(le->backing_offset);
+    cpu->backing_size       = le32_to_cpu(le->backing_size);
+    cpu->image_offset       = le32_to_cpu(le->image_offset);
+    cpu->image_size         = le32_to_cpu(le->image_size);
+
+    cpu->cluster_bits       = le32_to_cpu(le->cluster_bits);
+    cpu->features           = le64_to_cpu(le->features);
+    cpu->compat_features    = le64_to_cpu(le->compat_features);
+    cpu->header_size        = le32_to_cpu(le->header_size);
+
+    memcpy(cpu->backing_fmt, le->backing_fmt, sizeof(cpu->backing_fmt));
+    memcpy(cpu->image_fmt, le->image_fmt, sizeof(cpu->image_fmt));
+}
+
+static void add_cow_header_cpu_to_le(const AddCowHeader *cpu, AddCowHeader *le)
+{
+    le->magic               = cpu_to_le32(cpu->magic);
+
+    le->backing_offset      = cpu_to_le32(cpu->backing_offset);
+    le->backing_size        = cpu_to_le32(cpu->backing_size);
+    le->image_offset        = cpu_to_le32(cpu->image_offset);
+    le->image_size          = cpu_to_le32(cpu->image_size);
+
+    le->cluster_bits        = cpu_to_le32(cpu->cluster_bits);
+    le->features            = cpu_to_le64(cpu->features);
+    le->compat_features     = cpu_to_le64(cpu->compat_features);
+    le->header_size         = cpu_to_le32(cpu->header_size);
+    memcpy(le->backing_fmt, cpu->backing_fmt, sizeof(le->backing_fmt));
+    memcpy(le->image_fmt, cpu->image_fmt, sizeof(le->image_fmt));
+}
+
+static int add_cow_probe(const uint8_t *buf, int buf_size, const char *filename)
+{
+    const AddCowHeader *header = (const AddCowHeader *)buf;
+
+    if (buf_size < sizeof(*header)) {
+        return 0;
+    }
+    if (le32_to_cpu(header->magic) == ACOW_MAGIC) {
+        return 100;
+    }
+    return 0;
+}
+
+static int add_cow_create(const char *filename, QemuOpts *opts)
+{
+    AddCowHeader header = {
+        .magic      = ACOW_MAGIC,
+        .features   = 0,
+        .compat_features = 0,
+    };
+    AddCowHeader le_header;
+    int64_t image_len = 0;
+    const char *backing_filename = NULL, *backing_fmt = NULL;
+    const char *image_filename = NULL, *image_format = NULL;
+    size_t cluster_size = ACOW_CLUSTER_SIZE;
+    BlockDriverState *bs, *backing_bs;
+    BlockDriver *drv = bdrv_find_format("add-cow");
+    int ret;
+
+    image_len = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
+    backing_filename = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
+    backing_fmt = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FMT);
+    image_filename = qemu_opt_get_del(opts, BLOCK_OPT_IMAGE_FILE);
+    image_format = qemu_opt_get_del(opts, BLOCK_OPT_IMAGE_FMT);
+    cluster_size = qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
+                                         ACOW_CLUSTER_SIZE);
+
+    header.cluster_bits = ffs(cluster_size) - 1;
+    if (header.cluster_bits < MIN_CLUSTER_BITS ||
+        header.cluster_bits > MAX_CLUSTER_BITS ||
+        (1 << header.cluster_bits) != cluster_size) {
+        error_report(
+            "Cluster size must be a power of two between %d and %dk",
+            1 << MIN_CLUSTER_BITS, 1 << (MAX_CLUSTER_BITS - 10));
+        return -EINVAL;
+    }
+
+    header.header_size = MAX(cluster_size, DEFAULT_HEADER_SIZE);
+    if (backing_filename) {
+        header.backing_offset = sizeof(header);
+        header.backing_size = strlen(backing_filename);
+
+        if (!backing_fmt) {
+            backing_bs = bdrv_new("image");
+            ret = bdrv_open(backing_bs, backing_filename, NULL,
+                            BDRV_O_RDWR | BDRV_O_CACHE_WB, NULL);
+            if (ret < 0) {
+                bdrv_delete(backing_bs);
+                return ret;
+            }
+            backing_fmt = bdrv_get_format_name(backing_bs);
+            bdrv_delete(backing_bs);
+        }
+    } else {
+        header.compat_features |= ACOW_F_ALL_ALLOCATED;
+    }
+
+    if (image_filename) {
+        header.image_offset = sizeof(header) + header.backing_size;
+        header.image_size = strlen(image_filename);
+    } else {
+        error_report("Error: image_file should be given.");
+        return -EINVAL;
+    }
+
+    if (backing_filename && !strcmp(backing_filename, image_filename)) {
+        error_report("Error: Trying to create an image with the "
+                     "same backing file name as the image file name");
+        return -EINVAL;
+    }
+
+    if (!strcmp(filename, image_filename)) {
+        error_report("Error: Trying to create an image with the "
+                     "same filename as the image file name");
+        return -EINVAL;
+    }
+
+    if (header.image_offset + header.image_size > header.header_size) {
+        error_report("image_file name or backing_file name too long.");
+        return -ENOSPC;
+    }
+
+    ret = bdrv_file_open(&bs, image_filename, NULL, 0);
+    if (ret < 0) {
+        return ret;
+    }
+    bdrv_close(bs);
+
+    ret = bdrv_create_file(filename, NULL);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = bdrv_file_open(&bs, filename, NULL, BDRV_O_RDWR);
+    if (ret < 0) {
+        return ret;
+    }
+    assert(strlen(backing_fmt ? backing_fmt : "") < sizeof(header.backing_fmt));
+    strncpy(header.backing_fmt, backing_fmt ? backing_fmt : "",
+            sizeof(header.backing_fmt));
+    assert(strlen(image_format ? image_format : "raw") < sizeof(header.image_fmt));
+    strncpy(header.image_fmt, image_format ? image_format : "raw",
+            sizeof(header.image_fmt));
+    add_cow_header_cpu_to_le(&header, &le_header);
+    ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header));
+    if (ret < 0) {
+        bdrv_delete(bs);
+        return ret;
+    }
+
+    ret = bdrv_pwrite(bs, header.image_offset, image_filename,
+                      header.image_size);
+    if (ret < 0) {
+        bdrv_delete(bs);
+        return ret;
+    }
+
+    if (backing_filename) {
+        ret = bdrv_pwrite(bs, header.backing_offset, backing_filename,
+                          header.backing_size);
+        if (ret < 0) {
+            bdrv_delete(bs);
+            return ret;
+        }
+    }
+    bdrv_close(bs);
+
+    ret = bdrv_open(bs, filename, NULL, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv);
+    if (ret < 0) {
+        bdrv_delete(bs);
+        return ret;
+    }
+
+    ret = bdrv_truncate(bs, image_len);
+    bdrv_delete(bs);
+    return ret;
+}
+
+static int add_cow_open(BlockDriverState *bs, QDict *options, int flags)
+{
+    char                image_filename[PATH_MAX];
+    char                tmp_name[PATH_MAX];
+    int                 ret;
+    int                 sector_per_byte;
+    BDRVAddCowState     *s = bs->opaque;
+    AddCowHeader        le_header;
+    char backing_filename[PATH_MAX];
+
+    ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header));
+    if (ret < 0) {
+        goto fail;
+    }
+
+    add_cow_header_le_to_cpu(&le_header, &s->header);
+
+    if (s->header.magic != ACOW_MAGIC) {
+        ret = -EINVAL;
+        goto fail;
+    }
+
+    if (s->header.features & ~ACOW_FEATURE_MASK) {
+        char buf[64];
+        snprintf(buf, sizeof(buf), "Feature Flags: %" PRIx64,
+                 s->header.features & ~ACOW_FEATURE_MASK);
+        qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+                      bs->device_name, "add-cow", buf);
+        return -ENOTSUP;
+    }
+
+    if ((s->header.compat_features & ACOW_F_ALL_ALLOCATED) == 0) {
+        snprintf(bs->backing_format, sizeof(bs->backing_format),
+                 "%s", s->header.backing_fmt);
+    }
+
+    if (s->header.cluster_bits < MIN_CLUSTER_BITS ||
+        s->header.cluster_bits > MAX_CLUSTER_BITS) {
+        ret = -EINVAL;
+        goto fail;
+    }
+
+    s->cluster_size = 1 << s->header.cluster_bits;
+    if (s->header.header_size != MAX(s->cluster_size, DEFAULT_HEADER_SIZE)) {
+        char buf[64];
+        snprintf(buf, sizeof(buf), "Header size: %u",
+                 s->header.header_size);
+        qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
+                      bs->device_name, "add-cow", buf);
+        return -ENOTSUP;
+    }
+
+    if ((s->header.compat_features & ACOW_F_ALL_ALLOCATED) == 0) {
+        ret = bdrv_read_string(bs->file, s->header.backing_offset,
+                               s->header.backing_size,
+                               bs->backing_file,
+                               sizeof(bs->backing_file));
+        if (ret < 0) {
+            goto fail;
+        }
+    }
+
+    ret = bdrv_read_string(bs->file, s->header.image_offset,
+                           s->header.image_size, tmp_name,
+                           sizeof(tmp_name));
+    if (ret < 0) {
+        goto fail;
+    }
+
+    if (path_has_protocol(tmp_name)) {
+        pstrcpy(image_filename, sizeof(image_filename), tmp_name);
+    } else {
+        path_combine(image_filename, sizeof(image_filename),
+                     bs->filename, tmp_name);
+    }
+    bdrv_get_full_backing_filename(bs, backing_filename,
+                                   sizeof(backing_filename));
+    if (!strcmp(backing_filename, image_filename)) {
+        error_report("Error: backing file and image file are same: %s\n",
+                     backing_filename);
+        ret = EINVAL;
+        goto fail;
+    }
+    s->image_hd = bdrv_new("");
+    ret = bdrv_open(s->image_hd, image_filename, NULL, flags,
+                    bdrv_find_format(s->header.image_fmt));
+    if (ret < 0) {
+        bdrv_delete(s->image_hd);
+        goto fail;
+    }
+
+    bs->total_sectors = bdrv_getlength(s->image_hd) / BDRV_SECTOR_SIZE;
+    s->cluster_sectors = s->cluster_size / BDRV_SECTOR_SIZE;
+    sector_per_byte = s->cluster_sectors * 8;
+    s->bitmap_size =
+        (bs->total_sectors + sector_per_byte - 1) / sector_per_byte;
+    s->bitmap_cache =  block_cache_create(bs, ACOW_CACHE_SIZE,
+                                          s->cluster_size,
+                                          BLOCK_TABLE_BITMAP);
+    s->image_hd_cache = block_cache_create(s->image_hd, 0, 0, 0);
+    block_cache_depends_on_flush(s->image_hd_cache);
+    qemu_co_mutex_init(&s->lock);
+    QLIST_INIT(&s->cluster_allocs);
+    return 0;
+fail:
+    return ret;
+}
+
+static void add_cow_close(BlockDriverState *bs)
+{
+    BDRVAddCowState *s = bs->opaque;
+    if (s->bitmap_cache) {
+        block_cache_destroy(bs, s->bitmap_cache);
+    }
+    if (s->image_hd_cache) {
+        block_cache_destroy(s->image_hd, s->image_hd_cache);
+    }
+    bdrv_delete(s->image_hd);
+}
+
+static bool is_allocated(BlockDriverState *bs, int64_t sector_num)
+{
+    BDRVAddCowState *s  = bs->opaque;
+    int64_t cluster_num = sector_num / s->cluster_sectors;
+    uint8_t *table      = NULL;
+    bool val = false;
+    int ret;
+
+    uint64_t offset = s->header.header_size +
+        (offset_in_bitmap(sector_num, s->cluster_sectors)
+            & (~(s->cluster_size - 1)));
+    ret = block_cache_get(bs, s->bitmap_cache, offset, (void **)&table);
+    if (ret < 0) {
+        return ret;
+    }
+
+    val = table[cluster_num / 8 % s->cluster_size] & (1 << (cluster_num % 8));
+    ret = block_cache_put(bs, s->bitmap_cache, (void **)&table);
+    if (ret < 0) {
+        return ret;
+    }
+    return val;
+}
+
+static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors, int *num_same)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int changed;
+
+    if (nb_sectors == 0) {
+        *num_same = 0;
+        return 0;
+    }
+
+    if (s->header.compat_features & ACOW_F_ALL_ALLOCATED) {
+        *num_same = nb_sectors;
+        return 1;
+    }
+    changed = is_allocated(bs, sector_num);
+
+    for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) {
+        if (is_allocated(bs, sector_num + *num_same) != changed) {
+            break;
+        }
+    }
+    return changed;
+}
+
+static int add_cow_backing_read(BlockDriverState *bs, QEMUIOVector *qiov,
+                                int64_t sector_num, int nb_sectors)
+{
+    int n1;
+    if ((sector_num + nb_sectors) <= bs->total_sectors) {
+        return nb_sectors;
+    }
+    if (sector_num >= bs->total_sectors) {
+        n1 = 0;
+    } else {
+        n1 = bs->total_sectors - sector_num;
+    }
+
+    qemu_iovec_memset(qiov, BDRV_SECTOR_SIZE * n1,
+                      0, BDRV_SECTOR_SIZE * (nb_sectors - n1));
+
+    return n1;
+}
+
+static coroutine_fn int add_cow_co_readv(BlockDriverState *bs,
+                                         int64_t sector_num,
+                                         int remaining_sectors,
+                                         QEMUIOVector *qiov)
+{
+    BDRVAddCowState *s  = bs->opaque;
+    int cur_nr_sectors;
+    uint64_t bytes_done = 0;
+    QEMUIOVector hd_qiov;
+    int n1, ret = 0;
+
+    qemu_iovec_init(&hd_qiov, qiov->niov);
+    qemu_co_mutex_lock(&s->lock);
+    while (remaining_sectors != 0) {
+        cur_nr_sectors = remaining_sectors;
+        if (add_cow_is_allocated(bs, sector_num, cur_nr_sectors,
+                                 &cur_nr_sectors)) {
+            qemu_iovec_reset(&hd_qiov);
+            qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
+                              cur_nr_sectors * BDRV_SECTOR_SIZE);
+            qemu_co_mutex_unlock(&s->lock);
+            ret = bdrv_co_readv(s->image_hd, sector_num,
+                                cur_nr_sectors, &hd_qiov);
+            qemu_co_mutex_lock(&s->lock);
+            if (ret < 0) {
+                goto fail;
+            }
+        } else {
+            if (bs->backing_hd) {
+                qemu_iovec_reset(&hd_qiov);
+                qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
+                                  cur_nr_sectors * BDRV_SECTOR_SIZE);
+                n1 = add_cow_backing_read(bs->backing_hd, &hd_qiov,
+                                          sector_num, cur_nr_sectors);
+                if (n1 > 0) {
+                    qemu_co_mutex_unlock(&s->lock);
+                    ret = bdrv_co_readv(bs->backing_hd, sector_num,
+                                        cur_nr_sectors, &hd_qiov);
+                    qemu_co_mutex_lock(&s->lock);
+                    if (ret < 0) {
+                        goto fail;
+                    }
+                }
+            } else {
+                qemu_iovec_memset(&hd_qiov, 0, 0,
+                                  BDRV_SECTOR_SIZE * cur_nr_sectors);
+            }
+        }
+        remaining_sectors -= cur_nr_sectors;
+        sector_num += cur_nr_sectors;
+        bytes_done += cur_nr_sectors * BDRV_SECTOR_SIZE;
+    }
+fail:
+    qemu_co_mutex_unlock(&s->lock);
+    qemu_iovec_destroy(&hd_qiov);
+    return ret;
+}
+
+static int coroutine_fn copy_sectors(BlockDriverState *bs,
+                                     int n_start, int n_end)
+{
+    BDRVAddCowState *s = bs->opaque;
+    QEMUIOVector qiov;
+    struct iovec iov;
+    int n, ret;
+
+    n = n_end - n_start;
+    if (n <= 0) {
+        return 0;
+    }
+
+    iov.iov_len = n * BDRV_SECTOR_SIZE;
+    iov.iov_base = qemu_blockalign(bs, iov.iov_len);
+
+    qemu_iovec_init_external(&qiov, &iov, 1);
+
+    ret = bdrv_co_readv(bs->backing_hd, n_start, n, &qiov);
+    if (ret < 0) {
+        goto out;
+    }
+    ret = bdrv_co_writev(s->image_hd, n_start, n, &qiov);
+    if (ret < 0) {
+        goto out;
+    }
+
+    ret = 0;
+out:
+    qemu_vfree(iov.iov_base);
+    return ret;
+}
+
+static int handle_dependencies(BlockDriverState *bs, uint64_t sector_num,
+                               int remaining_sectors)
+{
+    BDRVAddCowState *s = bs->opaque;
+    AddCowMeta *old_alloc;
+
+    QLIST_FOREACH(old_alloc, &s->cluster_allocs, next_in_flight) {
+
+        uint64_t start = sector_num;
+        uint64_t end = sector_num + remaining_sectors;
+        uint64_t old_start = old_alloc->start_sector;
+        uint64_t old_end = old_alloc->start_sector + old_alloc->remaining_sectors;
+        if (end <= old_start || start >= old_end) {
+            /* No intersection */
+        } else {
+            qemu_co_mutex_unlock(&s->lock);
+            qemu_co_queue_wait(&old_alloc->dependent_requests);
+            qemu_co_mutex_lock(&s->lock);
+            return -EAGAIN;
+        }
+    }
+    return 0;
+}
+
+static coroutine_fn int add_cow_co_writev(BlockDriverState *bs,
+                                          int64_t sector_num,
+                                          int remaining_sectors,
+                                          QEMUIOVector *qiov)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int ret = 0, i;
+    QEMUIOVector hd_qiov;
+    uint8_t *table;
+    uint64_t offset;
+    int mask = s->cluster_sectors - 1;
+    int cluster_mask = s->cluster_size - 1;
+    AddCowMeta *meta = NULL;
+
+    if (remaining_sectors == 0) {
+        return ret;
+    }
+
+    qemu_co_mutex_lock(&s->lock);
+
+    meta = g_malloc0(sizeof(*meta));
+again:
+    *meta = (AddCowMeta) {
+        .next           = QLIST_FIRST(&s->cluster_allocs),
+        .start_sector   = sector_num,
+        .remaining_sectors  = remaining_sectors,
+    };
+    if (handle_dependencies(bs, sector_num, remaining_sectors) == -EAGAIN) {
+        goto again;
+    }
+    qemu_co_queue_init(&meta->dependent_requests);
+    QLIST_INSERT_HEAD(&s->cluster_allocs, meta, next_in_flight);
+    qemu_iovec_init(&hd_qiov, qiov->niov);
+    ret = bdrv_co_writev(s->image_hd, sector_num, remaining_sectors, qiov);
+
+    if (ret < 0) {
+        goto fail;
+    }
+    if ((s->header.compat_features & ACOW_F_ALL_ALLOCATED) == 0) {
+        /* Copy content of unmodified sectors */
+        if (!is_cluster_head(sector_num, s->cluster_sectors)
+            && !is_allocated(bs, sector_num)) {
+            ret = copy_sectors(bs, sector_num & ~mask, sector_num);
+            if (ret < 0) {
+                goto fail;
+            }
+        }
+
+        if (!is_cluster_tail(sector_num + remaining_sectors - 1,
+                             s->cluster_sectors)
+            && !is_allocated(bs, sector_num + remaining_sectors - 1)) {
+            ret = copy_sectors(bs, sector_num + remaining_sectors,
+                               ((sector_num + remaining_sectors) | mask) + 1);
+            if (ret < 0) {
+                goto fail;
+            }
+        }
+
+        for (i = sector_num / s->cluster_sectors;
+            i <= (sector_num + remaining_sectors - 1) / s->cluster_sectors;
+            i++) {
+            offset = s->header.header_size
+                + (offset_in_bitmap(i * s->cluster_sectors,
+                s->cluster_sectors) & (~cluster_mask));
+            ret = block_cache_get(bs, s->bitmap_cache, offset, (void **)&table);
+            if (ret < 0) {
+                goto fail;
+            }
+
+            block_cache_set_dependency_two_bs(bs, s->bitmap_cache,
+                                              s->image_hd, s->image_hd_cache);
+            if ((table[(i / 8) & (s->cluster_size - 1)] & (1 << (i % 8))) == 0) {
+                table[(i / 8) & (s->cluster_size - 1)] |= (1 << (i % 8));
+                block_cache_entry_mark_dirty(s->bitmap_cache, table);
+            }
+
+            ret = block_cache_put(bs, s->bitmap_cache, (void **) &table);
+            if (ret < 0) {
+                goto fail;
+            }
+        }
+    }
+    ret = 0;
+fail:
+    qemu_co_mutex_unlock(&s->lock);
+    qemu_iovec_destroy(&hd_qiov);
+    QLIST_REMOVE(meta, next_in_flight);
+    g_free(meta);
+    return ret;
+}
+
+static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int sector_per_byte = s->cluster_sectors * 8;
+    int ret;
+    int64_t bitmap_size =
+        (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte;
+    bitmap_size = (bitmap_size + s->cluster_size - 1)
+        & (~(s->cluster_size - 1));
+
+    ret = bdrv_truncate(bs->file, s->header.header_size + bitmap_size);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = bdrv_truncate(s->image_hd, size);
+    if (ret < 0) {
+        return ret;
+    }
+    return 0;
+}
+
+static int add_cow_reopen_prepare(BDRVReopenState *state,
+                                  BlockReopenQueue *queue, Error **errp)
+{
+    BDRVAddCowState *s;
+    int ret = -1;
+
+    assert(state != NULL);
+    assert(state->bs != NULL);
+
+    if (queue == NULL) {
+        error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+                  "No reopen queue for add-cow");
+        goto exit;
+    }
+
+    s = state->bs->opaque;
+
+    assert(s != NULL);
+
+
+    bdrv_reopen_queue(queue, s->image_hd, state->flags);
+    ret = 0;
+
+exit:
+    return ret;
+}
+
+
+static coroutine_fn int add_cow_co_flush(BlockDriverState *bs)
+{
+    BDRVAddCowState *s = bs->opaque;
+    int ret = 0;
+
+    qemu_co_mutex_lock(&s->lock);
+    if (s->image_hd_cache) {
+        ret = block_cache_flush(s->image_hd, s->image_hd_cache);
+        if (ret < 0) {
+            qemu_co_mutex_unlock(&s->lock);
+            return ret;
+        }
+    }
+    if (s->bitmap_cache) {
+        ret = block_cache_flush(bs, s->bitmap_cache);
+    }
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
+}
+
+static int add_cow_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+    BDRVAddCowState *s = bs->opaque;
+    bdi->cluster_size = s->cluster_size;
+    return 0;
+}
+
+static QemuOptsList add_cow_create_opts = {
+    .name = "add-cow-create-opts",
+    .head = QTAILQ_HEAD_INITIALIZER(add_cow_create_opts.head),
+    .desc = {
+        {
+            .name = BLOCK_OPT_SIZE,
+            .type = QEMU_OPT_SIZE,
+            .help = "Virtual disk size"
+        },
+        {
+            .name = BLOCK_OPT_BACKING_FILE,
+            .type = QEMU_OPT_STRING,
+            .help = "File name of a base image"
+        },
+        {
+            .name = BLOCK_OPT_BACKING_FMT,
+            .type = QEMU_OPT_STRING,
+            .help = "Image format of the base image"
+        },
+        {
+            .name = BLOCK_OPT_IMAGE_FILE,
+            .type = QEMU_OPT_STRING,
+            .help = "File name of a image file"
+        },
+        {
+            .name = BLOCK_OPT_IMAGE_FMT,
+            .type = QEMU_OPT_STRING,
+            .help = "Image format of the image file"
+        },
+        {
+            .name = BLOCK_OPT_CLUSTER_SIZE,
+            .type = QEMU_OPT_SIZE,
+            .help = "add-cow cluster size",
+            .def_value_str = stringify(ACOW_CLUSTER_SIZE)
+        },
+        { /* end of list */ }
+    }
+};
+
+static BlockDriver bdrv_add_cow = {
+    .format_name                = "add-cow",
+    .instance_size              = sizeof(BDRVAddCowState),
+    .bdrv_probe                 = add_cow_probe,
+    .bdrv_open                  = add_cow_open,
+    .bdrv_close                 = add_cow_close,
+    .bdrv_create                = add_cow_create,
+    .bdrv_co_readv              = add_cow_co_readv,
+    .bdrv_co_writev             = add_cow_co_writev,
+    .bdrv_truncate              = bdrv_add_cow_truncate,
+    .bdrv_co_is_allocated       = add_cow_is_allocated,
+    .bdrv_reopen_prepare        = add_cow_reopen_prepare,
+    .bdrv_get_info              = add_cow_get_info,
+
+    .bdrv_create_opts           = &add_cow_create_opts,
+    .bdrv_co_flush_to_os        = add_cow_co_flush,
+};
+
+static void bdrv_add_cow_init(void)
+{
+    bdrv_register(&bdrv_add_cow);
+}
+
+block_init(bdrv_add_cow_init);
diff --git a/block/block-cache.c b/block/block-cache.c
index 454269c..a05418f 100644
--- a/block/block-cache.c
+++ b/block/block-cache.c
@@ -164,14 +164,15 @@ int block_cache_flush(BlockDriverState *bs, BlockCache *c)
     return result;
 }
 
-int block_cache_set_dependency(BlockDriverState *bs,
-                               BlockCache *c,
-                               BlockCache *dependency)
+int block_cache_set_dependency_two_bs(BlockDriverState *bs,
+                                      BlockCache *c,
+                                      BlockDriverState *depend_bs,
+                                      BlockCache *dependency)
 {
     int ret;
 
     if (dependency->depends) {
-        ret = block_cache_flush_dependency(bs, dependency);
+        ret = block_cache_flush_dependency(depend_bs, dependency);
         if (ret < 0) {
             return ret;
         }
@@ -188,6 +189,13 @@ int block_cache_set_dependency(BlockDriverState *bs,
     return 0;
 }
 
+int block_cache_set_dependency(BlockDriverState *bs,
+                               BlockCache *c,
+                               BlockCache *dependency)
+{
+    return block_cache_set_dependency_two_bs(bs, c, bs, dependency);
+}
+
 void block_cache_depends_on_flush(BlockCache *c)
 {
     c->depends_on_flush = true;
diff --git a/include/block/block-cache.h b/include/block/block-cache.h
index 1992765..d057d9e 100644
--- a/include/block/block-cache.h
+++ b/include/block/block-cache.h
@@ -45,6 +45,10 @@ int block_cache_flush(BlockDriverState *bs, BlockCache *c);
 int block_cache_set_dependency(BlockDriverState *bs,
                                BlockCache *c,
                                BlockCache *dependency);
+int block_cache_set_dependency_two_bs(BlockDriverState *bs,
+                                      BlockCache *c,
+                                      BlockDriverState *depend_bs,
+                                      BlockCache *dependency);
 void block_cache_depends_on_flush(BlockCache *c);
 int block_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset,
                     void **table);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 02f52ac..b40be77 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -57,6 +57,8 @@
 #define BLOCK_OPT_COMPAT_LEVEL      "compat"
 #define BLOCK_OPT_LAZY_REFCOUNTS    "lazy_refcounts"
 #define BLOCK_OPT_ADAPTER_TYPE      "adapter_type"
+#define BLOCK_OPT_IMAGE_FILE        "image_file"
+#define BLOCK_OPT_IMAGE_FMT         "image_fmt"
 
 typedef struct BdrvTrackedRequest BdrvTrackedRequest;
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH V19 8/8] qemu-iotests: add add-cow iotests support
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
                   ` (6 preceding siblings ...)
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 7/8] add-cow file format core code Dongxu Wang
@ 2013-05-30 10:00 ` Dongxu Wang
  2013-06-03  1:47 ` [Qemu-devel] [PATCH V19 0/8] add-cow file format Fam Zheng
  2013-07-22  5:10 ` Wenchao Xia
  9 siblings, 0 replies; 17+ messages in thread
From: Dongxu Wang @ 2013-05-30 10:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, Dong Xu Wang, wdongxu, stefanha

From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>

This patch will use qemu-iotests to test add-cow file format.

v17-v18:
1) add 037 and 038 tests.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
---
 tests/qemu-iotests/017       |  2 +-
 tests/qemu-iotests/020       |  2 +-
 tests/qemu-iotests/037       |  2 +-
 tests/qemu-iotests/038       |  2 +-
 tests/qemu-iotests/common    |  6 ++++++
 tests/qemu-iotests/common.rc | 15 ++++++++++++++-
 6 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/017 b/tests/qemu-iotests/017
index 45f2c0b..ae0e627 100755
--- a/tests/qemu-iotests/017
+++ b/tests/qemu-iotests/017
@@ -40,7 +40,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.pattern
 
 # Any format supporting backing files
-_supported_fmt qcow qcow2 vmdk qed
+_supported_fmt qcow qcow2 vmdk qed add-cow
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index 2fb0ff8..3dbb495 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -42,7 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.pattern
 
 # Any format supporting backing files
-_supported_fmt qcow qcow2 vmdk qed
+_supported_fmt qcow qcow2 vmdk qed add-cow
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/037 b/tests/qemu-iotests/037
index c11460b..683e73d 100755
--- a/tests/qemu-iotests/037
+++ b/tests/qemu-iotests/037
@@ -38,7 +38,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.rc
 . ./common.filter
 
-_supported_fmt qcow qcow2 vmdk qed
+_supported_fmt qcow qcow2 vmdk qed add-cow
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/038 b/tests/qemu-iotests/038
index 36125ea..10bacfb 100755
--- a/tests/qemu-iotests/038
+++ b/tests/qemu-iotests/038
@@ -38,7 +38,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.rc
 . ./common.filter
 
-_supported_fmt qcow2 qed
+_supported_fmt qcow2 qed add-cow
 _supported_proto generic
 _supported_os Linux
 
diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index 6826ea7..5110fc1 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -128,6 +128,7 @@ common options
 check options
     -raw                test raw (default)
     -cow                test cow
+    -add-cow            test add-cow
     -qcow               test qcow
     -qcow2              test qcow2
     -qed                test qed
@@ -165,6 +166,11 @@ testlist options
 	    xpand=false
 	    ;;
 
+    -add-cow)
+        IMGFMT=add-cow
+        xpand=false
+        ;;
+
 	-qcow)
 	    IMGFMT=qcow
 	    xpand=false
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index e9ba358..b203f04 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -110,6 +110,16 @@ _make_test_img()
     fi
     if [ \( "$IMGFMT" = "qcow2" -o "$IMGFMT" = "qed" \) -a -n "$CLUSTER_SIZE" ]; then
         optstr=$(_optstr_add "$optstr" "cluster_size=$CLUSTER_SIZE")
+    elif [ "$IMGFMT" = "add-cow" ]; then
+        local IMG="$TEST_IMG"".raw"
+        if [ "$1" = "-b" ]; then
+            IMG="$IMG"".b"
+            $QEMU_IMG create -f raw $IMG $image_size>/dev/null
+            extra_img_options="-o image_file=$IMG $extra_img_options"
+        else
+            $QEMU_IMG create -f raw $IMG $image_size>/dev/null
+            extra_img_options="-o image_file=$IMG"
+        fi
     fi
 
     if [ -n "$optstr" ]; then
@@ -130,7 +140,8 @@ _make_test_img()
             -e "s# zeroed_grain=\\(on\\|off\\)##g" \
             -e "s# subformat='[^']*'##g" \
             -e "s# adapter_type='[^']*'##g" \
-            -e "s# lazy_refcounts=\\(on\\|off\\)##g"
+            -e "s# lazy_refcounts=\\(on\\|off\\)##g" \
+            -e "s# image_file='[^']*'##g"
 
     # Start an NBD server on the image file, which is what we'll be talking to
     if [ $IMGPROTO = "nbd" ]; then
@@ -152,6 +163,8 @@ _cleanup_test_img()
             rm -f $TEST_DIR/t.$IMGFMT
             rm -f $TEST_DIR/t.$IMGFMT.orig
             rm -f $TEST_DIR/t.$IMGFMT.base
+            rm -f $TEST_DIR/t.$IMGFMT.raw
+            rm -f $TEST_DIR/t.$IMGFMT.raw.b
             ;;
 
         rbd)
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH V19 0/8] add-cow file format
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
                   ` (7 preceding siblings ...)
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 8/8] qemu-iotests: add add-cow iotests support Dongxu Wang
@ 2013-06-03  1:47 ` Fam Zheng
  2013-07-22  5:10 ` Wenchao Xia
  9 siblings, 0 replies; 17+ messages in thread
From: Fam Zheng @ 2013-06-03  1:47 UTC (permalink / raw)
  To: Dongxu Wang; +Cc: kwolf, wdongxu, qemu-devel, stefanha

On Thu, 05/30 18:00, Dongxu Wang wrote:
> It will introduce a new file format: add-cow.
> 
> The add-cow file format makes it possible to perform copy-on-write on top of
> a raw disk image.  When we know that no backing file clusters remain visible
> (e.g. we have streamed the entire image and copied all data from the backing
> file), then it is possible to discard the add-cow file and use the raw image
> file directly.
> 
> This feature adds the copy-on-write feature to raw files (which cannot support
> it natively) while allowing us to get full performance again later when we no
> longer need copy-on-write.
> 
> add-cow can benefit from other available functions, such as path_has_protocol
> and qed_read_string, so we will make them public.
> 
> snapshot_blkdev are not supported now for add-cow. Will add it in futher patches.
> 
> These patches are using QemuOpts parser, former patches could be found here:
> http://patchwork.ozlabs.org/patch/247508/
> 
> 
> v18->v19:
> 1) support parallel aio write.
> 2) fix flush method.
> 3) other small fix.
> v17 -> v18:
> 1) remove version field.
> 2) header size is maximum value and cluster size value.
> 3) fix type.
> 4) move struct to source file.
> 5) cluster_size->table_size.
> 6) use error_report, not fprintf.
> 7) remove version field from header.
> 8) header_size is MAX(cluster_size, 4096).
> 9) introduce s->cluster_sectors.
> 10) use BLKDBG_L2_LOAD/UPDATE.
> 11) add 037 and 038 tests.
> 
> v16->v17):
> 1) Use stringify.
> 
> v15->v16):
> 1) Rebased on QEMU upstream source tree.
> 2) Judge if opts is null in add_cow_create function.
> 
> v14->v15:
> 1) Fix typo and make some sentences more readable in docs.
> 2) Introduce STRINGIZER macro.
> 
> v13->v14:
> 1) Make some sentences more clear in docs.
> 2) Make MAGIC from 8 bytes to 4 bytes.
> 3) Fix some bugs.
> 
> v12->v13:
> 1) Use QemuOpts, not QEMUOptionParameter
> 2) cluster_size configuable
> 3) Refactor block-cache.c
> 4) Correct qemu-iotests script.
> 5) Other bug fix.
> 
> v11->v12:
> 1) Removed un-used feature bit.
> 2) Share cache code with qcow2.c.
> 3) Remove snapshot_blkdev support, will add it in another patch.
> 5) COW Bitmap field in add-cow file will be multiple of 65536.
> 6) fix grammer and typo.
> 
> Dong Xu Wang (8):
>   V18: docs: document for add-cow file format

Why mention V18 here?

-- 
Fam

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH V19 1/8] V18: docs: document for add-cow file format
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 1/8] V18: docs: document for " Dongxu Wang
@ 2013-06-03  1:48   ` Fam Zheng
  2013-06-03  1:55     ` Dongxu Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Fam Zheng @ 2013-06-03  1:48 UTC (permalink / raw)
  To: Dongxu Wang; +Cc: kwolf, wdongxu, qemu-devel, stefanha

On Thu, 05/30 18:00, Dongxu Wang wrote:
> From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> 
> Document for add-cow format, the usage and spec of add-cow are
> introduced.
> 
> v18-v19:
> 1) backing_fmt and image_fmt NUL-terminated.
> 2) other fix.
> V17->V18:
> 1) remove version field.
> 2) header size is maximum value and cluster size value.
> 3) fix type.
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>

Changed git config? I suggest you could remove the old SOB line (if
respin).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH V19 1/8] V18: docs: document for add-cow file format
  2013-06-03  1:48   ` Fam Zheng
@ 2013-06-03  1:55     ` Dongxu Wang
  0 siblings, 0 replies; 17+ messages in thread
From: Dongxu Wang @ 2013-06-03  1:55 UTC (permalink / raw)
  To: qemu-devel, kwolf, wdongxu, stefanha

On 2013/6/3 9:48, Fam Zheng wrote:
> On Thu, 05/30 18:00, Dongxu Wang wrote:
>> From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>>
>> Document for add-cow format, the usage and spec of add-cow are
>> introduced.
>>
>> v18-v19:
>> 1) backing_fmt and image_fmt NUL-terminated.
>> 2) other fix.
>> V17->V18:
>> 1) remove version field.
>> 2) header size is maximum value and cluster size value.
>> 3) fix type.
>>
>> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
>> Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
>
> Changed git config? I suggest you could remove the old SOB line (if
> respin).
>
Oh, yes, I changed my gitconfig file, and I did not notice these wrong 
SOB line, please ignore these boring lines. :)
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow Dongxu Wang
@ 2013-06-05 13:35   ` Stefan Hajnoczi
  2013-06-05 13:46   ` Kevin Wolf
  1 sibling, 0 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2013-06-05 13:35 UTC (permalink / raw)
  To: Dongxu Wang; +Cc: kwolf, wdongxu, qemu-devel, stefanha

On Thu, May 30, 2013 at 06:00:23PM +0800, Dongxu Wang wrote:
> diff --git a/block/block-cache.c b/block/block-cache.c
> index f5d75d1..454269c 100644
> --- a/block/block-cache.c
> +++ b/block/block-cache.c
> @@ -125,7 +125,7 @@ static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
>      } else if (c->table_type == BLOCK_TABLE_L2) {
>          BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
>      } else if (c->table_type == BLOCK_TABLE_BITMAP) {
> -        BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
> +        BLKDBG_EVENT(bs->file, BLKDBG_ADDCOW_WRITE);
>      }
>  
>      ret = bdrv_pwrite(bs->file, c->entries[i].offset,
> @@ -260,7 +260,7 @@ static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
>          if (c->table_type == BLOCK_TABLE_L2) {
>              BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
>          } else if (c->table_type == BLOCK_TABLE_BITMAP) {
> -            BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
> +            BLKDBG_EVENT(bs->file, BLKDBG_ADDCOW_READ);
>          }
>  
>          ret = bdrv_pread(bs->file, offset, c->entries[i].table,

These patches are in the wrong order.  The previous patch should not use
the incorrect BLKDBG_COW_READ/BLKDBG_COW_WRITE values.

Put the blkdebug.c and block.h changes from this patch before the
previous patch so block-cache.c can use
BLKDBG_ADDCOW_READ/BLKDBG_ADDCOW_WRITE from the start.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow Dongxu Wang
  2013-06-05 13:35   ` Stefan Hajnoczi
@ 2013-06-05 13:46   ` Kevin Wolf
  2013-06-06  7:30     ` Stefan Hajnoczi
  1 sibling, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2013-06-05 13:46 UTC (permalink / raw)
  To: Dongxu Wang; +Cc: wdongxu, qemu-devel, stefanha

Am 30.05.2013 um 12:00 hat Dongxu Wang geschrieben:
> From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> 
> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>

One of these should surely be enough?

> ---
>  block/blkdebug.c      | 3 +++
>  block/block-cache.c   | 4 ++--
>  include/block/block.h | 3 +++
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 71f99e4..2bd6a53 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -182,6 +182,9 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
>      [BLKDBG_CLUSTER_ALLOC]                  = "cluster_alloc",
>      [BLKDBG_CLUSTER_ALLOC_BYTES]            = "cluster_alloc_bytes",
>      [BLKDBG_CLUSTER_FREE]                   = "cluster_free",
> +
> +    [BLKDBG_ADDCOW_READ]                    = "add_cow_read",
> +    [BLKDBG_ADDCOW_WRITE]                   = "add_cow_write",
>  };
>  
>  static int get_event_by_name(const char *name, BlkDebugEvent *event)
> diff --git a/block/block-cache.c b/block/block-cache.c
> index f5d75d1..454269c 100644
> --- a/block/block-cache.c
> +++ b/block/block-cache.c
> @@ -125,7 +125,7 @@ static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
>      } else if (c->table_type == BLOCK_TABLE_L2) {
>          BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
>      } else if (c->table_type == BLOCK_TABLE_BITMAP) {
> -        BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
> +        BLKDBG_EVENT(bs->file, BLKDBG_ADDCOW_WRITE);
>      }
>  
>      ret = bdrv_pwrite(bs->file, c->entries[i].offset,
> @@ -260,7 +260,7 @@ static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
>          if (c->table_type == BLOCK_TABLE_L2) {
>              BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
>          } else if (c->table_type == BLOCK_TABLE_BITMAP) {
> -            BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
> +            BLKDBG_EVENT(bs->file, BLKDBG_ADDCOW_READ);
>          }

Doesn't this break qcow2 test cases that don't get the
BLKDBG_COW_READ/WRITE events any more now?

I guess you need to extend the cache creation function to also take
function pointers for the blkdebug events that should be generated in
the various places. Maybe it's enough data now that it would be worth
having a BlockCache struct somewhere (e.g. embedded in BDRVQcow2State)
and change the cache creation function to an init function that only
takes a pointer to this struct.

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH V19 7/8] add-cow file format core code.
  2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 7/8] add-cow file format core code Dongxu Wang
@ 2013-06-05 14:21   ` Stefan Hajnoczi
  0 siblings, 0 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2013-06-05 14:21 UTC (permalink / raw)
  To: Dongxu Wang; +Cc: kwolf, wdongxu, qemu-devel, stefanha

On Thu, May 30, 2013 at 06:00:24PM +0800, Dongxu Wang wrote:
> +static coroutine_fn int add_cow_co_writev(BlockDriverState *bs,
> +                                          int64_t sector_num,
> +                                          int remaining_sectors,
> +                                          QEMUIOVector *qiov)
> +{
> +    BDRVAddCowState *s = bs->opaque;
> +    int ret = 0, i;
> +    QEMUIOVector hd_qiov;
> +    uint8_t *table;
> +    uint64_t offset;
> +    int mask = s->cluster_sectors - 1;
> +    int cluster_mask = s->cluster_size - 1;
> +    AddCowMeta *meta = NULL;
> +
> +    if (remaining_sectors == 0) {
> +        return ret;
> +    }
> +
> +    qemu_co_mutex_lock(&s->lock);
> +
> +    meta = g_malloc0(sizeof(*meta));
> +again:
> +    *meta = (AddCowMeta) {
> +        .next           = QLIST_FIRST(&s->cluster_allocs),
> +        .start_sector   = sector_num,
> +        .remaining_sectors  = remaining_sectors,
> +    };
> +    if (handle_dependencies(bs, sector_num, remaining_sectors) == -EAGAIN) {
> +        goto again;
> +    }
> +    qemu_co_queue_init(&meta->dependent_requests);
> +    QLIST_INSERT_HEAD(&s->cluster_allocs, meta, next_in_flight);
> +    qemu_iovec_init(&hd_qiov, qiov->niov);

hd_qiov is unused.

> +    ret = bdrv_co_writev(s->image_hd, sector_num, remaining_sectors, qiov);

s->lock is held so write requests block other requests, this is not
parallel.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow
  2013-06-05 13:46   ` Kevin Wolf
@ 2013-06-06  7:30     ` Stefan Hajnoczi
  0 siblings, 0 replies; 17+ messages in thread
From: Stefan Hajnoczi @ 2013-06-06  7:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: wdongxu, Dongxu Wang, qemu-devel

On Wed, Jun 05, 2013 at 03:46:43PM +0200, Kevin Wolf wrote:
> Am 30.05.2013 um 12:00 hat Dongxu Wang geschrieben:
> > From: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> > 
> > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
> > Signed-off-by: Dongxu Wang <wdongxu@linux.vnet.ibm.com>
> 
> One of these should surely be enough?
> 
> > ---
> >  block/blkdebug.c      | 3 +++
> >  block/block-cache.c   | 4 ++--
> >  include/block/block.h | 3 +++
> >  3 files changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/blkdebug.c b/block/blkdebug.c
> > index 71f99e4..2bd6a53 100644
> > --- a/block/blkdebug.c
> > +++ b/block/blkdebug.c
> > @@ -182,6 +182,9 @@ static const char *event_names[BLKDBG_EVENT_MAX] = {
> >      [BLKDBG_CLUSTER_ALLOC]                  = "cluster_alloc",
> >      [BLKDBG_CLUSTER_ALLOC_BYTES]            = "cluster_alloc_bytes",
> >      [BLKDBG_CLUSTER_FREE]                   = "cluster_free",
> > +
> > +    [BLKDBG_ADDCOW_READ]                    = "add_cow_read",
> > +    [BLKDBG_ADDCOW_WRITE]                   = "add_cow_write",
> >  };
> >  
> >  static int get_event_by_name(const char *name, BlkDebugEvent *event)
> > diff --git a/block/block-cache.c b/block/block-cache.c
> > index f5d75d1..454269c 100644
> > --- a/block/block-cache.c
> > +++ b/block/block-cache.c
> > @@ -125,7 +125,7 @@ static int block_cache_entry_flush(BlockDriverState *bs, BlockCache *c, int i)
> >      } else if (c->table_type == BLOCK_TABLE_L2) {
> >          BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
> >      } else if (c->table_type == BLOCK_TABLE_BITMAP) {
> > -        BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
> > +        BLKDBG_EVENT(bs->file, BLKDBG_ADDCOW_WRITE);
> >      }
> >  
> >      ret = bdrv_pwrite(bs->file, c->entries[i].offset,
> > @@ -260,7 +260,7 @@ static int block_cache_do_get(BlockDriverState *bs, BlockCache *c,
> >          if (c->table_type == BLOCK_TABLE_L2) {
> >              BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
> >          } else if (c->table_type == BLOCK_TABLE_BITMAP) {
> > -            BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
> > +            BLKDBG_EVENT(bs->file, BLKDBG_ADDCOW_READ);
> >          }
> 
> Doesn't this break qcow2 test cases that don't get the
> BLKDBG_COW_READ/WRITE events any more now?

This event is only raised for BLKDBG_TABLE_BITMAP, which is new for
add-cow only.  qcow2 only uses L2 and refcount table types, not bitmap.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH V19 0/8] add-cow file format
  2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
                   ` (8 preceding siblings ...)
  2013-06-03  1:47 ` [Qemu-devel] [PATCH V19 0/8] add-cow file format Fam Zheng
@ 2013-07-22  5:10 ` Wenchao Xia
  9 siblings, 0 replies; 17+ messages in thread
From: Wenchao Xia @ 2013-07-22  5:10 UTC (permalink / raw)
  To: Dongxu Wang; +Cc: kwolf, wdongxu, qemu-devel, stefanha

于 2013-5-30 18:00, Dongxu Wang 写道:
> It will introduce a new file format: add-cow.
> 
> The add-cow file format makes it possible to perform copy-on-write on top of
> a raw disk image.  When we know that no backing file clusters remain visible
> (e.g. we have streamed the entire image and copied all data from the backing
> file), then it is possible to discard the add-cow file and use the raw image
> file directly.
> 
> This feature adds the copy-on-write feature to raw files (which cannot support
> it natively) while allowing us to get full performance again later when we no
> longer need copy-on-write.
> 
> add-cow can benefit from other available functions, such as path_has_protocol
> and qed_read_string, so we will make them public.
> 
> snapshot_blkdev are not supported now for add-cow. Will add it in futher patches.
> 
> These patches are using QemuOpts parser, former patches could be found here:
> http://patchwork.ozlabs.org/patch/247508/
> 
> 
> v18->v19:
> 1) support parallel aio write.
> 2) fix flush method.
> 3) other small fix.
> v17 -> v18:
> 1) remove version field.
> 2) header size is maximum value and cluster size value.
> 3) fix type.
> 4) move struct to source file.
> 5) cluster_size->table_size.
> 6) use error_report, not fprintf.
> 7) remove version field from header.
> 8) header_size is MAX(cluster_size, 4096).
> 9) introduce s->cluster_sectors.
> 10) use BLKDBG_L2_LOAD/UPDATE.
> 11) add 037 and 038 tests.
> 
> v16->v17):
> 1) Use stringify.
> 
> v15->v16):
> 1) Rebased on QEMU upstream source tree.
> 2) Judge if opts is null in add_cow_create function.
> 
> v14->v15:
> 1) Fix typo and make some sentences more readable in docs.
> 2) Introduce STRINGIZER macro.
> 
> v13->v14:
> 1) Make some sentences more clear in docs.
> 2) Make MAGIC from 8 bytes to 4 bytes.
> 3) Fix some bugs.
> 
> v12->v13:
> 1) Use QemuOpts, not QEMUOptionParameter
> 2) cluster_size configuable
> 3) Refactor block-cache.c
> 4) Correct qemu-iotests script.
> 5) Other bug fix.
> 
> v11->v12:
> 1) Removed un-used feature bit.
> 2) Share cache code with qcow2.c.
> 3) Remove snapshot_blkdev support, will add it in another patch.
> 5) COW Bitmap field in add-cow file will be multiple of 65536.
> 6) fix grammer and typo.
> 
> Dong Xu Wang (8):
>    V18: docs: document for add-cow file format
>    make path_has_protocol non static
>    qed_read_string to bdrv_read_string
>    rename qcow2-cache.c to block-cache.c
>    Make block-cache.c be common interface
>    add debug event for add-cow
>    add-cow file format core code.
>    qemu-iotests: add add-cow iotests support
> 
>   block.c                      |  29 +-
>   block/Makefile.objs          |   4 +-
>   block/add-cow.c              | 812 +++++++++++++++++++++++++++++++++++++++++++
>   block/blkdebug.c             |   3 +
>   block/block-cache.c          | 342 ++++++++++++++++++
>   block/qcow2-cache.c          | 323 -----------------
>   block/qcow2-cluster.c        |  52 +--
>   block/qcow2-refcount.c       |  62 ++--
>   block/qcow2.c                |  21 +-
>   block/qcow2.h                |  24 +-
>   block/qed.c                  |  34 +-
>   docs/specs/add-cow.txt       | 172 +++++++++
>   include/block/block-cache.h  |  59 ++++
>   include/block/block.h        |   6 +
>   include/block/block_int.h    |   2 +
>   tests/qemu-iotests/017       |   2 +-
>   tests/qemu-iotests/020       |   2 +-
>   tests/qemu-iotests/037       |   2 +-
>   tests/qemu-iotests/038       |   2 +-
>   tests/qemu-iotests/common    |   6 +
>   tests/qemu-iotests/common.rc |  15 +-
>   trace-events                 |  13 +-
>   22 files changed, 1536 insertions(+), 451 deletions(-)
>   create mode 100644 block/add-cow.c
>   create mode 100644 block/block-cache.c
>   delete mode 100644 block/qcow2-cache.c
>   create mode 100644 docs/specs/add-cow.txt
>   create mode 100644 include/block/block-cache.h
> 
  To add dirty block tracking about block, I think add-cow file format
will help, hope it can progress fast.

-- 
Best Regards

Wenchao Xia

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-07-22  5:11 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-30 10:00 [Qemu-devel] [PATCH V19 0/8] add-cow file format Dongxu Wang
2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 1/8] V18: docs: document for " Dongxu Wang
2013-06-03  1:48   ` Fam Zheng
2013-06-03  1:55     ` Dongxu Wang
2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 2/8] make path_has_protocol non static Dongxu Wang
2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 3/8] qed_read_string to bdrv_read_string Dongxu Wang
2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 4/8] rename qcow2-cache.c to block-cache.c Dongxu Wang
2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 5/8] Make block-cache.c be common interface Dongxu Wang
2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 6/8] add debug event for add-cow Dongxu Wang
2013-06-05 13:35   ` Stefan Hajnoczi
2013-06-05 13:46   ` Kevin Wolf
2013-06-06  7:30     ` Stefan Hajnoczi
2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 7/8] add-cow file format core code Dongxu Wang
2013-06-05 14:21   ` Stefan Hajnoczi
2013-05-30 10:00 ` [Qemu-devel] [PATCH V19 8/8] qemu-iotests: add add-cow iotests support Dongxu Wang
2013-06-03  1:47 ` [Qemu-devel] [PATCH V19 0/8] add-cow file format Fam Zheng
2013-07-22  5:10 ` Wenchao Xia

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.